Alejandro García Magos PhD
This article was originally published by Este Pais magazine in its July 2023 issue titled “Technopolitics: Digital Democracy.” The translation was made by the author with a few edits, solely to make it accessible to a broader audience. The original title in Spanish was “Encuestas electorales: pasado y futuro.”
Recent history has revealed a concerning trend in electoral surveys. We witnessed this in many parts of the world. A notable example in our hemisphere was the constitutional plebiscite in Chile, where polls slightly favored the “Reject” option, but it ultimately secured a comfortable 18-point victory over “Approve.” A similar situation occurred in Brazil in 2022, with all surveys showing Lula 10 points ahead of Bolsonaro, but they ended up almost tied. Not to mention cases like Trump in 2016, Brexit, or more recently, Erdogan in Turkey, who won easily despite all polls predicting his defeat.
What explains this disparity between electoral polls and election results? In short: social and technological changes that have put survey research as a method of social science in crisis. To provide some background, three key moments can be identified in the development of survey research. The first is its invention between the 1930s and 1960s, during which the theoretical and conceptual foundations that are familiar to us today were laid down: representativeness, biases, sampling frame, questionnaires, etc. The second moment is its expansion between 1960 and 1990, largely due to the rise of landline telephones among middle and lower classes. Finally, the third moment is one of crisis and adaptation, which began in 1990 and continues to this day, characterized by two circumstances: first, the decline in the number of participants in telephone surveys and face-to-face interviews; second, the emergence of the internet and new technologies.
These two circumstances have posed a strong crisis for one of the most traditional methodologies for conducting polls: random digit dialing, which relies on telephone directories as a sampling frame and allows a significant group of residents in a given jurisdiction to have the same probability of being selected to participate in a survey. The expansion of mobile phones and the trend among young people to communicate through internet platforms rather than making calls have caused a decline in participation rates in just a few years. According to the Pew Research Center, between 1997 and 2018, the participation rate in their telephone surveys dropped from 36 to 6 percent.
All of this is to say that nowadays conducting telephone surveys is more difficult, as the target populations are becoming increasingly elusive. On the other hand, returning to face-to-face surveys is not always the best option as they introduce social desirability biases, are very costly, and can even be physically impossible to carry out in conflict ridden areas. I don’t mean to say that these methodologies are obsolete. What I am pointing out is that polling firms are currently facing a perfect storm: 1) on one hand, declining participation rates, both by phone and in-person; 2) an increase in costs driven by low participation and inflation; and 3) high demand for more abundant and higher-quality data in shorter timeframes.
And just as a deus ex machina, the internet arrived. Indeed, it revolutionized surveys as a tool of social research by solving the Gordian knot they were in. On one hand, it allowed reaching millions of connected users every day. On the other, it significantly reduced costs, to the point that nowadays anyone can conduct a survey on Google. Moreover, it opened the possibility of collecting massive and real-time data.
Fantastic, right? Well, not exactly. Conducting surveys over the internet also has its own difficulties. Let’s look at some of them.
Firstly, it is important to note that online surveys only consider the population that uses the internet. This population tends to be small in poorer countries due to low internet penetration and/or biased towards young males. Even in developed countries, there is a bias towards the young population who naturally spend more time online. However, perhaps the biggest problem is the lack of a sampling frame; in other words, there is no database with names, email addresses, and a unique identifier for each internet user in all countries. In other words, we do not know who they are or how many people are in cyberspace. This contrasts with what we see in telephone and face-to-face surveys, where a sampling frame does exist. For telephone surveys, the telephone directories that include the names and phone numbers of city residents served as the frame for a long time. For face-to-face surveys, there is always the possibility of using the population census.
To solve this problem, many online polling firms have created their own sampling frames in the form of panels by inviting or recruiting internet users. Similar to a puzzle that forms an image, panels attempt to replicate society in miniature, aiming to be a faithful representation of a country’s population in terms of gender, age, income, ethnicity, language, etc. Once the panel is assembled, a random sampling could be conducted where all panelists have an equal chance of being selected for a survey, theoretically making it representative.
Easy, right? Well, not quite again.
Panel or No Panel
The problem lies in the fact that panels tend to be biased in terms of who they include and exclude in their databases. This is particularly notable in Latin America. Researchers from the Latin American Public Opinion Project (LAPOP) at Vanderbilt University published a report earlier this year revealing that polling firms using panels in the region have a strong bias towards educated individuals with medium and high incomes. They also point out that these firms find it very difficult to recruit and obtain responses from individuals in these groups. As a result, surveys based on these panels only capture the voices of individuals with higher levels of education and political participation than the average population. And I haven’t even mentioned non-response bias, which tends to be high among panelists, who understandably may be hesitant to express their opinions on sensitive topics due to lack of anonymity. For some analysts, all of the above explains, to some extent, Clinton’s surprising defeat in 2016, precisely because the panels had insufficient representation of citizens without college degrees who mostly supported Trump, and/or many panelists turned out to be “shy Trumpers.”
As we can see, achieving an acceptable level of representativeness in an online survey is a complicated task. Much of the current work in both universities and technology laboratories is about finding a formula that allows us to take advantage of the benefits of the internet while respecting the methodological principles that guarantee an acceptable level of representativeness, and doing so in a transparent and clear manner. This is exactly what we try to do everyday at RIWI Corp.: combining innovative methodologies with technological advancements.
To err is human, to rectify is wise.
Now, one of the current concerns with polls is that we’ve had a string of bad luck with them. It’s worth noting that surveys can be wrong, that’s for sure. And in hindsight, it’s easy to find explanations: hidden vote, biased or insufficient sample, misinterpretation or misuse by the media or political actors, among others. However, making mistakes is not indicative of manipulation. This is politics, and surprises abound. Nevertheless, there is a very clear red line between making mistakes and committing fraud. The uncomfortable truth is that politicians of all colors feel a great temptation to use surveys, essentially a research tool with its strengths and weaknesses, as a political weapon against their adversaries.
On this last point, there are many forms of manipulating an electoral poll. What is important to know is that the methodology, questionnaire, and sampling frame are fundamental aspects of a survey and require great technical expertise to be carried out correctly. In other words, they leave ample room to introduce biases, passively or actively. The selection of methodology, for instance, will have a decisive impact on the results: a telephone survey will give more weight to the elderly sector and underestimate the youth vote, while an online survey will do the opposite. There is no perfect methodology, but it is necessary to know the method in detail when interpreting the results.
All of this must be taken into consideration when interpreting the data resulting from a survey, particularly an electoral poll. Polls are not “won.” Elections are won. Polls provide us with a signal at a specific moment and place, subject to methodological and technological limitations.