I woke up this morning to my friends checking if they were one of the over 1 million Filipinos whose data had been stolen by Cambridge Analytica. The Philippines was the second hardest-hit country in the data leak. With news reports investigating if CA had a hand in influencing the Philippine national elections, many people came to the (largely incorrect) conclusion that if your data was leaked, you were then “brainwashed” into voting for a certain candidate.
It’s not that straightforward. Here’s how I understand what could have happened to our data (and yes, I was one of those whose information was scraped). Check if your Facebook account was affected.
The data from 1 million Filipino Facebook users can be treated as one giant data set, or can be further sifted into subsets to crunch — according to age, sex, location, etc. Analyzing such data, you can find correlations and links between certain preferences and political leanings. Even in 2014 (as featured in this Slate article), Facebook Likes were already being used as shockingly accurate predictors for personal traits.
These predictions didn’t come from obvious likes. For example, if someone likes the Republican Party’s Facebook page, it probably indicates that person is a Republican. But the connections used in this research are not obvious. The top likes that were strongly indicative of high intelligence, for instance, were “Thunderstorms,” “The Colbert Report,” “Science,” and “Curly Fries.”
How could liking curly fries be predictive? The reasoning relies on a few insights from sociology. Imagine one of the first people to like the page happened to be smart. Once she liked it, her friends saw it. A social science concept called homophily tells us that people tend to be friends with people like themselves. Smart people tend to be friends with smart people. Liberals are friends with other liberals. Rich people hang out with other rich people.
So if a smart person likes curly fries, her (smart) friends see that and some of them like the page, too. The same goes for their friends and so on. Basically, the liking spreads through a part of the social network that happens to be more intelligent. After a while, liking the curly fries page happens to become a thing that smart people do. When you like it, the algorithm guesses that you must also be smart.
Market research companies do this all the time, but using data from willing respondents . So the difference here is that we did not consent to having our data used this way.
So now from this data analysis you can create general profiles about the kind of people you have in your sample. (UPDATE: This is called psychography. Here’s a nice CNN article about this whole mess that explains what psychographics is and how it can be used to influence people towards a certain action.)
The bigger your sample size is, the more accurate the conclusions you can draw from it toward a wider population. For instance, the usual Pulse Asia and SWS survey sample is only 1,200 respondents (but a representative one that cuts across socioeconomic and geographic divides). The sample size from Facebook is over 1 million “respondents” but limited to those who have cellphones and/or internet access. Still, 1 million is a huge data set and can be useful to make inferences about “those who have cellphones and/or internet access”. From the 1M people that could have been profiled thru their stolen data, the campaign could have targeted the wider voting population.
Strategists can then decide on what kind of person to target & refine strategies to use on them. They can determine what kind of person would be most responsive to their messaging, because it’s easier to reinforce an existing belief rather than to change someone’s mind. (Confirmation bias!) They can also find out which “influencers” get plenty of traction from the “target market”, then coopt them or cooperate with them to unify messaging.
They can use various means of messaging: SMS, seeded stories to newspapers & radio show hosts, “overheard” copy-pastes on FB groups — I’m sure we’ve encountered all sorts of propaganda. 😉 Propaganda is most effective when delivered to those who would believe. “Converts” and “true believers” will then spread this in personal circles online & offline.
It’s a little like Inception: plant the seed of an idea so insidiously that the person thinks they came to that conclusion by themselves.