News

Election 2016: The Data Game

BLOG

Elections are won or lost by candidates. But in a close race, data can be crucial to a campaign. Donald Trump won the presidency with razor-thin margins in swing states. The president-elect flipped Florida, Pennsylvania, Michigan and Wisconsin by margins of less than two per cent. If Hillary Clinton had taken those last three states, she would have won the election. Trump won those three states by a combined margin of approximately 107,000 votes.

The 2016 presidential election showed that the use of data to identify, persuade and turn out voters has become increasingly sophisticated. Cambridge Analytica’s Data Science, Digital Marketing, and Research teams informed key decisions on campaign travel, communications, and resource allocation.

The firm provided daily intelligence and recommendations to senior Trump campaign staff. The president-elect’s principal advisers were updated in real time with the latest data and guidance via a simple digital dashboard on their laptops.

How great a role did data play in this election? What lessons can be learned for the future? And is polling dead, as some observers have suggested?


Understanding Voters

It all starts with polling. Every week Cambridge Analytica collected responses from 1,500 to 2,000 people in each battleground state. This gave the firm an unrivalled insight into where the race stood every day, as well as giving it fresh information to add to its commercial and demographic data.

It used this research and data to model scores for all voters across key states: which candidate they preferred, which were “persuadable”, the issues they cared about, and how likely they were to actually vote on Election Day. Every voter in each battleground state was also segmented by ethnicity, religion, and the issues that concerned them most.

This gave the firm a sophisticated understanding of the voter landscape at any given moment, rather than how it looked four or eight years ago in previous elections.

Cambridge Analytica understood the so-called “Trump effect”.  It recognised that some of his supporters were not typical Republicans and had not necessarily voted for John McCain or Mitt Romney. They were often more white, more rural, slightly younger, and cared about different issues, such as law and order, immigration, wages and trade. About 10 to 15 per cent were Democrats.

The firm’s scientists also analyzed absentee ballots and early voting returns. They knew that turnout was likely to be key, and that voter history was not going to be a reliable indicator for who was going to vote this year. So they updated their models to take into account the trends that they saw emerging in absentee ballots and early voting. The team saw that Donald Trump was going to get a lift in the critically important Rust Belt, and had a much wider path to victory than anyone else had anticipated.


It’s How You Use the Data

 “There are no longer any experts except Cambridge Analytica. They were Trump’s digital team who figured out how to win.”  

Frank Luntz, American political consultant and pollster

Understanding the electorate allowed Cambridge Analytica to advise the Trump campaign how to win. Its analysis and recommendations influenced resource allocation, where the candidate travelled to, voter persuasion, get-out-the-vote, messaging, TV targeting, digital ads, and fundraising.

Using fresh polling data, the firm continually ran simulations to calculate which combination of states formed the easiest “path to victory” in terms of winning the required number of Electoral College votes. From this it created a “State Priority Ranking” tool, which told the campaign which states were the most important to focus on in order to win the election: a vital tool for deciding where to spend resources. Priority scores were also given for cities and counties within each state, based on the number of “persuadable” voters in each location as well as those voters who needed to be encouraged to vote.

This data was used to inform the president-elect’s travel schedule, so that he could campaign where most needed, and he knew what issues people cared about. In the last week of the race, Cambridge Analytica’s top three priority states were Florida, Pennsylvania, and Ohio. These were where Donald Trump held the most rallies in the final week. Indeed, he visited the top seven cities that Cambridge had recommended.

When, in the final weeks of there race, the firm’s data scientists recalculated voter turnout and recalibrated their models to show how Donald Trump could win, the GOP candidate revisited states like Michigan and Wisconsin.

Cambridge Analytica’s data analysis informed the campaign on how to communicate with voters, and it was the firm’s Digital Team that planned and executed the digital advertising strategy that won over undecided voters. Online ads placed by the firm were viewed a staggering 1.5 billion times by millions of Americans, after the company ran 4,000 individual digital ad campaigns backing the Republican candidate. Its messaging and testing was continually tested and refined by using the firm’s polling and data. Through the digital team, the campaign was the first to leverage many of the emerging trends in the advertising industry, such as native advertising.

Finally, data was used for fundraising. Cambridge Analytica was initially hired by the campaign to help with its efforts to raise money. The firm’s data analytics played a role in raising more low-dollar donations than any Republican candidate has ever raised before. The return-on-investment was probably unprecedented for a presidential campaign.

What lessons can be drawn at this early stage?


Data Science Needs to be Client-Specific

Cambridge Analytica built a very specific data science program geared to their client. This was crucial. Donald Trump’s very direct and outspoken approach resulted in a highly volatile campaign, so his data program had to be highly responsive to changes in public opinion.

If Hillary Clinton’s campaign was the Titanic, then Donald Trump’s campaign was a speedboat: nimble, flexible, and able to adapt fast.

The firm built this state-of-the-art data operation and advertising technology infrastructure for a presidential campaign from scratch in less than six months: a first in U.S. political history.


Integrated Teams Work Better

“Donald Trump may not have had as impressive an operation on the ground as Hillary Clinton, but since June a very sophisticated data analysis operation had been underway.” 

- Rory Cellan-Jones, BBC News Technology correspondent.

The Trump campaign benefited hugely from having fully integrated teams carrying out research, data science and digital marketing. Each team from Cambridge Analytica knew how to get the best out of one another. In addition, their workflow created a circular learning process. Field surveys directly influenced the data modelling, which in turn built audiences for digital marketing, TV ads, mail and other engagement. Field research then tested the effectiveness of voter targeting, which adapted and improved accordingly. This circular process meant the campaign was constantly learning and improving its outreach.

Cambridge Analytica blended experienced political operatives with the most talented and highly qualified PhD data scientists.


Polling and Data are Very Much Alive

The firm’s polling reflected changes in the electorate one to two days after significant developments on the campaign trail, because of its sample sizes and data modeling. Public polling lagged behind by about two weeks. Cambridge Analytica was able to quickly react by updating models to take these shifts into account.

Most pollsters underestimated Donald Trump’s support. Cambridge Analytica’s internal data ahead showed the race tightening considerably because its data scientists had seen the trends that would play out on election night.

Absentee ballots and early voting showed a decrease in African-American turnout, a marginally increased Hispanic turnout, and a big increase in rural and older turnout. So polling and modeling was re-weighted to more accurately reflect who was going to vote, and this gave Trump a one to three point boost in Rustbelt states like Michigan, Pennsylvania, Ohio, Iowa, and Indiana.

Also in play was a so-called “reverse-Bradley” or “hidden Trump” effect: a group of people who were not telling pollsters that they intended to vote for the president-elect. This probably gave a small one to two percent boost to Donald Trump on election day.

Research is the lifeblood of data science. In order to quickly and accurately reflect changes in the electorate, data science needs polling data. Conversely, polling needs real data science if it wants to get it right. Polling can’t thrive in isolation, and hope that making assumptions based on past elections will lead to an understanding of voters.

If data feels cold and impersonal, or perhaps downright spooky at times, then consider this: the data revolution is in the end making politics (or shopping) more intimate by restoring the human scale. Author and journalist Sasha Isenberg wrote about this paradox in his 2012 book The Victory Lab: “Campaigns are learning to quantify the ineffable – the value of a neighbor’s knock, of a stranger’s call, the delicate condition of being undecided – and isolate the moment when a behavior can be changed, or a heart won. Campaigns have started treating voters like people again.”