Cambridge Analytica Teaches Us Data is More Powerful Than AI

Cambridge Analytica is all over the news this week, and for good reason. The company has been hired to do its work in a large number of elections around the world and there appears to have been some serious breaches of both the law and ethics in the activities described in the press.  As this piece explains, the targeting technology described is readily available and has been for the past several years. And, the marketing and advertising world is ahead of the political marketing the political communications world in this type of capability. The technology used here is not new, but the the type (and quantity) of data takes the AI to another level. Let me explain.

Data brokers and online marketers all collect or obtain data about individuals including browsing history, location data, who your friends are and the like. This data can then be used to infer additional unknown, information about you like your gender, political affiliation, your current emotional state and how reliable you are. They can even be used to leverage AI, making predictions on what you’re likely going to buy next.

The capability goes even further to include predicting personality traits, aptitudes, and abilities in what is called psychometric profiling. This data traditionally could only be found through specifically designed tests and questionnaires but, through machine learning, has now been correlated to data that can predict things like how neurotic you are, how open you are to new experiences or whether you are contentious. Today, Twitter is an open data source advertisers leverage to understand individuals motivations and desires. These insights help predict who you should advertise to.

IBM, as an example, introduced Watson Personality Insights capabilities as a cloud-based API service in 2015. Using Twitter and other data, It predicts people’s habits and preferences on an individual level by identifying personality characteristics, needs and values through written text like social media.

The service offers three kinds of personality insights, detailed below and with a graphical representation of the API results below:

  • Personality characteristics, known as emotional range, including a portrait of an individual’s personality characteristics and how they engage with the world across five primary dimensions: Openness, Conscientiousness, Extroversion, Agreeableness, and Neuroticism (also known as).
  • Needs, which can infer certain aspects of a product that will resonate with an individual across twelve needs: Excitement, Harmony, Curiosity, Ideal, Closeness, Self-expression, Liberty, Love, Practicality, Stability, Challenge, and Structure.
  • Values, which can identify motivating factors which influence a person’s decision-making across five dimensions: Self-transcendence / Helping others, Conservation / Tradition, Hedonism / Taking pleasure in life, Self-enhancement / Achieving success, and Open to change / Excitement.

This capability isn’t expensive. The price IBM charges for each profile generated ranges between $0.005 and $0.02 per APl call, depending on the volume. In other words, the technology is mature, widely available and inexpensive.

What is unique in the case of Cambridge Analytica is that they appear to have run their 2016 US presidential targeting campaigns leveraging immense amounts of Facebook user data that was acquired unethically and/or illegally. This amplified their capabilities, but again, this is not new technology. The takeaway is that the data is the key that unlocked the capability that Cambridge Analytica brought to the 2016 U.S. presidential election. They combined advanced machine learning and a powerful data set to gain advantage.

So how do companies get the data to do this type of analysis without crossing ethical or legal lines? There are data sets available on the open market. Data on individuals that can feed personality prediction models are freely available from social media sites and also from paid sources.

Predicting personality is possible from Instagram photos, Twitter profiles and even phone-usage metrics. Crystal Knows, a startup, delivers personality reports of their contacts from Google or social media and offers real-time suggestions for how to personalize emails or messages. This IBM demo uses free Twitter data.

However, the most impactful data for predictions comes from intelligently joining multiple data sets, like those available from social media and those available for purchase from data brokers. Putting together these data sets can enable both more predictive capabilities and more accurate predictions.

The future is in predictions based on data. This graphic from Gartner neatly sums up where the world is heading. The value of data goes up as you move from using it for looking in the rear view mirror to gaining insights into why things happened, which are all focused on the past, to predicting what will happen. The final step in the Gartner model is what prescriptive analytics, which is what Cambridge Analytica pitched as their capability. It’s very unclear what impact they had on actual voting but it’s where marketers are always looking to go.

At KUNGFU.AI, we’ve already received questions from potential customers referencing the articles on Cambridge Analytica on targeting potential clients leveraging psychographics for marketing purposes.

Our answer to whether we can do this work is: yes. The answer to whether we’re willing to do this work is: maybe.  We have taken a strong, public stance on only using AI for Good and only working with clients on projects that follow this ethos.

With that said, there are a myriad of ways to use this targeting and marketing technology for good and not cross any ethical lines. Very unfortunately, that does not appear to be the case in the work Cambridge Analytica did as described in the press.