How the Personal Data Markets Threaten National Security

This is the second of a three part series respectively covering the current state of the data broker industry, the threat it poses to national security, and a novel proposal for resolving these issues while maintaining or increasing profitability for data brokers.


As covered last week, the current market for personal data allows groups to easily and cheaply obtain detailed personal information on almost any U.S. citizen, including countries that might wish the U.S. harm. Not only could this give potentially hostile nations an espionage advantage over the U.S., but facilitate their attempts to influence popular opinion.

Both China and Russia have been implicated in bulk acquisition of data on U.S. citizens. U.S. officials have concluded that the 2015 Office of Personnel Management breach of over 4 million U.S. federal government employees, 78 million health records in the Anthem breach, travel records stolen from United Airlines, and data from other targets are part of a campaign intended to build a massive database of U.S. citizen records.

Given the historically close relationship between the Russian government and organized crime especially cybercrime one must also consider the possibility of state involvement in several breaches attributed to Russian hackers. This includes the 2015 theft of over 100,000 tax returns, 1 billion Yahoo accounts in 2013 only discovered in 2016 (and one of the buyers for whom lead investigator specifically believed “was potentially a foreign intelligence organization because the questions they were asking were very specific,”), the allegedly state-sponsored 2014 Yahoo breach of 500 million accounts, 117 million LinkedIn login credentials, and approximately 40 million credit card records through a series of breaches of retailors in which cyber criminals seemingly referenced American and European sanctions against Russia at the time.

The most conventional application for these data is for espionage. Not only could it be used to identify U.S. intelligence assets, it could point towards potential recruitment targets through blackmail, financial difficulties, ideological susceptibility, and/or facilitating personality traits. Fraudulent account or identity theft could also allow for infiltration of institutional and social networks for acquisition of additional information.

The less conventional threat it poses becomes apparent when one considers that the entire purpose of marketing data aggregation is to influence people more efficiently.

Russia has certainly demonstrated an interest—and apparent success—in mass influence of foreign populations. Recent Russian ventures have included influencing elections in the United States, the Netherlands, Germany, the Brexit referendum, and the Italian referendum though spreading fake news and (in the case of the at least the U.S.) cyberwarfare. Even outside the context of elections, Russia has sponsored sophisticated covert networks of trolls that spread both terrifying hoaxes and subtle propaganda. Large networks of Twitterbots have also been identified propagating pro-Kremlin narratives.

The use of personal data could make targeting and tailoring these influence campaigns more efficient. Its effectiveness in shaping public opinion is highlighted by how reliant U.S. politicians have become in using such data to fine-tune their messages to each audience, as touched upon last week. Even former skeptic of data’s efficacy, Donald Trump ended up using the services of firm Cambridge Analytica (which claims to 4-5,000 data points on 220 million U.S. adults, and even more when supplemented with the databases of larger firms), to build 100,000 web pages micro-targeted to appeal to specific voter segments. Clinton, of course, had her own extensive data mining and analytics operation,  and Obama’s was often credited with giving him a strong campaign advantage.

Outside of politics, personal data analytics applied to marketing have proven effective at increasing a company’s profitability, even though (as of the 2015 study) most companies implemented it inefficiently.

Algorithmic analysis of consumer data can make targeting even more efficient, as demonstrated by a 2014 Telenor/MIT study in which an automated marketing program used social network analysis to obtain 13 times the initial conversion rate of an experienced marketing team (that is to say, the people it selected as likely customers were 13 times as likely to buy services as the ones chosen by the human marketers), and 98% of the customers that the algorithm convinced continued to use it for the next month, compared to 37% of the ones successfully chosen by the human marketers.

As tools for converting personal data into persuasion continue to advance in sophistication, they will provide ever more potent weapons for public manipulation by hostile governments unless we can prevent our citizens’ information from falling into their hands.

Next week will cover how to keep our personal data out of their hands while improving the industry’s profitability and curbing its excesses.