Improving the Personal Data Markets for Everyone through Transparency

This is the second of a three part series respectively covering the current state of the data broker industry, the threat it poses to national security, and a novel proposal for resolving these issues while maintaining or increasing profitability for data brokers.


Despite the huge risks posed by the current lack of transparency and oversight of the personal broker industry, reform has been glacial to non-existent. As previously discussed, U.S. legislators use data brokers’ information to target their political campaigns despite general U.S. discomfort with such practices. Reforms are possible, however, if they can be done without threatening data broker profits—and the proposal here is intended to enhance them while also safeguarding against some of the industry’s excesses by creating publicly transparent markets for the exchange of data.

Reforms improving transparency in both electoral processes and data brokers have proven successful in the past, so it’s certainly possible Over the course of the 1970’s federal regulations were put into place requiring elected officials to disclose all campaign contributions. In 2003 with the FACT Act, credit bureaus were required to at least once per year give consumers access to their credit reports.

Public outrage over the industry’s excesses could kindle the drive over such reforms. In addition to the issues already discussed, there are a few incidents and concerns that play to partisan issues. Liberals might be concerned over anti-abortion groups using phone location tracking to send women mobile ads as they enter Planned Parenthood clinics, or that the government could bypass intelligence agency objections to obtain a lists of every Muslim in the U.S. for under $20,000 from several different brokers. Conservatives could realize that lists like these also list their own religions, political affiliations, gun ownership, etc., or how terrorist groups like ISIS could use these data to more efficiently target recruitment or attacks.

But the motive for reform matters little without a workable alternative to the status quo. So, I propose bringing the transactions of brokers into the light by establishing markets for the bulk exchange of data, inspired by the likes of commodities markets, Amazon, and Alibaba, but in which all transactions are matters of public record. On this market groups could buy access, rights, storage, analysis, etc. of such data.

From a security perspective, this would make it considerably easier for U.S. intelligence and security agencies to protect U.S. citizens’ information. Buyers and sellers on the market would be subject to regular federal certification to ensure general compliance with security practices and being a legitimate user for the data (keeping the data out of the hands of criminals, terrorists, and potentially hostile foreign governments), with the degree of scrutiny scaled based on the quantity and sensitivity of the information. By providing an index of which groups had access to which data security forces could better identify sources of leaks, which is currently nearly impossible. Additionally, brokers could easily outsource the actual storage of their data to specialized, highly secure firms rather than leaving redundant copies in their own systems, meaning that hackers need only seek out the weakest defended of many possible targets to breach.

For consumer advocates it means that watchdog groups can get a solid handle on these transactions, quickly flagging any particularly worrisome incidents, so that the worst of the industry can be tidied up without hindering more benign exchanges.

The data brokers might initially seem to risk a lot in this shift: opening themselves and their clients to public scrutiny plus compliance expenses. However, it also opens several new ways to enhance their own profitability.

First it streamlines and unifies the market, reducing transaction costs. Second, it could facilitate standardization of data formats, revolutionizing data exchanges in the same way that standardized shipping crates revolutionized global trade. Third, it gives all data purchasers direct access to the history and provenance of data in order to better assess its reliability and appropriateness for their uses. Fourth, it could open the brokers up to new sources of clients. University researchers in particular spring to mind here as one could see them purchasing access to the data through their institution (which would act as the screened intermediary and would actually retain control of the data for security and IRB compliance purposes), but private sector groups could also find it easier to navigate a transparent market for data than the current disorganized and opaque one. Fifth, the overall transparency in the market would make it far easier to identify underserved niches in data collection and analysis, highlighting new opportunities for revenue.

With the interests of brokers, legislators, and the public aligned, we can all reap the benefits of the Age of Omniscience while keeping an eye on the watchmen.

How the Personal Data Markets Threaten National Security

This is the second of a three part series respectively covering the current state of the data broker industry, the threat it poses to national security, and a novel proposal for resolving these issues while maintaining or increasing profitability for data brokers.


As covered last week, the current market for personal data allows groups to easily and cheaply obtain detailed personal information on almost any U.S. citizen, including countries that might wish the U.S. harm. Not only could this give potentially hostile nations an espionage advantage over the U.S., but facilitate their attempts to influence popular opinion.

Both China and Russia have been implicated in bulk acquisition of data on U.S. citizens. U.S. officials have concluded that the 2015 Office of Personnel Management breach of over 4 million U.S. federal government employees, 78 million health records in the Anthem breach, travel records stolen from United Airlines, and data from other targets are part of a campaign intended to build a massive database of U.S. citizen records.

Given the historically close relationship between the Russian government and organized crime especially cybercrime one must also consider the possibility of state involvement in several breaches attributed to Russian hackers. This includes the 2015 theft of over 100,000 tax returns, 1 billion Yahoo accounts in 2013 only discovered in 2016 (and one of the buyers for whom lead investigator specifically believed “was potentially a foreign intelligence organization because the questions they were asking were very specific,”), the allegedly state-sponsored 2014 Yahoo breach of 500 million accounts, 117 million LinkedIn login credentials, and approximately 40 million credit card records through a series of breaches of retailors in which cyber criminals seemingly referenced American and European sanctions against Russia at the time.

The most conventional application for these data is for espionage. Not only could it be used to identify U.S. intelligence assets, it could point towards potential recruitment targets through blackmail, financial difficulties, ideological susceptibility, and/or facilitating personality traits. Fraudulent account or identity theft could also allow for infiltration of institutional and social networks for acquisition of additional information.

The less conventional threat it poses becomes apparent when one considers that the entire purpose of marketing data aggregation is to influence people more efficiently.

Russia has certainly demonstrated an interest—and apparent success—in mass influence of foreign populations. Recent Russian ventures have included influencing elections in the United States, the Netherlands, Germany, the Brexit referendum, and the Italian referendum though spreading fake news and (in the case of the at least the U.S.) cyberwarfare. Even outside the context of elections, Russia has sponsored sophisticated covert networks of trolls that spread both terrifying hoaxes and subtle propaganda. Large networks of Twitterbots have also been identified propagating pro-Kremlin narratives.

The use of personal data could make targeting and tailoring these influence campaigns more efficient. Its effectiveness in shaping public opinion is highlighted by how reliant U.S. politicians have become in using such data to fine-tune their messages to each audience, as touched upon last week. Even former skeptic of data’s efficacy, Donald Trump ended up using the services of firm Cambridge Analytica (which claims to 4-5,000 data points on 220 million U.S. adults, and even more when supplemented with the databases of larger firms), to build 100,000 web pages micro-targeted to appeal to specific voter segments. Clinton, of course, had her own extensive data mining and analytics operation,  and Obama’s was often credited with giving him a strong campaign advantage.

Outside of politics, personal data analytics applied to marketing have proven effective at increasing a company’s profitability, even though (as of the 2015 study) most companies implemented it inefficiently.

Algorithmic analysis of consumer data can make targeting even more efficient, as demonstrated by a 2014 Telenor/MIT study in which an automated marketing program used social network analysis to obtain 13 times the initial conversion rate of an experienced marketing team (that is to say, the people it selected as likely customers were 13 times as likely to buy services as the ones chosen by the human marketers), and 98% of the customers that the algorithm convinced continued to use it for the next month, compared to 37% of the ones successfully chosen by the human marketers.

As tools for converting personal data into persuasion continue to advance in sophistication, they will provide ever more potent weapons for public manipulation by hostile governments unless we can prevent our citizens’ information from falling into their hands.

Next week will cover how to keep our personal data out of their hands while improving the industry’s profitability and curbing its excesses.

Bringing Light to the Shadowy World of Data Brokers

This is the first of a three part series respectively covering the current state of the data broker industry, the threat it poses to national security, and a novel proposal for resolving these issues while maintaining or increasing profitability for data brokers.


Experian’s new report predicting data breach trends in 2017, provides a glimpse into just how much information on every one of us is stored in private databases and how poorly managed that information is.

As early as 2014 a White House report stated that “we  live  in  a  world  of  near-ubiquitous  data  collection.” This data collection provides the backbone for the huge (and largely opaque) ecosystem of private and government groups that buy and sell personal information. On the one hand this ecosystem enormous convenience and indeed make our current economy possible. On the other, the systemic lack of transparency of any of these groups or their transactions poses numerous risks to not only defrauding individuals, but posing numerous national security risks. To resolve these issues, I propose an alternative market-based framework intended to increase data security and accountability while enhancing the profitability of private stakeholders.

To begin grasping the issue’s scale, consider the example of one of the most visible data aggregator: Facebook. Facebook records a phenomenal amount of details on the behaviors of its nearly 2 billion active users, which it uses to assign them over 52,000 unique attributes. When helping their customers—that is advertisers—target their marketing, Facebook supplements their own data by partnering with nine different data broker firms around the world. Facebook also works with these brokers to match specific offline purchases with ad views by users to ensure that its ad targeting is effectively persuasive. Considering that Facebook Likes alone have proven sufficient for impressively reliable algorithmic extrapolations of users’ personal information such as sexual orientation, religion, relationship status, and alcohol use, one can only begin to imagine how much Facebook can infer about its users.

But even Facebook touches on only a portion of this opaque marketplace. Just one of its data suppliers, Acxiom has multi-sourced information on over 700 million individuals, whose data it uses to conduct over 1 trillion transactions per week globally with over 3,000 clients including over half of Fortune 100 according to its 2016 annual report. Strangely, the 2016 report did not repeat the claim from the 2014 and 2013 reports that it has records of “Over 3,000 propensities for nearly every U.S. consumer”.

Such information can include categories such as social relationships, legal history, financial records, health information, Social Security number, ethnicity, address history, religion, political affiliation, purchase histories…—indeed minute details on nearly every aspect of individuals’ lives is available for sale.

Some of the information being sold by these brokers is incredibly sensitive, such as the lists of rape victims, dementia sufferers, HIV/AIDS sufferers, or those with addictive behaviors (helpfully sub-categorized into “alcohol”, “drugs”, and “gambling”) being sold for 7.9 cents per name in 2013. Even six years prior in 2007 federal agencies acknowledged that criminals were using data sources like these to target telemarketing scams, buying lists like “’Elderly Opportunity Seekers,’ 3.3 million older people ‘looking for ways to make money,’ and ‘Suffering Seniors,’ 4.7 million people with cancer or Alzheimer’s disease. ‘Oldies but Goodies’ contained 500,000 gamblers over 55 years old, for 8.5 cents apiece. One list said: ‘These people are gullible. They want to believe that their luck can change.’” Nor is this merely sporadic: Experian, one of Facebook partners and major player in the industry, was advertising marketing lists based on medical prescription use, among their many categories.

For even more direct defrauding in a recently settled FTC case, criminals used information from re-sold payday loan applications purchased from a data broker to directly steal over $25 million from millions of bank accounts.

The current arrangement of data broker industry also facilitates identity theft. Experian subsidiary Court Ventures was providing access to the financial records, SSN’s and other information of 200 million Americans to online identity theft marketplace—3.1 million of which were actually queried. When these identity theft sites can’t purchase the data from the brokers they steal them, like ssndob[dot]ru, which sold records on four million Americans obtained by covertly piping them from the databases of several brokers. (And brokers have been found stealing from each other as well.) These identity theft sites are sufficiently comprehensive that in a 2014 experiment, Brian Krebs was able to obtain from just two sites the SSN’s, phone numbers, and address histories for all 13 members of the Senate Subcommittee on Consumer Protection, Product Safety and Insurance, as well as the heads of the FTC and Consumer Financial Protection Bureau. However, due to a great deal of redundancy between data brokers databases, it’s typically impossible to determine from which brokers these data originate and thus identify any vulnerabilities or illicit sales in the overall industry.

This may be only the tip of the iceberg of the abuse. Industry oversight in the U.S. is minimal so all of the example problematic issues have been picked up largely by chance. In 2014 FTC Chair Edith Ramirez, whose agency theoretically has jurisdiction over data brokers, admitted that her agency didn’t even know how many data brokers were active, let alone details of their activities.

What FTC the can report is “a fundamental lack of transparency about data broker industry practices.” Indeed the brokers are so resistant to sharing information about themselves that, they refused to provide specifics on their data sources and customers to even a 2013 Congressional inquiry.

Their confidence in refusing such high-level inquiries may originate from the identities of a few of their customers in particular. Jeffrey Chester, executive director of the Center for Digital Democracy notes that political campaigns frequently use data from brokers to target their advertisements, so “There’s no political pressure on Congress, really, to act. The data-broker lobby is incredibly powerful.”

Politicians may have good reason to worry about unveiling their relationships with data brokers, as a 2012 survey found that the majority of Americans are deeply uncomfortable with the idea of political ads tailored to them by their personal information, which is more or less the standard practice.

Unfortunately, the resultant lack of industry transparency raises not just a criminal threat to individuals, but several national security threats. Not only does it create several espionage risks, but it could facilitate foreign manipulation of U.S. public opinion.

But we’ll get into that next week.

North Korea: Interview with a Political Vampire

In light of the substantial media coverage of the Sony hack, Sony’s decision to cancel its release of The Interview, and the recent statement by US officials that North Korea was “centrally involved,” I thought I’d devote some time to discussing the implications of a nation-state perpetrating a cyber-attack on a private company. Most of the discussion I’ve seen thus far has focused on Sony’s response, with many expressing criticism of the company for acquiescing to a terrorist threat. And while there are some interesting proposals for how Sony should respond, (such as releasing the film online for free; or airlifting copies of the film into North Korea), I would like to focus on how we should view this under international law. Continue reading