Bringing Light to the Shadowy World of Data Brokers

This is the first of a three part series respectively covering the current state of the data broker industry, the threat it poses to national security, and a novel proposal for resolving these issues while maintaining or increasing profitability for data brokers.


Experian’s new report predicting data breach trends in 2017, provides a glimpse into just how much information on every one of us is stored in private databases and how poorly managed that information is.

As early as 2014 a White House report stated that “we  live  in  a  world  of  near-ubiquitous  data  collection.” This data collection provides the backbone for the huge (and largely opaque) ecosystem of private and government groups that buy and sell personal information. On the one hand this ecosystem enormous convenience and indeed make our current economy possible. On the other, the systemic lack of transparency of any of these groups or their transactions poses numerous risks to not only defrauding individuals, but posing numerous national security risks. To resolve these issues, I propose an alternative market-based framework intended to increase data security and accountability while enhancing the profitability of private stakeholders.

To begin grasping the issue’s scale, consider the example of one of the most visible data aggregator: Facebook. Facebook records a phenomenal amount of details on the behaviors of its nearly 2 billion active users, which it uses to assign them over 52,000 unique attributes. When helping their customers—that is advertisers—target their marketing, Facebook supplements their own data by partnering with nine different data broker firms around the world. Facebook also works with these brokers to match specific offline purchases with ad views by users to ensure that its ad targeting is effectively persuasive. Considering that Facebook Likes alone have proven sufficient for impressively reliable algorithmic extrapolations of users’ personal information such as sexual orientation, religion, relationship status, and alcohol use, one can only begin to imagine how much Facebook can infer about its users.

But even Facebook touches on only a portion of this opaque marketplace. Just one of its data suppliers, Acxiom has multi-sourced information on over 700 million individuals, whose data it uses to conduct over 1 trillion transactions per week globally with over 3,000 clients including over half of Fortune 100 according to its 2016 annual report. Strangely, the 2016 report did not repeat the claim from the 2014 and 2013 reports that it has records of “Over 3,000 propensities for nearly every U.S. consumer”.

Such information can include categories such as social relationships, legal history, financial records, health information, Social Security number, ethnicity, address history, religion, political affiliation, purchase histories…—indeed minute details on nearly every aspect of individuals’ lives is available for sale.

Some of the information being sold by these brokers is incredibly sensitive, such as the lists of rape victims, dementia sufferers, HIV/AIDS sufferers, or those with addictive behaviors (helpfully sub-categorized into “alcohol”, “drugs”, and “gambling”) being sold for 7.9 cents per name in 2013. Even six years prior in 2007 federal agencies acknowledged that criminals were using data sources like these to target telemarketing scams, buying lists like “’Elderly Opportunity Seekers,’ 3.3 million older people ‘looking for ways to make money,’ and ‘Suffering Seniors,’ 4.7 million people with cancer or Alzheimer’s disease. ‘Oldies but Goodies’ contained 500,000 gamblers over 55 years old, for 8.5 cents apiece. One list said: ‘These people are gullible. They want to believe that their luck can change.’” Nor is this merely sporadic: Experian, one of Facebook partners and major player in the industry, was advertising marketing lists based on medical prescription use, among their many categories.

For even more direct defrauding in a recently settled FTC case, criminals used information from re-sold payday loan applications purchased from a data broker to directly steal over $25 million from millions of bank accounts.

The current arrangement of data broker industry also facilitates identity theft. Experian subsidiary Court Ventures was providing access to the financial records, SSN’s and other information of 200 million Americans to online identity theft marketplace—3.1 million of which were actually queried. When these identity theft sites can’t purchase the data from the brokers they steal them, like ssndob[dot]ru, which sold records on four million Americans obtained by covertly piping them from the databases of several brokers. (And brokers have been found stealing from each other as well.) These identity theft sites are sufficiently comprehensive that in a 2014 experiment, Brian Krebs was able to obtain from just two sites the SSN’s, phone numbers, and address histories for all 13 members of the Senate Subcommittee on Consumer Protection, Product Safety and Insurance, as well as the heads of the FTC and Consumer Financial Protection Bureau. However, due to a great deal of redundancy between data brokers databases, it’s typically impossible to determine from which brokers these data originate and thus identify any vulnerabilities or illicit sales in the overall industry.

This may be only the tip of the iceberg of the abuse. Industry oversight in the U.S. is minimal so all of the example problematic issues have been picked up largely by chance. In 2014 FTC Chair Edith Ramirez, whose agency theoretically has jurisdiction over data brokers, admitted that her agency didn’t even know how many data brokers were active, let alone details of their activities.

What FTC the can report is “a fundamental lack of transparency about data broker industry practices.” Indeed the brokers are so resistant to sharing information about themselves that, they refused to provide specifics on their data sources and customers to even a 2013 Congressional inquiry.

Their confidence in refusing such high-level inquiries may originate from the identities of a few of their customers in particular. Jeffrey Chester, executive director of the Center for Digital Democracy notes that political campaigns frequently use data from brokers to target their advertisements, so “There’s no political pressure on Congress, really, to act. The data-broker lobby is incredibly powerful.”

Politicians may have good reason to worry about unveiling their relationships with data brokers, as a 2012 survey found that the majority of Americans are deeply uncomfortable with the idea of political ads tailored to them by their personal information, which is more or less the standard practice.

Unfortunately, the resultant lack of industry transparency raises not just a criminal threat to individuals, but several national security threats. Not only does it create several espionage risks, but it could facilitate foreign manipulation of U.S. public opinion.

But we’ll get into that next week.

FCC Adopts Broadband Privacy and Data Security Rules

Today, the Federal Communications Commission adopted new rules that apply to Internet Service Providers, but not “edge” providers such as Twitter or Facebook. (The Federal Trade Commission has jurisdiction over edge providers.)  The 3-2 vote divided along party lines.  The rules seek to protect consumer privacy and security. Here are some highlights.

Continue reading

Safer Harbor?

This week heralded some very big news: the EU and US have apparently finalized a replacement for the Safe Harbor: the EU-US Privacy Shield. And just in time, as the moratorium on enforcement was slated to end, ahem, 3 days ago. Regardless, a new deal is good for stability, even if all of my pithy commentary had to be rewritten. Although you can’t blame me for planning ahead, as just a week ago the reported “sticking points” in the negotiation were the same two issues I highlighted from the Schrems decision invalidating the Safe Harbor back in October. The pieces have certainly come together pretty quickly, and while many EU privacy advocates see this as an EU capitulation, it looks poised to be the framework moving forward.

I’ll start with a quick overview of how I’m approaching this issue generally. My primary concern is whether the new Privacy Shield will withstand future legal challenges to the European Court of Justice, which invalidated the original Safe Harbor. In my analysis of Schrems, I highlighted two substantive issues (national security and standing), and one technical (the Safe Harbor didn’t expressly recognize US privacy law as adequate) that were the foundation for invalidating the Safe Harbor. This new agreement must resolve each to present a feasible case in the (likely) case that the Privacy Shield is challenged. The Privacy Shield solves the technical problem easily by fixing the technicality: it will expressly recognize the United States’ “adequacy.” As for the two substantive issues, I’ll be devoting the remainder of this post to their resolution. Continue reading

Aggregation Episode 3: Revenge of Mosaic Theory

A long time ago, in a courtroom far, far away . . . The DC Circuit was locked in an epic struggle between privacy and advancing technology, where GPS tracking looked to spell the doom of civil liberties neith the mighty tred of   . . . alright I’ll stop the Star Wars roleplaying. The Mosaic Theory is back in the news again, with the Fourth Circuit’s decision in United States v. Graham resurrecting that presumed dead interpretation of the Fourth Amendment, and I want to talk about it. The ruling involves cellphone cell-site data, (the stuff they talked about in Serial), but really implicates location data generally, and has commentators wondering if the Supreme Court will take a second look at how it treats location data under the Fourth Amendment. Continue reading

SCOTUS and Privacy: Spokeo v. Robins

The Supreme Court has been on a roll this past week. Obergefell v. Hodges found that same-sex marriage is a fundamental right and therefore legal in all 50 states; King v. Burwell upheld Obamacare (I would encourage you to also read Justice Scalia’s dissent); Kimble v. Marvel gave us some wonderful Supreme Court Spiderman puns; and Los Angeles v. Patel resolved a case with potentially major privacy ramifications in a manner that was decidedly uneventful. As such, I thought this would be a good time to talk about an upcoming Supreme Court case involving consumer privacy: Spokeo v. Robins. Spokeo is a corporate challenge to class action lawsuits that are based purely on statutory violations: basically, it asks if consumers can sue a company for shady privacy practices even if they cannot show that anything bad happened. This case is potentially a big deal, particularly for big business, and so its worth a closer look.  Continue reading