Data Localization, Web Scrapers, and Even More EU Privacy Problems

I want to return again to the realm of territoriality and data sovereignty. The privacy world is still reeling from the Safe Harbor invalidation, and although the US and EU have agreed “in principle” to a new data sharing pact, (presumably to provide a semblance of stability), the future of data governance is anything but clear. While I won’t devote another post entirely to the Safe Harbor, one specific issue, referred to as the Lindqvist loophole, caught my eye, and should serve as a good jumping off point to talk about broader issues the global Internet is poised to face. With an apparent trend towards data localization, (storing personal data in the State where the data subject is located), the conflict between an open Internet and domestic legislation is increasingly coming to a head.

“Keep your friends close, and their data closer”

I’ll start with data localization. This is the seemingly simple concept that data about a State’s citizens must be stored within that State. Data localization is in many ways a reaction to the un-territoriality of data, which uses domestic laws to force data to be held within that State’s territorial borders. International jurisdiction can be tricky, particularly with regard to the Internet, so requiring commercial activities (in this case, data storage) to be done within your domestic territory provides a seemingly simple solution. And while localization raises tricky questions about whose data is regulated and what governs the data a State may localize, it nonetheless preserves a worldview of territorial sovereignty. Or at least attempts to.

Yet looking to the data localization laws that currently exist or have been considered, there is little consistency. Localization may be required for all data, all the time (Russia); all data, some of the time (EU); certain types of data, all of the time (China); or certain types of data, some of the time (Australia). Some provide exceptions for if the user consents (Malaysia), or if the transfer is “necessary” for business operations (India), but almost all of them are difficult to interpret. And data localization isn’t necessarily exclusive: many States just want to ensure that they have a copy, as is the case in Russia. Whereas others don’t require data localization, per se, but instead limit where else data can be stored, as with the EU. Nevertheless, all of these laws are fundamentally at odds with the global Internet, which thrives on the uninhibited flow of information across State borders.

The Lindqvist Loophole

Which brings me to the Lindqvist loophole. The Lindqvist loophole is a common-sense interpretation of the EU’s data localization law, found in the Data Protection Directive (DPD), that provides an exception for when data isn’t “sent” abroad, but is instead “accessed” from abroad. (Unlike other data localization laws, the DPD regulates where you cannot store data, but the effect is largely the same. See also here.) For example: if an EU citizen posts a photo of their friend on a personal website, and someone from NoPrivacyLand downloads it, have they breached EU privacy laws? Is viewing their website a “transfer” which triggers the DPD? The ECJ said no: they didn’t transfer the data, it was merely accessed. The fact that websites (which often contain personal data) can be accessed from anywhere in the world is insufficient to invoke the DPD’s wrath. Any alternative would potentially break the global Internet, as only approved countries would be allowed to view EU websites.

The Lindqvist loophole, despite seeming like a necessary corollary to an open Internet, appears to be viewed increasingly unfavorably in the EU. The principle has always been problematic, as an enterprising “accessor” could abuse this position to circumvent EU laws, but the alternatives are equally unappealing: one would effectively isolate the EU Internet from outside viewers; while the other would extend the EU’s regulations to reach anyone who views EU websites. And while the Lindqvist loophole is still technically good law, it represents a broader jurisdictional problem with data localization laws, and indeed data protection regulations in general. Without a Lindqvist loophole, could downloading photos from a foreign Facebook page violate their foreign data localization law? What would it mean for that foreign State to have jurisdiction over me? Would the scale of my activity impact the discussion?

These are complicated questions, so I’ll break the remainder of my discussion into two sections: the first on jurisdiction, and the second on web scrapers.

Data Jurisdiction

Broadly speaking, jurisdiction refers to the right of a government to impose its laws upon you. While there are several bases for asserting jurisdiction over an individual, I’ll be focusing on two: territoriality and effects. Territoriality is the older of the two, and is based upon the territorial boundaries of the State making the laws: inside the United States? US law applies. Outside the United States? US law does not apply. Easy. Effects, by contrast, regulates activity having effects within a State’s territory, regardless of where the activity occurred. A straightforward example would be a person in Canada firing a gun across the US border: the activity (firing the gun) occurred extraterritorially, but the effect (a person being shot) impacted US interests, so the US can assert a claim to regulate that activity. (This is sometimes cast as “passive personality”). Given the potentially long arms of effects jurisdiction, it is typically interpreted narrowly. Yet even with an effects jurisdiction justification, a State can only enforce that jurisdiction on subjects within its territory, or with the agreement of the State where the subject is currently located. Territoriality, by and large, is still king.

It is this territoriality/effects tension that motivates data localization laws, because data and the entities that control data are less bounded by geography, making it easier for multiple States to assert a territorial justification for regulating them. To overcome this, a State wanting to ensure regulatory control over the personal data of its citizens needs for that data to be stored domestically. Through their territorial jurisdiction over the data controllers, which typically have branches everywhere they operate, States are able to bring what would otherwise be extraterritorial activity within their control, unilaterally. While data localization is disfavored in the US (most recently seen in the Trans Pacific Partnership), it is a natural outgrowth of a worldview in which data located extraterritorially is beyond your regulatory reach, and is increasingly popular in States where data protection is a high priority.

Yet the jurisdictional basis for data localization raises some thorny questions: if Russia can require data on Russian citizens to be stored domestically, can it require the same for data on foreign citizens? Or perhaps the better question is, what is to stop it from doing so? Is there some limiting principle on which data can be localized? Other jurisdictional principles, like nationality, certainly support the notion that States have an interest in the well-being of their citizens, but they do not place limits on the State’s power to regulate domestic corporations. And while some international agreements, particularly trade agreements, may come into play, these have as yet failed to limit data localization measures, notwithstanding substantial financial impact. And that’s without considering the potential fallout when a data processor is subject to conflicting laws. (For instance, dual citizenship.) 

I’ve raised a similar hypothetical in discussing the Microsoft Ireland case – if the US cannot compel Microsoft to disclose extraterritorially stored data, could it instead require that a copy of all data be stored domestically? The territorial sovereignty argument that Microsoft relies on would be largely moot, as this would regulate the data before it is stored anywhere. While this may seem like an end run around jurisdiction, it is probably better characterized as a diplomatic struggle over data governance, fought through the realm of jurisdiction. While the US uses territoriality primarily to assert a right to access information, others, like the EU, use territoriality to assert a right to regulate information, as seen with the right to be forgotten. Without the international equivalent of a sovereign to resolve these disputes, data controllers have to get creative.

One solution to this jurisdictional dilemma is to compartmentalize. Rather than attempt to comply with the laws of each State, which are often conflicting and financially unappealing, a corporation could create separate legal entities for each State they operate in. This avoids the potential for conflicting laws, because no individual entity is subject to more than one State’s jurisdiction via territoriality. This also simplifies who has access to the data, as only that branch should have access to that branch’s State’s data. For example, Amazon Web Services compartmentalizes itself into separate entities so that it can localize into States with more restrictive laws, like China. In a similar vein, Microsoft is increasingly using partnerships with local data controllers in an attempt to remove itself from direct legal accountability, as seen in Germany. And if this compartmentalization strategy ultimately fails, corporations can always choose to not do business in that State. Because without territoriality, a State can’t regulate you, right?

Except the EU is rethinking that proposition. The General Data Protection Regulation (GDPR), slated to be finalized this year, greatly expands the scope of the EU’s privacy laws, regulating anyone that handles EU citizens’ personal data, regardless of physical location. This is a substantial expansion, and a marked departure from the EU’s prior reliance on territoriality and transfer restrictions. (While this theoretically makes localization unnecessary, the GDPR still has transfer restriction.) Despite relying on several traditional notions of jurisdiction, the new rules nonetheless raise practical questions about implementation, as territoriality is still the default for enforcing regulations. And without any express mention of a Lindqvist loophole (itself a judicial creation), it is unclear just how far reaching these regulations might be.


I’d use as an example web scrapers. A web scraper is something that collects information online and repurposes that information, usually for some ulterior business motive. So, for example, a savvy coder could use a web scraper to collect all of the prices from a competitor’s website, and guarantee to undercut them. (A popular example is airline prices, which are notoriously arcane.) Other sites use scrapers to create web-portals, bringing lots of data from several sites to one place, and others simply attempt to duplicate information to generate (some would say steal) ad-revenue. While many scrapers offer clear commercial benefits, others are much more nefarious, and all of them occupy a legal grey area. But the most important point about web scrapers is that can operate from effectively anywhere with an Internet connection.

Under the GDPR, web scrapers would potentially be subject to EU law. The language I’ve seen thrown around for the GDPR’s scope is that it applies to activities “related to the offering of goods or services to EU individuals, or to the monitoring of their behaviour.” While the latter seems targeted at US surveillance, the language could apply to a range of activities. So, for example, if I decided to start a US website that collected all the data it could about the British royal family (tweets, news articles, photos, etc.) and put it all in one place, I would presumably be regulated by the EU GDPR. And even if there were some form of public figure exception, the same could be said for any website that aggregated, say, the bizarre social media posts of Russians. (I could provide a direct link, but I thought better of it.) And supposing one of those Russians decided that the photo should be removed à la right to be forgotten, do you have to comply? What are the consequences of refusing? While Russia isn’t in the EU, hopefully you get my point.


So after all that lead up, do I have anything actually useful to say? There aren’t any easy solutions, but the most likely is some form of international agreement. While bi-lateral treaties are the most common, I could also imagine the creation of a limited forum for disputes over Internet jurisdiction generally, where State actors and data controllers could bring suit over whether that data controller (or more specifically, the data) have sufficient contacts with the State to establish jurisdiction. This would almost assuredly have to proceed on narrow, bi-lateral grounds, but it could begin building a framework of international cooperation. This would also provide an elegant solution to things like web scrapers, as the scale of the activity could be considered in assessing whether it would be fundamentally fair for jurisdiction to apply. Downloading a single picture? Not enough to warrant jurisdiction (akin to the Lindqvist loophole). Downloading all of the pictures? That might justify regulation. This would also provide an easy solution to the Microsoft Ireland case, as it would establish an internationally recognized authority governing extraterritorial data access, based on the State’s interest in the data. And while host States would probably retain trump cards for compelling national interests (e.g. free speech; national security), this would provide a starting ground for international data governance.

There is plenty more to say on these topics, including several rabbit holes I cut for length (and coherence), but ultimately the goal is to preserve the openness that made the Internet great. Given the Internet’s international nature, this will almost assuredly mean compromise moving forward. At least until I can convince the entire world to adopt my views on privacy.

Until next time,


One thought on “Data Localization, Web Scrapers, and Even More EU Privacy Problems

  1. Pingback: A Win for Microsoft in Ireland? | The CACR Supplement

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s