Legal Rules vs. Legal Norms for Web Scraping


Every time I drive my car, I follow the same algorithm for how fast I drive. If I’m driving on a highway or an open road, I drive between 6 and 9 miles over the speed limit. If I’m driving in a neighborhood or near a school, I drive 3 to 4 miles over the speed limit. I’m not sure how I came up with this system, but I’ve been driving like this for about as long as I’ve been driving, more than 25 years now.

Every time I drive, I violate the legally posted limits for how fast I’m supposed to drive. At this point, I’ve probably done this more than 10,000 times. And I’m sure I’ve done it in plain sight of hundreds if not thousands of law enforcement officials.

And yet, never once have I gotten a ticket. I have a perfectly clean driving record. I’ve never even been pulled over for speeding.

To me, this is a perfect illustration of the distinction between laws and norms. The speed at which law enforcement has a right to hold me accountable for the speed that I drive is posted on the speed limit sign. But we all know that the posted speed limits are somewhat conservative compared to how fast people actually drive. And so the majority of people drive slightly faster than the posted speed limits. Do this within reason, and law enforcement almost never enforces the technical limits posted on the speed limit sign. Go way over the limit, and you’re likely to get a ticket.

And, if you happen to catch the attention of law enforcement for some other reason—suspicious-looking car, you’re swerving over the road, you’re blasting death-metal within ear-shot of the police, all of the sudden law enforcement might just enforce the letter of the law against you, simply because they perceive that you violated some other norm.

This analogy is helpful, I think, in understanding the legal landscape associated with web scraping in the United States. If you are familiar with web scraping and have made it this far into this article, you probably know that web scraping is ubiquitous and becoming more so by the day. You probably know that many small businesses and enterprise-level companies engage in web scraping as a matter of routine. And perhaps you know that there are dozens of laws that are implicated by web scraping—and that many web-scraping businesses have gotten entangled in web-scraping litigation and run out of business.

Most casual observers of this topic feel inclined to describe web scraping as a “gray area of the law.” But that’s not really correct, either. Legal interpretations of the CFAA are a mess, admittedly. But that’s not the most heavily litigated issue with web scraping anymore. Today, most sophisticated companies looking to enforce web-scraping claims do so under breach of contract, misappropriation, and intellectual property and quasi-intellectual property theories of law. CFAA claims are only the primary focus in web-scraping litigation unique circumstances, such as when jurisdictional issues prevent the plaintiff from pursuing easier claims.

In most respects, web scraping is not that much more of a “gray area of the law” than the law related to speed limits. It’s just not as well understood.

And that’s because the norms associated with web-scraping enforcement are industry-specific and even company-specific.

Often, web scrapers operate within an ecosystem that drives additional traffic to the host websites that serve as the original source of the data. Sometimes, this creates a dynamic where host websites might technically prohibit web scraping with their terms of service but are inclined to tolerate or even cooperate with web scrapers if they play by their rules. But run afoul of the norms associated with their web site, and all of the sudden the technical prohibition on web scraping might apply to you.

But there are some companies that have a zero-tolerance policy with web scraping. Access their data once, and it’s off with your head. And by that, I mean, some companies will litigate against any business that prominently scrapes their data.

I often write about legal issues with web scraping, on this site and on others. It’s a fast- developing area of the law, and one that I find fascinating. But when I write about the laws that pertain to web scraping, that’s only telling half the story. The other half of the story that businesses that scrape—and businesses that want to stop scraping—need to understand are the legal norms associated with web scraping. And this is something that I think takes longer to understand than the laws associated with web scraping—and the law of web scraping is fairly nuanced. But if one follows this area of the law closely, one can assess what a company is planning to do with web scraping and determine with a good confidence interval whether it’s going to be problematic or not. And vice versa, whether it’s feasible for a company that wants to stop scraping to effectively do so.

And the answer to those questions may or may not be reflective of a pure legal analysis applied to a company’s web scraping tactics. It’s not always about the technical permutations of the laws. That’s relevant and important, of course, but it’s not the only consideration. It’s just as much about understanding the norms of how these things have been handled in the past. Get it right, and you can proceed with few concerns about legal entanglements. Get in wrong, and you’ll find yourself looking in the rear-view with sirens and lights on your tail.