The Most Popular Blog Post Ever on Web Scraping and the Law

//

In August 2023, I wrote a blog post that was unusual for me. Rather than cover a case or a particular legal issue, I wrote a long rant about hypocrisy in the legal world of web scraping.

It became perhaps the most popular thing I have ever written. To this day, it is one of the highest-traffic blog posts in the 20-year history of Professor Eric Goldman’s Technology & Marketing Blog. It was featured on Tyler Cowan’s Marginal Revolution. It shot straight to the top of Hacker News.

I suspect it’s the most popular piece of writing ever on web scraping on the law. If you’re here and you’re interested in the topic, it’s a good place to start.

Some of the biggest companies on earth—including Meta and Microsoft—take aggressive, litigious approaches to prohibiting web scraping on their own properties, while taking liberal approaches to scraping data on other companies’ properties. When we talk about web scraping, what we’re really talking about is data access. All the world’s knowledge is available for the taking on the internet, and web scraping is how companies acquire it at scale. But the question of who can access and use that data, and for what purposes, is a tricky legal question, which gets trickier the deeper you dig.

[…]

But make no mistake, these companies view this data, generated by their users on their platforms, as their property. This is true even though the law does not recognize that they have a property interest in it, and even though they expressly disclaim any property rights in that data in their terms of use. Since the law does not give them a cognizable property interest in this data, they must resort to other legal theories to prevent others from taking it and using it.

Here’s the link.

Hope you enjoy!