Data or web scraping has become standard practice for many companies operating in the digital landscape. They might use web scraping tools for competitive price monitoring, to help them fetch product descriptions and images, or to aggregate news article data.
There are dozens of different applications of web scraping used in all sorts of fields and industries. But just because a lot of people do it, doesn’t make it legal.
Table of contents
What do you really know about the legality of scraping the web? Below, you’ll find arguments for both sides of the courtroom, plus a few landmark legal cases that contested its legality.
So, just how legal is web scraping? Well, to use a common SEO catchphrase: It depends…
Please note: The information in this article does not provide legal advice, and we are not legal professionals. Before doing any web scraping, we advise you to consult with a legal team or lawyer first.
How does web scraping work?
Web scraping, data scraping, or web harvesting is the process of extracting data from web pages. This includes any web page, from search engine results pages (SERPs) to Amazon product pages to the front page of the New York Times.
After extraction, the acquired data is stored in a database or copied into a program like Excel or Google Spreadsheets, often after reformatting the data. Although you can technically scrape the web manually, the term web scraping typically refers to the automated process, which uses a web crawler.
The web crawler (e.g. SERPMaster) systematically browses the internet for you, retrieving the requested data from web pages. This bot can extract data from thousands of websites in no time, giving you a wealth of competitor and industry knowledge right at your fingertips.
The (potential) problem with web scraping
Web scraping as a practice is not illegal. However, in some instances, it can be. And whether it is or isn’t often has to do with how the data is handled.
Let’s illustrate the problem around web scraping legality with two short stories.
Say you were asked to perform competitor research for your company to see how your products’ prices stack up.
Instead of doing this manually, you decide to use an automated web scraper tool. You let the bot scrape data from your main competitors’ websites, extracting pricing data and saving it in a database. The tool analyzes all the data and automatically updates your company’s prices accordingly.
Now your competitors’ websites are publicly accessible on the World Wide Web. Your bot crawling them and extracting the data isn’t illegal. Neither is the tool using this data to update your company’s prices. So is this type of web scraping legal, you think? You’d probably agree it is.
Now let’s look at the second story.
Like most people, you have many different social media accounts. Facebook, Twitter, Instagram. It’s annoying you have to check all of them separately. If only there were an aggregator to combine it all for you in one handy site.
So you go ahead and build a site like that. You use automated web scrapers to scrape content from Facebook, Twitter, and Instagram and have it displayed on your site.
Congratulations, you likely just committed a felony under U.S. law as ruled in Facebook, Inc. v. Power Ventures, Inc.!
You see, back in 2009, Facebook filed a lawsuit against Power Ventures. Facebook claimed, among other things, that it was a matter of copyright infringement after web scrapers unlawfully – and against the terms and conditions of Facebook – scraped their website to extract information and repurpose it somewhere else. The court ruled in favour of Facebook.
So what do these two stories tell us about the legality of web scraping?
Well, that web scraping can be done in both legal and illegal ways. And that’s the whole problem.
When web-scraping is (potentially) illegal
What exactly constitutes as illegal and what doesn’t is often a legal grey zone. And to make things worse, different countries have different regulations. So how do you know when web scraping is allowed and when it isn’t?
For example, the EU’s General Data Protection Regulation (GDPR) states that you are not allowed to use any personally identifiable information (PII) without a lawful reason, such as a person’s consent. Such data includes (among others):
- Email address
- Phone number
- Date of birth
- Social security number
Scraping this data without consent will often be labelled illegal.
This is just one of many examples where web scraping can infringe someone’s rights and therefore be deemed illegal.
In the United States, the three most common legal claims a website owner can make to prevent web scraping are:
- Violation of the CFAA (unauthorized access to data)
- Copyright infringement (unauthorized use of someone else’s copyrighted material)
- Trespass to chattel (unauthorized trespassing of personal property)
However, to make things more complicated, there have been court rulings going both ways based on each of these claims. Let’s have a look at two famous examples going different ways.
eBay v. Bidder’s Edge
One of the best-known cases (aside from the Facebook case mentioned above) is eBay v. Bidder’s Edge.
Bidder’s Edge used a web scraper to access, collect, and index auctions from eBay. The U.S. court ruled that the computer system of eBay was considered personal property, which was trespassed by the user of a scraper (Bidder’s Edge). As such, it constituted a case of trespass to chattels, which is illegal in the U.S.
hiQ Labs v. LinkedIn Corp
In hiQ Labs v. LinkedIn Corp, the professional networking site sued hiQ Labs, a data analytics company, for using a web scraper to access information from public profiles on LinkedIn.
Although LinkedIn built their case mostly on the same grounds as eBay and Facebook, including trespass of chattel and violation of the CFAA, the court ruled in favour of hiQ Labs.
So, is a web scraping legal?
Honestly, there is no right or wrong answer here.
The practice of web scraping that is visiting web pages and extracting data from them is not illegal in its own right. The internet is free of access for all, so visiting a website and using the data presented there for your site (or for analysis) will often not be a crime.
That said, many large corporations have tried to prevent web scrapers from accessing their data in the past. Sometimes they won, sometimes they didn’t. So before you start web scraping, we would always advise you to talk to a lawyer or law firm first.