Is Web Scraping Legal? A Practical Guide for Developers in 2026
Web scraping exists in a legal gray area. Here's what developers and data teams actually need to know — key court cases, privacy regulations, and how to scrape responsibly.
The Short Answer
Web scraping of publicly available data is generally legal in the United States, thanks to several landmark court decisions. However, scraping can become illegal depending on what data you collect, how you collect it, and what you do with it.
The legality varies significantly by jurisdiction, and the law is still evolving. This guide covers the current landscape as of 2026, but it is not legal advice — consult a lawyer for your specific situation.
Key Court Cases That Define Web Scraping Law
hiQ Labs v. LinkedIn (2022)
The most significant case for web scraping law. hiQ Labs scraped publicly available LinkedIn profiles to build workforce analytics products. LinkedIn sent a cease-and-desist letter and blocked hiQ's access. hiQ sued, and the case went to the Supreme Court.
Key ruling: The Ninth Circuit ruled that scraping publicly available data likely does not violate the Computer Fraud and Abuse Act (CFAA). The CFAA requires accessing a computer "without authorization," and the court held that publicly available data — accessible to anyone with a web browser — doesn't meet that threshold.
What it means for you: If data is publicly visible without logging in, scraping it is unlikely to violate the CFAA. However, this ruling applies specifically to the CFAA — other legal theories (trespass to chattels, breach of contract) may still apply.
Van Buren v. United States (2021)
A Supreme Court case that narrowed the scope of the CFAA. The Court ruled that the CFAA's "exceeds authorized access" provision applies only to those who access data they are not entitled to access at all — not to those who access permitted data for unauthorized purposes.
What it means for you: This ruling strengthens the argument that accessing publicly available web pages — even for purposes a website owner wouldn't approve of — doesn't violate the CFAA.
Clearview AI Lawsuits (2020–present)
Clearview AI scraped billions of photos from social media platforms to build a facial recognition database. Multiple lawsuits and regulatory actions followed, with fines under GDPR in Europe and settlements under BIPA (Biometric Information Privacy Act) in Illinois.
What it means for you: Scraping personal data — especially biometric data — carries significant legal risk under privacy laws, even if the data is technically "public." The public availability of data doesn't exempt you from privacy regulations.
Meta v. Bright Data (2024)
Meta (Facebook) sued Bright Data for scraping Facebook and Instagram data. A California court ruled in favor of Bright Data for publicly available data, but found that scraping data behind a login wall (requiring a Facebook account) could constitute a breach of contract and unauthorized access.
What it means for you: The distinction between public and authenticated data matters. Scraping data that requires logging in is legally riskier than scraping publicly accessible pages.
Legal Frameworks That Affect Web Scraping
Computer Fraud and Abuse Act (CFAA) — United States
The primary federal law governing unauthorized computer access. After Van Buren and hiQ, scraping publicly available data is unlikely to violate the CFAA. However, bypassing technical access controls (CAPTCHAs, IP blocks, authentication walls) could change this analysis.
GDPR — European Union
The General Data Protection Regulation applies when you scrape personal data of EU residents, regardless of where you're located. Key requirements:
- You need a lawful basis for processing personal data (consent, legitimate interest, contract, etc.)
- "Legitimate interest" is the most common basis for scraping, but requires a balancing test against the data subject's rights
- Data subjects have rights to access, deletion, and objection — you must be able to honor these
- Fines can reach 4% of global annual revenue or 20 million euros, whichever is greater
CCPA / CPRA — California
The California Consumer Privacy Act (and its amendment, CPRA) gives California residents rights over their personal data. Unlike GDPR, CCPA has an exemption for "publicly available information" — data from government records or that a consumer has made available to the general public. This partially protects scraping of genuinely public data, but the exemption is narrow.
Copyright Law
Scraping copyrighted content (articles, images, creative works) raises copyright issues. Facts and data generally aren't copyrightable, but the creative expression of those facts is. Scraping prices and product specs is generally safe. Scraping and republishing full article text is likely infringement.
Terms of Service
Many websites prohibit scraping in their terms of service. Whether ToS violations create enforceable legal claims is debated. Courts have been inconsistent — some treat ToS violations as breach of contract, others have dismissed them as unenforceable "browsewrap" agreements that users never actually agreed to.
Best practice: Always review the target site's ToS. While enforcement varies, violating ToS increases your legal risk.
The robots.txt Question
robots.txt is a file that websites use to tell crawlers which pages they should or shouldn't access. It's a convention, not a legal requirement — but ignoring it carries risk.
- Not legally binding: robots.txt is a voluntary standard. Ignoring it doesn't automatically create legal liability.
- But it's evidence: In litigation, a plaintiff can point to robots.txt as evidence that you were aware they didn't want their site scraped. Ignoring it looks bad in court.
- Best practice: Respect robots.txt directives. If a site explicitly blocks your scraper, consider whether the data is available from alternative sources.
What Makes Web Scraping Illegal?
While scraping public data is generally legal, several factors can push scraping into illegal territory:
1. Bypassing Access Controls
Circumventing login walls, CAPTCHAs, IP blocks, or other technical barriers to access data likely violates the CFAA. If a website actively tries to prevent your access and you bypass those measures, you're in dangerous territory.
2. Scraping Personal Data Without a Legal Basis
Collecting names, email addresses, phone numbers, or other personal data without a valid legal basis under GDPR, CCPA, or similar laws can result in significant fines and lawsuits.
3. Causing Server Damage
Aggressive scraping that overwhelms a website's servers can be treated as a denial-of-service attack. This can trigger liability under the CFAA and state computer crime laws.
4. Using Scraped Data for Harmful Purposes
Even if the scraping itself is legal, using the data for spam, discrimination, stalking, or other harmful purposes creates separate legal issues.
5. Republishing Copyrighted Content
Scraping and republishing copyrighted articles, images, or creative works without permission is copyright infringement, regardless of how you obtained the content.
Best Practices for Legal Web Scraping
Follow these guidelines to minimize legal risk:
1. Only scrape publicly available data
Stick to data visible to any anonymous visitor. Avoid scraping behind login walls or paywalls.
2. Respect robots.txt
Honor crawl directives. If a site blocks your user agent, don't disguise it.
3. Rate-limit your requests
Don't overwhelm target servers. Add delays between requests and limit concurrent connections.
4. Don't bypass access controls
Never circumvent CAPTCHAs, IP blocks, or authentication systems designed to prevent automated access.
5. Be careful with personal data
If you collect personal information, ensure compliance with GDPR, CCPA, and other applicable privacy laws. Document your legal basis for processing.
6. Extract facts, not creative expression
Scraping product prices, business hours, and stock levels is generally safe. Scraping and republishing full articles or images is not.
7. Review terms of service
Check the target site's ToS for anti-scraping provisions. While enforceability varies, awareness reduces risk.
8. Use an API when available
If the target site offers an official API, use it. APIs are explicitly authorized access channels with no legal gray area.
Web Scraping Legality by Country
| Region | Key Law | Status |
|---|---|---|
| United States | CFAA | Public data scraping generally legal after hiQ ruling |
| European Union | GDPR | Legal with valid basis; strict on personal data |
| United Kingdom | UK GDPR + DPA 2018 | Similar to EU; research exemptions exist |
| Australia | Privacy Act 1988 | No specific anti-scraping law; privacy rules apply |
| China | PIPL + CSL | Strict; personal data heavily regulated |
| Japan | APPI | Generally permissive for non-personal data |
How API Everything Helps You Scrape Responsibly
Using a managed extraction API like API Everything helps with legal compliance in several ways:
- Built-in rate limiting: Our infrastructure automatically manages request rates to avoid overwhelming target servers.
- robots.txt compliance: Configurable respect for robots.txt directives.
- No access control bypassing: We focus on extracting publicly available data — no login credential harvesting or CAPTCHA solving.
- Structured data output: Our AI extracts facts (prices, ratings, specifications) not copyrighted creative content, reducing copyright risk.
- Data minimization: You specify exactly what fields you need — no bulk collection of unnecessary data.
Frequently Asked Questions
Can I scrape Amazon, Google, or Facebook?
Scraping publicly visible data from these sites is likely legal under current US case law. However, all three have Terms of Service prohibiting scraping, and all have the resources to pursue legal action. Use official APIs when available (Google Search API, Amazon Product API) and limit scraping to genuinely public data.
Is scraping prices and product data legal?
Generally yes. Prices and product specifications are facts, not copyrightable expression. Competitive price monitoring is a well-established and legal business practice. This is one of the most common use cases for web scraping.
Can I scrape email addresses?
Technically possible, but legally risky. Email addresses are personal data under GDPR and CCPA. Collecting and using them for unsolicited marketing may violate CAN-SPAM, GDPR, and other laws. Proceed with extreme caution and legal counsel.
What about scraping for AI training?
This is the hottest legal question of 2026. Multiple lawsuits (NYT v. OpenAI, Getty v. Stability AI) are testing whether scraping copyrighted content for AI training constitutes fair use. The law is unsettled — watch for major rulings.
Bottom Line
Web scraping is legal in most cases when done responsibly — targeting publicly available data, respecting rate limits and robots.txt, and being careful with personal information. The legal landscape continues to evolve, with courts increasingly distinguishing between accessing public data (legal) and bypassing access controls (illegal).
For developers and data teams, the safest path is using a managed extraction API that handles compliance concerns for you. Get started with API Everything to extract data responsibly — with built-in rate limiting, public data focus, and structured output that minimizes legal risk.
Related reading: Automated Data Extraction Guide · What Is Screen Scraping? · AI vs. Traditional Web Scraping