The Short Answer

Web scraping of publicly available data is generally legal in the United States, thanks to several landmark court decisions. However, scraping can become illegal depending on what data you collect, how you collect it, and what you do with it.

The legality varies significantly by jurisdiction, and the law is still evolving. This guide covers the current landscape as of 2026, but it is not legal advice — consult a lawyer for your specific situation.

Key Court Cases That Define Web Scraping Law

hiQ Labs v. LinkedIn (2022)

The most significant case for web scraping law. hiQ Labs scraped publicly available LinkedIn profiles to build workforce analytics products. LinkedIn sent a cease-and-desist letter and blocked hiQ's access. hiQ sued, and the case went to the Supreme Court.

Key ruling: The Ninth Circuit ruled that scraping publicly available data likely does not violate the Computer Fraud and Abuse Act (CFAA). The CFAA requires accessing a computer "without authorization," and the court held that publicly available data — accessible to anyone with a web browser — doesn't meet that threshold.

What it means for you: If data is publicly visible without logging in, scraping it is unlikely to violate the CFAA. However, this ruling applies specifically to the CFAA — other legal theories (trespass to chattels, breach of contract) may still apply.

Van Buren v. United States (2021)

A Supreme Court case that narrowed the scope of the CFAA. The Court ruled that the CFAA's "exceeds authorized access" provision applies only to those who access data they are not entitled to access at all — not to those who access permitted data for unauthorized purposes.

What it means for you: This ruling strengthens the argument that accessing publicly available web pages — even for purposes a website owner wouldn't approve of — doesn't violate the CFAA.

Clearview AI Lawsuits (2020–present)

Clearview AI scraped billions of photos from social media platforms to build a facial recognition database. Multiple lawsuits and regulatory actions followed, with fines under GDPR in Europe and settlements under BIPA (Biometric Information Privacy Act) in Illinois.

What it means for you: Scraping personal data — especially biometric data — carries significant legal risk under privacy laws, even if the data is technically "public." The public availability of data doesn't exempt you from privacy regulations.

Meta v. Bright Data (2024)

Meta (Facebook) sued Bright Data for scraping Facebook and Instagram data. A California court ruled in favor of Bright Data for publicly available data, but found that scraping data behind a login wall (requiring a Facebook account) could constitute a breach of contract and unauthorized access.

What it means for you: The distinction between public and authenticated data matters. Scraping data that requires logging in is legally riskier than scraping publicly accessible pages.

Legal Frameworks That Affect Web Scraping

Computer Fraud and Abuse Act (CFAA) — United States

The primary federal law governing unauthorized computer access. After Van Buren and hiQ, scraping publicly available data is unlikely to violate the CFAA. However, bypassing technical access controls (CAPTCHAs, IP blocks, authentication walls) could change this analysis.

GDPR — European Union

The General Data Protection Regulation applies when you scrape personal data of EU residents, regardless of where you're located. Key requirements:

You need a lawful basis for processing personal data (consent, legitimate interest, contract, etc.)
"Legitimate interest" is the most common basis for scraping, but requires a balancing test against the data subject's rights
Data subjects have rights to access, deletion, and objection — you must be able to honor these
Fines can reach 4% of global annual revenue or 20 million euros, whichever is greater

CCPA / CPRA — California

The California Consumer Privacy Act (and its amendment, CPRA) gives California residents rights over their personal data. Unlike GDPR, CCPA has an exemption for "publicly available information" — data from government records or that a consumer has made available to the general public. This partially protects scraping of genuinely public data, but the exemption is narrow.

Copyright Law

Scraping copyrighted content (articles, images, creative works) raises copyright issues. Facts and data generally aren't copyrightable, but the creative expression of those facts is. Scraping prices and product specs is generally safe. Scraping and republishing full article text is likely infringement.

Terms of Service

Many websites prohibit scraping in their terms of service. Whether ToS violations create enforceable legal claims is debated. Courts have been inconsistent — some treat ToS violations as breach of contract, others have dismissed them as unenforceable "browsewrap" agreements that users never actually agreed to.

Best practice: Always review the target site's ToS. While enforcement varies, violating ToS increases your legal risk.

The robots.txt Question

robots.txt is a file that websites use to tell crawlers which pages they should or shouldn't access. It's a convention, not a legal requirement — but ignoring it carries risk.

Not legally binding: robots.txt is a voluntary standard. Ignoring it doesn't automatically create legal liability.
But it's evidence: In litigation, a plaintiff can point to robots.txt as evidence that you were aware they didn't want their site scraped. Ignoring it looks bad in court.
Best practice: Respect robots.txt directives. If a site explicitly blocks your scraper, consider whether the data is available from alternative sources.

What Makes Web Scraping Illegal?

While scraping public data is generally legal, several factors can push scraping into illegal territory:

1. Bypassing Access Controls

Circumventing login walls, CAPTCHAs, IP blocks, or other technical barriers to access data likely violates the CFAA. If a website actively tries to prevent your access and you bypass those measures, you're in dangerous territory.

2. Scraping Personal Data Without a Legal Basis

Collecting names, email addresses, phone numbers, or other personal data without a valid legal basis under GDPR, CCPA, or similar laws can result in significant fines and lawsuits.

3. Causing Server Damage

Aggressive scraping that overwhelms a website's servers can be treated as a denial-of-service attack. This can trigger liability under the CFAA and state computer crime laws.

4. Using Scraped Data for Harmful Purposes

Even if the scraping itself is legal, using the data for spam, discrimination, stalking, or other harmful purposes creates separate legal issues.

5. Republishing Copyrighted Content

Scraping and republishing copyrighted articles, images, or creative works without permission is copyright infringement, regardless of how you obtained the content.

Best Practices for Legal Web Scraping

Follow these guidelines to minimize legal risk:

1. Only scrape publicly available data

Stick to data visible to any anonymous visitor. Avoid scraping behind login walls or paywalls.

2. Respect robots.txt

Honor crawl directives. If a site blocks your user agent, don't disguise it.

3. Rate-limit your requests

Don't overwhelm target servers. Add delays between requests and limit concurrent connections.

4. Don't bypass access controls

Never circumvent CAPTCHAs, IP blocks, or authentication systems designed to prevent automated access.

5. Be careful with personal data

If you collect personal information, ensure compliance with GDPR, CCPA, and other applicable privacy laws. Document your legal basis for processing.

6. Extract facts, not creative expression

Scraping product prices, business hours, and stock levels is generally safe. Scraping and republishing full articles or images is not.

7. Review terms of service

Check the target site's ToS for anti-scraping provisions. While enforceability varies, awareness reduces risk.

8. Use an API when available

If the target site offers an official API, use it. APIs are explicitly authorized access channels with no legal gray area.

Web Scraping Legality by Country

Region	Key Law	Status
United States	CFAA	Public data scraping generally legal after hiQ ruling
European Union	GDPR	Legal with valid basis; strict on personal data
United Kingdom	UK GDPR + DPA 2018	Similar to EU; research exemptions exist
Australia	Privacy Act 1988	No specific anti-scraping law; privacy rules apply
China	PIPL + CSL	Strict; personal data heavily regulated
Japan	APPI	Generally permissive for non-personal data

How API Everything Helps You Scrape Responsibly

Using a managed extraction API like API Everything helps with legal compliance in several ways:

Built-in rate limiting: Our infrastructure automatically manages request rates to avoid overwhelming target servers.
robots.txt compliance: Configurable respect for robots.txt directives.
No access control bypassing: We focus on extracting publicly available data — no login credential harvesting or CAPTCHA solving.
Structured data output: Our AI extracts facts (prices, ratings, specifications) not copyrighted creative content, reducing copyright risk.
Data minimization: You specify exactly what fields you need — no bulk collection of unnecessary data.

Frequently Asked Questions

Can I scrape Amazon, Google, or Facebook?

Scraping publicly visible data from these sites is likely legal under current US case law. However, all three have Terms of Service prohibiting scraping, and all have the resources to pursue legal action. Use official APIs when available (Google Search API, Amazon Product API) and limit scraping to genuinely public data.

Is scraping prices and product data legal?

Generally yes. Prices and product specifications are facts, not copyrightable expression. Competitive price monitoring is a well-established and legal business practice. This is one of the most common use cases for web scraping.

Can I scrape email addresses?

Technically possible, but legally risky. Email addresses are personal data under GDPR and CCPA. Collecting and using them for unsolicited marketing may violate CAN-SPAM, GDPR, and other laws. Proceed with extreme caution and legal counsel.

What about scraping for AI training?

This is the hottest legal question of 2026. Multiple lawsuits (NYT v. OpenAI, Getty v. Stability AI) are testing whether scraping copyrighted content for AI training constitutes fair use. The law is unsettled — watch for major rulings.

Bottom Line

Web scraping is legal in most cases when done responsibly — targeting publicly available data, respecting rate limits and robots.txt, and being careful with personal information. The legal landscape continues to evolve, with courts increasingly distinguishing between accessing public data (legal) and bypassing access controls (illegal).

For developers and data teams, the safest path is using a managed extraction API that handles compliance concerns for you. Get started with API Everything to extract data responsibly — with built-in rate limiting, public data focus, and structured output that minimizes legal risk.