Web scraping, or web data extraction, involves extracting data from websites through automated means. This can include scraping data like real estate listings, product information, or user profiles from sites like Zillow, Amazon, and Facebook. But is scraping sites like Zillow actually legal?
The legality of web scraping tends to fall into a gray area. Currently, no federal laws in the United States explicitly make web scraping illegal.
However, a few factors must be considered when determining if scraping a particular website is legal or ethical. This article will look in-depth at web scraping laws and the legal and ethical implications of scraping real estate sites like Zillow.
What is Web Scraping?
Web scraping refers to the automated process of extracting data from websites. This usually involves writing a program or script that can crawl across web pages, analyze their content, and copy certain types of data into a database or spreadsheet.
Some common examples of web scraping include:
- Scraping product data – Retailers may scrape competitor websites to collect and compare product descriptions, pricing, shipping costs, and more.
- Scraping social media data – Companies use web scraping tools to pull social media profiles into spreadsheets for marketing and analytics.
- Scraping real estate data – Real estate investors use scrapers to extract MLS listings from sites like Zillow to analyze market trends.
- Scraping contact information – Recruiters and salespeople scrape sites like LinkedIn to collect names, emails, and phone numbers.
Web scraping allows you to extract large volumes of data from websites far more quickly than manual copying and pasting. It transforms unstructured web data into structured, machine-readable data that can be easily exported and analyzed.
Is Web Scraping Illegal?
There is no single federal law in the US that explicitly outlaws web scraping. The Computer Fraud and Abuse Act (CFAA) is sometimes interpreted as making scraping illegal, but this depends a lot on the circumstances and how courts interpret the law.
In general, here are some of the key factors that influence the legality of web scraping:
- Terms of service – Almost all major websites prohibit scraping in their ToS. Violating a site’s ToS can potentially lead to civil lawsuits or CFAA charges in some cases.
- Authorization and access – Scraping data you don’t have authorization to access often raises legal concerns. However, scraping publicly accessible data is less likely to cause legal issues.
- Volume of scraping – Scraping a few pages periodically is less risky than scaling up to scrape an entire site aggressively, which may be seen as abuse or denial-of-service attack.
- Use of scraped data – Scraping for non-commercial personal use is generally less legally risky than large-scale scraping for commercial purposes. The latter raises concerns about copyright and intellectual property.
So, in summary – there is no blanket ban on web scraping in the US. Small-scale, non-commercial scraping of public data is generally low legal risk. But large-scale scraping – especially of private user data or for commercial use – raises red flags that may lead to lawsuits or prosecution under the CFAA in some cases.
Is Scraping Zillow Legal?
No, scraping Zillow is probably not legal. This is because it breaks their website rules.
Zillow says in their terms of service that scraping is not allowed without permission. Ignoring their rules and taking data anyway could get you in trouble.
You might get a warning letter from their lawyers or even sued. In very serious cases, you could potentially face criminal charges for something called computer fraud.
So even just taking a small amount of real estate information from Zillow is risky without their okay. It’s not clear-cut illegal, but it’s in a gray area where you could encounter legal penalties.
The only way to get Zillow’s data 100% legally is to use their official APIs or purchase a license. If you want to stay on the safe side of the law, it’s best to get their direct approval before scraping.
But here are some key factors to consider regarding the legality of scraping Zillow listings:
- Zillow’s Terms of Use – Zillow’s Terms of Use explicitly prohibit scraping without express permission. So any unauthorized scraping activity violates their ToS and opens you up to potential civil action.
- Accessing private/unauthorized data – Zillow contains agent contact info, off-market listings, and other non-public data. Scraping and using this private data raises legal issues.
- Commercial use – Using scraped Zillow data for personal real estate research is lower risk. However, larger-scale scraping for commercial purposes increases legal exposure.
- Respecting blocking – If Zillow blocks or bans your IP for scraping, attempting to circumvent those blocks is unlawful access under the CFAA.
- Data volume – Light scraping periodically may go unnoticed, but heavy scraping runs higher risk of retribution from Zillow and raises trespass to chattels issues.
So in summary, any unauthorized scraping of Zillow violates their ToS. You avoid legal risk by scraping minimally, slowly, without circumventing blocks, and avoiding private/unauthorized data. Commercial use and high data volumes also raise legal issues. Overall, consult an attorney before any large-scale Zillow scraping.
Zillow’s Position on Web Scraping
Given the prominence of Zillow in real estate data, it’s helpful to understand their specific policies and perspective when it comes to web scraping of their site.
In Zillow’s Terms of Use, they specifically prohibit:
- Copying, reproducing, modifying, distributing, or otherwise exploiting Zillow data and listings without express permission.
- Using manual or automated software to “crawl”, “scrape”, or “spider” their sites.
- Circumventing their access controls, rate limits, or blocks on scraping activity.
Zillow states they use several technical means to detect and prevent scraping of their data. They block detected scrapers at the IP level and may pursue legal action if needed.
Zillow essentially considers any form of unauthorized, large-scale automated scraping of their site data to be in violation of their ToS. They view excessive scraping as draining their resources and profitability.
However, Zillow generally permits individuals to manually access and use minimal amounts of listing data for personal use only. However, they draw the line between systematic scraping and commercial use of aggregated listing data from their site.
What Does the Law Say About Web Scraping?
While there is no specific anti-scraping law, there are a few existing US laws that are relevant when determining the legality of scraping sites like Zillow:
Computer Fraud and Abuse Act (CFAA)
This law prohibits accessing a computer or website “without authorization” or in a manner that “exceeds authorized access.” Violating a site’s Terms of Service is sometimes interpreted as “exceeding authorized access,” which means the CFAA may prohibit scraping in some cases.
Digital Millennium Copyright Act (DMCA)
The DMCA protects copyrighted materials like listing photos on Zillow. It requires following proper takedown procedures if you scrape and re-use any copyrighted content.
Electronic Communications Privacy Act (ECPA)
The ECPA prevents unauthorized access to private electronic communications. Scraping confidential user info protected by the ECPA would be illegal.
Trespass to Chattels
Common law tort involving interfering with someone else’s personal property. Aggressive scraping that overburdens a site’s servers could potentially lead to a trespass to chattels claim.
So in summary, the CFAA, DMCA, and ECPA may all apply in cases where large-scale scraping crosses certain lines. While not explicitly illegal, all unauthorized scraping carries some civil litigation risk.
Common Legal Risks of Web Scraping
Given the complex legal landscape around web scraping, what are some of the specific legal risks you may encounter if you decide to scrape sites like Zillow?
- Cease-and-desist letters – Getting a C&D order from a site’s legal team is very common. This demands you stop scraping their site.
- Civil lawsuits – Sites may sue scrapers for things like copyright infringement, data theft, violating CFAA, etc. This can lead to fines and injunctions.
- Criminal charges – In rare cases, very egregious scraping may result in criminal CFAA charges for “computer hacking”. But this is less common.
- DMCA takedown notices – A site may issue them if you re-use their copyrighted photos or other materials scraped from the site.
- Banning/blacklisting – Sites will typically ban and blacklist repeat scrapers at the IP address level through firewall rules.
So in most cases, an angry C&D letter or DMCA notice is the most likely outcome. But civil lawsuits and criminal charges are also possible depending on the nature of your unauthorized scraping activity.
Best Practices for Legal Web Scraping
Given the questionable legality of scraping sites like Zillow without explicit permission, what are some best practices to keep your web scraping activities legal and ethical?
- Review terms of use – Check each site’s ToS to understand their policies before scraping. Never violate their expressed terms.
- Make requests fairly and slowly – Use proper throttling and delays to minimize server load and avoid detection as an attacker.
- Request public data only – Avoid scraping private user info or other non-public or restricted data you don’t have right to access.
- Do not circumvent blocks – If a site blocks your IP, stopping your scraping activity is the only legal recourse.
- Use scraped data minimally – Only re-purpose a minimal amount of scraped data for personal or research purposes rather than commercial gain.
- Use your own computing resources – Don’t tap other sites’ resources like servers and bandwidth without permission.
- Consult an attorney if needed – Get qualified legal advice for riskier large-scale scraping uses or questionable gray areas.
If in doubt, it’s always best to politely request a site provide API access to their data. Otherwise, ensure your scraping is low-volume, harmless, doesn’t violate their ToS, and puts minimal burden on their systems.
Is it Ethical to Scrape Zillow Listings?
Even if you determine that lightly scraping Zillow data is technically lawful in your jurisdiction, there are still some ethical implications to consider before scraping real estate sites.
Here are some key ethical perspectives when web scraping Zillow or other MLS listing sites:
- Purpose behind scraping – If you’re scraping to empower renters or analyze market trends, it may be more justifiable than commercial profit.
- Transparency – Making sites aware of your intended scraping and uses of the data through a formal request is more ethical than covert scraping.
- Data privacy – Accessing only public listing data reduces privacy issues versus scraping private user info.
- Attribution – Giving credit if you re-use any site content or data is good ethical practice.
- Impact on target site – Excessive scraping can slow target sites and cost them money, which raises ethical questions.
- Compliance with norms – Scraping in a manner consistent with standard conventions online is more ethical than pushing boundaries.
- Legality – While not inherently unethical, any legally questionable scraping creates an ethical dilemma.
So, in summary, lighter scraping of truly public data for attributional personal use is generally more ethical. However, larger-scale commercial scraping should be performed with transparency and data privacy while minimizing harm to the target sites.
Scraping Ethics – Striking the Right Balance
At the end of the day, web scraping exists in a nebulous space between powerful innovation and potentially unethical disruption. The technology itself is neutral, but how we choose to deploy it raises important ethical questions.
When scraping sites like Zillow, here are some ethical guiding principles to keep in mind:
- Innovation vs. harm – Scrape only in ways that maximize innovation while minimizing harm and costs to the target sites. Seek win-win value exchanges.
- Transparency vs. deception – Let sites know what data you seek and why. However some limited covert scraping may be justifiable in certain cases.
- Workflow enhancement vs. misuse – Leverage scraping to enhance human workflows and analysis vs. mindless data hogging intelligently.
- Public data use vs. data abuse – Respect private data and limited commercial licensing of proprietary data.
- Reasonable defaults vs. excessive force – For gray areas lacking consensus, take the most conservative, reasonable path with less potential harm.
By keeping these principles in mind, we can advance the constructive upside of web scraping while avoiding many of the potential downsides and abuses. With mindfulness and ethics, data scraping can drive knowledge and progress forward.
Is Scraping Zillow Safe?
No, scraping Zillow is not completely safe. This is because it breaks their website rules.
Zillow does not allow people to scrape their websites for data. Their rules say you cannot copy their information without permission. If you scrape Zillow, you can get in trouble for ignoring their rules.
You may get an angry letter from Zillow’s lawyers. They could also sue you and try to get money as punishment. In very rare and serious cases, you could even face criminal charges.
However, scraping a small amount slowly and carefully is less risky. But there is still a chance Zillow catches you and asks you to stop. So it’s safer to get Zillow’s permission first.
The best way to get their real estate data safely is to use Zillow’s official APIs or buy their database. This makes sure you stay on Zillow’s good side and avoid any trouble.
- It violates Zillow’s Terms of Service, which explicitly prohibits scraping. This means Zillow could potentially send you a cease-and-desist letter or even sue you for violating their ToS.
- Scraping a large volume of data from Zillow could potentially overload their servers and get you blocked. Circumventing blocks could violate the Computer Fraud and Abuse Act.
- Using scraped Zillow data commercially creates more legal risks around copyright infringement.
- There are ethical concerns around undermining Zillow’s business model by taking data without permission.
However, there are some scenarios where scraping Zillow may have lower risks:
- Scraping small amounts of purely public listing data (no private info) minimally for personal research purposes.
- Using scraped data internally or for journalism purposes rather than commercial gain.
- Being transparent with Zillow when possible about your scraping activities.
- Avoiding circumventing any blocks if Zillow detects and stops your scraping.
- Not republishing copied listing photos or copyrighted listing descriptions.
So in summary – scraping Zillow does carry real legal and ethical risks in many cases. There are no guarantees of safety. But following prudent practices like scraping minimally, slowly, transparently and only accessing public data can potentially reduce risks of negative consequences. Consult an attorney before any wide-scale scraping. The safest approach is to use Zillow’s official APIs if possible or subscribe to their bulk data services.
Best Practices When Scraping Real Estate Listings
If you do choose to gather MLS real estate listings via web scraping, here are some best practices to consider:
- Use residential proxies or other methods to mask and rotate your IP address, avoiding easy blacklisting.
- Build in random delays of 5-10+ seconds between page requests to limit volume and avoid bot detection.
- Only access public listings, not private realtor data, pocket listings, etc that require authentication.
- Scrape discretely during overnight low-traffic hours to minimize impact and visibility.
- Use a robust scraping framework like Scrapy that handles throttling, retries, proxies etc for you.
- Consult a lawyer if doing anything beyond minimal personal research scraping to assess legal exposure.
- Avoid republishing scraped copyrighted images or listing descriptions. Only use limited extracted data.
- Use scraped listings data for research purposes rather than competing directly with the source site.
- Regularly check each site’s ToS and respect any prohibitions or blocking they enact.
- Consider using paid data services from sites like Zillow or realtor associations, which give full legal access.
Exercising good judgment and respect for each real estate platform while minimizing commercial harms are the keys to keeping any scraping activity ethical and prudent.
Scraping Real Estate Sites – Where to Draw the Line
With laws ambiguous and enforcement unpredictable, it’s tempting to push the boundaries when scraping Zillow and other MLS listing sites. But where exactly should ethical scrapers draw the line?
Here are some recommendations on where the line exists between reasonable scraping and clearly unethical territory:
- No circumventing blocks – If a site blocks you, stop. Never try to evade blocks with VPNs or proxies.
- Public listings only – Scraping authenticator-only realtor data is unethical. Respect data privacy.
- Non-commercial use – Avoid competitive commercial use-cases. Research and journalism uses are more defensible.
- Minimal data volume – Pulling 100+ listings per city is reasonable. Scraping every property nationwide is excessive.
- Low frequency – Scrape a given site no more than once per week, ideally much less. Don’t relentlessly scrape daily.
- Limited automation – Some automation is fine, but man-in-the-loop checks help avoid excessive, ethical scraping.
- Transparency when possible – Making scrapers feel comfortable disclosing activities helps raise visibility on ethical norms.
Of course, these are just guidelines. Ethical scraping exists across a spectrum. But following principles of transparency, moderation, and respecting sites’ policies is the wise path forward.
Should You Hire a Web Scraping Service or Consultant?
Given all the complex technical, legal, and ethical hurdles, you may be tempted just to hire a web scraping consultant or managed service to scrape Zillow for you. But there are tradeoffs to consider here as well.
Potential benefits of hiring scrapers:
- Expertise – They have the technical skills and experience to handle CAPTCHAs, blocks, clustering, etc., which you may lack.
- Efficiency – A qualified scraper gets data fast with minimum time and effort on your end.
- Anonymity – They abstract some legal risk away from you directly.
- Managed service – Some scraper companies offer scraping as an ongoing managed data feed.
Potential downsides of hiring scrapers:
- Questionable ethics – You are enabling sketchy behavior that flies in the face of sites’ policies.
- Legal uncertainty – You likely retain some legal liability despite using a third-party scraper.
- Data risks – Scraped data quality can suffer without oversight, and scrapers can mishandle private data.
- Indirect harm – Hiring scrapers contributes to automation arms race of blocks and circumvention.
- Cost – Reputable scraping firms can be quite expensive, with fees upto $10,000+ sometimes.
Ideally, it’s best to work directly with sites like Zillow to access listing data through official APIs or bulk data services when you need large, up-to-date feeds. But if scraping is your only viable option, thoroughly vet scraping contractors to ensure proper ethics, compliance, and data handling.
Key Takeaways – Scraping Real Estate Sites
As we’ve explored throughout this article, scraping sites like Zillow exists in a legal and ethical gray zone that contains nuance. Here are some of the key principles to keep in mind:
- There is no absolute definitive answer on the blanket legality of scraping real estate sites – outcomes depend heavily on specific circumstances.
- All scraping without explicit permission violates most sites’ ToS. But ToS alone do not determine legality.
- Light scraping of purely public data is less legally risky than heavy commercial scraping or accessing private data.
- Scraping politely at reasonable volumes after reading sites’ policies is best to avoid lawsuits or prosecution.
- There are non-legal ethical obligations around not harming target sites and respecting data licensing systems.
- Scrape transparently and only as needed to balance innovation vs. disruption. Seek win-win exchanges of value.
- When possible, work with sites directly to get bulk access legally through official APIs or data subscriptions.
- Consult qualified legal counsel before undertaking any large-scale or questionable scraping project.
In the right circumstances and handled ethically, web scraping can unlock powerful data insights. But scrape thoughtfully and judiciously to avoid becoming the villain. With principles of fairness, transparency, and minimizing harm, we can advance data access in a responsible way.
Conclusion
Web scraping exists in a gray legal zone, requiring careful attention to stay on the right side of the law and ethics. While not categorically illegal, scraping sites like Zillow without permission certainly carries civil and criminal legal risks. Beyond strict law, we also have an ethical obligation to respect sites’ policies and terms of use, scrape transparently when possible, minimize harm, restrict private data access, and restrain large-scale automated scraping. With mindfulness and reasonable restraint, web scraping can safely unlock exciting new data insights and innovations. But scrapers must thoughtfully walk the line between value creation and destruction. With wisdom and principle, we can ethically tap into data abundance while respecting the interests of others. Approached properly, web scraping can be a force for empowerment and truth.