Skip to content

Federal Court makes clear: Website scraping is illegal

As a general rule, we all know it is not a good idea to scrape content from a website, yet some companies persist in this behaviour contrary to law and best practice.

On April 15, Justice Richard Southcott of the Federal Court of Canada issued a permanent injunction against Mongohouse.com, aka, MongoHouse.ca, Sheng Lan Mai aka Maxim Mai, Kun Xu, 2565707 Ontario Inc. and Jing Liu (collectively, the Mongohouse defendants) in a stinging rebuke against web/data scrapers, upholding the copyright of the Toronto Real Estate Board in its internal multiple-listings service systems and database.

The Toronto Real Estate Board, a not-for-profit corporation representing more than 50,000 realtors across the Greater Toronto Area, is the creator, author and custodian of the TREB Multiple Listing Service®, a co-operative service that provides more than 100 online services, including access to active real estate listings, detailed property descriptions, archival information, photography, neighbourhood descriptions and other curated information related to real property (including purchase prices) that is available for use by its members for a fee and its partner real estate boards’ members. TREB has reciprocal agreements with other real estate boards across Canada and is affiliated with the Canadian Real Estate Association, the registered owner of the multiple listing service registered trademark and the MLS design.

The Toronto Real Estate Board’s statement of claim, filed on Sept. 12, 2018, alleged that Mongohouse.com’s entire existence (and business) was based upon its unauthorized access of the TREB MLS® system and infringing use and distribution of TREB MLS® information for a commercial purpose, namely monetizing TREB’s content for Mongohouse’s and its owners’ financial benefit. 

The Mongohouse defendants were accused of deactivating, bypassing and circumventing the various technological protection measures actively deployed by TREB to limit and restrict access to the TREB MLS® system and the MLS® information, in violation of various confidentiality and copyright protection obligations of TREB’s listing agreements, its authorized user agreements, statutory obligations and third-party licence agreements with information-supply partners including Teranet Inc. and the Municipal Property Assessment Corporation.   

The claim also named the software engineer who was the alleged author of the software used to crack TREB’s technological protection measures to gain access to the TREB MLS® information and display it on Mongohouse. 

Various U.S. and Canadian internet service providers were also originally named in the claim for the purposes of obtaining injunctive relief requiring them to comply with take-down notices and to cease hosting Mongohouse as well as providing all information regarding the identity of the current and past site owners and operators.

Currently, TREB members can only access the TREB MLS® system by providing two levels of credential authorization to authenticate their user names, passwords and using a PIN number to gain access. Members are also required to abide by the TREB MLS® rules and policies, which require them to agree to the TREB authorized user agreement terms and conditions.

Section four of the AUA explicitly prohibits authorized users from using, copying, reproducing or exploiting “the MLS Database contrary to various By-Laws, the MLS Rules and MLS Policies” or Ontario’s Real Estate and Business Brokers Act. 

Authorized Users are also expressly forbidden, under s. 7(c) of the AUA, to “decompile, reverse-engineer, disable, modify, analyze or create derivative works of the software, MLS Database or BRS Database.” TREB presently uses, as described in the claim, a variety of software applications and protection measures to actively prevent third parties from gaining unauthorized access to and download or stream the TREB MLS® information, including antivirus software, third-party anti-scraping services (ongoing monitoring and check/validation processing), hosting-service firewalls and intrusion-detection systems, anti-malware software and detection systems and encrypted token authentication protocols.

Mongohouse stood accused of subverting the TPMs put in place by TREB and populating its website, on a daily basis and at no charge, with content that it had copied from the TREB MLS® system, including new property listing, prices, photography and detailed property descriptions. 

Using maps with indicators to show new property listings and recently sold properties to its 50,000 registered users, in a similar form and with a similar content and layout to that provided by TREB, Mongohouse also offered advertising space to real estate-related businesses in competition with TREB.

The claim alleged that the information contained on the Mongohouse site could only have been available from TREB’s MLS® system, and TREB asserted that they had actually verified this fact by placing certain unique information in the TREB MLS® system for access by members (and restricting how this information could be displayed). TREB subsequently found that the information was suddenly available on the Mongohouse website within 24 to 48 hours following its initial placement on the TREB site, proof that the content was actually being scraped from the TREB MLS® system.

TREB argued that the TREB MLS® system, including its design, layout, presentation, manner of access and form/selection of information as well as the information contained therein, is proprietary to TREB (even though not all of the information is exclusive to it) and that it had spent millions of dollars annually for upkeep, maintenance and support of the online service for its members. Moreover, TREB claimed that the unique collection of information compiled, organized and maintained by TREB in the TREB MLS® system is a copyrightable work (namely a compilation that is original, independently created and organized that requires a great degree of skill, judgment and labour in its overall selection and arrangement) and that also contains confidential and proprietary information. Accordingly, as the author and content creator, TREB holds the copyright interest associated with the TREB MLS® info (and associated copyrights as defined in the Copyright Act and, therefore, only TREB MLS® has the right to authorize its use, copying, streaming, distribution or dissemination. Mongohouse, through the use of the illegally obtained TREB MLS® information, was passing itself off as offering the same services as offered by TREB (without users having to pay the associated fees), infringing TREB’s copyrights and exclusive rights in order to profit from advertising revenue, etc. 

In addition to an interlocutory and permanent injunction against Mongohouse and the defendants, TREB sought: damages for each breach and infringement of TREB’s proprietary information and copyrights in the amount of $100,000; an accounting as to the receipts by each defendant arising from such infringement of TREB’s copyrights and breaches of confidential information; damages in the amount of $2,000,000 under the Trade-Marks Act for infringement, passing off, confusion and loss of reputation and  pre-and post-judgment interest as provided by law and TREB’s costs on a solicitor and client basis.

By Oct. 1, 2018, not long after the filing of the claim, Mongohouse had taken down its site and it remained offline. However, on Oct. 30, 2018, the Mongohouse defendants responded with their own spirited, 73-page statement of defence and counterclaim denying virtually every allegation in the claim and counterclaiming for lost revenue. However, the Federal Court was not convinced and, with the consent of the Mongohouse defendants, definitively found and declared in its order that: As the owner of the TREB MLS® listing services and TREB MLS® database, TREB is the owner of the associated copyrights pursuant to the act; the unauthorized copying, data scraping, downloading, display, distribution, access to make available for distribution and streaming for public display of any TREB MLS® data is a breach of TREB’s proprietary rights and copyrights associated with the TREB MLS® service and  any access to the TREB MLS® system other than as authorized by TREB using any means to avoid, bypass, deactivate, impair or to circumvent in any manner a TPM is a breach of s. 41 of the act and is an infringement of TREB’s rights. 

The court further granted a permanent injunction against the Mongohouse defendants, restraining each of them (including their officers, directors, employees, agents, assigns or any person acting under their instructions) from: accessing, copying, data scraping, downloading, displaying, distributing, accessing to make available for distribution and streaming for public display any TREB MLS® data or information, unless expressly authorized in writing by TREB; using any method to avoid, bypass, remove, deactivate, impair or circumvent any technological protection measures put in place to protect or limit access to the TREB MLS® system and data; operating, conducting or having any involvement in or providing or offering means to access the TREB MLS® system or assisting in the collection or display of the TREB MLS® data, unless expressly authorized in writing by TREB; maintaining, operating, implementing, marketing or having any involvement with any business or enterprise used in any manner or form for the purpose of providing or offering a means to access the TREB MLS® system via any means or method, including any internet-based technology, without the express written permission of TREB. 

The action was otherwise dismissed on a without cost basis and Mongohouse’s counterclaim was also dismissed on a without costs basis.

What are the takeaways of this decision? 

The Federal Court has clearly laid to rest any question regarding the legality of web scraping.  The bottom line for prospective digital companies is: Engaging in unauthorized copying, data scraping, downloading and distributing third-party content without the consent of the original rights holders is illegal under the act; and web scraping is not the basis of a good business or revenue model that will likely be profitable or have staying power in the longer term.   

  • Copyright and parasitism

    John G
    Presumably a plaintiff would have to demonstrate at least some of what TREB did in this case, to support a claim of copyright. Usually what gets scraped is some kind of compilation, the value of which the scraper wants to take to itself - what French course have described as parasitism (a cause of action not found in the Intellectual property code but the judges just were not going to stand for it. I guess the Fed Court has clearly laid it down until someone in similar circumstances to the defendants here take the issue to appeal. But such a party will have some barriers to overcome to make an appeal court sympathetic to it.
  • It's Just a Consent Order - No Precedential Value - Apparently Inconsistent with a Federal Court of Appeal Judgment

    Howard Knopf
    This was only a consent order - not an actual judgment with precedential value - and is apparently inconsistent with a clear ruling of the Federal Court of Appeal . See "The Toronto Real Estate Board is Back in the Copyright News": http://excesscopyright.blogspot.com/2019/04/the-toronto-real-estate-board-is-back.html
  • Overreach

    Laurel L. Russwurm
    It seems clear the defendant both broke its agreement and circumvented TPMs, and perhaps caused confusion by representing itself. Although I am not a lawyer, it seems these issues could better be addressed by contract an fraud law rather than IP law. Particularly troubling to me is the idea that "design, layout, presentation, manner of access and firm/selection of information" is being considered proprietary. Instead of allowing anyone to claim ownership of reasonable ways in which to organize data, such things ought to be promoted as web standards to improve Internet access for all users. Is it only a matter of time before Google Maps copyrights the process of determining the fastest route from.point A to point B? Supporting such copyright overreach is as bad an idea as patenting a shape (like, say, a rectangle with rounded corners) or laying claim to words or letters of the alphabet.
  • Really?

    Jon Festinger, Q.C.
    Unless I'm missing something, it's drawing a rather long and IMHO inaccurate bow, to say that what sounds like a consent decree unaccompanied by detailed reasons of a court, and which seem to be mostly supported by contractual terms in the AUA is the definitive statement that every possible instance of web scraping is illegal. What if there was no AUA being agreed to? Is there never a fair dealing argument? Or an educational use etc. etc. Based on the above I believe your statement "The Federal Court has clearly laid to rest any question regarding the legality of web scraping." is likely far too broad.