Thursday, July 14, 2005

Wayback Machine Sued for Copyright Infringement

Earlier I blogged about the Wayback Machine. Now NYT reports The Internet Archive was created in 1996 as the institutional memory of the online world, storing snapshots of ever-changing Web sites and collecting other multimedia artifacts. Now the nonprofit archive is on the defensive in a legal case that represents a strange turn in the debate over copyrights in the digital age. Beyond its utility for Internet historians, the Web page database, searchable with a form called the Wayback Machine, is also routinely used by intellectual property lawyers to help learn, for example, when and how a trademark might have been historically used or violated. That is what brought the Philadelphia law firm of Harding Earley Follmer & Frailey to the Wayback Machine two years ago. The firm was defending Health Advocate, a company in suburban Philadelphia that helps patients resolve health care and insurance disputes, against a trademark action brought by a similarly named competitor. In preparing the case, representatives of Earley Follmer used the Wayback Machine to turn up old Web pages - some dating to 1999 - originally posted by the plaintiff, Healthcare Advocates of Philadelphia. Last week Healthcare Advocates sued both the Harding Earley firm and the Internet Archive, saying the access to its old Web pages, stored in the Internet Archive's database, was unauthorized and illegal. and SE Round Table reports Wayback Machine is used so often by many of us for many reasons. Want to see the first RustyBrick Web site, use the wayback machine. But it is also used as a legal tool "to turn up old Web pages" that can be used in a legal case. Due to this, one law firm wanted to prevent this from happening and decided to bring in the Wayback Machine into a lawsuit.

This lawsuit has an incredible number of implications; to name a few:
(1) The fun aspect of bringing up old versions of Web sites
(2) A sure fire way to prove copyright infringement
(3) Many search engines have "caching" functionality

There is a definite search related impact on this case for all of us. The forums are buzzing on this topic, to major threads are at WebmasterWorld & Search Engine Watch Forums.


At SEWF Jenstar wrote If you are wondering how to know if people are checking out your site via the archive.org, you can check the referral for images on your page, since it hotlinks all images for all copies of the pages it indexes. You might be surprised to see how many people are peeking at the older copies of your pages, I have spotted the IPs of many competitors in those image referrals to archive.org. I am guessing this is how Healthcare Advocates knew those pages were still active in archive.org for competitors to access, since they state exactly how many pages their specific competitor accessed. If archive.org no longer hotlinked images, then it would not be apparent who and how often those historic pages were accessed through the archive, so proving access would be a lot more difficult, although many pages wouldn't be as user friendly as they currently are. This will be a lawsuit to watch and see how it will affect how things are done at archive.org and how it keeps older versions of webpages, particularly the hotlinked image situation. It will be unfortunate if it makes this tool less valuable for those researching trademarks and copyright infringement.

At WMW Webwork (9:14 pm on July 13, 2005) wrote At times like this it pays to read the law, which I now lawfully post: http://www.copyright.gov/title17/92chap1.html#101

§ 108. Limitations on exclusive rights: Reproduction by libraries and archives

(a) . . . it is not an infringement of copyright for a library or archives, or any of its employees acting within the scope of their employment, to reproduce no more than one copy or phonorecord of a work, except as provided in subsections (b) and (c), or to distribute such copy or phonorecord, under the conditions specified by this section, if —

(1) the reproduction or distribution is made without any purpose of direct or indirect commercial advantage;

(2) the collections of the library or archives are (i) open to the public, or (ii) available not only to researchers affiliated with the library or archives or with the institution of which it is a part, but also to other persons doing research in a specialized field; and

(3) the reproduction or distribution of the work includes a notice of copyright that appears on the copy or phonorecord that is reproduced under the provisions of this section, or includes a legend stating that the work may be protected by copyright if no such notice can be found on the copy or phonorecord that is reproduced under the provisions of this section.


hunderdown wrote (3:08 am on July 14, 2005)

At least you should be able to permanently remove YOUR site /pages when you ask. I don't think they remove anything, other than allow you to block it via robots.
Not sure I agree. After all, it contains materials that had been freely and publicly available at some time in the past. If a company produces a brochure which a library collects, does that company have a right to ask for it to be discarded just because they say so? I don't think so.

It was interesting to read that law firms use the Internet Archive as a resource in trademark cases--sort of a cheap and easy discovery process alternative.

Any questions?

No comments: