Vous êtes sur la page 1sur 30

The Google Books Story

Cover to Cover
A comprehensive report on the origins, development, functionality, social impact, ethics, legality, and outlook of the Google Inc. Book Search service. David J. Thompson Winter 2010

Table of Contents
Foreword....................................................................................................................................................3 Introduction ..............................................................................................................................................4 Google and Contextual Advertising...........................................................................................................6 The Birth of the Books Project..................................................................................................................9 Acquiring Content: The Partner Program & the Library Project.............................................................11 Google Book Search: Presentation & Functionality................................................................................11 Controversy..............................................................................................................................................12 Relevant Legal Concepts..........................................................................................................................15 Infringement Allegations..........................................................................................................................16 Googles Case..........................................................................................................................................17 The Proposed Settlement.........................................................................................................................19 Public Reaction to the Settlement: Further Controversy .........................................................................20 Hypothetical: Google Inc. vs. AAP & Authors Guild............................................................................22 Legal Precedent.......................................................................................................................................22 Final Analysis: Google Books and Fair Use...........................................................................................24 Conclusion...............................................................................................................................................27 Works Cited..............................................................................................................................................28

Foreword
This paper is a report on the Google Book Search service. The first several hundred words are dedicated to chronicling Google's growth as a company, its profit model, the historical development of Google Book Search, and an in-depth description of the functions of the service. I felt it necessary to include a comprehensive overview of the history and technology because I believe they are integral to understanding and appreciating the Google Books controversy and legal discussion. The second half of the paper discusses the origins of the controversy, the ethical dilemmas it posed, the ensuing legal battles, relevant legislation and case precedent, a summary of the 2008 settlement, and a prospective analysis of the Google Books service and similar digitization projects. The scope of the report is limited to Googles activities in the United States. The primary reason for confining the discussion to domestic affairs is that copyright infringement, the crux of the Google Books legal conflict, is defined by jurisdiction and can vary drastically between countries. Consequently, a thorough discussion of the legal outcomes of the Google Books Project on an international scale would require a prohibitively large amount of research and reporting. Also, a word on the tone and depth of analysis in the report: The report is written for those completely unfamiliar with the aforementioned events and concepts. As an undergraduate business major with a concentration in finance and a minor in philosophy, I have had little education in the way of computer science, advertising-based profit models for website owners, programming, or copyright law. As I read about these concepts to familiarize myself with the issues and ideas underlying the Google Books topic, I attempted to distill the basics and convey them in the body of the report in the simplest terms I could manage. If you are already well-versed in any of these topics and find yourself dozing off out during one these inexpert discussions, feel free to skip ahead.

Introduction
Why has Google become such an important part of our culture? Why has its name been assimilated into dictionaries1, and why are history books being written about the company only years after its inception?2 The answer to these questions is at once simple and complicated. The simple part of the answer as to why it has enjoyed such great success is due to its web search engine. It quickly emerged as the most effective and most popular search tool during the dot-com boom of the late nineties and early 2000s and has since maintained those titles. (Nielson) Another factor contributing to Google's rise to prominence is that it has evolved to become much more than just a search engine. In fact, it now has a hand in almost every imaginable facet of internet use. To name a few... Gmail, short for Google Mail, is its email service that boasts over 146 million users per month. (Arrington, 2009) With a user-friendly interface and unique features like conversation-view email chains, its user base is growing at a staggering annual rate of 43 percent and it has become the third-most popular email service only four years after its release. (Schonfeld, 2009) Google Earth and Google Maps are applications that bring geographic satellite imagery to the masses. It is amazing technology that offers, among other things, street maps, a route planner, and business locators for several different countries around the globe. (ABC, 2010) Google Apps are office productivity programs that are similar in functionality to the Microsoft Office suite, but instead of creating documents via a program installed on your hard drive, the Google products allow users to create, modify, and store word documents, presentations, or databases online. (Berlind, 2007) The Apps are proving to be a major factor in the trend toward exclusively-online computing and remote storage, a movement which may eventually eliminate the need for anything but internet accessibility in personal computing hardware.

1 Joining the elite ranks of products such as Coke, Xerox, and Kleenex, the Google company title has transcended its role as a mere reference to a particular organization and attained genericized brand name status. The term google now refers to a search or inquiry, usually in the context of a Web search, and its spin-off verb to google is the act of performing such an investigation. The word was officially added to the Oxford English Dictionary on June 15, 2006. (Bylund, 2006) 2 Cf. Stross; Planet Google: One Company's Audacious Plan to Organize Everything We Know, Girard; The Google Way: How One Company is Revolutionizing Management As We Know It, Batelle; The Search: How Google and its Rivals Rewrote the Rules of Business and Transformed Our Culture, and so on...

Google has also established a foothold in the social networking world with Orkut, an answer to Facebook and MySpace that has become particularly popular in India and Brazil. In 2010, Orkut ranked 60th among all websites in user traffic and currently has more than 100 million active users worldwide. (Alexa) YouTube, the popular video-streaming website, is a wholly-owned subsidiary of Google and is a juggernaut in and of itself. It has been estimated that in 2007, YouTube consumed as much bandwidth as the entire internet did in 2000. (Daily Telegraph, 2007) Amazingly, the video-clip giant has since increased in popularity, and is revolutionizing in the way video is disseminated and experienced.

Google also has a finance page, with live updates on stock quotes and other developments. In the same vein, it has a sports page with live updates on scores for almost every imaginable sport. It has a search function for images, scholarly articles, and patents. It has its own web browser, called Chrome, to compete with Microsoft's Internet Explorer and Mozilla Firefox. It released its first smartphone, the Nexus One, in early 2010. (Google, 2010) It owns the Android mobile operating system, which is projected to become the world's second most common smartphone platform by 2012. (ComputerWorld, 2009) And there's talk of a Google operating system in the works, dubbed Chrome OS, which would no doubt claim a significant portion of Microsoft's veritably uncontested Windows market share. (Mediati, 2009) From Google Voice, to Wave, to Latitude, to Calendar, to Reader, to Health, to Chat, to SketchUp, to Adsense, to Blogger the list goes on almost indefinitely, and nearly all of the 150+ Google domains are integrated such that a single search can return relevant results from any of the aforementioned sites and applications. (Google Corporate) Perhaps the most essential element contributing to Google's transcendent status and massive user base is the fact that all of the services listed in the previous paragraph are completely free to users. There are no subscription fees, no one-time membership fees, and no bothersome biweekly pop-ups pestering you to upgrade to a 'premium' version. It appears that we have finally reached an age our parents said would never come; we can now get something for nothing. (Actually, Grandpa, there is such a thing as a free lunch it's called Gmail, and there is no scam.) But are we actually getting something for nothing? Well, that depends on your definition of compensation. If you think of payment as most people do, i.e. as an outlay of cash or cash equivalents in exchange for a product or service, then yes, we are getting something for nothing. However, this begs another question namely, how does Google manage to invest so heavily in the production of all these original, high-quality services of paradigm-shifting proportions, only to give them away? And how does it generate billions of dollars in revenue each quarter without selling anything tangible?3 These are reasonable questions, especially given the fact that most tech-firms today do not give their services away. To the contrary, most tech-companies (like Microsoft, for
3 In Q4 of Google's 2009 fiscal year, the company generated revenues of $6.67 billion. (Google Investor Relations)

example) continue to approach business more or less like traditional brick-and-mortar organizations, in which bottom-line performance is determined almost exclusively by their margins and their ability to increase sales volume. Thus, despite the cutting-edge appearance of the products they deliver, the traditional the more copies of software program X/hardware component Y we sell, the more money we make mentality remains the prevalent business outlook of the industry. Returning to Google, we again pose the question: How does it manage to saturate every corner of the personal computing and internet service markets with innovative, practical, and powerful applications, and offer them for free, and all the while generate huge cash flows? The solution to this apparent paradox is as novel and ground-breaking as the technology on which its search engine is founded.

Google and Contextual Advertising


In short, the solution to the riddle of Google's seemingly too-good-to-be-true business model is advertising revenue. More specifically, Google sells contextual advertising space to companies seeking the attention of groups who are statistically more likely to purchase their product or service than any random cross section of the population. And because of its superior search technology, Google is able to do it bigger and better than anyone. To fully appreciate Googles mastery of generating ad revenue and ability to sustain its competitive edge, we first need to familiarize ourselves with some of the fundamentals of advertising. Google is not the first to practice targeted advertising. In fact, the concept is probably as old as modern marketing itself. As early as the 1880's, business owners recognized the two inherent limitations of traditional forms of marketing; one being the lack of control over who ultimately sees the advertisement, and the other being the lack of feedback as to how effective the ad actually was, i.e. whether the message had a material impact on the purchasing predilections of the customer or whether it had any effect on top-line performance. This uncertainty about the effectiveness of dollars spent on advertising was captured perfectly by the retail giant John Wanamaker in 1886 when he said, Half the money I spend on advertising is completely wasted. Trouble is, I don't know which half. Over the years, marketing efforts to deliver advertisements to specific target markets have become increasingly effective, and have been aided by the evolving and expanding forms of media. Initially, signs and printed word afforded business owners very little control over the ultimate recipients their message. The success or failure of their marketing effort was dependent upon whatever random mix of people happened to be driving by their billboard or flipping past a particular page in a newspaper. However, their situation improved with the advent of popular magazines and radio, as they provided at least a modicum of insight as to the tastes and worldviews of their audiences. Television, and the diverse array of channels that came with it, brought marketers a step closer to realizing their goal of reaching a specific population subset by allowing for even more specialized and niche-specific ads to be delivered. However, the most dynamic and revolutionary changes in targeted marketing took place at the turn of the new millennium when internet access became commonplace. Advertising on the web provided marketers with a host of new opportunities and virtually eradicated the limitations inherent in classic forms of advertising. Space was now cheap and plentiful (as opposed 6

to steeply-priced television time slots and cramped newspaper pages), messages could be delivered quickly and efficiently, and marketers could establish a one-to-one dialogue with the consumer. (Adams, 2003) The potential for unique, creative articulation of brand messages had also evolved as the palette of web-based coding enabled the integration of video, high-definition graphics, flash animation, and even interactive games into a single advertisement. Furthermore, given the vast, multifarious, and ever-growing body of web sites, the ability to execute highly-targeted advertising campaigns was becoming more feasible than ever before. Perhaps the most important improvement of internet-based advertising over traditional mediums was the ability to directly and accurately quantify the intended audience's exposure and response to an ad. Companies with online advertisements could now collect precise data on the number of users who saw their ad, the number of users who clicked on their ad, what time they did so, what they looked at once they were redirected to your site, and whether they made a purchase or not. With these statistics at their fingertips, companies were no longer content paying a flat fee to a site owner for the mere opportunity to display their name on some banner. They could now directly observe how much revenue was generated by a particular ad in a particular location, and in real time.4 Thus, the systems of payment and revenue models for internet advertising came to reflect these traffic and sales metrics. With the guesswork regarding exposure and effectiveness out of the picture, companies were finally able to pay ad hosts for exactly how their site contributed to their sales no more, no less. Today, the three most common ways of pricing and measuring the effectiveness of advertising are (a) Cost Per Mille, or CPM, in which advertisers pay website owners for exposure of their message to every 1,000 people, (b) Cost Per Click, or CPC, in which advertisers pay each time a user clicks on their listing and is redirected to their website, and (c) Cost Per Action, or CPA, in which the advertiser pays the host only according to the amount of users who complete a transaction, such as a purchase or a sign-up. (Alexandrou, 2007) The basic lesson here is that if you want to make a lot of money as a third-party host of advertisements on the internet, two factors will be critical to your success; one is the level of user traffic your website generates, and the other is how interesting or attractive your visitors find the advertisements on your page.5 That is, you must have a large number of people seeing the ads and a large number of people actually clicking on the ads with a genuine interest in the products or services offered therein.6 Which brings us full-circle back to Google: Google is uniquely positioned to meet the two aforementioned criteria on a larger scale and with better accuracy than any other company on the globe. With regard to user traffic, Google and its subsidiaries occupy fourteen of the top fifty most-visited websites on the entire world wide web, with Google.com and YouTube.com holding places one and three, respectively. (Alexa, 2010) More importantly, Google is equipped with some of the most sophisticated internet search technology in existence, which enables it to return highly-relevant and often useful (as opposed to intrusive) ads to its users.7 This drastically improves the advertisers' odds
4 As you may recall from one of the previous paragraphs, the two inherent limitations in traditional advertising mediums were (a) the lack of control over the audience that ultimately receives the ad, and (b) the uncertainty regarding the effect on potential consumers. 5 The latter is a proxy for CPC- and CPA-based revenues. 6 Caveat: The vast majority of online advertising is paid for via the Cost Per Click model rather than the Cost Per Action model, so ultimately, purchasing the product or service at the advertiser's website is generally secondary to visitation in terms of revenue production. 7 The method by which Google receives payment for hosting advertisements is as fascinating and unique to Google as the PageRank algorithm. Via the AdWords program, (which is Google's main advertising product and primary source of

of reaching an audience that is at least interested in its products, if not in need of them. Part of Google's ability to return spot-on advertisements to web surfers is based on an acute recognition of key words using its search technology. But Google also employs information that hasn't been explicitly volunteered by the user, such as their geographic location, the time of day, and cached data about their browsing history. Furthermore, it runs these ads on all of its sites and tailors the message according to the context of the user's activity. For example, suppose you open an email in your Gmail account from your friend and coworker Bob that says, Beers after work?. Without prompt, Google scans this text, picks out key words, identifies your geographic location, returns several short ads for different bars near your workplace, maybe telling you which ones have happy hour specials, when those discounts occur, pinpoints their location on Google Maps, offers to give you directions, and suggests automatically adding it to your schedule in Gmail Calendar. Google performs this type of highlycontextualized advertising on all of its websites. Some people find this concept unsettling. After all, just think of the diversity and sheer quantity of information Google can collect about a single user of its products; it can see your photos, read your emails, read your instant-message chats, see your schedule, your contacts, your profession, your place of residence, your workplace, your most frequent travel destinations, and your documents, among other things. Indeed, Google is privy to almost everything there is to know about you if you use its products. It can use this private data about you because you've already agreed to let them do so in the terms of service. (The terms of service agreement is that jumble of text you clicked through in three seconds before activating your account.) That's your form of payment for using Googles services agreeing to let them connect you with advertisers based on the most detailed and comprehensive personal information about you they can produce.8 This, of course, is an advertiser's dream come true, and the demand for Google ad space reflects this. In 2008, Google raked in a cool $21 billion in advertising revenue alone. (Google Investors Relations, 2008) And here's the ringer: As Google collects massive amounts of revenue for all of the advertising leads it's providing, it's also collecting massive amounts of data about the purchasing tendencies of its users. What leads are users actually following? What ads seem to get the most attention? Google stores and processes this information and then uses it to predict future consumer behavior. Based on their forecasts (which are probably highly accurate given the sample size, comprehensive scope of the data, and sophistication of their technology and staff), they tailor their products and product line to conform to the evolving tastes, needs, and wants of the ever-growing internet population. It is a selfperpetuating cycle of creating new services, advertising within these services, earning revenue on those ads, and then using the capital and data garnered from that advertising to create more ad-bearing products and services. Google is writing its own ticket. By carrying out their mission of organizing the world's knowledge and making it universally accessible and useful, they are also building a giant following, with each new pair of eyes and ears
revenue) advertisers specify key words they would like to have associated with their product, and the maximum amount they are willing to pay Google per 'click,' or visit, by one of Googles visitors. The bidding between advertisers is blind, so each enters how much the ads are truly worth to them, without a clear price point. It is completely automated and prices of keywords are redefined after every search as their frequency, or demand changes. That's many millions of times per day. In essence, Google is running a 24-7, global auction between millions of advertisers, for eleven slots per search. An economist at the Haas School of Business at U.C. Berkeley speculates that it may be the most successful business idea in history. 8 In fairness to Google, they don't actually disclose the information to the advertisers they keep all the information inhouse and simply use it to connect you to advertisers. (Google Privacy Policy, 2010)

attuned to a Google website representing an improvement in their informational resources and an increase in advertising demand. Now you can see why Google has an incentive to spend capital on developing high-quality, useful services and also why they have an incentive to give these services away. One such service, which I have not yet mentioned, is more audacious than any other Google project to date and arguably the most faithful embodiment of the company's mission statement. It is called Google Book Search, and it is an attempt to make all the world's printed word as searchable and accessible as information on the internet is today. The basic idea is that Google scans a book's pages, converts the scanned image into computer-readable text, and then stores the image and corresponding text in its memory banks. Today, a Google search query can return results from any of the books in its digital library alongside those from its 1 trillion plus indexed web pages. A Books search can be conducted in over 35 languages, ranging from Japanese, to Czech, to Finnish; over 10,000 publishers and authors from over 100 countries are participating in the effort; its library partners include such distinguished learning centers as Harvard University, U.C. Berkeley, and Ghent University in Belgium (Google BS, 2007); and it has already scanned over 10 million books into its database and counting. (Skidelsky, 2009) We may be only years away from being able to search the entire corpus of recorded human knowledge with a single tool.

The Birth of the Books Project


The Book Project's roots can be traced back to the inception of Google itself. Actually, it may be more accurate to say that Google.com, the search engine that brought Google to fame and remains its primary service, can trace its roots back to the books project. Well, not the Books Project that would one day become Google Books, but rather the Stanford Digital Library Technologies Project (hereafter SDLP) that began in the mid-1990s at Stanford University in California. SDLP was a research program whose primary goal was to provide an infrastructure that affords interoperability among heterogeneous, autonomous digital library services. (Baldonado, 1997) In colloquial terms, SDLP was an effort to develop some kind of program that could navigate a complex system and be able to analyze how the independent entities within that system were interrelated. The ideal result of the SDLP would be a program that could easily and accurately search a large index of digitized books, as it was believed the project leaders that digitization of libraries was very likely to occur in the near future. At the time, Google co-founders Sergey Brin and Larry Page were graduate students at Stanford University pursuing computer science degrees and were a part of the SDLP team. As a junior, Page was looking for inspiration as to which topic to study for his dissertation. After a talk with one of his 9

advisers, he decided to undertake a project similar in purpose (but larger in scope) to the SDLP; he would explore the mathematical properties of the World Wide Web, and in doing so, attempt to understand its structure by illustrating pages' interconnectivity using a graph. (Battelle, 2005) During his time spent working on the SDLP, Page had been exposed to the idea of ranking books in a search field according to their relative importance as measured by the number of times it had been cited by other authors. The idea behind this model was that a book's popularity with other authors was more indicative of higher-quality content and general relevance than simply the number of times the search term appeared in the text. Page had the insight to apply this idea to his modeling of the World Wide Web. Instead of ranking web sites by the number of times the search term appeared within page, he thought that a site's relevance should be determined by how many other pages linked back to it, as well as the size and rank of those backlinked pages.9 In pursuit of this idea, he decided to develop a web crawler, which is an automated program that explores web pages one site at a time (a much more feasible task in 1996 than it would be today), and to customize it so that it would return specific information about how pages were linked to each other. It was around this time that Page was joined by friend and fellow graduate student Sergey Brin, who was also studying computer science at Stanford. Together, they developed an algorithm to convert the raw backlink data gathered by the web crawler into a measure of importance for a given page. They dubbed their invention the PageRank algorithm, and after using it for a short time to analyze the output of their web crawler, they realized its potential to improve internet searches. Soon after, Page filed a patent for the PageRank algorithm, Google's secret recipe for their search technology, and the company was incorporated roughly a year later. The rest is history. (Wiki: History of Google, 2009) However, Brin and Page never lost sight of the company's roots, and as early as 2002, they began talking with experts about the feasibility of scanning every book in the world and organizing it into a digital database. Within the next two years, Page and a small team of Google engineers and researchers would visit some of the largest library digitization projects in progress across the globe, including those at the Library of Congress and the Million Book Project, to study their processes and techniques. By 2004, Google acquired an ultra high-speed, non-damaging scanner/camera and programs that enabled optical character recognition, and the process of scanning text got quicker and more gentle on the books themselves. In 2005 at a statewide book fair in Arizona, the co-founders personally announced Google's ambitious goal for their current project of scanning entire volumes into their databases and making them available for free to the public. Brin and Page referred to the effort as Google Print, which was its working title at the time. By the date of the announcement, Google had already forged several partnerships with prestigious libraries and prolific publishers, bringing the number of books at Googles disposal to a total of 15 million. By 2007, Google Books was fully functional, integrated with other Google services such as maps and search, and available in over thirty-five languages.10 (Google BS, 2007)

9 The term backlink describes an incoming link to a website. Before search engines were commonly used, backlinks were the primary means of web navigation. 10 Google's description of Google Books on its corporate website says that over 10,000 publishers and authors from 100+ countries are participating in the Book Search Partner Program. The Library Project [has] expand[ed] to 28 partners, including seven international library partners: Oxford University (UK), University of Complutense of Madrid (Spain), the National Library of Catalonia (Spain), University Library of Lausanne (Switzerland), Ghent University (Belgium) and Keio University (Japan). (Google BS, 2009)

10

Acquiring Content: The Partner Program & the Library Project


As mentioned briefly in the preceding paragraph, texts are procured by Google via two major sources; publishers and libraries. Google's effort to collect volumes directly from publishers for the Books database is referred to as the Partner Program. This method of acquisition involves establishing an agreement between Google and a publishing agency. Users are allowed to read several pages of a copyrighted work, and in return, Google pays the agency a share of the advertising revenue generated by that visit. The primary source of the Book Project's scanned material, however, is from institutional libraries that have agreed to grant Google full access to their entire collections. These institutions are not compensated monetarily for their cooperation in the program, but they are given copies of all of the scans for free a service that would otherwise take years of tedious labor and millions of dollars of investment. The libraries may also be motivated by a sense of global community-mindedness, or a desire to make their collections more accessible and useful.

Google Book Search: Presentation & Functionality


Google.com/books is fashioned in traditional Google styling; its homepage is uncluttered and minimalistic with a user-friendly interface. The primary functions of the service include: Bookmarking: This feature allows the user to flag texts. Once you do, it will be preserved in your account's memory so you can find it easily when you want to read it later. Organizing: A feature that enables you to create your own 'bookshelves,' or personal collections of scanned books. It also allows users to rate books and author reviews. Share: You can share books, bookshelves, and/or recommendations with friends or the entire community. Status: This function allows you to maintain lists of books you have already read, reviewed, and/or plan on reading.

With regard to finding specific volumes, users can either browse by subject, medium (e.g. newspaper, magazine, novel, etc.), or conduct a Google search using key terms, titles, or authors' names. Search inquiries are returned in one of four formats depending on the copyright restrictions in place for the text in question. The first format is full view. As its name implies, full view offers the user full access to the text, along with option of downloading the work in its entirety in portable document format. This is the least restrictive mode of viewing and is available only on books out of print and on books whose authors who have granted Google express permission to reproduce it in full.

11

The second-least restrictive setting is limited preview, which lets users read a predefined number of pages from the selected work. The third of the four formats, snippet view, allows for only a few sentences surrounding the search term to be displayed. The majority of the results returned by a Book Search inquiry are comprised of previews in snippet view because it constitutes the maximum amount of text Google can legally display for a work still under copyright. The most restrictive setting states that there is no preview available and offers only basic bibliographic information. In all cases, Google provides links directing the user to online bookstores where the text can be purchased or nearby libraries where it can be checked out. (GB, Views)

Screenshot of the Google Books homepage.

Controversy
As you might already be aware, the Google Book Search program was not greeted with open arms by everyone when it premiered in 2004. In fact, it was met with more widespread and vehement opposition than any of Google's actions to date.11 The battle lines of the debate were not clear as scholars, authors, and laymen alike came down on all sides of the issue. Those in favor of the service applauded Google's commitment to the democratization of knowledge. Now, due to the efforts of a single organization, anyone with as much as a 56K modem or a smartphone could have access to the same educational resources as a Rhodes Scholar at Oxford. That not only goes for undergraduate college students and researchers, but also grammar school students in rural areas and people in third-world countries who would have never had such an opportunity otherwise. Discussing the domestic impact of the new service, Harvards Robert Darnton expressed a similar sentiment in his article, Google & the future of books: Googles generosity will be a boon to the small-town, Carnegielibrary readers, who will have access to more books than are currently available in the New York Public Library. Google can make the Enlightenment dream come true. To people of this mindset, Googles service was symbolic of everything that is good about the internet; namely, the ability to enrich lives all over the world due to the profound leveling effects that come with equal access to knowledge. Proponents of the Book Search program also cited the potential financial benefit to authors and publishers as an upside. With the financial health of the publishing industry in decline, the Google Books effort held promise for a new, reviving profit model: Authors and publishers would receive a
11 Save for the controversy over google.cn's compliance with China's internet-censorship policies, but that's another paper altogether.

12

percentage of the total revenue generated by Googles advertisements that ran when one of their works was viewed and a proportional share of any money received from users who pay to view the entirety of a copyrighted book. Furthermore, it would increase the public exposure of authors and their books, whose names and titles might appear in web searches instead of just sitting on a library shelf collecting dust, lapsing into anonymity. Google would eventually produce statistics to reinforce this position; the average increase in the number of visits to a publishers website after one of its books had been indexed in the Google Books search catalogue was 124 percent. (Zhang, 2008) Google also made a point of scanning many orphan, or out-of-copyright books. Among other reasons, Googles motivation for concentrating on these unattributed works was that it could display the entire text without compromising any author or publishers rights to profit and/or dissemination. However, as the pro-Book Search crowd pointed out, it also had the effect of publicizing works that are exceptionally rare, unusual, or old, which were previously only known by or accessible to a small group of scholars. This dissemination of previously closely-held materials was also cited by proponents as an upside to the project. Yet another argument in favor of the Books program is what has been called the disaster scenario argument, which is the notion that Google is essentially creating a back-up copy of all written word. In the event of a natural or man-made disaster taking place on a global scale, there would be several digital copies of the worlds books in several different locations. Google was also lauded for making a socially responsible decision by undertaking the Books Project at a time when other non-profit public and private institutions such as Ivy League Universities and the Library of Congress were unwilling to do so, despite their nearly unlimited financial resources and human capital. While certain well-endowed libraries such as Harvard and the University of Michigan had actually begun digitization efforts in their own respective libraries, no single institution showed the initiative or the vision to unify their efforts and create a large, collective digital library. (Cites and Insights, 2009) Furthermore, the rate at which each was digitizing their books was hopelessly inept in terms of actually creating something useful for the public; the aforementioned universities were scanning less than ten thousand volumes per year, whereas Google started at a rate of ten thousand per week. According to Mary Sue Coleman, president of Michigan University, their internal estimation of the time needed to digitize their entire library was over 1,000 years. (Google, BS) Then Google came along and did it in under a year. Thus, believers in the Google Books project saw the corporation as a servant of society, filling the void left by the inaction of public institutions such as the Library of Congress or an alliance of research supported by a network of foundations, who could have achieved the same result at a feasible cost. Finally, many saw Google as a leader of change, spearheading human kinds evolution from physical to virtual mediums. They saw the change as an inevitable consequence of the digital age -- a progressive shift which would have a beneficial impact on the environment and improve way information is disseminated, assimilated, and used. (Blank, 2009) The parties in opposition to Book Search were also myriad, and even more outspoken than those in favor of the Project. Arguments ranged from speculations about its cultural impact to legal concerns. For example, certain politicians and scholars in Europe have criticized Google on the grounds that it is perpetuating linguistic-imperialism, meaning that it is fortifying Englishs stronghold as the lingua franca, or dominant language in the world. Supporters of this sentiment cite the disproportionately high 13

number of books in English on Googles Book Search as evidence, arguing that other languages, in which scholarship is common, such as German, French, and Spanish, are underrepresented. Consequently, they worry that the disproportionate access to past works of scholarship in English will influence future works to also be in English, creating a self-perpetuating cycle of English dominance in academia. (Jeanneney, 2006) Those with a slightly paranoid bent expressed concern over the privatization of a public good -- the contents of libraries and the lack of privacy that one normally enjoys when checking out or browsing books in a traditional, brick-and-mortar library.12 The first complaint is a variation of the story of the burning of the Great Library at Alexandria; were some disaster of Biblical proportion to take place, Google, Inc. (a for-profit corporation) might become the sole remaining bearer of the entire body of human knowledge. The notion of a business in control of our entire culture and history evokes worries of manipulation of this precious information in the interest of profit. An editorial in the New York Times expressing worry over the lack of privacy in the act of reading a book via Googles new service propounded the possibility that in theory, Google could collect data on what books people read and create a dossier of their political views and other information. (Editorial, 2009) In combination with reading peoples emails, analyzing their internet searches, and knowing their schedule, Google could become a very effective instrument for some Orwellian, totalitarian regime, or any group with malicious intent if it was to fall in the wrong hands. The Googleplex in Mountain View could become the Ministry of Truth, or so they worry. However, the most outspoken and prolific party on the opposing side was a huge collective of publishers and authors. They were unhappy because Google had not consulted them before they began the project in 2002, nor after they unveiled it 2004, for that matter. Most cried copyright infringement, others complained about the bastardization of the medium. The two largest organizations to emerge on the side of the opposition were the Association of American Publishers (AAP), a trade association representing over 300 publishers, and the Authors Guild, another American trade alliance comprised of 8,000 published authors, literary agents, and attorneys. Both organizations primary activities are concerned with intellectual property issues and related litigation. In 2005, the two organizations brought separate lawsuits against Google, each similar in scope, alleging that the search-engine conglomerate was in engaging in massive copyright infringement by digitally reproducing the plaintiffs works for commercial profit, and then publicly distributing and displaying copies of those works. The lawsuits demanded that the court block Google from copying the books so the authors would not suffer irreparable harm by being deprived of the right to control reproduction of their works. (Sturcke, 2005) Nick Taylor, then-president of the Authors Guild, asserted that its not up to Google or anyone other than the authors, the rightful owners of these copyrights, to decide whether and how their works will be copied, and adding, This is plain and brazen violation of copyright law. (Sturcke, 2005) Google countered with a publicly-issued response consisting of many of the pro-Books arguments
12 Google Books proponents would assert that the first complaint stems from a misunderstanding of the program; not only will the physical libraries still be there, but they are receiving free, independent digital copies of all books scanned by Google. Furthermore, they are free to create online digital libraries with those copies as they please.

14

enumerated in the previous section. In one of the earliest of Googles statements in response to the lawsuit, Googles product management vice-president briefly touched upon the core issue: The ability to introduce millions of users to millions of titles can only expand the market for the authors books, which is precisely what copyright was intended to foster. From the beginning, it was clear that the ensuing battle between the Authors Guild, the AAP and Google would be a fight decided by the interpretation of copyright law; specifically, the doctrine of fair use.

Relevant Legal Concepts


Before we delve into the legal aspects of the controversy, lets define a couple basic terms and concepts: A. Copyright: Copyright is a property right attached to original works of art or literature. The terms art and literature are intended to encompass all creative innovations of a tangible nature. Copyright does not protect facts or ideas, meaning that historical or present truths about the world as well as abstract concepts cannot be owned by any one person or entity. (White, 2008) Specifically, Title 17 of the United States Code states that the owner of a copyrighted work has the exclusive right to do and authorize any of the following: i. ii. iii. iv. v. vi. to reproduce the copyrighted work in copies or phonorecords; to prepare derivative works based upon the copyrighted work; to distribute copies or phonorecords of the copyrighted work to the public by sale or other transfer of ownership, or by rental, lease, or lending; in the case of literary, musical, dramatic, and choreographic works, pantomimes, and motion pictures and other audiovisual works, to perform the copyrighted work publicly; in the case of literary, musical, dramatic, and choreographic works, pantomimes, and pictorial, graphic, or sculptural works, including the individual images of a motion picture or other audiovisual work, to display the copyrighted work publicly; and in the case of sound recordings, to perform the copyrighted work publicly by means of a digital audio transmission. (17 U.S.C. 106)

B. Fair Use: Fair use is essentially a set of rules that defines the context under which copyrighted works can be legally used without the authors permission, or alternatively, a list of exemptions to the six components copyright law enumerated in the preceding paragraph. (Stanford, 2010) Also codified in 17 U.S.C., there are four standards for determining whether use of a copyrighted work constitutes fair use or not: i. Purpose of use: Copying and using small parts of a protected work is acceptable, as long as they are used in specific educational contexts, especially if the copies are 15

made spontaneously, used temporarily, and is not reproduced in a permanent collection or anthology. ii. Nature of the work: The type or nature of the original work is question is an important part of determining fair use. For example, an excerpt from a newspaper would be treated differently than a clip of a song or movie. Similarly, non-fiction is treated differently than fiction in the eyes of the law. Amount and Substantiality: Length of the copied section in relation to the entire original work is a central consideration as to whether something qualifies as fair use. Are only a few sentences from the copyrighted work being copied? This definitely constitutes fair use. A couple paragraphs? Still okay. An entire chapter? Here is where the line gets blurry. The entire book? Almost certainly not. The outcome hinges on the length of the copied material in relation to the entirety of the original. The Effect on Marketability: If the copied work will have no effect on the original works ability to sell, then the fair use exemption is likely to apply. Conversely, if the copy is seen as supplanting the original or infringing upon the potential revenues of the original, it does not qualify as fair use. (Stanford, 2010)

iii.

iv.

At the heart of the AAP and Authors Guild lawsuits were the issues of copyright and fair use: Did Googles scanning and archiving of these copyrighted works constitute copyright infringement? Or was it the indexing and making them available online that was cause for concern? What was the purpose of Googles use of these published works? Was it for educational purposes? Or did Googles profits from the ad revenue generated by the Books content outweigh the educational aspect, effectively characterizing the service as a commercial enterprise? Furthermore, did the snippets of text from protected works that were returned in Google Books searches meet the fair use criterion for amount and substantiality?

Infringement Allegations
The foremost consideration in the legal debate was whether Googles scanning and indexing of copyrighted works constituted copyright infringement. A key issue in the deliberation over whether Google had infringed upon the rights of the authors was the question of who should bear the burden of determining whether or not a book could be scanned into the Books database. A rule of thumb in traditional copyright law is that it is the prospective users responsibility to seek permission from the creator or publisher to use the work in question. Google turned this rule on its head with the Books Project: It announced its intention to digitize entire collections, and would only refrain from scanning certain volumes if the author or publisher explicitly withheld permission. Google provided publishers 16

with what it called an opt out plan, which require action on behalf of the copyright holders if they wished not to have their books scanned into the collection and indexed. (Jewler, 2005) The position of the AAP and the Authors Guild was that the creators and rights holders should not have to bear the burden of specifying each and every single book they did and did not want in this index. Rather, it was Googles responsibility to regulate what entered the online collection. As the publishers13 were wellaware, traditional copyright practice suggests that Google should at least ask, if not pay a fee for every copyrighted work they adopted into their database. Furthermore, they argued that to proactively keep their books out of Googles digitization project would place an unduly large burden upon the publishers; they could not realistically meet Googles demand of contacting every author, asking them whether they were willing to participate in Googles Book Project, and reporting this list back to Google, especially given the volume of published works being scanned. But the AAP and Authors guild held that there was a deeper concern than just the issue of which party should answer to the other. Namely, they felt that the act of scanning and indexing their creative works was a fundamental violation of their exclusive right to copy and/or display protected work. (Jewler, 2005) They felt that regardless of whether Google was legally permitted to copy works before being granted permission by the publishers and authors, they were culpable of reproducing and using these works for a strictly commercial purpose, which constitutes a violation of the terms of fair use. The publishers justified their characterization of the Books Project as a strictly commercial enterprise by pointing to (a) the payment of digitized, scanned copies of a librarys collection in exchange for unfettered access to the its stacks, a service that would take years and potentially cost millions of dollars to libraries otherwise, and (b) the advertising revenue that Google would generate as a result of making these texts searchable online. (Jewler, 2005) In their opinion, this was revenue that was being taken directly out of their pockets, as it robbed them of the opportunity to control the terms of the digital dissemination of their work in a way that could potentially generate earnings for themselves in the future. The Authors Guilds characterized this prospective lost opportunity in their formal complaint as irreparable harm [caused] by depriving [the publishers] of both the right to control the reproduction and/or dissemination of their copyrighted Works and to receive revenue therefrom. (Duman, Bershad & Shulman LLP, 2005) Moreover, they held that Googles compliance with the amount and substantiality aspect of the fair use criterion was suspect. While a Books search might only return a snippet from a protected work, which in and of itself would not constitute a substantial portion of the entire text, the snippet displayed could be from any part of the entire text. (Lafeber & Saunders, 2008) The greater concern of the AAP and Authors Guild was not so much with the few sentences returned in the search results, but rather with Googles act of storing these proprietary works in their entirety in a database.

Googles Case
Google denied breaking copyright rules of any kind by asserting that its opt-out program offered to copyright owners absolved it of any infringement liability. Adopting the exact opposite strategy of the publishers, Google argued that abiding by the traditional copyright practice of actively seeking permission from every individual rights-holder would be prohibitively laborious and unrealistically
13 Hereafter I will interchangeably refer to the AAP and the Authors Guild by their formal titles and by the publishers.

17

demanding in the context of a project of such scope as the Books Project. Google also believed that even if its activities were found to be infringement, they were protected by the fair use doctrine. But as we have already discussed, the publishers argued that Google was ineligible for fair use protection due to the amount of substantiality of the indexed material and because of the Book Searchs supposed effect upon their works marketability. Google basically brushed off the amount and substantiality allegation, as it was unprecedented that such minimal use of copyrighted material as oneto three-sentence previews would constitute a violation. With regard to the supposed detrimental effects of the Books Project on the works marketability, however, Google countered by questioning the legitimacy of the Authors Guild and the AAPs claims of sustenance of injury to their business operations. Essentially, the publishers complaints of commercial harm caused by the Books Project boiled down to three elements, accusing Google of: 1. Causing damage to their goodwill14 and reputation 2. Contributing to lost profits and opportunities 3. Causing depreciation in the value and ability to license and sell their works According to Google, these arguments just didnt hold water. For one, there was no evidence or sound logic underlying the assertion that Book Search contributed to lost profits and/or opportunities or that it eroded the publishers ability to sell their works. Quite to the contrary, Google cited statistical evidence indicating that the Books index actually increased exposure and revenues for the publishers. (Publisher Case Studies, 2010) Furthermore, the Authors Guild and AAPs argument that Google was robbing them of the opportunity to gain from a future profit-generating mechanism of some kind involving the copyrighted content was purely hypothetical and would thus be unlikely to be acknowledged as legitimate evidence of injury in a formal court proceeding. However, regardless of the strength of Googles position, the possibility that the act of indexing book content to enable automated searching would be found to constitute copyright infringement by a court of law was a risk Google was unwilling to take. If Google was in fact convicted of massive copyright infringement, such a ruling might pose dangerous legal implications for its crown jewel, i.e. the Google search engine. After all, a simplified explanation of the mechanism in Googles search technology is that it makes copies of web pages (cached copies) and indexes them to promote navigability. Google Books works in much the same way; it makes copies of books, and then uses those scanned images to create a massive index, allowing for word-based searching on a large scale. So even though Google has always maintained an unflinching appearance of confidence to the public and has been willing to bet on the legality of its actions in court in the past (cf. Viacom v. YouTube), the prospect of going to trial over the Books Project controversy was an unnecessary risk as it could have jeopardized the existence of its entire operation. Thus, protracted negotiations began in the Spring of 2006, and by October 28, 2008, Google announced that a settlement had been reached between them and the Authors Guild and the Association of American Publishers that would end the current lawsuits and set the terms for future operations of the
14 The legal definition of goodwill is a businesss reputation, patronage, and other intangible assets that are considered when appraising the business, esp. for purchase; the ability to earn income in excess of the income that would be expected from the business viewed as a mere collection of assets. ~ Because an established businesss trademark or servicemark is a symbol of goodwill, trademark infringement is a form of theft of goodwill. (Blacks Law Dictionary, 8th ed. 2004)

18

Google Book Search. (Google Books Settlement Agreement, 2008)

The Proposed Settlement


Spanning a hundred and forty one pages (before attachments and indices) of pure legalese, the Google Books settlement document quickly gained notoriety for its incomprehensibility to anyone but trained legal professionals. (Settlement, 2008) In an attempt to foster more widespread understanding and conversation, the American Library Association issued a document shortly after Googles announcement entitled the Google Book Settlement -- 2 page Super Simple Summary that captured the essential points in bullet point format. The following paragraphs outline the content of the settlement based on the ALAs summary. One major component of the settlement agreement was the definition of future parameters for Googles relationships with content providers, i.e. libraries, by establishing four broad categories of partnership: The first was a fully participating library, which was an institution that provided Google with full access to its collection (including in-copyright material) in return for (a) digital copies of the books, (b) freedom from liability for copyright infringement, (c) the ability to provide special access to these inhouse digital copies to library members with a disability that inhibits traditional, visual reading, and (d) liberal policies regarding use of digital copies for scholarly purposes. The limitations of the agreement demanded that these libraries never use these digital copies to create their own digital database, or for any commercial purpose, or for inter-library loans, and had to comply with a demanding Security Standard that addressed issues of identification, authentication, access control, network security, and other provisions. The second class was a cooperating library, which provided Google with content for scanning but did not receive digital copies of the scanned books provided to Google. However, as a result, they also did not have to comply with the Security Standard, which greatly reduced the costs of their participation and still provided the benefit of making their collection more widely available to the public. They were also absolved of any copyright infringement liability. The third group was public domain libraries, who received an identical agreement to that of the cooperating libraries, except that they provided only public domain books to Google. The final group was the other libraries, who agreed to provide Google books to scan, but chose not to participate in the settlement benefits. The other major component of the settlement defined the parties affected by Google Book Search and set the terms for how each would be compensated or charged depending upon their involvement. The first of the four parties was the user base of Google Books. In essence, the users were basically unaffected by the settlement; Google would continue to scan books into its search database, publishers and authors would agree not to sue, and users would continue to be allowed to search the entire content of scanned books. If anything, the settlement improved access for users, as they would now be able to purchase online copies of out-of-print books through an account with Google at a price set by the rights-holders a service not offered before the settlement. The agreement also stipulated specific rules regarding the amount of each purchased book that could be printed and/or cut & pasted into other documents. The second party mentioned in the settlement was U.S. public libraries (hence the 2 page Super Simple Summary provided by the American Library Association). Any public library that requested 19

access to the database would receive free Public Access Service (PAS), which is essentially a subscription for unlimited viewing of all content in the index, at a single terminal (computer workstation) in their building. If these institutions desired having more terminals with full access, they could pay Google a monthly subscription fee for this access. Though the fee would be set by Google, it would be reasonably priced and based upon the estimated number of individuals who would be using the service full-time. Libraries, particularly small and geographically remote ones, are also generally seen as winners in this agreement as they will greatly increase their access to content at no cost or at a moderate subscription rate. The third affected group was the collection of authors and publishers who had initiated the legal proceedings four years earlier, i.e. the AAP and the Authors Guild. They received a hearty share of the revenues generated by Google through advertising and sales of access to full-text of in-copyright, noncommercially available books to users. Specifically, the authors and publishers would receive 63 percent of all future revenues generated by the aforementioned sources and would receive these funds through a new system called the Books Rights Registry, or BRR, which allowed individual copyright holders to sign up and receive their payments directly from Google. The BRR would also play a significant role in determining the prices set by Google for access to the work. Authors who did not wish to participate would have until April 5, 2011 to request complete removal of a specific volume from the database. Lastly, the AAP and Authors Guild would receive $45 million up front through the newly erected BRR for previous scanning and use of their work. The fourth and final player in the settlement was Google itself. By implication of the deal with the publishers, Google would get to keep approximately 37 percent of the revenue generated by the Books Service. More importantly, it would be allowed to continue scanning books and operating its database. As mentioned in the earlier discussion of Googles business model, the true value generated by its services is less defined by its earnings from advertising revenue and more about generating reliable information about user trends, habits, and behaviors. The settlement ensured Google would be able to continue collecting this user data on a huge scale. And though the $45 million back payment might seem large at face value, the amount is miniscule in relation to Googles earnings. For some perspective, consider that in 2008, the year of the settlement, Googles net income was over $4.2 billion. This implies that it could have covered this settlement payment close to a hundred times over before tapping into its financial reserves. (Mergent Online, 2008) Finally, Google agreed that within five years of the settlement approval, it would provide free search, the Public Access Service, and institutional subscriptions for 85 percent of the out-of-print books it has scanned, and will make commercially reasonable efforts to accommodate users with conditions inhibiting their ability to read. All things considered, the settlement terms were highly favorable to Google as well.

Public Reaction to the Settlement: Further Controversy


Despite Googles claims that the agreement between them and the AAP and Authors Guild was groundbreaking and final, and that it would accommodate the interests of all parties involved, the 20

settlement evoked more public ire and critical attention than the original controversy itself. One of the most common complaints was that the settlement allowed Google to effectively purchase a monopoly over the market for digital book scanning and dissemination. Robert Darnton of Harvard captured the issue eloquently in his article, Google and the Future of Books: As an unintended consequence [of the settlement], Google will enjoy what can only be called a monopoly a monopoly of a new kind, not of railroads or steel but of access to information. Google has no serious competitors. Microsoft dropped its major program to digitize books several months ago, and other enterprises like the Open Knowledge Commonsand the Internet Archive are minute and ineffective in comparison with Google. Google alone has the wealth to digitize on a massive scale. And having settled with the authors and publishers, it can exploit its financial power from within a protective legal barrier; for the class action suit covers the entire class of authors and publishers. No new entrepreneurs will be able to digitize books within that fenced-off territory, even if they could afford it, because they would have to fight the copyright battles all over again. If the settlement is upheld by the court, only Google will be protected from copyright liability. (Darnton, 2009) Another vocal proponent of the anti-trust complaint against the settlement was Gary Reback, the antitrust attorney who worked on the anti-trust case against Microsoft a decade ago. He argued that Google could never have attained the dominant position in the market for digital books it is about to inherit via the settlement through free-market competition, and that it chose instead to use court process to achieve dominance. (Masnick, 2009) Others cited the potential effects of the settlement on future application of the fair use doctrine. For example, Siva Vaidhyanathan, a professor at the University of Virginia, argued that the fair use claims made by Google are so excessive that it may cause judicial limitation of that right. (Vaidhyanathan, 2007) That is, Google may be pushing its luck so far that courts will deem the existing body of fair use law to be too lenient and generous to parties not originally intended to benefit from it, and will impose more restrictive measures in response. Still others worried that the settlement would allow Google to start managing and filtering its scanned content in much the same way as it censors its web searches or removes videos from YouTube. In the event that Google Book Search one day becomes the primary medium for experiencing the written word, censorship could have a far-reaching impact on the propagation of ideas and literature. In the same vein, concerns were raised about the lack of any mention of individual privacy in the settlement. Many involved were expecting privacy issues to be a central target of revision in the settlement, but guarantees of privacy protection have not been extended nor have any improvements been implemented. Even those in favor of Google and the Books Project were unhappy with the outcome, wondering why a settlement was necessary in the first place. They saw Google as having a very strong legal case, and so the settlement just amounted to caving to the greedy interests of publishers and authors demanding a share of Googles profits for no reason other than that Google had scanned their works and pointed more people to them. Furthermore, they worried that this settlement would only open the door to 21

music labels, the press, video producers, and anyone else whose content Google has indexed, to sue for their share, ultimately leaving Google with less freedom and funds to continue to deliver useful services and programs. (Masnick, 2009) Since the 2008 announcement, numerous parties have created organizations in protest of the settlement or filed amicus curiae briefs expressing concerns about anti-trust implications and fair-use. One of these parties, the New York Law School, wrote that its motive in filing the amicus brief was to help understand the interplay of law and technology and influence their development to serve democratic values in a digital age, [as well as to] extend human knowledge and harness new information tools to the goals of social justice. (Levy, 2009) (However, their motives appeared slightly less noble-minded when the NYLS disclosed that the funding for their litigation research on Googles Book Project was funded entirely by the Microsoft Corporation.) Another group, the Open Book Alliance, was formed in opposition to the settlement, working to counter the scheme to monopolize the access, distribution, and pricing of the largest digital database of books in the world. (Open Book Alliance, 2008) Similarly, the Public Index, a group of professors, students, and volunteers from New York, was formed to advance the publics involvement in the discussion of the ethics of the Google Books settlement. (Open Content Alliance, 2008) Even Harvard University Library, one of the original Library Partners in the Google Books effort, publicly announced its dissatisfaction with the settlement and threatened to end its partnership with Google unless drastic changes were implemented. (Mirviss, 2008) The settlement is yet to gain final approval by the courts. The process has been hindered by ongoing negotiations between the publishers and Google in order to refine the settlement terms, which is a result of a brief filed by the U.S. Department of Justice suggesting that the settlement was in violation of U.S. anti-trust laws. Though the final hearing was supposed to have taken place in early February of 2010, it has been indefinitely postponed.

Hypothetical: Google Inc. vs. AAP & Authors Guild


While Googles decision to settle out of court was the prudent course of action from a business perspective, it was unfortunate for other library digitization projects and everyone else desiring clarification of fair use provisions. A formal court opinion would have provided substance and detail to future applications of the fair use doctrine in technological contexts. In this section, I will discuss what factors might have come into play had Google and the AAP & Authors Guild actually squared off in court. First, I will summarize three important previous cases and their outcomes that would have been relevant to the Google Books hearing, and then I will conduct a fair use analysis in an attempt to emulate how a judge might have actually ruled.

Legal Precedent
Sony Corp. of America v. Universal Studios, Inc. (1984) Sony was one of the earliest cases to begin to shape copyright law in the digital age. In the early 1980s, 22

Universal Studios sued Sony for producing VCRs with the capability of recording television shows and films. Universals argument was that Sony was manufacturing a device that was enabling people to commit copyright infringement, thus they should be held liable for all illegally reproduced work made with the VCR. The case eventually reached the Supreme Court, who ruled in favor of Sony because they believed the VCR had other significant, non-infringing uses and because Universal was unable to disprove this. Since the case ended, the decision has been cited primarily in the context of deciding whether devices and/or computer programs with built-in recording or copying functions are in violation of copyright law. For example, Sony has been arguably the most oft-cited case in the peer-to-peer networking/file sharing debate. More generally, however, the decision found fair use to be an equitable rule of reason to be applied in light of the overall purposes of the Copyright Act. (Sony Corp. v. Universal, 1984) And given that the original, stated object for the Act was the encouragement of learning, Sony may have been a cornerstone of Googles case for the Books Project. Whelan Assocs., Inc. v. Jaslow Dental Lab., Inc. (1985) The Whelan case was a dispute between two individuals who were both in the dental industry. Elaine Whelan had created an original program and method that was used for managing dental laboratories. Rand Jaslow, who worked closely with Whelan, soon thereafter created his own management system that bore deep, conceptual similarities to Whelans. Whelan cried foul play and took Jaslow to court, suing him for copyright infringement. Though much of the decision was concerned with sorting out technical details and determining whether certain forms of digital communication were subject to copyright, the decision also clearly established the distinction between ideas and expression in the context of modern media. That is, the appellate court found that the purpose of a utilitarian work, such as a computer program, is the works idea, whereas the means by which this idea is executed is the expression. Thus, given the similarities between the non-essential aspects of the two programs, the court ruled in favor of Whelan, finding copyright infringement to have occurred. So what does this have to do with Google Books? None of the technological details would have been relevant to the Google case, nor would have the distinction drawn between ideas and expression. However, the decisions distinct emphasis on the importance of upholding the spirit of copyright law is extremely relevant: We must remember that the purpose of the copyright law is to create the most efficient and productive balance between protection (incentive) and dissemination of information, to promote learning, culture, and development. (Whelan v. Jaslow, 1985) This case decision suggests that the court would place significant weight upon the scholarly and cultural benefits of the Google Books service.

Kelly v. Arriba Soft Corp. (2002 to 2003) If Google Inc. ever did find itself in court defending the Book Search service, Kelly v. Arriba Soft Corporation might have been the single most important case precedent on which to base its defense. 23

In 2002, a commercial photographer named Les Kelly sued the search engine company Arriba Soft Corporation for displaying thumbnails, or scaled-down images, of his photographs in their search results and for providing deep links to the photos by directing the user to the owners website where the full pictures were stored. In short, the California-based 9th Circuit Court of Appeals ruled that the thumbnail image of a cached image was sufficiently transformative to qualify as non-infringing. Because the primary product of Arriba Soft Corp. was the index, and the cached web pages and scanned images were only incidental copies made in order to create the index, Arriba was absolved of liability for copyright infringement. (Kelly v. Arriba Soft Corp., 2003) In its analysis of whether the image indexing constituted fair use in Kelly, the Court came to the following conclusions: Purpose and character of use: As mentioned above, the use was found to be commercial and transformative, and permissible because of their utility in the facilitation of the search engine. Nature of the copyrighted work: Though creative works tend to favor findings of infringement, published works tend to be viewed as fair. Because the photographs were both creative and published, slight favor was given to Kelly by the Court in this consideration. Amount and substantiality: Though Arriba was in fact using the entire image (albeit a scaleddown version) of Kellys work, the court found that this was necessary for the search engine to provide its service if the entire image were not displayed, users would not be able to recognize the image and the service would be of no use to anyone. Therefore, because Arriba was using the minimum amount necessary for its intended use, the Court found this consideration to be neutral. Effect upon marketability and value: The Court proclaimed that Arribas thumbnails would guide people to Kellys work rather than direct them away from it or supersede it with something else. Concluding that the work was indeed transformative, the Court defined this factor as favorable to Arriba. The case would have been indispensable to Google in trial because it (a) emphasized the importance of flexibility and realism in copyright law, and (b) demonstrated that effective search engines are essential to the culture and economy of the Internet and the United States. (Vaidyanathan, 2009) Furthermore, virtually all four of the fair use considerations would be identical, with the slight exception of the amount and substantiality factor due to the fact that Google is copying and storing permanently whereas Arriba was redirecting and caching, a more impermanent form of storage.

Final Analysis: Google Books and Fair Use


In conjunction with the consideration of relevant case precedent, a court overseeing the proceedings between Google Inc. and the publishers associations would have also considered the four fair use criteria to determine whether the doctrine should be applied. 24

I.

Purpose and Character of Use The Google Book Search program has great educational value. Given the popularity of Google and the ever-increasing rates of internet consumption per capita, (Statistical Abstract, 2010) the service would have an undeniably positive impact upon literacy and the availability of information in the United States and abroad. Yet, at the same time, the Book Search service is undeniably commercial. There is an exchange of goods between Google and its partner libraries (i.e. the libraries receive digital copies of the books in exchange for providing Google with content) and Google will receive future revenue streams from advertising displayed on the Book Search site. From a utilitarian perspective, the conclusion is simple: the Book Searchs pedagogical and scholarly value greatly outweighs the disutility to the publishers resulting from Googles commercial interests, as thousands (perhaps millions) more people will be benefited by allowing the service to continue than the alternative. However, a more literal adherence to copyright law would dictate that the purpose and character of use criteria in this instance be split between Google and the publishers given the presence of both commercial and educational interests.

II.

Nature of the Copyrighted Work The nature of the copyrighted work consideration attempts to distinguish between different kinds of creation, such as the broad genre of a book or film, and whether the work in question rightfully belongs in the public domain. Given the vast array of copyrighted works in question in the Google Books case, the nature consideration would prove to be largely irrelevant. However, as discussed in the summary of the Kelly v. Arriba Soft Corp. decision, creative works tend to favor findings of infringement, whereas published works tend to be viewed as fair use. If this general rule were applied to the Google Books case, it would obviously fall in favor of Google.

III. & IV.

Amount and Substantiality AND Effect on Marketability and/or Value

I combined the third and fourth criterion explanations because I believe they are inextricably linked in the context of the Google Books case and thus are more effectively explained concurrently. Of all the fair use judgments to be made, the amount and substantiality is the most ambiguous for Google Books. The lines are murky because it is uncertain as to where the rule should be applied, i.e. whether it should be applied to the amount Google scans into its database (the entire text) or to the amount it displays after returning search results (a few sentences). Returning to the original intentions of the law, the purpose of the amount and substantiality criterion is to determine what percentage of the entire original work had been copied in order to 25

establish the degree to which the defendant replicated the original and to assess the likelihood of the copy essentially replacing the original. This is applicable to the Google Books case because it suggests that the rule should be applied to whatever portion of the copied material is viewable to the public, since anything kept privately by Google would not threaten to render the original, public copy superfluous or potentially reduce its marketability. If this opinion were held by the court, then Googles snippets would be the focus of the amount and substantiality estimation, which would almost certainly be deemed a fair use. As for the marketability aspect, there is a strong likelihood that Googles service could bring vitality back to a stagnant publishing industry by offering a new business model for authors. Furthermore, it is creating a market for authors whose books are no longer in print, who can now make money from royalties and digital sales of their work. Finally, it is making the industry as a whole more consumer-friendly by improving the navigability of the total selection and by ensuring that consumers know whether the book they are buying is actually something they need. An improvement upon a book or an industrys marketability certainly qualifies as fair use. However, there is also precedent for copyright holders charging sampling fees for use of even small amounts of their work, which they would not receive from Google or the users viewing their snippets. This potential reduction in value would reduce the likelihood of a fair use verdict. Another point against Google is the issue of presentation: Historically, authors and publishers have previously been able to dictate which quotes can or cannot be advertised in isolation when promoting their book. With Google Book Search, the sentences from the text displayed would be determined exclusively by the terms entered by the user in the search field, neglecting the input of the author. If a court believed that selective presentation had a substantial impact on the marketability and/or sales of a book, this could also hurt Googles cause. Therefore, it appears that Google would have a strong case if a courts reasoning followed such a trajectory. However, the decision would crucially hinge upon subjective factors such as the courts interpretation of (a) the relative weighting of the educational value and commercial interests, and (b) Googles practice of scanning of entire volumes in terms of amount and substantiality. However, in addition to the specific fair use considerations listed above, I believe that a broader, more fundamental issue would have aided the Google Books cause in a court; namely, the emphasis in recent intellectual property case decisions upon upholding the original purpose and spirit of copyright law. We would expect a court to undertake a fair use analysis of Google Book Search in a manner so as to avoid rigid application of the copyright statute when, on occasion, it would stifle the very creativity which that law is designed to foster. (Kelly v. Arriba Soft Corp., 2003) There is an undeniably significant social and cultural benefit provided by Google Book Search, and this would have been recognized by the court and been a central consideration in their decision.

26

Conclusion
Though we will never know what the outcome of a confrontation between Google Inc. and the AAP and Authors Guild would have been in a courtroom setting, we will still have a form of precedent going forward. That is, our precedent will be the settlement agreement that is still being revised and improved by the two parties. It will, for better or for worse, become an essential reference point for future disputes regarding intellectual property law in the digital age. While the settlement approval process continues to be fraught with setbacks due to intense scrutiny from the public and the Justice Department, it appears as though it is going to be approved sometime between mid 2010 and early 2011 in more or less its original form. Approval of the settlement will be a great step forward in terms of making information widely available in the digital age. With the consent of the publishers and the federal government, there will be no limit to the size and scope of Googles online library, and if their past projects are any indication of future potential, it could become the most comprehensive and accessible collection of written word ever assembled. However, the settlement could also prove to be inhibitive to future projects necessitating fair use of copyrighted material as Google will have set a precedent of paying off the plaintiff. In any case, the Google Books controversy and the public debate it has sparked over the ethics and application of fair use will echo for decades in discussions about copyright law and its ability to evolve in accordance with the ever-changing technological landscape.

27

Works Cited
Adams, R. (2003) www.advertising, Broadway, NY.:The Illex Press Limited, ISBN 0823058611 Advertisement: "Nexus One Phone". Google.com. Google. 2010. Retrieved January 6, 2010 Alexa. "Orkut.com - Traffic Details from Alexa". Alexa Internet, Inc. Retrieved 2009-10-17 Alexandrou, Marios. All Things SEM: CPM v. CPC v. CPA. February 4, 2007. Retrieved February 1, 2010. <http://www.allthingssem.com/cpm-cpc-cpa/ >

Arrington, Michael. "Bing Comes to Hotmail". Techcrunch. Published July 9, 2009. Retrieved December 30, 2009. Australian Broadcasting Corporation. "ABC Fora- Lars Rasmussen on Inventing Google Maps". Abc.net.au. Retrieved 2010-01-12 Baldonado, Michelle; Chen-chuan K Chang, Luis Gravano, Andreas Paepcke (1997). The Stanford Digital Library Metadata Architecture. Retrieved 2009-07-24. Battelle, John. "The Birth of Google." Wired Magazine. August 2005. Berlind, David. (2007-06-25). "Google improves Apps, offers organizations clear path off Exchange, Notes, etc. to GMail". Published 2007-06-25. Retrieved 2008-05-29. Blank, Kim. The Importance of Being Google. The New English Review. September 2009. http://www.newenglishreview.org/custpage.cfm/frm/45773/sec_id/45773 Bylund, Anders. "To Google or Not to Google." The Motley Fool. July 5, 2006. Retrieved on March 28, 2007. Carter, Lewis. "Web could collapse as video demand soars". Daily Telegraph. Published 2008-04-07. Retrieved 2008-0421 Crawford, Walt. Perspective: The Google Books Search Settlement. Cites & Insights: Crawford at Large. A Publication Sponsored by YBT Library Services. Volume 9, Number 4: March 2009. <http://citesandinsights.info/civ9i4.pdf> Darnton, Robert. Google & the Future of Books. Volume 56, Number 2. Februrary 12, 2009. Retrieved March 1, 2010. <http://www.nybooks.com/articles/22281> Dumain, Bershad & Schulman LLP. Class Action Complaint: The Authors Guild v. Google Inc. Filed September 20, 2005. United States District Court, Southern District of New York. Editorial: Googles Big Plan for Books. The New York Times Online. Published July 28, 2009. Retrieved February 23, 2010. <www.nytimes.com/2009/07/29/opinion/29wed3.html> Google Books Settlement Agreement. The Future of Google Books: Our Groundbreaking Agreement with Authors and Publishers. 2008. <http://books.google.com/googlebooks/agreement/> Google Books Views. Google Books Library Project An enhanced card catalog of the world's books. Retrieved February 19, 2010. <http://books.google.com/googlebooks/library.html>

28

Google BS: About Google Books: History. Published 2007. http://books.google.com/googlebooks/history.html Google Corporate Information. <http://www.google.com/corporate/> Google Investor Relations. "Financial Tables". Retrieved 2008-01-31 Google Privacy Policy, 2010. https://checkout.google.com/files/privacy.html Hamblen, Matt. Android to Grab No. 2 Spot by 2012. ComputerWorld. Published October 6, 2009. Retrieved March 3, 2010. <http://www.computerworld.com/s/article/9139026/Android_to_grab_No._2_spot_by_2012_says_Gartner.> Jeanneney, Jean-Noel. Google and the Myth of Universal Knowledge: A View from Europe. October 23, 2006. Retrieved via the Wikipedia article on Google Books, February 28, 2010. Jewler, Robin. CRS Report for Congress. The Google Book Search Project: Is Online Indexing a Fair Use Under Copyright Law? December 28, 2005. Retrieved March 1, 2010. Kelly v. Arriba Soft Corp., 335 F. 3d 811. 9th Circuit, 2003. <http://docs.law.gwu.edu/facweb/claw/ArribaSo.htm> Lafeber, Michael & Saunders, Lindsey. ABA Intellectual Property Roundtable. Copyright Protection in the Digital Age: Google Book Settlement and Beyond. November / December 2008. Retrieved March 1, 2010. Levy, Steven. Whos Messing With the Google Book Settlement? Hint: Theyre in Redmond, Washington. March 31, 2009. Wired Magazine Online. <www.wired.com/epicenter/2009/03/whos-messing-wi/> Masnick, Mike. Complaints Against Google Book Scanning Project Reach Ridiculous Levels. Techdirt.com blog. Wednesday, September 9, 2009. <http://www.techdirt.com/articles/20090908/2342546135.shtml> Mergent Online. Historical Financial Data. Google, Inc. (NMS: GOOG) As reported annual income statement: 12/31/2008. Mediati, Nick (2009-07-07). "Google Announces Chrome OS". PC World. Retrieved 2009-07-08. Mirviss, Laura G. (30 October 2008). "Harvard-Google Online Book Deal at Risk". The Harvard Crimson. Retrieved March 6, 2010. Nielson Global Web Site Rankings. Insights: Top 10 Search Providers, Home & Work. <http://enus.nielsen.com/rankings/insights/rankings/internet> Open Book Alliance. Mission Statement. <www.openbookalliance.org/mission/> Open Content Alliance. Google Claims to be the Lone Defender of Orphans: Not Lone, Not Defender. 2008 <www.opencontentalliance.org/2009/10/07/google-claims-to-be-the-lone-defender-of-orphans-not-lone-notdefender/> Publisher Case Studies. Thoughts from Publishers. <www.google.com/au/intl/en/googlebooks/newviews/pub.html> Schonfeld, Erick. January 14, 2009, Gmail grew 43 Percent last year. AOL and Hotmail need to start worrying. <Techcrunch.com>

29

Settlement Agreement: The Authors Guild, Inc., the Association of American Publishers, Inc., et. al v. Google Inc. Case No. 05 CV 8136-JES Skidelsky, William. The Observer. Googles Plan for Worlds Biggest Online Library: Philanthropy or an Act of Piracy? Sunday, August 30, 2009. <http://www.guardian.co.uk/technology/2009/aug/30/google-library-project-bookssettlement> Sony Corp. of America v. Universal Studios Inc. Volume 464, U.S. 417 (1984). <http://supreme.justitia.com/us/464/417/case.html> Stanford University. Center for Copyright and Fair Use, Stanford University Libraries. Copyright Overview: NOLO. Chapter 9: Fair Use. <http://fairuse.stanford.edu/Copyright_and_Fair_Use_Overview/chapter9/index.html> Statistical Abstract of the United States from the National Data Book, 2010. Information and Communications: Sections 1118 1120: Internet Access and Usage Internet Activities of Adults by Type of Home Internet Connection. Provided by the United States Census Bureau. Sturcke, James. Writers sue Google Print over copyright. The Guardian, UK. September 21, 2005. Retrieved on February 28, 2010. <www.guardian.co.uk/media/2005/sep/21/newmedia.business> Vaidyanathan, Siva. The Googlization of Everything and the Future of Copyright. University of California Davis Law Review, 40 (3): 1207 1231. ISSN 0197-4564. Retrieved March 6, 2010. Whelan Assocs., Inc. v. Jaslow Dental Lab., Inc., 609 F. Supp. 1307, 225 U.S.P.Q. (BNA) 156 (E.D. Pa. 1985) <http://scholar.google.com/scholar_case?case=15336516076885890624&q=609+F.+Supp. +1307&hl=en&as_sdt=2 002> White, Charles. Beginner Guide to Copyright. Presentation Transcript. SlideShare.net, presentation. Retrieved February 22, 2010. <www.slideshare.net/cwhite449/beginner-guide-to-copyright> Zhang, Dennis. Googles Book Project. PowerPoint presentation. Fall 2008. Retrieved February 28, 2010.

30

Vous aimerez peut-être aussi