Paris

Now booking: Google wants to digitize research libraries. Credit: GOOGLE

A spat is brewing between academic publishers and Google over the Internet-search company's plans to digitize and index library collections at major research universities.

Late last year, Google, based in Mountain View, California, announced a decade-long project to scan millions of volumes at the universities of Harvard, Stanford, Michigan and Oxford, as well as the New York Public Library. The resulting archive would allow computer users worldwide to search the texts online. But some publishers complain that they weren't consulted by Google, and that scanning library collections could be illegal.

Under the scheme, people searching with Google would find library volumes relevant to their query at the top of their search results. Clicking on a title would allow them to browse images of the full text of works in the public domain. Only brief excerpts and bibliographic data would be shown for material under copyright. Participating libraries would also be given a digital copy of their collection.

Google describes the initiative as an extension of Google Print (http://www.print.google.com), which is based on agreements with publishers and allows the full text of books to be searched. Google Print's results provide a brief excerpt of the text, together with a link to publishers or booksellers that sell the book and to libraries that hold it.

But Google has not yet struck any legal agreements with publishers, either individually or collectively, for the research-library initiative, says Sally Morris, chief executive of the Association of Learned and Professional Society Publishers, the international trade body for not-for-profit publishers. Few publishers would want to opt out of the library scheme, Morris says — but they need to be asked to provide the appropriate permission.

Copyright material generally carries some variation of a warning banning the reproduction, storage or distribution of copies of the work without the publisher's permission. Scanning a book constitutes making a copy and so is only allowed with permission, say lawyers from several publishers. They also argue that an exception under US law that allows libraries to copy texts for preservation purposes would not apply in this case. Nor would making copies for ‘fair use’, given that Google is a commercial company.

A spokesman for Google says that it will “respect the rights of copyright holders”, and that it “prefers to work directly with publishers to bring copyrighted books online”. Google “has been working closely with publishers to help them connect with more readers online”, he adds.

Part of the uncertainty stems from the fact that there seems to have been little discussion so far between Google and publishers, says Terry Hulbert, head of electronic development and strategy at the UK Institute of Physics. “Someone clearly needs to have a chat with the 800-pound gorilla sat in the corner,” he observes. “There is no question that Google should have spoken to the learned societies and publishers beforehand. Systematic digitization of copyright content is absolutely something they cannot do without seeking approval of the rights holders.”

Peter Kosewski, director of publications and communications at Harvard University Library, says the library believes that the way Google intends to handle copyright works is consistent with the law. Harvard is carrying out a pilot with Google on 40,000 titles before making a decision on digitizing its entire 15-million-volume collection. “We have a number of questions that will be answered by the pilot project, and that includes copyright issues,” he says. “We think it is a great programme Google has put together.”