The non-profit Internet Archive, libraries at the University of California and the University of Toronto and technology suppliers Hewlett-Packard Co and Adobe Systems Inc are among the founders of the group.
The organisation, known as the Open Content Alliance (OCA), plans to create a unified storehouse of both public domain and copyright materials, hosted by the Internet Archive.
This potentially vast library would be searchable and freely available to anyone, whether individual web surfers or commercial sites, its promoters said.
“The goal is to really spur the expansion of books, audio and video being made available online through this alliance,” said David Mandelbrot, Yahoo’s vice president of search content – in charge of licensing the media featured on Yahoo’s site.
The Yahoo-backed consortium poses a challenge to Google Inc which has been working for the past year on an ambitious project to scan the contents of five of the world’s great academic libraries to make the books freely available online – unless copyright holders first object.
The Google and Yahoo projects are just the latest in a long list of projects to digitise academic collections. The pioneering Project Gutenberg, which scans literary works in the public domain, has been underway since the early 1970s.
“The initiative seems to respect the rights of creators to determine how their works will be used, and this has been our concern and objective all along”
Initially, several OCA members will work digitising about 18,000 works of American literature that have been defined as the “canon collection” by the University of California.
These works – which include many of the writings of Mark Twain, Henry James and Edgar Allen Poe, as examples – will begin appearing on the OCA site by the end of this year, with the entire collection set to be online by the end of next year.
The European Archive and the National Archive in Britain have also signed on as founders of the OCA, but are determining what material to contribute, Mandelbrot said.
Yahoo will supply its search technology for use on the OCA archive. It also plans to make the contents of the OCA digital media archive searchable through its own Yahoo search site.
Academic and commercial publishers praised the concept.
“The initiative seems to respect the rights of creators to determine how their works will be used, and this has been our concern and objective all along,” said Pat Schroeder, CEO of the Association of American Publishers.
Gary Price, an analyst with SearchEngineWatch, envisions a literature professor being able to build a custom search system of OCA’s archive that would have students link only to the contents of specific books assigned by the professor.
“Not only is the material in the database open but also the database itself,” said Price who was briefed on Yahoo’s plan.