The Project Gutenberg eBook of Technology and Books for All
Title: Technology and Books for All
Author: Marie Lebert
Release date: October 29, 2008 [eBook #27098]
Most recently updated: March 14, 2009
Language: English
Credits: Produced by Al Haines
Produced by Al Haines
{~—- UTF-8 BOM —-~}
TECHNOLOGY AND BOOKS FOR ALL
MARIE LEBERT
Updated Version
NEF, University of Toronto, 2008
Copyright © 2008 Marie Lebert
——-
From Project Gutenberg in 1971 to the Encyclopedia of Life in 2007, 38 milestones and as many pages, with an overview and an in-depth description for each milestone. This book is also available in French, with a different text. Both versions are available on the NEF <http://www.etudes-francaises.net/dossiers/technologies.htm>.
——-
Marie Lebert is a researcher and journalist specializing in technology and books, other media and languages. She is the author of Les mutations du livre (Mutations of the Book, in French, 2007) and Le Livre 010101 (The 010101 Book, in French, 2003). All her books have been published by NEF (Net des études françaises / Net of French Studies), University of Toronto, Canada, and are freely available online at <http://www.etudes-francaises.net>.
——-
Most quotations are excerpts from NEF interviews. With many thanks to all the persons who are quoted here, and who kindly answered my questions over the years. Most interviews are available online at <www.etudes-francaises.net/entretiens/index.htm>.
——
With many thanks to Greg Chamberlain, Laurie Chamberlain, Kimberly Chung, Mike Cook, Michael Hart and Russon Wooldridge, who kindly edited and/or proofread some parts in previous versions. The author, whose mother tongue is French, is responsible for any remaining mistakes.
——
TABLE
Introduction
1968: ASCII 1971: Project Gutenberg 1974: Internet 1977: UNIMARC 1984: Copyleft 1990: Web 1991: Unicode 1993: Online Books Page 1993: PDF 1994: Library Websites 1994: Bold Publishers 1995: Amazon.com 1995: Online Press 1996: Palm Pilot 1996: Internet Archive 1996: New Ways of Teaching 1997: Digital Publishing 1997: Logos Dictionary 1997: Multimedia Convergence 1998: Online Beowulf 1998: Digital Librarians 1998: Multilingual Web 1999: Open eBook Format 1999: Digital Authors 2000: yourDictionary.com 2000: Online Bible of Gutenberg 2000: Distributed Proofreaders 2000: Public Library of Science 2001: Wikipedia 2001: Creative Commons 2002: MIT OpenCourseWare 2004: Project Gutenberg Europe 2004: Google Books 2005: Open Content Alliance 2006: Microsoft Live Search Books 2006: Free WorldCat 2007: Citizendium 2007: Encyclopedia of Life
Websites
INTRODUCTION
Michael Hart, who founded Project Gutenberg in 1971, wrote: "We consider eText to be a new medium, with no real relationship to paper, other than presenting the same material, but I don't see how paper can possibly compete once people each find their own comfortable way to eTexts, especially in schools." (excerpt from a NEF interview, August 1998)
Tim Berners-Lee, who invented the web in 1989-90, wrote: "The dream behind the web is of a common information space in which we communicate by sharing information. Its universality is essential: the fact that a hypertext link can point to anything, be it personal, local or global, be it draft or highly polished. There was a second part of the dream, too, dependent on the web being so generally used that it became a realistic mirror (or in fact the primary embodiment) of the ways in which we work and play and socialize. That was that once the state of our interactions was on line, we could then use computers to help us analyse it, make sense of what we are doing, where we individually fit in, and how we can better work together." (excerpt from: The World Wide Web: A Very Short Personal History, May 1998)
John Mark Ockerbloom, who created The Online Books Page in 1993, wrote: "I've gotten very interested in the great potential the net had for making literature available to a wide audience. (…) I am very excited about the potential of the internet as a mass communication medium in the coming years. I'd also like to stay involved, one way or another, in making books available to a wide audience for free via the net, whether I make this explicitly part of my professional career, or whether I just do it as a spare-time volunteer." (excerpt from a NEF interview, September 1998)
Here is the journey we are going to follow:
1968: ASCII is a 7-bit coded character set. 1971: Project Gutenberg is the first digital library. 1974: The internet takes off. 1977: UNIMARC is set up as a common bibliographic format. 1984: Copyleft is a new license for computer software. 1990: The web takes off. 1991: Unicode is a universal double-byte character set. 1993: The Online Books Page is a list of free eBooks. 1993: The PDF format is launched by Adobe. 1994: The first library website goes online. 1994: Publishers put some of their books online for free. 1995: Amazon.com is the first main online bookstore. 1995: The mainstream press goes online. 1996: The Palm Pilot is the first PDA. 1996: The Internet Archive is founded to archive the web. 1996: Teachers explore new ways of teaching. 1997: Online publishing begins spreading. 1997: The Logos Dictionary goes online for free. 1997: Multimedia convergence is the topic of an international symposium. 1998: Library treasures like Beowulf go online. 1999: Librarians become webmasters. 1998: The web becomes multilingual. 1999: The Open eBook format is a standard for eBooks. 1999: Authors go digital. 2000: yourDictionary.com is a language portal. 2000: The Bible of Gutenberg goes online. 2000: Distributed Proofreaders digitizes books from public domain. 2000: The Public Library of Science (PLoS) works on free online journals. 2001: Wikipedia is the first main online cooperative encyclopedia. 2001: Creative Commons works on new ways to respect authors' rights on the web. 2003: MIT offers its course materials for free in its OpenCourseWare. 2004: Project Gutenberg Europe is launched as a multilingual project. 2004: Google launches Google Print to rename it Google Books. 2005: The Open Content Alliance (OCA) launches a world public digital library. 2006: Microsoft launches Live Search Books as its own digital library. 2006: The union catalog WorldCat goes online for free. 2007: Citizendium is a main online "reliable" cooperative encyclopedia. 2007: The Encyclopedia of Life will document all species of animals and plants.
[Unless specified otherwise, all quotations are excerpts from NEF interviews. These interviews are available online at <http://www.etudes-francaises.net>.]
1968: ASCII
[Overview]
Used since the beginning of computing, ASCII (American Standard Code for Information Interchange) is a 7-bit coded character set for information interchange in English. It was published in 1968 by ANSI (American National Standards Institute), with an update in 1977 and 1986. The 7-bit plain ASCII, also called Plain Vanilla ASCII, is a set of 128 characters with 95 printable unaccented characters (A-Z, a-z, numbers, punctuation and basic symbols), i.e. the ones that are available on the English/American keyboard. Plain Vanilla ASCII can be read, written, copied and printed by any simple text editor or word processor. It is the only format compatible with 99% of all hardware and software. It can be used as it is or to create versions in many other formats. Extensions of ASCII (also called ISO-8859 or ISO-Latin) are sets of 256 characters that include accented characters as found in French, Spanish and German, for example ISO 8859-1 (Latin-1) for French.
[In Depth (published in 2005)]
Whether digitized years ago or now, all Project Gutenberg books are created in 7-bit plain ASCII, called Plain Vanilla ASCII. When 8-bit ASCII (also called ISO-8859 or ISO-Latin) is used for books with accented characters like French or German, Project Gutenberg also produces a 7-bit ASCII version with the accents stripped. (This doesn't apply for languages that are not "convertible" in ASCII, like Chinese, encoded in Big-5.)
Project Gutenberg sees Plain Vanilla ASCII as the best format by far. It is "the lowest common denominator." It can be read, written, copied and printed by any simple text editor or word processor on any electronic device. It is the only format compatible with 99% of hardware and software. It can be used as it is or to create versions in many other formats. It will still be used while other formats will be obsolete (or are already obsolete, like formats of a few short-lived reading devices launched since 1999). It is the assurance collections will never be obsolete, and will survive future technological changes. The goal is to preserve the texts not only over decades but over centuries. There is no other standard as widely used as ASCII right now, even Unicode, a universal double-byte character encoding launched in 1991 to support any language and any platform.
1971: PROJECT GUTENBERG
[Overview]
In July 1971, Michael Hart created Project Gutenberg with the goal of making available for free, and electronically, literary works belonging to public domain. A pioneer site in a number of ways, Project Gutenberg was the first information provider on the internet and is the oldest digital library. When the internet became popular in the mid-1990s, the project got a boost and gained an international dimension. The number of electronic books rose from 1,000 (in August 1997) to 5,000 (in April 2002), 10,000 (in October 2003), 15,000 (in January 2005), 20,000 (in December 2006) and 25,000 (in April 2008), with a current production rate of around 340 new books each month. With 55 languages and 40 mirror sites around the world, books are being downloaded by the tens of thousands every day. Project Gutenberg promotes digitization in "text format", meaning that a book can be copied, indexed, searched, analyzed and compared with other books. Contrary to other formats, the files are accessible for low-bandwidth use. The main source of new Project Gutenberg eBooks is Distributed Proofreaders, conceived in October 2000 by Charles Franks to help in the digitizing of books from public domain.
[In Depth (published in 2005, updated in 2008)]
The electronic book (eBook) is now 37 years old, which is still a short life comparing to the five and a half century print book. eBooks were born with Project Gutenberg, created by Michael Hart in July 1971 to make available for free electronic versions of literary books belonging to public domain. A pioneer site in a number of ways, Project Gutenberg was the first information provider on an embryonic internet and is the oldest digital library. Long considered by its critics as impossible on a large scale, Project Gutenberg had 25,000 books in April 2008, with tens of thousands downloads daily. To this day, nobody has done a better job of putting the world's literature at everyone's disposal, while creating a vast network of volunteers all over the world, without wasting people's skills or energy.
During the first twenty years, Michael Hart himself keyed in the first hundred books, with the occasional help of others. When the internet became popular, in the mid-1990s, the project got a boost and gained an international dimension. Michael still typed and scanned in books, but now coordinated the work of dozens and then hundreds of volunteers across many countries. The number of electronic books rose from 1,000 (in August 1997) to 2,000 (in May 1999), 3,000 (in December 2000) and 4,000 (in October 2001).
37 years after its birth, Project Gutenberg is running at full capacity. It had 5,000 books online in April 2002, 10,000 books in October 2003, 15,000 books in January 2005, 20,000 books in December 2006 and 25,000 books in April 2008, with 340 new books available per month, with 40 mirror sites worldwide, and with books downloaded by the tens of thousands every day.
Whether they were digitized 30 years ago or digitized now, all the books are captured in Plain Vanilla ASCII (the original 7-bit ASCII), with the same formatting rules, so they can be read easily by any machine, operating system or software, including on a PDA, a cellphone or an eBook reader. Any individual or organization is free to convert them to different formats, without any restriction except respect for copyright laws in the country involved.
In January 2004, Project Gutenberg had spread across the Atlantic with the creation of Project Gutenberg Europe. On top of its original mission, it also became a bridge between languages and cultures, with a number of national and linguistic sections. While adhering to the same principle: books for all and for free, through electronic versions that can be used and reproduced indefinitely. And, as a second step, the digitization of images and sound, in the same spirit.
1974: INTERNET
[Overview]
When Project Gutenberg began in July 1971, the internet was not even born. On July 4, 1971, on Independence Day, Michael keyed in The United States Declaration of Independence (signed on July 4, 1776) to the mainframe he was using. In upper case, because there was no lower case yet. But to send a 5K file to the 100 users of the embryonic internet would have crashed the network. So Michael mentioned where the eText was stored (though without a hypertext link, because the web was still 20 years ahead). It was downloaded by six users. The internet was born in 1974 with the creation of TCP/IP (Transmission Control Protocol / Internet Protocol) by Vinton Cerf and Bob Kahn. It began spreading in 1983. It got a boost with the invention of the web in 1990 and of the first browser in 1993. At the end of 1997, there were 90 to 100 million users, with one million new users every month. At the end of 2000, there were over 300 million users.
1977: UNIMARC
[Overview]
In 1977, the IFLA (International Federation of Library Associations) published the first edition of UNIMARC: Universal MARC Format, followed by a second edition in 1980 and a UNIMARC Handbook in 1983. UNIMARC (Universal Machine Readable Cataloging) is a common bibliographic format for library catalogs, as a solution to the 20 existing national MARC (Machine Readable Cataloging) formats, which meant lack of compatibility and extensive editing when bibliographical records were exchanged. With UNIMARC, catalogers would be able to process records created in any MARC format. Records in one MARC format would first be converted into UNIMARC, and then be converted into another MARC format.
[In Depth (published in 1999)]
At the time, the future of online catalogs was linked to the harmonization of the MARC format. Set up in the early 1970s, MARC is an acronym for Machine Readable Catalogue. This acronym is rather misleading as MARC is neither a kind of catalog nor a method of cataloguing. According to UNIMARC: An Introduction, a document of the Universal Bibliographic Control and International MARC Core Programme, MARC is "a short and convenient term for assigning labels to each part of a catalogue record so that it can be handled by computers. While the MARC format was primarily designed to serve the needs of libraries, the concept has since been embraced by the wider information community as a convenient way of storing and exchanging bibliographic data."
After MARC came MARC II. MARC II established rules to be followed consistently over the years. The MARC communication format intended to be "hospitable to all kinds of library materials; sufficiently flexible for a variety of applications in addition to catalogue production; and usable in a range of automated systems."
Over the years, however, despite cooperation efforts, several versions of MARC emerged, e.g. UKMARC, INTERMARC and USMARC, whose paths diverged because of different national cataloguing practices and requirements. We had an extended family of more than 20 MARC formats. Differences in data content meant some extensive editing was needed before records could be exchanged.
One solution to incompatible data was to create an international MARC format - called UNIMARC - which would accept records created in any MARC format. Records in one MARC format would first be converted into UNIMARC, and then be converted into another MARC format, so that each national bibliographic agency would need to write only two programs - one to convert into UNIMARC and one to convert from UNIMARC - instead of having to write twenty programs for the conversion of each MARC format (e.g. INTERMARC to UKMARC, USMARC to UKMARC etc.).
In 1977, the IFLA (International Federation of Library Associations and Institutions) published UNIMARC: Universal MARC Format, followed by a second edition in 1980 and a UNIMARC Handbook in 1983. These publications focused primarily on the cataloguing of monographs and serials, while taking into account international efforts towards the standardization of bibliographic information reflected in the ISBDs (International Standard Bibliographic Descriptions).
In the mid-1980s, UNIMARC expanded to cover documents other than monographs and serials. A new UNIMARC Manual was produced in 1987, with an updated description of UNIMARC. By this time UNIMARC had been adopted by several bibliographic agencies as their in-house format.
Developments didn't stop there. A standard for authorities files was set up in 1991, as explained on the website of IFLA in 1998: "Previously agencies had entered an author's name into the bibliographic format as many times as there were documents associated with him or her. With the new system they created a single authoritative form of the name (with references) in the authorities file; the record control number for this name was the only item included in the bibliographic file. The user would still see the name in the bibliographic record, however, as the computer could import it from the authorities file at a convenient time. So in 1991 UNIMARC/Authorities was published."
In 1991 a Permanent UNIMARC Committee was also created to regularly monitor the development of UNIMARC. Users realized that continuous maintenance - and not just the occasional rewriting of manuals - was needed, to make sure all changes were compatible with what already existed.
On top of adopting UNIMARC as a common format, The British Library (using UKMARC), the Library of Congress (using USMARC) and the National Library of Canada (using CAN/MARC) worked on harmonizing their national MARC formats. A three-year program to achieve a common MARC format was agreed on by the three libraries in December 1995.
Other libraries began using SGML (Standard Generalized Markup Language) as a common format for both the bibliographic records and the hypertextual and multimedia documents linked to them. As most publishers were using SGML for book records, librarians and publishers began working on a convergence between MARC and SGML. The Library of Congress worked on a DTD (Definition of Type of Document, which defines its logical structure) for the USMARC format. A DTD for the UNIMARC format was developed by the European Union. Some European libraries chose SGML to encode their bibliographic data. In the Belgian Union Catalog, for example, the use of SGML allowed to add descriptive elements and to facilitate the production of an annual CD-ROM.
1984: COPYLEFT
[Overview]
The term "copyleft" was invented in 1984 by Richard Stallman, who was a computer scientist at MIT (Massachusetts Institute of Technology). "Copyleft is a general method for making a program or other work free, and requiring all modified and extended versions of the program to be free as well. (…) Copyleft says that anyone who redistributes the software, with or without changes, must pass along the freedom to further copy and change it. Copyleft guarantees that every user has freedom. (…) Copyleft is a way of using of the copyright on the program. It doesn't mean abandoning the copyright; in fact, doing so would make copyleft impossible. The word 'left' in 'copyleft' is not a reference to the verb 'to leave' — only to the direction which is the inverse of 'right'. (…) The GNU Free Documentation License (FDL) is a form of copyleft intended for use on a manual, textbook or other document to assure everyone the effective freedom to copy and redistribute it, with or without modifications, either commercially or non commercially." (excerpt from the GNU website)
1990: WEB
[Overview]
The internet got its first boost with the invention of the web and its hyperlinks by Tim Berners-Lee at CERN (European Laboratory for Particle Physics) in 1990, and a second boost with the invention of the first browser Mosaic in 1993. The internet could now be used by anyone, and not only by computer literate people. There were 100 million internet users in December 1997, with one million new users per month, and 300 million internet users in December 2000. In summer 2000, the number of non-English-speaking users reached the number of English-speaking users, with a percentage of 50-50. According to Netcraft, an internet services company, the number of websites went from one million (April 1997) to 10 million (February 2000), 20 million (September 2000), 30 million (July 2001), 40 million (April 2003), 50 million (May 2004), 60 million (March 2005), 70 million (August 2005), 80 million (April 2006), 90 million (August 2006) and 100 million (November 2006).
[In Depth (published in 1999, updated in 2008)]
The World Wide Web -that became the Web or web- was invented by Tim Berners-Lee in 1989-90. In 1998, he stated: "The dream behind the web is of a common information space in which we communicate by sharing information. Its universality is essential: the fact that a hypertext link can point to anything, be it personal, local or global, be it draft or highly polished. There was a second part of the dream, too, dependent on the web being so generally used that it became a realistic mirror (or in fact the primary embodiment) of the ways in which we work and play and socialize. That was that once the state of our interactions was on line, we could then use computers to help us analyze it, make sense of what we are doing, where we individually fit in, and how we can better work together." (excerpt from: The World Wide Web: A very short personal history, May 1998.)
Christiane Jadelot, researcher at INaLF-Nancy (INaLF: National Institute of the French Language) wrote: "I began to really use the internet in 1994, with a browser called Mosaic. I found it a very useful way of improving my knowledge of computers, linguistics, literature… everything. I was finding the best and the worst, but as a discerning user, I had to sort it all out and make choices. I particularly liked the software for e-mail, file transfers and dial-up connections. At that time I had problems with a programme called Paradox and character sets that I couldn't use. I tried my luck and threw out a question in a specialist news group. I got answers from all over the world. Everyone seemed to want to solve my problem!" (July 1998)
The W3C (World Wide Web Consortium) was founded in October 1994 to develop interoperable technologies (specifications, guidelines, software and tools) for the web, as a forum for information, commerce, communication and collective understanding. The W3C develops common protocols to lead the evolution of the web, for example the specifications of HTML (HyperText Markup Language) and XML (eXtensible Markup Language). HTML is used for publishing hypertext on the web. XML was originally designed as a tool for large-scale electronic publishing. It now plays an increasingly important role in the exchange of a wide variety of data on the web and elsewhere.
According to the network tracking firm Netcraft, there were 100 million websites on November 1st, 2006. Previous milestones in the survey were reached in April 1997 (1 million sites), February 2000 (10 million), September 2000 (20 million), July 2001 (30 million), April 2003 (40 million), May 2004 (50 million), March 2005 (60 million), August 2005 (70 million), April 2006 (80 million ) and August 2006 (90 million).
1991: UNICODE
[Overview]
First published in January 1991, Unicode is the universal character encoding maintained by the Unicode Consortium. "Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language." (excerpt from the website) This double-byte platform-independent encoding provides a basis for the processing, storage and interchange of text data in any language, and any modern software and information technology protocols. Unicode is a component of the W3C (World Wide Web Consortium) specifications.
1993: ONLINE BOOKS PAGE
[Overview]
Founded in 1993 by John Mark Ockerbloom while he was a student at Carnegie Mellon University, The Online Books Page is "a website that facilitates access to books that are freely readable over the internet. It also aims to encourage the development of such online books, for the benefit and edification of all." (excerpt from the website) John Ockerbloom first maintained this page on the website of the School of Computer Science of Carnegie Mellon University. In 1999, he moved it to its present location at the University of Pennsylvania Library, where he is a digital library planner and researcher. The Online Books Page listed 12,000 books in 1999, 20,000 books in 2003 (including 4,000 books published by women), 25,000 books in 2006 and 30,000 books in 2007. The books "have been authored, placed online, and hosted by a wide variety of individuals and groups throughout the world", with 7,000 books from Project Gutenberg. The FAQ also lists copyright information about most countries in the world with links to further reading.
[In Depth (published in 1999)]
John Mark Ockerbloom first started the website of the School of Computer Science of Carnegie Mellon University (CMU CS), and began maintaining The Online Books Page on it. Web space and computing resources were provided by the School of Computer Science.
Interviewed by email in September 1998, John wrote: "I was the original webmaster here at CMU CS, and started our local web in 1993. The local web included pages pointing to various locally developed resources, and originally The Online Books Page was just one of these pages, containing pointers to some books put online by some of the people in our department. (Robert Stockton had made web versions of some of Project Gutenberg's texts.)
After a while, people started asking about books at other sites, and I noticed that a number of sites (not just Gutenberg, but also Wiretap and some other places) had books online, and that it would be useful to have some listing of all of them, so that you could go to one place to download or view books from all over the net. So that's how my index got started.
I eventually gave up the webmaster job in 1996, but kept The Online Books Page, since by then I'd gotten very interested in the great potential the net had for making literature available to a wide audience. At this point there are so many books going online that I have a hard time keeping up (and in fact have a large backlog of books to list). But I hope to keep up my online books works in some form or another.
I am very excited about the potential of the internet as a mass communication medium in the coming years. I'd also like to stay involved, one way or another, in making books available to a wide audience for free via the net, whether I make this explicitly part of my professional career, or whether I just do it as a spare-time volunteer."
In 1998, The Online Books Page listed more than 7,000 books, which could be browsed by author, by title or by subject. It also listed significant directories and archives of online texts, and special exhibits. From the main search page, users could search four types of media: books, music, art, and video.
The Online Books Page began listing serials. As stated on the website: "Along with books, The Online Books Page is also now listing major archives of serials (such as magazines, published journals, and newspapers), as of June 1998. Serials can be at least as important as books in library research. Serials are often the first places that new research and scholarship appear. They are sources for firsthand accounts of contemporary events and commentary, They are also often the first (and sometimes the only) place that quality literature appears. (For those who might still quibble about serials being listed on a 'books page', back issues of serials are often bound and reissued as hardbound 'books'.)"
The Online Books Page participated in the Experimental Search System of the Library of Congress. It also worked with The Universal Library Project, hosted at Carnegie Mellon University.
In 1999, after graduating from Carnegie Mellon with a Ph.D. in computer science, John moved to work as a digital library planner and researcher at the University of Pennsylvania Library. He also moved The Online Books Page there, and went on expanding it.
1993: PDF
[Overview]
PDF (Portable Document Format) was conceived by Adobe in 1992, launched in June 1993 with Adobe Acrobat software, and perfected over 15 years as the global standard for distribution and viewing of information. It "lets you capture and view robust information from any application, on any computer system and share it with anyone around the world. Individuals, businesses, and government agencies everywhere trust and rely on Adobe PDF to communicate their ideas and vision." (excerpt from the website) Adobe Acrobat gives the tools to create and view PDF files and is available in many languages and for many platforms (Macintosh, Windows, Unix, etc.). Ten years later, over 500 million copies of PDF-based Adobe Reader (formerly Acrobat Reader, until May 2003) have been downloaded worldwide. Approximately 10% of the documents on the internet are available in PDF.
1994: LIBRARY WEBSITES
[Overview]
The first library website was the one created by the Helsinki City Library in Finland, which went live in February 1994. Traditional libraries began using a website as a new virtual window for their patrons and beyond. Patrons could check opening hours, browse the online catalog, or surf on a broad selection of websites on various topics, depending on their needs. Libraries also began developing digital libraries alongside their standard collections, for a large audience to be able to access their specialized, old, local and regional collections. Librarians could now fulfill two goals that used to be in contradiction - book preservation (on shelves) and book communication (on the internet).
[In Depth (published in 1999)]
The first library website was the one created by the Helsinki City Library in Finland, which went live in February 1994. Many libraries began developing a digital library alongside their standard collections. Digital libraries allowed a large audience to have access to documents belonging to specialized, old, local or regional collections. Thanks to their digital libraries, traditional libraries could achieve a long-time dream and fulfill two goals which used to be in contradiction - book preservation and book communication. On the one hand, books were taken out of their shelves only once to be scanned. On the other hand, books could easily be accessed anywhere at any time, and read on the screen of a computer, without the need to go to the library and struggle through a lengthy process to have access to the original books, for various reasons: concern for preservation of rare and fragile documents, reduced opening hours, forms to fill out, long waiting period to get the document, and shortage of staff. All these reasons were often hurdles to get over, and often required of the researcher an unfailing patience and an out-of-the-ordinary determination to finally get to the document.
Some virtual libraries were created from scratch, right on the internet from the beginning, with no back up from a traditional library. This was the case of Athena, founded in 1994 by Pierre Perroud, a Swiss teacher, and hosted on the website of the University of Geneva, Switzerland. Athena was created as a multilingual digital library focusing on philosophy, science, classics, literature, history, and economics. As Geneva is in French-speaking Switzerland, it also focused on putting French texts online. The Helvetia section gathered documents about Switzerland. A specific page offered a number of links to other digital libraries in the world.
In an interview dated February 1996, Pierre Perroud explained: "Electronic texts represent an encouragement to reading and a convivial participation to culture dissemination, (…) [and] a good complement to the paper book, which remains irreplaceable for reading (…). [The paper book] remains a mysteriously holy companion with profound symbolism for us: we grip it in our hands, we hold it against our bodies, we look at it with admiration; its small size comforts us and its content impresses us; its fragility contains a density we are fascinated by; like man it fears water and fire, but it has the power to shelter man's thoughts from time." (excerpt from the Swiss magazine Informatique-Informations)
The Internet Public Library (IPL) opened in March 1995 as the first digital public library of and for the internet community. Its different sections were: Reference, Exhibits, Magazines and Serials, Newspapers, Online Texts, and Web Searching. There were also sections for Teen and Youth. All the items of the collections were carefully selected, catalogued and described by the IPL staff. As an experimental library, IPL also listed the most interesting projects run by librarians on the internet, in the section Especially for Librarians.
1994: BOLD PUBLISHERS
[Overview]
Some publishers decided to use the web as a new marketing tool. In the U.S., NAP (National Academy Press) was the first publisher in 1994 to post the full text of some books, for free, with the authors' consent. NAP was followed by MIT Press (MIT: Massachusetts Institute of Technology) in 1995. Michael Hart, founder of Project Gutenberg, wrote in 1997: "As university publishers struggle to find the right business model for offering scholarly documents online, some early innovators are finding that making a monograph available electronically can boost sales of hard copies." (excerpt from the Project Gutenberg Newsletter of October 1997)
[In Depth (published in 1999)]
The web became a marketing tool for publishers. Some publishers decided to put the full text of some books on the web, for free, with their authors' consent. Oddly enough, there was no drop in sales - on the contrary, sales increased. In the US, NAP was the first publisher to take such a risk in 1994, followed by the MIT Press in 1995, and it worked.
NAP (National Academy Press) was created by the National Academy of Sciences to publish its own reports and the ones of the National Academy of Engineering, the Institute of Medicine, and the National Research Council. In 1994, NAP was publishing 200 books a year in science, engineering, and health. The new NAP Reading Room offered 1,000 entire books, available online for free in various formats ("image" format, HTML format and PDF format).
In 1995, the MIT Press (MIT: Massachusetts Institute of Technology) was publishing 200 new books a year and 40 journals, first in science and technology, and then in architecture, social theory, economics, cognitive science, and computational science. The MIT Press decided to put a number of books online for free, as "a long-term commitment to the efficient and creative use of new technologies." Sales of the print books increased.
Michael Hart, founder of Project Gutenberg, wrote in 1997: "As university publishers struggle to find the right business model for offering scholarly documents online, some early innovators are finding that making a monograph available electronically can boost sales of hard copies. The National Academy Press has already put 1,700 of its books online, and is finding that the electronic versions of some books have boosted sales of the hard copy monographs - often by two to three times the previous level. It's 'great advertising', says the Press's director. The MIT Press is experiencing similar results: 'For each of our electronic books, we've approximately doubled our sales. The plain fact is that no one is going to sit there and read a whole book online. And it costs money and time to download it'." (excerpt from the Project Gutenberg Newsletter of October 1997)
1995: AMAZON.COM
[Overview]
Amazon.com was a "pioneer" online bookstore that created an entirely new economic model. Amazon.com was launched by Jeff Bezos in July 1995, in Seattle, on the west coast of the U.S., after a market study which led him to conclude that books were the best products to sell on the internet. When Amazon.com started, it had 10 employees and a catalog of 3 million books. Unlike traditional bookstores, Amazon.com didn't have windows looking out on the street and books skillfully lined up on shelves or piled upon displays. The virtual window is its website, with all transactions made through the internet. Books are stored in huge storage facilities before being put into boxes and sent by mail. In November 2000, Amazon.com had 7,500 employees, a catalog of 28 million items, 23 million clients worldwide and four subsidiaries in UK (in August 1998), in Germany (in August 1998), in France (August 2000) and in Japan (October 2000). A fifth subsidiary opened in Canada in June 2002. A sixth subsidiary - named Joyo - opened in China in September 2004.
[In Depth (published in 1999)]
Jeff Bezos launched Amazon.com in July 1995, after a market study which led him to conclude that books were the best products to sell on the internet.
In Spring 1994, he drew up a list of twenty products that could be sold online, ranging from clothing to gardening tools, and then researched the top five, which were CDs, videos, computer hardware, computer software, and books.
"I used a whole bunch of criteria to evaluate the potential of each product, but among the main criteria was the size of the relative markets. Books, I found out, were an $82 billion market worldwide. The price point was another major criterion: I wanted a low-priced product. I reasoned that since this was the first purchase many people would make online, it had to be non-threatening in size. A third criterion was the range of choice: there were 3 million items in the book category and only a tenth of that in CDs, for example. This was important because the wider the choice, the more the organizing and selection capabilities of the computer could be put in good use." (excerpt from the Amazon.com press kit)
In 1998, Amazon.com was offering 3 million books, CDs, audio books, DVDs, computer games - more than 14 times as many titles as the large chain superstores - to 3 million people in 160 countries. "Businesses can do things on the web that simply cannot be done any other way", wrote Jeff Bezos. "We are changing the way people buy books and music." Amazon.com quickly became the largest online bookstore, with a catalog of these 3 million items that could be ordered online, authoritative reviews, author interviews, excerpts, customer reviews, and book recommendations. As an internet retailer, Amazon.com could offer more services than traditional retailers: lower prices, larger selection, and a wealth of product information.
Any book lover could post his own reviews of books on Amazon's website, and read others. He could read many interviews with authors, and a number of blurbs and excerpts from books. He could search for books by author, subject, title, ISBN or publication date. Prices were discounted, with savings of 20-40% on 400,000 titles (40% on selected feature books, 30% on hardcovers, and 20% on paperbacks). The client usually received the books within a week. If he requested it, he could receive an email announcing a new book by a favorite author or a new book on a favorite topic. He could select some book categories (44 listed), to be sent a monthly review of new books by email. All things that were entirely new at the time.
What we take for granted now, i.e. buy a book in Europe from the US site of Amazon.com, or buy a book in the US from the German site of Amazon.de, was making big waves at the time, first as "unfair competition" with the local online bookstores, then for taxation. A first outline agreement was concluded between the US and the European Union in December 1997, and this agreement was followed by an international convention. The internet was decided a free trade area, i.e. without any custom taxes for software, films and electronic books bought online. Material goods (books, CDs, DVDs, and so on) and services were subject to existing regulations, with collection of the VAT for example, but with no additional custom taxes.
Amazon.com and others had great assets, but there were bad news for small bookstores. Like the small bookstore set up in 1971 by my friend Catherine Domain in central Paris, on the island Ile Saint-Louis, surrounded by the Seine river.
The small Ulysses Bookstore is known as the oldest travel bookstore in the world. It has more than 20,000 books, maps and magazines, out of print and new, in a number of languages, about any country and any kind of travel, all packed up in a tiny space. Catherine has been a traveller since she was a child. She travels every summer - usually sailing - while her boyfriend runs the bookstore. She is also a member of the French National Union of Antiquarian and Modern Bookstores (SLAM), the Explorers' Club and the International Club of Long-Distance Travellers.
Catherine visited 140 countries, where she sometimes had a hard time. But one of her most difficult challenges was to set up a website on her own, from scratch, without knowing anything about computers. Catherine wrote in December 1999: "My site is still pretty basic and under construction. Like my bookstore, it is a place to meet people before being a place of business. The internet is a pain in the neck, takes a lot of my time and I earn hardly any money from it, but that doesn't worry me…" Nevertheless, despite the internet, she was pessimistic about the future. "I am very pessimistic, because the internet is killing off specialist bookstores."
1995: ONLINE PRESS
[Overview]
The first electronic versions of print newspapers were available in the early 1990s through commercial services like America Online and CompuServe. In 1995, newspapers and magazines began creating their own websites to offer a partial or full version of their latest issue - available freely or through subscription (free or paid) - with online archives. In Europe, the Times and the Sunday Times set up a common website called Times Online, with a way to create a personalized edition. The weekly publication The Economist also went online in UK, as well as the weekly Focus and the weekly Der Spiegel in Germany, the daily Le Monde and daily Libération in France, and the daily El País in Spain. The computer press went logically online as well, like the monthly Wired, created in 1992 in California to cover cyberculture as "the magazine of the future at the avant-garde of the 21st century", or ZDNet, another leading computer magazine. More and more "only" electronic magazines were also created.
[In Depth (published in 1999)]
The first electronic versions of newspapers were available in the early 1990s through commercial services like America Online or CompuServe. Then, in 1995, newspapers and magazines began to create websites to offer the full version of their latest issue - available freely or through subscription (free or paid) - which was then archived online. There were also heated debates on copyright issues for articles posted on the web. More and more "only" electronic magazines were created.
In 1996, the New York Times site could be accessed free of charge. It included the contents of the daily newspaper, breaking news updates every ten minutes, and original reporting available only online. The Washington Post site provided the daily news online, with a full database of articles including images, sound and video.
In Europe, the Times and the Sunday Times set up a common website called Times Online, with the possibility to create a personalized edition. The respected Economist was also available online, as were the French daily newspapers Le Monde and Libération, the Spanish daily newspaper El País or the German weekly magazines Focus or Der Spiegel.
The computer press went online as well. First the monthly Wired, created in 1992 in California to focus on cyberculture and be the magazine of the future at the avant-garde of the 21st century. Then ZDNet, a main publisher of computer magazines.
Behind the news, the web was providing a whole encyclopedia to help us understand them. The web was providing instant access to a wealth of information (geographical maps, biographical notes, official texts, political and economic data, audiovisual and video data); speed in information dissemination; access to main photographic archives; links to articles, archives and data on the same topic; and a search engine to browse articles by date, author, title, subject, etc.
From the start, there were also all these zines using the internet as a cheap way to get published. John Labovitz launched The E-Zine-List in Summer 1993 to list electronic zines (e-zines) around the world, the ones that were accessible via the web, FTP, gopher, email, and other services. The list was updated monthly.
What exactly is a zine? John Labovitz explained on his website: "For those of you not acquainted with the zine world, 'zine' is short for either 'fanzine' or 'magazine', depending on your point of view. Zines are generally produced by one person or a small group of people, done often for fun or personal reasons, and tend to be irreverent, bizarre, and/or esoteric. Zines are not 'mainstream' publications - they generally do not contain advertisements (except, sometimes, advertisements for other zines), are not targeted towards a mass audience, and are generally not produced to make a profit. An 'e-zine' is a zine that is distributed partially or solely on electronic networks like the internet."
3,045 zines were listed on November 29, 1998. John wrote on his website: "Now the e-zine world is different. The number of e-zines has increased a hundredfold, crawling out of the FTP and Gopher woodworks to declaring themselves worthy of their own domain name, even asking for financial support through advertising. Even the term 'e-zine' has been co-opted by the commercial world, and has come to mean nearly any type of publication distributed electronically. Yet there is still the original, independent fringe, who continue to publish from their heart, or push the boundaries of what we call a 'zine'." John stopped updating his list a few years later.
1996: INTERNET ARCHIVE
[Overview]
Founded in April 1996 by Brewster Kahle, the Internet Archive is a non-profit organization that built an "internet library" to offer permanent access to historical collections in digital format for researchers, historians and scholars. An archive of the web is stored every two months or so. In October 2001, with 30 billion web pages stored, the Internet Archive launched the Wayback Machine, for users to be able to surf the archive of the web by date. In 2004, there were 300 terabytes of data, with a growth of 12 terabytes per month. In 2006, there were 65 billion pages from 50 million websites. In late 1999, the Internet Archive also started to include more collections of archived web pages on specific topics. It also became an online digital library of text, audio, software, image and video content. In October 2005, the Internet Archive launched the Open Content Alliance (OCA) with other contributors as a collective effort to build a permanent archive of multilingual digitized text (Text Archive) and multimedia content.
1996: NEW WAYS OF TEACHING
[Overview]
With more and more computers available in schools and at home, and more and more internet connections, teachers began exploring new ways of teaching. Going from print book culture to digital culture was changing their relationship to knowledge, and the way both scholars and students were seeing teaching and learning. Print book culture provided stable information whereas digital culture provided "moving" information. During the September 1996 meeting of IFIP (International Federation of Information Processing), Dale Spender gave a lecture about Creativity and the Computer Education Industry, with insightful comments on forthcoming trends.
[In Depth (published in 1999)]
Going from print book culture to digital culture began changing our relationship to knowledge. Book culture provided stable information whereas digital culture provided "moving" information. During the September 1996 meeting of the IFIP (International Federation of Information Processing), Dale Spender gave an interesting lecture about Creativity and the Computer Education Industry.
Here are some excerpts:
"Throughout print culture, information has been contained in books - and this has helped to shape our notion of information. For the information in books stays the same - it endures.
And this has encouraged us to think of information as stable - as a body of knowledge which can be acquired, taught, passed on, memorized, and tested of course.
The very nature of print itself has fostered a sense of truth; truth too is something which stays the same, which endures. And there is no doubt that this stability, this orderliness, has been a major contributor to the huge successes of the industrial age and the scientific revolution. (…)
But the digital revolution changes all this. Suddenly it is not the oldest information - the longest lasting information that is the most reliable and useful. It is the very latest information that we now put the most faith in - and which we will pay the most for. (…)
Education will be about participating in the production of the latest information. This is why education will have to be ongoing throughout life and work. Every day there will be something new that we will all have to learn. To keep up. To be in the know. To do our jobs. To be members of the digital community. And far from teaching a body of knowledge that will last for life, the new generation of information professionals will be required to search out, add to, critique, 'play with', and daily update information, and to make available the constant changes that are occurring."
1996: PALM PILOT
[Overview]
In the 1990s, Jacques Gauchey was a journalist and writer living in Silicon Valley and specializing in IT (information technology). He was also working as a "facilitator" between the United States and Europe. Jacques was among the first to buy a Palm Pilot in March 1996, and wrote about it in his free online newsletter. As a side remark, he remembered in July 1999: "In 1996 I published a few issues of a free English newsletter on the internet. It had about 10 readers per issue until the day (in January 1996) when the electronic version of Wired Magazine created a link to it. In one week I got about 100 emails, some from French readers of my book La vallée du risque - Silicon Valley [editor's note: The Valley of Risk - Silicon Valley, published by Plon, Paris, in 1990], who were happy to find me again." He added: "All my clients now are internet companies. All my working tools (my mobile phone, my PDA and my PC) are or will soon be linked to the internet." Despite fierce competition, Palm stayed the leader in the PDA market, with 23 million Palm Pilots sold between 1996 and 2002.
1997: DIGITAL PUBLISHING
[Overview]
Digital publishing became mainstream in 1997. This was a new step in the changes underwent by the traditional publishing chain since the 1970s. The traditional printing business was first disrupted by new photocomposition machines, with lower costs. Text and image processing began to be handed over to desktop publishing shops and graphic art studios. Impression costs went on decreasing with desktop publishing, photocopiers, color photocopiers and digital printing equipment. Digitization also accelerated the publication process. Editors, designers and other contributors could all work at the same time on the same book. For educational, academic and scientific publications, online publishing became a cheaper solution than print books, with the possibility of regular updates to include the latest information.
[In Depth (published in 1999)]
Since the 1970s, the traditional publishing chain has drastically changed. The printing work done by pre-press shops was first disrupted by new photocomposition machines. Text and image processing began to be handed over to advertising and graphic art agencies. Impression costs went on decreasing with desktop publishing, copiers, color copiers and digital printing equipment.
In 1997, text and image processing was provided at a low price by desktop publishing shops and graphic art studios. Digitization accelerated the publication process. Editors, designers and other contributors could all work at the same time on the same book.
Digitization also made possible the online publishing of educational and scientific publications, which appeared as a far better solution than print books, because they could be regularly updated with the latest information. Some universities began distributing their own textbooks online, with chapters selected in an extensive database, and articles and commentaries from professors. For a seminar, a small print could be made upon request with a selection of online articles sent to a printer.
Electronic publishing allowed some academic publishers to keep running their business, with lower costs and quick access. This way, small publishers went on publishing specialized books, for which the printing in a small number of copies had become more and more difficult over the years due to budgetary reasons. These books could now be regularly updated and their readers benefit from the latest version. Readers didn't need to wait any more for a new printed edition, often postponed if not cancelled because of commercial constraints.
Electronic publishing and traditional publishing became complementary. The frontier between the two supports - electronic and paper - was vanishing. Most recent print media already stemmed from an electronic version anyway, on a word processor, a spreadsheet or a database. More and more documents became only electronic. And more and more print books were scanned to be included in digital bookstores and libraries.
At the end of the 1990s, there were no reliable statistics yet proving that the large-scale use of computers and electronic documents would make us paperless and save some tress, as hoped by all of us who believe in nature preservation. We were still in a transition period. Many people still needed a print version for easier reading, or to keep track of a document in case the electronic file was accidentally deleted, or to have some paper support for their documentation or archives.
1997: LOGOS DICTIONARY
[Overview]
Logos is a leading translation company located in Modena, Italy. In 1997, Logos had 200 in-house translators in Modena and 2,500 free-lance translators worldwide, who processed around 200 texts per day. The company made a bold move at the time, and decided to put on the web all the linguistic tools used by its translators, for the internet community to freely use them as well. The linguistic tools were the Logos Dictionary, a multilingual dictionary with 7 billion words (in Fall 1998); the Logos Wordtheque, a multilingual library with 300 billion words extracted from translated novels, technical manuals and other texts; the Logos Linguistic Resources, a database of 500 glossaries; and the Logos Universal Conjugator, a database for verbs in 17 languages.
[In Depth (published in 1999)]
The Logos Dictionary is a multilingual dictionary with 7,580,560 words (as of December 10, 1998). The Logos Wordtheque is a word-by-word multilingual library with a massive database of 325,916,827 words extracted from multilingual novels, technical literature and translated texts. Logos Linguistic Resources is a database of 553 glossaries. The Logos Universal Conjugator is a database for the conjugation of verbs in 17 languages.
Logos is an international translation company based in Modena, Italy. In 1997, Logos decided to put all the linguistic tools used by its translators on the web for free. Logos had 200 translators on the spot and 2,500 free-lance translators all over the world, who processed around 200 texts per day.
When interviewed by Annie Kahn in the French daily newspaper Le Monde of December 7, 1997, Rodrigo Vergara, the head of Logos, explained: "We wanted all our translators to have access to the same translation tools. So we made them available on the internet, and while we were at it we decided to make the site open to the public. This made us extremely popular, and also gave us a lot of exposure. The operation has in fact attracted a great number of customers, but also allowed us to widen our network of translators, thanks to the contacts made in the wake of the initiative."
In the same article, Annie Kahn wrote: "The Logos site is much more than a mere dictionary or a collection of links to other online dictionaries. A system cornerstone is the document search software, which processes a corpus of literary texts available free of charge on the web. If you search for the definition or the translation of a word ('didactique', for example), you get not only the answer sought, but also a quote from one of the literary works containing the word (in our case, an essay by Voltaire). All it takes is a click on the mouse button to access the whole text or even to order the book, thanks to a partnership agreement with Amazon.com, the famous online bookstore. Foreign translations are also available. However, if no text containing the required word is found, the system acts as a search engine, sending the user to other websites mentioning the term in question. In the case of certain words, you can even hear the pronunciation. If there is no translation currently available, the system calls on the public to contribute. Everyone can make their own suggestions, after which Logos translators and the company check the forwarded translations."
1997: MULTIMEDIA CONVERGENCE
[Overview]
As more and more people were using digital technology, previously distinct information-based industries, such as printing and publishing, graphic design, media, sound recording and film making, were converging into one industry, with information as a common product. This trend was named "multimedia convergence", with a massive loss of jobs, and a serious enough issue to be tackled by the ILO (International Labor Organization) by 1997. The first ILO Symposium on Multimedia Convergence was held in January 1997 at ILO headquarters in Geneva, Switzerland. This international symposium was a tripartite meeting with employers, unionists, and government representatives. Some participants, mostly employers, demonstrated the information society was generating or would generate jobs, whereas other participants, mostly unionists, demonstrated there was a rise in unemployment worldwide.
[In Depth (published in 1999)]
The first ILO Symposium on Multimedia Convergence was held in January 1997 at the headquarters of ILO (International Labor Office) in Geneva, Switzerland.
Peter Leisink, associate professor of labor studies at the Utrecht University, Netherlands, explained: "A survey of the United Kingdom book publishing industry showed that proofreaders and editors have been externalized and now work as home-based teleworkers. The vast majority of them had entered self-employment, not as a first-choice option, but as a result of industry mergers, relocations and redundancies. These people should actually be regarded as casualized workers, rather than as self-employed, since they have little autonomy and tend to depend on only one publishing house for their work."
This international symposium was held as a tripartite meeting with employers, unionists and government representatives. Some participants still thought our information society would generate jobs, whereas it was already stated worldwide that multimedia convergence was leading to a massive loss of jobs.
Michel Muller, secretary-general of the French Federation of Book, Paper and Communication Industry, stated that the French graphics industry had lost 20,000 jobs - falling from 110,000 to 90,000 - within the last decade, and that expensive social plans had been necessary to re-employ those people. He explained: "If the technological developments really created new jobs, as had been suggested, then it might have been better to invest the money in reliable studies about what jobs were being created and which ones were being lost, rather than in social plans which often created artificial jobs. These studies should highlight the new skills and qualifications in demand as the technological convergence process broke down the barriers between the printing industry, journalism and other vehicles of information. Another problem caused by convergence was the trend towards ownership concentration. A few big groups controlled not only the bulk of the print media, but a wide range of other media, and thus posed a threat to pluralism in expression. Various tax advantages enjoyed by the press today should be re-examined and adapted to the new realities facing the press and multimedia enterprises. Managing all the social and societal issues raised by new technologies required widespread agreement and consensus. Collective agreements were vital, since neither individual negotiations nor the market alone could sufficiently settle these matters."
Quite theoretical compared to the unionists' interventions, here was the answer of Walter Durling, director of AT&T Global Information Solutions: "Technology would not change the core of human relations. More sophisticated means of communicating, new mechanisms for negotiating, and new types of conflicts would all arise, but the relationships between workers and employers themselves would continue to be the same. When film was invented, people had been afraid that it could bring theatre to an end. That has not happened. When television was developed, people had feared that it would do away cinemas, but it had not. One should not be afraid of the future. Fear of the future should not lead us to stifle creativity with regulations. Creativity was needed to generate new employment. The spirit of enterprise had to be reinforced with the new technology in order to create jobs for those who had been displaced. Problems should not be anticipated, but tackled when they arose." In short, humanity shouldn't fear technology.
In fact, employees were not so much afraid of the future as they were afraid of losing their jobs. In 1997, our society already had a high unemployment rate, which was not the case when film was invented and television developed. During the next years, what would be the balance between job creation and lay-off? Unions were struggling worldwide to promote the creation of jobs through investment, innovation, vocational training, computer literacy, retraining for new jobs, fair conditions for contracts and collective agreements, defense of copyright, protection of workers in the artistic field, and defense of teleworkers as workers having full rights. The European Commission was expecting 10 million European teleworkers in the year 2000, which would represent 20% of teleworkers worldwide.
Despite unions' efforts, would the situation become as tragic as what we read in the report of the symposium? "Some fear a future in which individuals will be forced to struggle for survival in an electronic jungle. And the survival mechanisms which have been developed in recent decades, such as relatively stable employment relations, collective agreements, employee representation, employer-provided job training, and jointly funded social security schemes, may be sorely tested in a world where work crosses borders at the speed of light."