Updated Version

NEF, University of Toronto, 2008

Copyright © 2008 Marie Lebert


From Project Gutenberg in 1971 to the Encyclopedia of Life in 2007, 38milestones and as many pages, with an overview and an in-depthdescription for each milestone. This book is also available in French,with a different text. Both versions are available on the NEF<>.


Marie Lebert is a researcher and journalist specializing in technologyand books, other media and languages. She is the author of Lesmutations du livre (Mutations of the Book, in French, 2007) and LeLivre 010101 (The 010101 Book, in French, 2003). All her books havebeen published by NEF (Net des études françaises / Net of FrenchStudies), University of Toronto, Canada, and are freely availableonline at <>.


Most quotations are excerpts from NEF interviews. With many thanks toall the persons who are quoted here, and who kindly answered myquestions over the years. Most interviews are available online at<>.


With many thanks to Greg Chamberlain, Laurie Chamberlain, KimberlyChung, Mike Cook, Michael Hart and Russon Wooldridge, who kindly editedand/or proofread some parts in previous versions. The author, whosemother tongue is French, is responsible for any remaining mistakes.




1968: ASCII 1971: Project Gutenberg 1974: Internet 1977: UNIMARC 1984: Copyleft 1990: Web 1991: Unicode 1993: Online Books Page 1993: PDF 1994: Library Websites 1994: Bold Publishers 1995: 1995: Online Press 1996: Palm Pilot 1996: Internet Archive 1996: New Ways of Teaching 1997: Digital Publishing 1997: Logos Dictionary 1997: Multimedia Convergence 1998: Online Beowulf 1998: Digital Librarians 1998: Multilingual Web 1999: Open eBook Format 1999: Digital Authors 2000: 2000: Online Bible of Gutenberg 2000: Distributed Proofreaders 2000: Public Library of Science 2001: Wikipedia 2001: Creative Commons 2002: MIT OpenCourseWare 2004: Project Gutenberg Europe 2004: Google Books 2005: Open Content Alliance 2006: Microsoft Live Search Books 2006: Free WorldCat 2007: Citizendium 2007: Encyclopedia of Life



Michael Hart, who founded Project Gutenberg in 1971, wrote: "Weconsider eText to be a new medium, with no real relationship to paper,other than presenting the same material, but I don't see how paper canpossibly compete once people each find their own comfortable way toeTexts, especially in schools." (excerpt from a NEF interview, August1998)

Tim Berners-Lee, who invented the web in 1989-90, wrote: "The dreambehind the web is of a common information space in which we communicateby sharing information. Its universality is essential: the fact that ahypertext link can point to anything, be it personal, local or global,be it draft or highly polished. There was a second part of the dream,too, dependent on the web being so generally used that it became arealistic mirror (or in fact the primary embodiment) of the ways inwhich we work and play and socialize. That was that once the state ofour interactions was on line, we could then use computers to help usanalyse it, make sense of what we are doing, where we individually fitin, and how we can better work together." (excerpt from: The World WideWeb: A Very Short Personal History, May 1998)

John Mark Ockerbloom, who created The Online Books Page in 1993, wrote:"I've gotten very interested in the great potential the net had formaking literature available to a wide audience. (…) I am very excitedabout the potential of the internet as a mass communication medium inthe coming years. I'd also like to stay involved, one way or another,in making books available to a wide audience for free via the net,whether I make this explicitly part of my professional career, orwhether I just do it as a spare-time volunteer." (excerpt from a NEFinterview, September 1998)

Here is the journey we are going to follow:

1968: ASCII is a 7-bit coded character set. 1971: Project Gutenberg is the first digital library. 1974: The internet takes off. 1977: UNIMARC is set up as a common bibliographic format. 1984: Copyleft is a new license for computer software. 1990: The web takes off. 1991: Unicode is a universal double-byte character set. 1993: The Online Books Page is a list of free eBooks. 1993: The PDF format is launched by Adobe. 1994: The first library website goes online. 1994: Publishers put some of their books online for free. 1995: is the first main online bookstore. 1995: The mainstream press goes online. 1996: The Palm Pilot is the first PDA. 1996: The Internet Archive is founded to archive the web. 1996: Teachers explore new ways of teaching. 1997: Online publishing begins spreading. 1997: The Logos Dictionary goes online for free. 1997: Multimedia convergence is the topic of an international symposium. 1998: Library treasures like Beowulf go online. 1999: Librarians become webmasters. 1998: The web becomes multilingual. 1999: The Open eBook format is a standard for eBooks. 1999: Authors go digital. 2000: is a language portal. 2000: The Bible of Gutenberg goes online. 2000: Distributed Proofreaders digitizes books from public domain. 2000: The Public Library of Science (PLoS) works on free online journals. 2001: Wikipedia is the first main online cooperative encyclopedia. 2001: Creative Commons works on new ways to respect authors' rights on the web. 2003: MIT offers its course materials for free in its OpenCourseWare. 2004: Project Gutenberg Europe is launched as a multilingual project. 2004: Google launches Google Print to rename it Google Books. 2005: The Open Content Alliance (OCA) launches a world public digital library. 2006: Microsoft launches Live Search Books as its own digital library. 2006: The union catalog WorldCat goes online for free. 2007: Citizendium is a main online "reliable" cooperative encyclopedia. 2007: The Encyclopedia of Life will document all species of animals and plants.

[Unless specified otherwise, all quotations are excerpts from NEFinterviews. These interviews are available online at<>.]

1968: ASCII


Used since the beginning of computing, ASCII (American Standard Codefor Information Interchange) is a 7-bit coded character set forinformation interchange in English. It was published in 1968 by ANSI(American National Standards Institute), with an update in 1977 and1986. The 7-bit plain ASCII, also called Plain Vanilla ASCII, is a setof 128 characters with 95 printable unaccented characters (A-Z, a-z,numbers, punctuation and basic symbols), i.e. the ones that areavailable on the English/American keyboard. Plain Vanilla ASCII can beread, written, copied and printed by any simple text editor or wordprocessor. It is the only format compatible with 99% of all hardwareand software. It can be used as it is or to create versions in manyother formats. Extensions of ASCII (also called ISO-8859 or ISO-Latin)are sets of 256 characters that include accented characters as found inFrench, Spanish and German, for example ISO 8859-1 (Latin-1) forFrench.

[In Depth (published in 2005)]

Whether digitized years ago or now, all Project Gutenberg books arecreated in 7-bit plain ASCII, called Plain Vanilla ASCII. When 8-bitASCII (also called ISO-8859 or ISO-Latin) is used for books withaccented characters like French or German, Project Gutenberg alsoproduces a 7-bit ASCII version with the accents stripped. (This doesn'tapply for languages that are not "convertible" in ASCII, like Chinese,encoded in Big-5.)

Project Gutenberg sees Plain Vanilla ASCII as the best format by far.It is "the lowest common denominator." It can be read, written, copiedand printed by any simple text editor or word processor on anyelectronic device. It is the only format compatible with 99% ofhardware and software. It can be used as it is or to create versions inmany other formats. It will still be used while other formats will beobsolete (or are already obsolete, like formats of a few short-livedreading devices launched since 1999). It is the assurance collectionswill never be obsolete, and will survive future technological changes.The goal is to preserve the texts not only over decades but overcenturies. There is no other standard as widely used as ASCII rightnow, even Unicode, a universal double-byte character encoding launchedin 1991 to support any language and any platform.



In July 1971, Michael Hart created Project Gutenberg with the goal ofmaking available for free, and electronically, literary works belongingto public domain. A pioneer site in a number of ways, Project Gutenbergwas the first information provider on the internet and is the oldestdigital library. When the internet became popular in the mid-1990s, theproject got a boost and gained an international dimension. The numberof electronic books rose from 1,000 (in August 1997) to 5,000 (in April2002), 10,000 (in October 2003), 15,000 (in January 2005), 20,000 (inDecember 2006) and 25,000 (in April 2008), with a current productionrate of around 340 new books each month. With 55 languages and 40mirror sites around the world, books are being downloaded by the tensof thousands every day. Project Gutenberg promotes digitization in"text format", meaning that a book can be copied, indexed, searched,analyzed and compared with other books. Contrary to other formats, thefiles are accessible for low-bandwidth use. The main source of newProject Gutenberg eBooks is Distributed Proofreaders, conceived inOctober 2000 by Charles Franks to help in the digitizing of books frompublic domain.

[In Depth (published in 2005, updated in 2008)]

The electronic book (eBook) is now 37 years old, which is still a shortlife comparing to the five and a half century print book. eBooks wereborn with Project Gutenberg, created by Michael Hart in July 1971 tomake available for free electronic versions of literary books belongingto public domain. A pioneer site in a number of ways, Project Gutenbergwas the first information provider on an embryonic internet and is theoldest digital library. Long considered by its critics as impossible ona large scale, Project Gutenberg had 25,000 books in April 2008, withtens of thousands downloads daily. To this day, nobody has done abetter job of putting the world's literature at everyone's disposal,while creating a vast network of volunteers all over the world, withoutwasting people's skills or energy.

During the first twenty years, Michael Hart himself keyed in the firsthundred books, with the occasional help of others. When the internetbecame popular, in the mid-1990s, the project got a boost and gained aninternational dimension. Michael still typed and scanned in books, butnow coordinated the work of dozens and then hundreds of volunteersacross many countries. The number of electronic books rose from 1,000(in August 1997) to 2,000 (in May 1999), 3,000 (in December 2000) and4,000 (in October 2001).

37 years after its birth, Project Gutenberg is running at fullcapacity. It had 5,000 books online in April 2002, 10,000 books inOctober 2003, 15,000 books in January 2005, 20,000 books in December2006 and 25,000 books in April 2008, with 340 new books available permonth, with 40 mirror sites worldwide, and

