User guide

Using the website

Electronic texts

Search engine

Electronic images

Darwin Online database

Charles Darwin c. 1854


Using the website

Texts can be located via table of contents pages: Publications, Manuscripts and Supplementary works or the search pages for the Freeman bibliographical database, the Darwin Manuscript catalogue or the Advanced search page to search all content. See Search help for advanced search options. A basic entryway is provided on the Major Works page. See also: Site map

For most internet browsers, to search within an open document, push Ctrl f (control and f simultaneously) and a find box will appear.

Navigating within large documents: to jump to the top of a document push Home and to the bottom End; use the drop down menu at the top of each document between (see screen shot below) the Back and Next buttons to jump to a particular page; or press Page Up and Page Down or use the window's scroll bar. The arrow keys and mouse scroll wheel also move one up or down through a text or image.

Drop down box example

Font size is controlled by internet browsers. To increase or decrease the size of the text on the screen press Ctrl and scroll with mouse wheel or press Ctrl + or Ctrl -

To view two documents at once, open a second browser window (normally Ctrl n) and arrange the two side-by-side or one above the other. One can also adjust the size of the two windows in side-by-side view by clicking and dragging the central scroll bar right or left. The top frame containing the navigation buttons can be enlarged or decreased by dragging the frame divider.

Citations. If you use Darwin Online in your research or publication, please cite it as follows:

Wyhe, John van ed., 2002- The Complete Work of Charles Darwin Online (http://darwin-online.org.uk/)

Giving only the URL (web address) is not a proper citation. Neither would be giving only the ISBN number of a book. There should be no "www." at the front of the Darwin Online web address, although these links will also function.

The materials on Darwin Online can also be cited in the conventional manner using the short reference found in the Record at the top of each document (see Publications below) e.g.:

Darwin, C. R. 1869. On the origin of species by means of natural selection, or the preservation of favoured races in the struggle for life. London: John Murray. 5th edition. Tenth thousand, p. 53.

To quote the material as explicitly from the website the following is suggested:

Darwin, C. R. 1869. On the origin of species by means of natural selection, or the preservation of favoured races in the struggle for life. London: John Murray. 5th edition. Tenth thousand, p. 53 (http://darwin-online.org.uk/content/frameset?itemID=F387&viewtype=text&pageseq=82) RN1.

To cite new manuscript transcriptions on Darwin Online the transcriber and or editor of the document should be cited. Citing the document itself means that you have transcribed the original document. Using the work of others and allowing it to appear to be one's own is plagiarism. There have been several instances of published plagiarism of Darwin Online transcriptions by popular writers, scientists and even historians of science.

To download or print images use the PDF versions provided on the contents pages. Many of these files are very large and will take some time to download. A free PDF reader is available here.

The search engine

There are two kinds of searchable material on this site:

1. Electronic documents, like books and transcribed manuscripts, search the texts here.

2. A database of bibliographical and manuscript archive records. Search the database: Bibliography or Manuscripts.

The search facilities of this site are built on the popular and stable open-source Lucene search engine which is used by numerous institutions and businesses around the world, including JSTOR - The Scholarly Journal Archive and NSDL – The National Science Digital Library. This use of Lucene 'fuses together' the free text indexes acquired from the electronic documents themselves, together with the traditional bibliographic and manuscript records to present a unified view of the data- integrating date and metadata. See Search help for advanced search options such as Boolean, proximity, fuzzy searching etc.

Three main types of search interface are currently provided.

Advanced Search searches, if not restricted, the entire website including all documents, bibliography and manuscript catalogue. The advanced search includes 'Darwin Charles Robert' and 'English' language by default, but these can be deleted or changed if desired by the user.

Freeman Bibliographical Database Search searches only the Freeman 'F' and additional 'A' entries in the database.

Darwin Manuscript Catalogue Search searches only the manuscript records in the database.

The electronic texts'Man is but a worm' from Punch

Categories

The electronic texts are divided into four categories:

Books and Pamphlets

Contains printed individual items written by or containing contributions by Darwin or works in a series to which he contributed or edited. Also included are works which printed manuscripts left unpublished by Darwin. Letters published in his lifetime and the major collections published by his family after his death are included here. Also included are modern published transcriptions of Darwin manuscripts. Those which have appeared in print are given a Freeman number.

Publications in Serials or Articles

Contains papers, notes and letters by Darwin which were originally published in serials (periodicals) in Darwin's lifetime or shortly after his death. Translations or summaries in other languages or quotations from Darwin's works are only included when they were given a number by Freeman. All of Darwin's articles have been edited and annotated here for the first time.

Manuscripts

Contains items left unpublished by Darwin or catalogued amongst his private papers in the Darwin Archive at the Cambridge University Library or elsewhere. Transcriptions for Darwin Online retain their archival manuscript number e.g. CUL-DAR158.1-76

Supplementary Works

A wide range of publications not written by Darwin, including secondary reference works, contemporary reviews of Darwin's books, obituaries, recollections of Darwin, published descriptions of Darwin's Beagle specimens by other scientists and important related works for students studying the period or discussing Darwin or his life and work.

PublicationsThe 1st edition of Origin of species, 1859

The electronic texts provided here have been produced in a number of ways and over several years. The first were transcribed by dedicated volunteers like Sue Asscher. Others have been produced by optical character recognition (OCR) software and then corrected by hand. Many of those created for Darwin Online by AEL Data are transcribed by double key double compare entry which provides excellent accuracy (99.995%). Although it is difficult to ensure that an electronic text is an exact reproduction of the original, it has a number of important advantages over facsimile images, including those which have background text search facilities. Electronic texts are much smaller and therefore faster to manipulate, and possess the very great advantage of being searchable for key words or phrases. The text can be copied out and pasted into the reader's notes. Most of the texts provided here are available in both text and image so readers can chose which form they wish to use and verify the accuracy of the transcriptions for themselves. Please email us if you spot any errors.

The transcriptions are meant to resemble the originals in every way relevant to most scholarly uses, however they are not strictly meant to imitate a type facsimile. Therefore the font and text size and line breaks do not exactly reproduce the original appearance. However all printed characters, formatting and page breaks are accurately represented.

Only line breaks (or hyphenation at the ends of lines), which are an artifact of typesetting, which would impede searching and which is irrelevant to quotations, are removed. Page breaks, even when hyphenation occurs, are scrupulously preserved because this is reflected in scholarly quotations and is an accurate representation of the original. Italics, bold type and capitalization, which are retained when quoting, are meticulously preserved on Darwin Online.

For details of the coding and transcription of publications see the Print transcription policy.

Each document is provided in a single XHTML file. These are headed with a standard concise record entry (see screen shot above). This includes the:

author(s) or editor(s). publication date. title. place: publisher (or periodical title, volume, month and page numbers).

Following this is the item's revision history which records when the item was electronically scanned, transcribed, and further corrections or additions with dates and the names of those who have worked on them. At the end is an RN (Revision Number) which allows readers to determine if any changes to the content have been made since a previous visit.

Some items include a note providing an acknowledgement of reproduction permission or further details about the item. Editorial introductions, when available, are also given a link here. The record entry is divided from the historical document by a horizontal rule. The record and the document are further distinguished by different fonts.

 

When calling up a document the software takes one to the first page 'code' which leaves the record out of view just above it, unless one scrolls or pages up to see it. The end of the document is indicated by a second horizontal rule under which a standard footing is affixed providing a link back to the homepage and website copyright declaration and finally the date when the XHTML file was last updated. This date does not necessarily indicate a change in the file's textual content, but may simply record that a formatting or footer change has been carried out on the entire site.

Page breaks are indicated thus:

[page] 5

This indicates the start of page 5. In other instances, when no page number exists, as with end pages or with inserted plates, page breaks are indicated thus:

[page break]

Some selected works, such as the first edition of On the Origin of Species, include the original running headers. These page breaks appear, as in the original volume, thus:

[page] 8 VARIATION CHAP. I.

where '8' is the page number and 'VARIATION CHAP. I.' is the running title. This is exactly the string of characters used in the original text. (However if the page number comes after the running header it is moved to the front of the line.) Screen shot of side-by-side view

The electronic text can be viewed alone or, by clicking the link Text & Image view (where available), side-by-side with a facsimile of the original.

One can also bring up the side-by-side view at any time by clicking the page link in the Text view (such as [page] 8). By clicking the back or next buttons both views will advance together.

A robust browser, such as the free Mozilla Firefox, is recommended. Download it here.

Editorial notes or other insertions are made distinct by using a different font colour and size. Editorial references to other works are given in a short form as 'author date' e.g. 'Caldcleugh 1825'. These references are a link to the Bibliography of works cited on Darwin Online if the cited work is not available on this website. If the work is available on the website then the short reference is a link to the work itself.

The texts are prepared in accordance with the XHTML 1.0 (Transitional) DTD. UTF-8 encoding is used throughout. The content pages are presented in a stable URL space, housed in a navigation frameset to ease browsing complete items while keeping track of the relation between textual and image renditions of the same document. The browsing interface is designed to work well in any of the most recent generation of web browsers. It has been tested on Mozilla Firefox and versions of Internet Explorer 5 and above, SeaMonkey 1.0.2 (on SuSe Linux 10.1), KDE's Konqueror 3.5.1 (Linux) and Mac Safari 2.0.4 (419.3).

Despite being rendered in a frameset, the site is accessible to non-Javascript and non-frames-based browsers (including search engines) through a link to the plain content file present in the <noframes> area.

The top frame contains the name of the website (which is also a link to the homepage), button links to the main types of materials provided: Publications (electronic texts and images), Manuscripts, Biography, Credits, navigation buttons, a quick search box which searches the contents of all electronic texts and advanced search. A drop down box in the top frame provides access to every page in the currently opened document. Using this it is easy to jump to separated plates, for example.

Links to texts or particular pages are permanent stable URLs. If a user wants to make a link to a particular page in the Origin of species one simply clicks on the page link, for example:
[page] 490 CONCLUSION. CHAP. XIV.

The URL displayed by the browser will then be:
http://darwin-online.org.uk/content/frameset?itemID=F373&viewtype=side&pageseq=507

i.e. dual text and image view. If a link to the text only page is desired, clicking Text view will generate the appropriate link for the text only view of that page:
http://darwin-online.org.uk/content/frameset?itemID=F373&viewtype=text&pageseq=507

This URL contains the domain (darwin-online.org.uk) followed by the item ID F373, and the page sequence which in this case is 507 because the spine, covers and end pages are also included in the numbering sequence. The item ID is also the identifier in the bibliographical and manuscript database. Therefore it is possible to change the ID field in the URL displayed in the brower to quickly open another document.

For example, if the browswer currently displays:

http://darwin-online.org.uk/content/frameset?itemID=F373&viewtype=text&pageseq=1

Changing the ID (in bold here for clearness) to F376 will be the URL for F376, the second edition of Origin of species.

http://darwin-online.org.uk/content/frameset?itemID=F376&viewtype=text&pageseq=1

 

Manuscripts

A sample Darwin manuscript
A page from Darwin's barnacle notes

Manuscripts which are newly transcribed follow the same basic rules so that the site software can integrate all documents. Previously published transcriptions are treated as publications and so follow the rules above. For further details on how manuscripts are transcribed see the Manuscript Transcription Policy.

Darwin Online provides the largest collection of Darwin's manuscripts and private papers ever published. The great majority of these are taken from microfilms of the Cambridge University Library collection. Browse the papers here. See a general introduction to Darwin's manuscripts and private papers here.

 

The electronic images

For information on reproducing images from Darwin Online click here.

Images of Publications

Most of the images provided on this site were scanned by Darwin Online using a Bell & Howell 730DC FB scanner at 300dpi (dots per inch) as 16 million colour TIFFs. Special delicate materials have been commercially scanned or photographed by the Cambridge Anatomy Department Imaging group. Normal printed book page images average about 25 megabytes. Special pages such as maps, illustrations or diagrams are scanned at 400dpi and in the case of some of the Zoology engravings, at 500 dpi. Image files are named according to the rule:

date_titleword_identifier_number.extension

So, for example, an image might be named 1859_Origin_F373_004.tif

These TIFFs are then converted into more compact, but equal quality, PNG files (which are about 11 megabytes each). All images are then burned on DVD in three copies (of two different manufacturers) which are kept in two locations. An index is kept of the content of the backup DVDs. Two further copies are kept on external hard drives.

These images are then sent to AEL Data to split each dual page scan into two, so that each page becomes a separate file. Back in Cambridge the files are then downsized to 44% of their original size and converted to JPEG with a 90% quality setting (which at this resolution results in no visible loss of quality) for web display and a copyright notice added to the image (applicable to the image, not Darwin's out of copyright words).

Many of the images are simply scanned from photocopies or were scanned in greyscale by commercial companies or contributors without the facility for colour scanning. Almost all of the shorter publications and supplementary text images were collected for the research of John van Wyhe. Most of these were photocopies or scanned from originals as 400dpi bitonal or greyscale TIFFs.

Some items are currently available only in image form. These are not yet searchable.

The controls for the image viewing window are based on those of the Newton Project. The default scaling is 1:1. If an image is too large to fit within the window at that scale, it is scaled down to fit the width of the window. One can chose any scaling mode with the buttons provided. When advancing to the next image the chosen image scaling is preserved. Most images are presently provided at a resolution viewed best at 1:1 at typical monitor resolutions. Given further funding it will be possible to also provide high-resolution images.

PDFs

A copy of most images has been provided in PDF format to allow readers to download documents onto their own computers for off-line viewing and to print them. PDF documents also possess multiple viewing options such as thumbnail views and image side-by-side viewing. The PDFs are not searchable. A free reader is available here.

Images of Manuscripts

Darwin's JournalThe manuscript images are very heterogeneous, reflecting their diverse provenances. Most are scanned from microfilms at 300dpi greyscale before being downsized, usually by 80%, for web display. Unfortunately some of the microfilms images suffered loss of contrast when copied for us. We have digitally enhanced these where possible. Others have been provided or scanned in colour. It is hoped that at some stage it will be possible to scan the manuscripts in colour (but for this substantially more funding is required). It is important to note that the manuscript images remain copyright of the owner(s). Any requests to reproduce them must be addressed to the owner(s) and not Darwin Online.

The Darwin Online database

The Freeman bibliographical database and the Darwin manuscript catalogue are actually contained in a single database, although usually treated as separate on this website. Each has its own introduction: Freeman Bibliographical database and Darwin manuscript catalogue. The c. 80,000 entries are fully normalized.

The content for both was first entered or imported to an interim Access database used for desktop editing. This is converted to a plain text file using the UTF-8 character set conversion by the Lucene search engine indexing and website software.

Technical documentation:

Basman, Antranig. Technical Documentation for Darwin Online

Basman, Antranig. Content Markup Standard for Darwin Online

Basman, Antranig. Date Encoding Standard for Darwin Online

 

The Darwin Online concept, design, layout, graphics, editing and annotations are by John van Wyhe unless otherwise indicated. Web design by van Wyhe and Antranig Basman, new web design from 2012. Texts for digitization selected by John van Wyhe, Janet Browne and Jim Secord. See Credits and Acknowledgements for a complete list of contributors and those involved. For an overview of the history of Darwin Online see history.

John van Wyhe

October 2006

 

File last up23 August, 2012e -->e -->