Collection Management
BHL welcomes content into its collection that are relevant to the wide range of fields that support biodiversity research. We prioritize the selection of content from our BHL Partner collections as well as biodiversity relevant materials available within the Internet Archive “texts” corpus. For more details about our selection process, please review our collection development policy.
Tags: selection, deduplication, collections, collection development, deaccession
BHL is committed to maintaining persistent and open access to the materials in our collection. We make our best effort to incorporate materials that are relevant to the broad scope of biodiversity, high quality digital copies, and free of copyright restriction or otherwise contributed with permission from copyright holder. Due to the nature of the BHL program as a mass digitization project, we are not able to scrutinize every item that enters into the BHL collection. On rare occasions, content may be removed from the collection for a few reasons:
- Relevance — the BHL Collections Committee may determine that the content is irrelevant to the spectrum of fields related to biodiversity knowledge.
- Poor Image Quality — content may be temporarily removed from BHL if the scan quality is so poor as to render the digital content illegible within a reasonable zoom level. Items digitized from the collections of BHL partners will be resubmitted to the scanning queue and replaced to the best of our ability.
- Copyright Concern — please refer to our Take Down Guidelines for details.
Tags: error, ocean perch, title, relevance, removed, unavailable, unpublished
If you are interested in bringing this to our attention, please use our feedback form.
BHL can review examples of harmful content to improve our understanding of BHL’s collection and support research on the origins, extent, and impact of harmful biases in the natural sciences. BHL is committed to rethinking the development and curation of its collection of natural science materials. Where we have the opportunity to add or connect our collection materials to diverse, inclusive perspectives and voices, we will.
BHL will continue to support use of the collection, including research into biases and errors represented in the corpus, as well as lack of representation and knowledge loss resulting from historical biases and actions. As a cooperative global consortium library program, we hold a variety of personal and professional perspectives on our mission to, “improve research methodology by collaboratively making biodiversity literature openly available to the world as part of a global biodiversity community.” BHL has decided not to identify or flag harmful language or images available in its collection. Removing or obscuring harmful content fragments the historical record and hides evidence of injustice critical to addressing harmful biases that are still held today.
References
- UNESCO. (n.d.). Five Laws of MIL | United Nations Educational, Scientific and Cultural Organization. Retrieved December 13, 2021, from http://www.unesco.org/new/en/communication-and-information/media-development/media-literacy/five-laws-of-mil/
- United Nations Office on Genocide Prevention and the Responsibility to Protect. (n.d.). Retrieved December 13, 2021, from https://www.un.org/en/genocideprevention/hate-speech-strategy.shtml
- Institutions (IFLA), I. F. of L. A. and. (2021). Objectionable Third-Party Content: Library Responses. https://repository.ifla.org/handle/123456789/1754
BHL’s goal is to maintain access to the published record of biodiversity science. We uphold the value that more access to information, even if it contains harmful content, is critical to understanding the historical context in which knowledge has been created. Access to historical documents and records helps researchers gain a more complete understanding of scientific knowledge, including its moral and ethical impacts, and how they have changed over time. Therefore, BHL will not remove materials with harmful content from its collection. For other reasons, materials may be removed from the collection in accordance with our Take Down Guidelines as well as our Deaccession Policy.
Removing access to harmful content does not necessarily reduce harm. By exposing the perpetrators of harmful ideas, the BHL collection documents evidence of the biases and prejudices that perpetuate to this day as barriers to equitable knowledge creation and dissemination.
BHL encourages users of its collection to critically evaluate content and use it responsibly. The following resource evaluation methods can help researchers identify accurate data, information gaps, and biases, as well as the historical context within which the scholarly record was created:
Checklists
-
Fitzpatrick, T. (n.d.). Research Guides: Evaluating Sources: ACT UP: Evaluating sources. Retrieved March 8, 2022, from https://libguides.salemstate.edu/c.php?g=955102&p=6892068
- Stahura, D. (2018). ACT UP for evaluating sources: Pushing against privilege | Stahura | College & Research Libraries News. https://doi.org/10.5860/crln.79.10.551
- Smith, O. (2020, August 11). Evaluation using PROMPT. https://www.open.ac.uk/library/help-and-support/advanced-evaluation-using-prompt
Framework
- Marsteller, M. (n.d.). LibGuides: Science and Technology Section (STS): Science Information Literacy Framework: Home. Retrieved March 7, 2022, from https://acrl.libguides.com/sts/STSILFramework/draft
Online lessons
-
COR for the Science Classroom | Civic Online Reasoning. (n.d.). Retrieved March 7, 2022, from https://cor.stanford.edu/curriculum/collections/cor-for-the-science-classroom/
BHL collection management revolves around our strategic planning goal #1:
Grow BHL into the most comprehensive, reliable, reputable repository of data-rich biodiversity literature, and other original materials, to support a response to global challenges.
Please visit BHL collection management to learn more. Our collection development policy underwent significant revision as of November 2022.
Content and Data Reuse
Yes of course! The BHL makes its metadata available for public use under the CC0 1.0 Universal (CC0 1.0) Public Domain Dedication license. This Creative Commons license allows you to reuse, modify, repurpose, and distribute the metadata for all purposes including commercial and non-commercial, with no need to ask for permission.
Metadata in this case, refers to:
- Library catalog records, i.e. bibliographic data, used to describe the books and journals in the BHL collection (e.g. title and author data).
- Page level data such as page numbers and pages types (e.g. “Title page” and “Illustration”).
- Scientific name data, e.g. “Zea mays”.
The data in BHL’s collection is sourced and aggregated from its consortium partners and Internet Archive contributors. It is provided “as is,” without express or implied warranty as to accuracy, reliability, or fitness for any particular application. Please see our Data Disclaimer for more information.
Go ahead, take our metadata and do something creative with it! If you do repurpose BHL metadata please share your story with us. We often like to feature stories of reuse on our BHL blog.
Tags: download, export, METS, MODS, MARC, data, data mining, data export, data quality, text mining, copyright, license, licensing
Many of the items in BHL’s collection are in the public domain and free to reuse without risk of copyright infringement. Check the field to determine the copyright status of any given item in our collection.
BHL does our best to indicate the copyright status of each item digitized by our partner institutions. Ultimately, BHL does not hold copyright on the materials in our collection and cannot grant permission. It is up to you to review the copyright status of the image or item you wish to reuse and abide by any copyright restrictions that may apply. For additional guidance on reusing content from BHL, visit Copyright and Reuse.
Generally speaking, the country where you plan to use the content and the nature of the use dictate what you can/cannot do as well as the level of risk involved. Commercial use carries higher risk than non-commercial use and in different countries there are different copyright laws governing use. Please review the copyright law for the country in which you plan to use or publish the content. BHL recommends consulting the World Intellectual Property Organization’s (WIPO) Lex database at https://wipolex.wipo.int/en/main/legislation for further information.
If the content you wish to reuse is from a work where the is…:
“In copyright. Digitized with the permission of the rights holder.”
…you may use the materials for non-commercial purposes so long as you provide attribution to the rights holder and share the materials under the same license. Please see the specific terms set forth in the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 license (CC BY-NC-SA license).
Tags: permission, reuse, republication, licensing, license, copyright
I’m working on a project in Wikimedia using BHL data. Are you interested in collaborating with me?
Unfortunately, we do not have the resources to collaborate on projects at this time, but we welcome the use of our data and collections by our user community in Wikimedia projects. You can find more information about reusing BHL content below:
- What kind of files are available for download?
- How do I export data from BHL?
- How can I access BHL data via APIs?
- Is BHL data free to reuse?
- Can I reuse a book or image I find in BHL?
Tags: Wikipedia, Wikidata, Wikicommons
We welcome the use of our data and collections in Wikimedia projects. For example: BHL’s collections can be used as references for Wikipedia articles. BHL metadata (e.g. authors, scientific names, bibliographic references, etc.) can be used in Wikidata, Wikicite, and Wikispecies. BHL images can be uploaded to Wikicommons.
All of BHL’s collections and data are open access. Collections are either public domain or available under Creative Commons BY-NC-SA licenses. BHL makes its metadata available for public use under the CC0 1.0 Universal (CC0 1.0) Public Domain Dedication license.
See the links below to learn more about the types of files and data available from BHL and how to search for, download, and export this content:
- How do I search the BHL collection?
- How can I search for images in the BHL collection?
- What kind of files are available for download?
- How do I export data from BHL?
- How can I access BHL data via APIs?
- Is BHL data free to reuse?
- Can I reuse a book or image I find in BHL?
Tags: Wikipedia, Wikidata, Wikicommons, Wikicite, Wikispecies
While BHL is not currently contributing to Wikimedia through any centrally managed or sponsored projects, we welcome the use of our data and collections by our user community in Wikimedia projects. Several of our Partners also facilitate contribution of BHL collections to Wikimedia through workshops and local initiatives. We encourage you to reach out to BHL Partners in your region to inquire about Wikimedia projects at these institutions.
Our FAQ also provides more information about how you can use BHL collections and data in Wikimedia.
Tags: Wikipedia, Wikidata, Wikicommons, Wikispecies, Wikicite
BHL is working to make our library catalog records accessible via standard repositories and formats. We provide our title level and item (book or volume) level records through the following methods:
- BHL title records are available in OCLC’s WorldCat® database at https://www.worldcat.org/. If you have access to OCLC metadata service products, please look for the OCLC symbol “BHLMR” when reviewing holdings.
- BHL has implemented the NISO or “Knowledge Bases and Related Tools” KBART standard to facilitate the harvest of our item level holdings data into index-based discovery layer tools. If your institution is a subscriber to a major discovery product or knowledge base supplier, please talk to your provider about indexing BHL records in their system. OCLC’s Knowledgebase, Primo, and Summon are a few of many examples of index-based discovery layer tools that utilize data available in the KBART standard. Please refer to the BHL KBART FAQ for more details.
- Exports of BHL bibliographic, scientific name, and full optical character recognized text are available in a variety of formats via the Biodiversity Heritage Library Open Data Collection on Smithsonian’s Figshare https://doi.org/10.25573/data.21081727.v1.
The Biodiversity Heritage Library and Plazi joint Statement of Collaboration describes the cooperation and collaboration of BHL and Plazi, outlining common goals and areas of common interests and clarifying key areas of responsibility.
Tags: partnerships, reciprocal partners, BLR, FAIR data, GBIF
The Biodiversity Heritage Library (BHL) is an open access digital library that aggregates digitized monographs, journal volumes, and archival materials from hundreds of different contributors. In some cases, a consecutive series of journal volumes are available, but in other cases there are gaps in coverage. With the metadata available it is difficult, if not impossible, to accurately identify gaps in coverage. Identifying what is available is easy; identifying what is missing (the gaps) is hard.
These gaps in coverage occur for a variety of reasons: the volumes were not available for digitization due to contributor holdings or condition of the materials, or copyright restrictions prevented BHL from including the volumes in its open access repository. In order to more accurately present these coverage gaps, BHL followed guideline 6.4.6 (pg.17) in “NISO RP-9-2014, KBART Phase II Recommended Practice” where “A title should be listed twice if there is a coverage gap of greater than or equal to 12 months, with only the coverage field changing.”
The NISO KBART Standing Committee reviewed BHL’s KBART data file in 2020 and recommended each volume be listed individually. BHL’s KBART file contains many redundancies, but guarantees that each volume that is held in the collection is represented accurately. Therefore, when a new volume of a title is added to BHL’s collection, a new item (row) is added to the KBART file for that title. From the options listed in the BHL KBART Documentation, the decision was made to elect option 2 as the best method to provide holdings data in BHL’s KBART file.
Contribute
Financial contributions from individual and corporate donors represent an important part of BHL’s financial sustainability. From digitization support to endowments for ongoing technological enhancements and leadership continuity, BHL offers a range of meaningful opportunities for those wishing to make a financial contribution.
Support BHL’s digital collection growth: Donate
Learn more about BHL’s campaign and endowment opportunities in the BHL Case Statement.
Thank you to our supporters!
Tags: support, contribute, donation, tax deductible
At this time, the Biodiversity Heritage Library is not accepting applications for new consortium partners. We will continue to evaluate our capacity to expand our partnerships into the future and provide updates as appropriate. Please stay tuned.
Tags: member, affiliate, contributor, participate, get involved, contribute
There are many ways to get involved with BHL, from donating to support BHL or participating in one of our many volunteer activities. Explore our opportunities to learn how you can get involved today.
Tags: get involved, citizen science, crowdsourcing, contribute, transcription, Flickr tagging
Thank you for considering the Biodiversity Heritage Library. Incorporating content from outside our consortium partnership is tricky, but we would be glad to review your content for inclusion. Please keep in mind that the BHL is a voluntarily staffed consortium program that coordinates among its partners to incorporate content as time and resources allow.
For Publishers, we need the following:
- Permission if the material is in copyright:
- Explicit permission from the copyright holder is required in writing. BHL requires rights holders to sign a standard license agreement form legally permitting us to add in-copyright content to our free and open access collection. Please review a sample of our permissions agreement form as well as our Permissions page for more information. Our standard agreement applies a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.
- Metadata:
- BHL aggregates bibliographic information from the existing library catalogs of our consortium partnership, or WorldCat (https://www.worldcat.org/). Please verify that your content is cataloged.
- Resources:
- We will then work within our consortium to find at least one BHL partner who can digitize or upload the materials on your behalf.
For Individuals
BHL is a digitization program rather than publishing organization. We cannot accept contributions to our digital library directly from individuals. If you are the sole copyright holder of a publication that is held by one of our consortium library partners we can work with you to obtain permission to include the work in our collection. We will then 1) digitize the content from a physical copy or 2) upload an existing electronic copy.
For Institutions or Organizations
The Biodiversity Heritage Library accepts content into our collection from our consortium partners. In addition, we are able to harvest selected biodiversity relevant content held within the Internet Archive’s “texts” collection. At this time, BHL is not accepting applications for new consortium partners. We will continue to evaluate our capacity to expand our partnerships into the future and provide updates as appropriate. Please stay tuned.
Please Note
Depending on the extent and condition of materials to be incorporated, as well as existing priorities, we will process the content for inclusion at our earliest convenience.
Content is incorporated in one of two methods: 1) through digitization of the physical copies held within our BHL partner library collections or 2) uploaded from electronic-born materials, typically in PDF format. It is not common that we request materials directly from publishers but should we find we are missing any issues we will be in contact to request donations, if at all possible.
Please note that it may take considerable time to process the complete set of image files and metadata into our digital library. For more information on how we build and maintain our collection, please refer to our collection development policy.
Tags: digitize, digitization, upload, collection development, get involved, licensing, copyright, permissions, contribute
The OCR, or Optical Character Recognition, text in BHL is generally automatically-generated and uncorrected, meaning that it may be incomplete or contain errors. We do not yet have a way for users to directly correct the OCR within BHL.
We have implemented functionality to allow BHL Partners to upload transcriptions in place of the automatically-generated OCR (Optical Character Recognition) for archival materials digitized in BHL. You are welcome to help improve the OCR for these materials by participating in various Partner crowdsourcing projects from which BHL sources transcriptions, such as the Smithsonian Transcription Center, FromThePage, and DigiVol.
Tags: crowdsourcing, OCR, text recognition, text correction, citizen science
Thank you for your interest in contributing a guest post to the Biodiversity Heritage Library Blog.
BHL only accepts solicited guest blog posts. If you would like to submit your resume, qualifications, and examples of your work for consideration, please email feedback@biodiversitylibrary.org. After review, your information may be added to our files for future reference. If we are interested in soliciting a post, we may contact you. Please note that BHL does not provide compensation for guest posts.
Tags: outreach, press room, marketing, public affairs, media, public relations, communication, resources, announcements, alerts, content development, content marketing
Will you share information about my project, website, resources, events, etc. with your audiences?
Guided by Smithsonian Directives 814 and 950, promotion of 3rd party content through BHL’s website and communication channels is restricted to entities with which BHL has an official relationship.
Tags: outreach, press room, marketing, public affairs, media, public relations, communication, blog, social media, announcements, alerts
We have implemented functionality to allow BHL Staff to upload transcriptions in place of the automatically-generated OCR (Optical Character Recognition) text for digitized materials in BHL’s collection. This functionality supports transcriptions generated as part of in-house or crowdsourced transcription projects hosted by BHL Partners. The Show Text tab now indicates whether the text has been:
- automatically generated and uncorrected;
- automatically generated and error corrected, by machines, which may still include inconsistencies;
- or manually transcribed by humans.
Please note that BHL’s OCR is generated by its Internet Archive digitization partner using Tesseract Open Source OCR (as of 2020) or ABBYYFineReader.
Web-based crowdsource transcription projects are largely managed through the following providers, DigiVol, FromThePage, and Smithsonian Transcription Center.
Especially for archival materials, like field notes and correspondence with handwritten text, transcriptions make these items full-text searchable and enable our taxonomic name recognition software to index scientific names within their pages. Since the transcribed text can be viewed alongside the digitized page image, users can also more easily read materials with difficult-to-decipher handwriting. Thus, this new functionality makes it easier for researchers and the public to explore these valuable primary source materials and access specific information from their pages.
Interested in transcribing materials? Several BHL Partners have transcription projects on various crowdsourcing platforms. Follow the links below to explore the opportunities and get involved:
- Auckland War Memorial Museum Tamaki Paenga Hira on FromThePage
- Ernst Mayr Library of Harvard University on DigiVol
- Harvard Botany Libraries on FromThePage
- Lenhardt Library, Chicago Botanic Garden on FromThePage
- The John Torrey Papers from The New York Botanical Garden on FromThePage
- Smithsonian Institution Archives on the Smithsonian Transcription Center
- Ukrainian Collection items from the National Agricultural Library (NAL) on FromThePage
Tags: crowdsourcing, citizen science, transcription, OCR, full text search, archives
No, BHL can only accept materials into its collection from contributors that are part of our consortium or from materials available within the Internet Archive corpus that conform to our metadata and subject matter standards.
We work directly with the Internet Archive to often digitize, as well as store and serve our materials through our website. Materials from the Internet Archive that contain the appropriate metadata and scope may be considered for inclusion in the BHL collection.
BHL is not able to include materials from other digital repositories, such as HathiTrust and Google Books, for a variety of reasons including:
- legal restrictions that prevent the distribution of materials across various digital repositories
- technical incongruities with file formats and/or metadata that prevent the materials from being compatible with our digital library file structure.
Tags: harvest, IA, selection, digitization, collection development
No, the BHL is strictly a digital library and not able to accept physical materials. Although headquartered at Smithsonian Libraries, BHL is made up of a consortium of institutions that collaboratively digitize their respective collections. As such BHL is not the digital version of a single collection, but a series of collections virtually brought together into our digital repository.
Please consider donating your materials to one of our partner institutions or to your local library.
Tags: selection, collection development, donate
We do not offer advertising placement or other advertising opportunities for the Biodiversity Heritage Library at this time.
If you would like to learn more about our program, or receive notification about potential future opportunities, please subscribe to our mailing list.
Tags: promotion, product placement, link placement, banners, link insertion, post placement, in-text link
DOIs and Stable URLs
BHL produces stable URLs for our content and will ensure viability of these URLs. Please read the following blog post for an explanation of how BHL redirects certain IDs when a book has been taken offline.
Stable URLs are available for the following areas of content, with examples:
- Subject: Insects
http://www.biodiversitylibrary.org/subject/Insects - Author: Darwin, Charles, (1809 – 1882)
http://www.biodiversitylibrary.org/creator/93 - Title: The Journal of the Linnean Society
http://www.biodiversitylibrary.org/bibliography/350 - Item/Book: The Journal of the Linnean Society, v. 8 1865
http://www.biodiversitylibrary.org/item/8361 - Article: Davidse, G, & Pohl, R. New taxa and nomenclatural combinations of Mesoamerican grasses (Poaceae). Novon a journal of botanical nomenclature from the Missouri Botanical Garden, 2,81-110. https://doi.org/10.2307/3391667
https://www.biodiversitylibrary.org/part/2491 - Page: Pl. XXX Falco Aesalon from Die Raubvögel Deutschlands und des angrenzenden Mitteleuropas… http://www.biodiversitylibrary.org/page/47850703
With regards to persistent identifiers, BHL also assigns DOIs to a selection of its content. Learn more.
Tags: URI, permalink, linking, identifiers, persistent identifiers
A DOI (Digital Object Identifier) is a unique permanent identifier that provides a persistent link to a digital object. When a DOI is assigned to a publication, such as a journal article, that DOI becomes part of the publication’s bibliographic information and should be included (as a link) whenever the publication is referenced. DOIs thus create a linked network of scholarly research, enabling readers to click from publication to publication. They also facilitate persistent linking to publications from websites, blog posts, tweets, Wikipedia articles, etc.
The vast majority of historic publications lack DOIs and thus sit outside this linked network. BHL is working to change this. BHL started minting DOIs for historic publications in 2011, focusing primarily on books and monographs. In October 2020, BHL launched a new Persistent Identifier Working Group (PIWG) whose first assignment is to retrospectively mint DOIs for the articles on BHL, thereby bringing the world’s legacy biodiversity journals into the modern linked network of knowledge.
DOIs not only enable persistent linking to BHL content; they also allow us to track how that content is being used. For example, in October 2020, BHL minted a DOI for the first scientific description of the Duck-billed Platypus (published in 1799 and contributed to BHL by Museums Victoria): https://doi.org/10.5962/p.304567. By April 2021, the article had been tweeted by 219 twitter accounts, referenced in six Wikipedia pages, picked up by one news outlet and cited in an academic paper (data from Altmetric, April 2021). We know this because the article has a DOI.
To learn more about DOIs, please see:
https://en.wikipedia.org/wiki/Digital_object_identifier.
Want to know more about BHL’s Persistent Identifier Working Group? See:
- What Is BHL’s New Persistent Identifier Working Group DOI’ng?
BHL Blog Post, 10 May 2021: https://blog.biodiversitylibrary.org/2021/05/persistent-identifier-working-group.html - Discovering the Platypus: From its scientific description to its DOI, Biodiversity Information Science and Standards (TDWG) Conference, 6 October 2020:https://youtu.be/4UVSEoWsSrw?t=1285
- #RetroPIDs: making historic Platypus Infinitely Discoverable (PID) PIDapalooza: the festival of persistent identifiers, 28 January 2021:https://youtu.be/CSeQNe5KR5U
With regards to persistent identifiers, BHL also produces persistent and stable URLs for our content and will ensure viability of these URLs. Learn more.
Tags: digital object identifier, DOI, IDs, Crossref, cite, citation, permalink, linking, persistent identifiers, RetroPIDs.
Download
Yes of course! The BHL makes its metadata available for public use under the CC0 1.0 Universal (CC0 1.0) Public Domain Dedication license. This Creative Commons license allows you to reuse, modify, repurpose, and distribute the metadata for all purposes including commercial and non-commercial, with no need to ask for permission.
Metadata in this case, refers to:
- Library catalog records, i.e. bibliographic data, used to describe the books and journals in the BHL collection (e.g. title and author data).
- Page level data such as page numbers and pages types (e.g. “Title page” and “Illustration”).
- Scientific name data, e.g. “Zea mays”.
The data in BHL’s collection is sourced and aggregated from its consortium partners and Internet Archive contributors. It is provided “as is,” without express or implied warranty as to accuracy, reliability, or fitness for any particular application. Please see our Data Disclaimer for more information.
Go ahead, take our metadata and do something creative with it! If you do repurpose BHL metadata please share your story with us. We often like to feature stories of reuse on our BHL blog.
Tags: download, export, METS, MODS, MARC, data, data mining, data export, data quality, text mining, copyright, license, licensing
BHL provides all its bibliographic and scientific name data for download and reuse via various web-services. Please visit our Developer and Data Tools page for more information.
Tags: download, METS, MODS, data export, API, technical development, text mining, data mining, OAI-PMH
There are many ways to download resources, and some are better suited for certain purposes than others. Below you will see some common uses suggested on the “useful for” lines, but of course there are many more ways to use the BHL collection.
You can find most of these options when you click on the drop-down menu for “Download Contents” to see various options while viewing a book.
- View Metadata takes you to the bibliography page for the title.
- Useful for: citations, bibliographic needs
- You can also download the MODS for the title from the bibliography page.
- Select pages to download allows you to generate a custom pdf containing only the pages you need.
- Useful for: choosing pages from one article to print or save
- Learn more and see instructions for downloading.
- Download Article allows you to retrieve a PDF of an entire article.
- Useful for: Quickly and easily downloading an article without having to manually select pages
- Note that the “Download Article” option will only be available if you are viewing content that has been defined as part of an article. Defining articles in BHL is an ongoing process, and not all articles in the Library have been indexed. Use the “select pages to download” option (above) if “Download Article” is not available.
- Learn more and see instructions for downloading.
- Download Book has further options to view:
- PDF = Download a PDF of the entire volume.
Useful for: saving or printing the entire book, saving to read on a tablet - See instructions for downloading.
- All = Download all files associated with this book – links to the Internet Archive where you have access to image files, metadata files, and other derivative files – more information about available files including descriptions and file types.
Useful for: downloading the grayscale/black and white PDF version of a book (if available), and more - JP2 = Download ALL the jpeg2000 image files – there is one image per page, and each image file is usually 2-5MB so this can take quite a long time.
Useful for: saving or printing most of a book in high resolution - See instructions for downloading.
- OCR = Download the plain text of the entire volume. Note: the plain text is created automatically using optical character recognition (OCR) software and has not been corrected or edited by humans.
Useful for: translating the text, searching for a particular word in a book, copying long passages
- PDF = Download a PDF of the entire volume.
- Download Citation allows you to download the BibTex or RIS files for the volume or part.
- Useful for: citations
- Learn more and see instructions for exporting.
- View at Internet Archive – takes you to a copy of the book at the Internet Archive
- Useful for: searching inside the book, viewing it in a different interface
Options for downloading images include:
- Printing or downloading a single page image (screen quality)
- Useful for: printing a single image for personal use
- See instructions for downloading.
- Downloading a single page image (high quality)
- Useful for: professional publication, large print of the image
- See instructions for downloading.
- Downloading a single image from the BHL Flickr
- Useful for: printing a single image for personal use
- See instructions for downloading.
- JP2 = Download ALL the jpeg2000 image files in a volume – there is one image per page, and each image file is usually 2-5MB so this can take quite a long time.
Useful for: saving or printing most of a book in high resolution
Other download options include:
- Downloading all of the data in BHL
- Useful for: scientific name research, etc.
- Learn more.
From the BHL book viewer
- Select the Download Contents drop down menu
- Choose “Download Book”
Finally, select the “Download PDF” link to receive a PDF of the whole book
Tags: Internet Archive, IA, images, pages
The “Select pages to download” feature allows you to generate a custom PDF containing only the pages you need from a book. This can be useful if an article in a journal has not yet been defined in BHL, and is therefore not downloadable via these other methods.
DO NOT use this method to generate a PDF of the whole book. Rather see these instructions.
Locating the article
If you have a specific article in mind, for example, “Abundance and Distribution of Queen Conch Veligers (Strombus gigas Linne) in the Central Bahamas. I. Horizontal Patterns in Relation to Reproductive and Nursery Grounds” from the Journal of Shellfish Research, v. 16, pp. 7-18) by Allan W. Stoner and Megan Davis.
1. From the BHL homepage, enter “Journal of Shellfish Research” into the title search box.
2. Select the volume 16 from the results.
3. You will be taken directly to volume 16. Choose “Select pages to download” from the “Download Contents” menu on the upper right hand side of the Book viewer.
Creating the PDF
1. A grid of the pages will appear. Select the pages you’d like to include.
2. Choose “Review” from the menu bar in the center of the Book Viewer.
3. If the pages displayed are correct, select “Generate My PDF” at the bottom of the pop-up. Otherwise, you may edit this screen to remove pages, or go back to the page-selection grid.
4. Please enter the email address you’d like us to send your completed PDF, along with any title, author, or subject descriptors.
5. You will receive a link to download your PDF and a confirmation number.
If you have questions or need to report a problem, please contact us and include the PDF generation confirmation number.
Tags: download, email
If you only need one page image from a book, you can right click (Command + click on your Mac) on the page you want, and usually choose “Save As” or “Open image in new window” – this will give you a screen-quality jpeg of the page.
You can also select the “Print” icon to print a single page image from the book.
Tags: download, jpg, #SciArt, illustration, art
To retrieve a full-size, full-resolution JPG image of any BHL page, use the following: http://www.biodiversitylibrary.org/pageimage/pageid. For example, http://www.biodiversitylibrary.org/pageimage/1000000.
Then right-click on the image to save it to your computer.
Watch the video below to see where to find the page ID and how to follow these instructions to download a single page, high resolution image.
Tags: Tags: download, jpeg, #SciArt, illustration, art
To download an image in the BHL Flickr, use the download button located to the right below each image:
If you would like to download a high resolution copy of an image in the BHL Flickr, please follow these instructions.
For more information about rights, licensing, and reusing BHL materials, please see our Copyright and Reuse page.
Tags: download, jpg, #SciArt, illustration, art, jpeg, social media
Did you find a great image in Flickr and want to access a higher resolution image of it? In many cases, higher resolution copies of images available in Flickr can be accessed and downloaded from the BHL website. The BHL link to every image in Flickr is contained in the image descriptions field. Simply click on the link to access the image in BHL, and then follow our instructions to either download a single page high resolution JPEG image or the JPEG2000 image files (the highest resolution offered by BHL) for all images in the book.
For more information about rights, licensing, and reusing BHL materials, please see our Copyright and Reuse page.
Tags: download, jpg, #SciArt, illustration, art, jpeg, social media
You can download JPEG2000 image files (the highest resolution offered by BHL) for any book in BHL for free using the following methods. Note that this download will include jpeg2000 image files for every page in the book – there is one image per page, and each image file is usually 2-5MB so this can take quite a long time.
If you want to download only select pages in high resolution, see these instructions.
1.) Select the JP2 option in the download options listed under the volume details on the book’s bibliography page.
— OR —
2.) Select the Download Book option from the Download Contents dropdown menu on the book viewer screen and choose the JP2 option.
Tags: jpg, #SciArt, illustration, art, images
Sometimes our “Select pages to download” feature, also known as our custom PDF generator, may experience temporary technical difficulties that may delay or, at worst, prevent your PDF from successfully reaching your email Inbox. We apologize for the error and kindly request that you try to generate your custom PDF once again.
Depending on your connection speed it may take some time to download the PDF to your machine as our custom PDF files can sometimes be very large in size.
Note that some browsers offer a built-in PDF viewer, which may not correctly display the images. If you experience viewing problems in your browser, open the PDF in an alternative viewer or try these troubleshooting tips: https://helpx.adobe.com/acrobat/kb/cant-view-pdf-web.html
Tags: PDF, download, article
Depending on your connection speed it may take some time to download a PDF generated using the “Select pages to download” feature. Our custom PDF files can sometimes be very large in file size. If you have trouble loading the PDF file on your machine, try finding the downloaded PDF file (in your downloads folder) and opening the file with your preferred PDF reading software.
Note that some browsers offer a built-in PDF viewer, which may not correctly display the images. If you experience viewing problems in your browser, open the PDF in an alternative viewer or try these troubleshooting tips: https://helpx.adobe.com/acrobat/kb/cant-view-pdf-web.html.
Tags: article, download
There are two ways to download a PDF of an article that has been defined in BHL.
When you are viewing an article landing page, you can download a PDF of the article using the Download PDF icon in the Article Links section.
When you are viewing an article within a book in BHL, the “Download Article” option in the “Download Contents” dropdown menu will also download the PDF of the article.
The “Download Article” option will only be available if you are viewing content that has been defined as part of an article. Articles that have been defined for the item you are viewing are listed in the Table of Contents in the book viewer. Click on any of the entries to navigate directly to the first page of that article. If you are viewing pages that have been defined as part of an article, the article title will display below the series title in the book viewer.
Please note that defining articles in BHL is an ongoing process, and not all articles in the Library have been indexed. If the article you need has not yet been indexed, you can still use our “Select Pages to Download” feature to manually select the article pages and generate a PDF. See instructions for the “Select Pages to Download” feature.
Full Text Search
BHL’s full text search is searching the generally uncorrected text derived through Optical Character Recognition (OCR) for your term. Since this OCR is automatically-generated and uncorrected, it may be incomplete or have errors that prevent the search engine from locating all instances of your term.
Tags: troubleshooting, quality control, quality assurance, QA
ElasticSearch examines multiple fields (i.e., title, keywords, and full OCR text) and assigns a relevancy score based on:
- The number of times a term appears in the field
- The length of the field
- How often that term appears
If a field is short, like the title, then a term appearing in that field is weighted higher.
If a field is very long, and a term appears many times in the field, and many times over all of the texts in the corpus, then that term is weighted lower. This helps give a lower weight to words like a, and, or the.
If a field is very long, and a term appears infrequently across all of the texts in the corpus, then it’s ranked higher. This helps give a stronger weight to words like hippopotamus or giraffe.
All three of these factors are combined to produce a score for each field in the document, and then the scores for each field are combined to assign a score to the entire document. On top of this, we can force certain fields to be given a greater “weight” than others.
Fields with a higher “weight” have more of an affect on the score than fields with a lower “weight”. The final document score is used to rank the document in the search results.
Nitty gritty details: https://www.elastic.co/guide/en/elasticsearch/guide/current/scoring-theory.html
Tags: user interface, browse
BHL’s full text search functionality uses the ElasticSearch engine to try and find your search term(s) within:
— the bibliographic metadata about the items in BHL’s collection and
— the Optical Character Recognized (OCR) text from digitized page images.
Learn more about OCR and how BHL uses it.
Tags: technical development, user interface, search inside
Yes you can! To search inside a book in BHL, navigate to the top right corner of the book viewer, select the Search Inside tab and enter your search terms.
In that same panel, results will display the pages where your search terms are found, along with snippets of the surrounding text. Navigate to any of those pages by clicking on the hyperlinked page number.
There are also two alternative “search inside” methods for searching within each individual book from outside of the book viewer:
- Search using the Internet Archive’s interface
From the book viewer, go to the upper right hand corner of the screen. Select the “Download Contents” drop-down menu, choose “View at Internet Archive.” You are now viewing the item on the Internet Archive’s website. Select the magnifying glass icon to the right of the book. Note: Do not use the top right search bar – this searches the entire Internet Archive Corpus. - Download and search the full PDF
From the book viewer, go to the upper right hand corner of the screen. Select the “Download Contents” drop-down menu, choose “Download Book.” Select PDF to download the entire book/volume as a PDF with the full text included as a layer. To search, use your PDF viewer’s search tool (often a magnifying glass icon), or Ctrl+F (Command+F on a Mac)
IMPORTANT: the full text provided by IA/BHL is uncorrected text derived automatically through Optical Character Recognition (OCR) software. As it is automatically generated and uncorrected, it may contain errors. Full text search is a powerful research tool, but should not be considered exhaustive. Learn more about how full text search works in BHL.
Tags: user interface, book viewer, display, find
General
BHL operates as a worldwide consortium of over 80 partners from natural history, botanical, research, and national libraries working together to digitize the natural history literature held in their collections and make it freely available for open access as part of a global “biodiversity community.”
Tags: institutions, members, affiliates, contributors
The Biodiversity Heritage Library (BHL) is funded in large part by BHL Member Dues and by individual donations. Members and Affiliates have also received generous support from their parent institutions to enable significant contributions in the form of staffing and other in-kind costs. Please see our complete list of funding sources for more information.
Tags: grants, funding, endowments, federal, money, budget, expenses, donors
We will take suggestions from users about additional materials to include in our collection but we cannot guarantee fulfillment of any requests submitted. We retain requests received from our users as a running list of potential additions and will process requests as time and resources allow. Please keep in mind that it commonly takes several years before requests are fulfilled.
Please review our Guidelines for Submitting Scanning Requests before submitting your request via this form.
Tags: digitize, digitization, recommendations, collections, collection development, contribute
If you notice a problem with the BHL website or an error with any of the materials in our collection, please let us know by filling out our feedback form at http://biodiversitylibrary.org/contact. BHL is voluntarily staffed by our Partner Libraries and we are limited in our ability to respond personally to each contact with our patrons. We appreciate your patience. A BHL staff member may contact you if we require further information.
If the form is unavailable for some reason (i.e. BHL website outage), you can email us at feedback@biodiversitylibrary.org.
Tags: troubleshooting, error, quality control, quality assurance, QA, technical problem
Thank you for your feedback! BHL is voluntarily staffed by our Partner Libraries and we are limited in our ability to respond personally to each contact with our patrons. We appreciate your patience. A BHL staff member may contact you if we require further information.
Tags: error, reporting, technical problem, quality control, quality assurance, QA, ask a librarian, reference
Many of the items in BHL’s collection are in the public domain and free to reuse without risk of copyright infringement. Check the field to determine the copyright status of any given item in our collection.
BHL does our best to indicate the copyright status of each item digitized by our partner institutions. Ultimately, BHL does not hold copyright on the materials in our collection and cannot grant permission. It is up to you to review the copyright status of the image or item you wish to reuse and abide by any copyright restrictions that may apply. For additional guidance on reusing content from BHL, visit Copyright and Reuse.
Generally speaking, the country where you plan to use the content and the nature of the use dictate what you can/cannot do as well as the level of risk involved. Commercial use carries higher risk than non-commercial use and in different countries there are different copyright laws governing use. Please review the copyright law for the country in which you plan to use or publish the content. BHL recommends consulting the World Intellectual Property Organization’s (WIPO) Lex database at https://wipolex.wipo.int/en/main/legislation for further information.
If the content you wish to reuse is from a work where the is…:
“In copyright. Digitized with the permission of the rights holder.”
…you may use the materials for non-commercial purposes so long as you provide attribution to the rights holder and share the materials under the same license. Please see the specific terms set forth in the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 license (CC BY-NC-SA license).
Tags: permission, reuse, republication, licensing, license, copyright
If the BHL website is down, please monitor our status on Twitter https://twitter.com/biodivlibrary or Facebook https://www.facebook.com/BioDivLibrary/. If BHL is down, you can access content from our partners in the biodiversity collection via Internet Archive.
If the BHL website is available but page images for books are not showing then this means that the Internet Archive’s website at http://archive.org is down. We partner with the Internet Archive to digitize and host our entire collection and as such their outages directly affect the availability of our materials. We will provide updates on our homepage and via our social media outlets as described above.
If you are experiencing website difficulties for whatever reason, please try http://www.downforeveryoneorjustme.com/ to test your connection and if the problem persists, submit your question or problem to our feedback form or (if the form is unavailable due to website outages) email us at feedback@biodiversitylibrary.org describing your issue in as much detail as possible. The more information we have to try and replicate the problem, the better chance we’ll have at being able to address it.
Tags: troubleshooting, technical issues, website problem, feedback, technical support
In addition to the information provided in the BHL FAQ, many of our consortium partners have created training materials about BHL collections and services. You can explore these materials below:
- BHL LibGuides from Partners
- Harvard Library BHL LibGuide: https://guides.library.harvard.edu/bhlguide
- University Library at University of Illinois, Urbana-Champaign BHL LibGuide: https://guides.library.illinois.edu/BHL
- Training Videos
- Training videos created by BHL partners are available on YouTube: https://www.youtube.com/playlist?list=PLYj_4vJM9EpsLEduvEuszDvjJ3h5aT9Bv
Tags: training, tutorials, how-tos, documentation, user training
Metadata
BHL offers metadata exports in MODS, BibTex, and RIS formats. MODS is an XML-based bibliographic description schema used in a variety of library applications. BibTex and RIS are bibliographic citation files that are compatible with a variety of citation management tools. The MODS file is a title-level download. The BibTex and RIS files are item-level downloads.
The MODS download is available at the bottom of the title and part bibliography pages.
The BibTex and RIS downloads are available in two places:
1) Under the volume or part details on bibliography pages.
2) Under the “Download Contents” menu in the book viewer, via the “Download Citation” option.
Tags: zotero, refworks, cite, citations, export, EndNote, Mendeley
When an item has “[YYYY printing]” in the volume information, this is meant to highlight that the items associated with this title are all different printings of the same edition. These books have the same content and pagination but the title page is different.
Another way to detect different printings of the same edition is to compare the copyright statements, for example:
- In the case of Birdcraft, the 1899 printing has an original copyright date of 1895 and a later date of 1897: “Set up and electrotyped April 1895. Reprinted January 1896. Reprinted with additions, and new illustrations October 1897; June 1899.”
- https://www.biodiversitylibrary.org/item/56637
- 1900 printing: same as 1899, except for addition of “December 1900”
- https://www.biodiversitylibrary.org/item/115300
- 1903 printing adds: “April, 1903”
- https://www.biodiversitylibrary.org/item/33139
The Biodiversity Heritage Library follows established practices for cataloging works. Each book or journal in BHL’s collection is represented by a bibliographic metadata record that describes the title, creator(s), and other relevant information about the work’s creation and content. The creator(s) listed as part of the records are the personal, family, or corporate names responsible for the overall production of the work. Requests to add additional creators will be considered by the Cataloging Group if the creator is noted on the chief source of information (i.e. the title page) and/or is the predominant contributor to the work (i.e the sole illustrator). This criteria is in accordance with the following factors:
- cataloging best practices – contributions to portions of a work may not qualify under standard cataloging rules for addition at the title level,
- resource constraints – BHL lacks dedicated cataloging resources to generate original catalog records at scale,
- technical constraints of BHL’s database model – BHL’s collection is primarily structured around books, volumes, and articles. It is not possible to add page level metadata at this time.
Enriching Page Level Metadata
BHL acknowledges the work being done to highlight historically marginalized creators whose contributions have gone unrecognized. For now, the best way to add illustrator or photographer information at the page level is to add tags with the creator name(s) to the images in BHL’s Flickr photostream. Tagging the images in Flickr, uploaded from works in BHL, best allows us to identify the creator of the specific images they contributed to the work. Please see our Flickr image tagging guide for instructions.
Metadata Export Services
BHL offers metadata exports in MODS, BibTex, and RIS formats. MODS is an XML-based bibliographic description schema used in a variety of library applications. BibTex and RIS are bibliographic citation files that are compatible with a variety of citation management tools. The MODS file is a title-level download. The BibTex and RIS files are item-level downloads.
The MODS download is available at the bottom of the title and part bibliography pages.
The BibTex and RIS downloads are available in two places:
1) Under the volume or part details on bibliography pages.
2) Under the “Download Contents” menu in the book viewer, via the “Download Citation” option.
Tags: zotero, refworks, cite, citations, export, EndNote, Mendeley
The Biodiversity Heritage Library (BHL) is an open access digital library that aggregates digitized monographs, journal volumes, and archival materials from hundreds of different contributors. In some cases, a consecutive series of journal volumes are available, but in other cases there are gaps in coverage. With the metadata available it is difficult, if not impossible, to accurately identify gaps in coverage. Identifying what is available is easy; identifying what is missing (the gaps) is hard.
These gaps in coverage occur for a variety of reasons: the volumes were not available for digitization due to contributor holdings or condition of the materials, or copyright restrictions prevented BHL from including the volumes in its open access repository. In order to more accurately present these coverage gaps, BHL followed guideline 6.4.6 (pg.17) in “NISO RP-9-2014, KBART Phase II Recommended Practice” where “A title should be listed twice if there is a coverage gap of greater than or equal to 12 months, with only the coverage field changing.”
The NISO KBART Standing Committee reviewed BHL’s KBART data file in 2020 and recommended each volume be listed individually. BHL’s KBART file contains many redundancies, but guarantees that each volume that is held in the collection is represented accurately. Therefore, when a new volume of a title is added to BHL’s collection, a new item (row) is added to the KBART file for that title. From the options listed in the BHL KBART Documentation, the decision was made to elect option 2 as the best method to provide holdings data in BHL’s KBART file.
Outreach
Our Promotional Materials page offers a wide range of resources to support outreach activities around BHL.
Tags: outreach, swag, press room, logos, flyers, marketing, public affairs, media, public relations, communication, resources
Our logos and logo guide are available on the BHL Logos page.
Tags: outreach, swag, press room, marketing, public affairs, media, design, style guide, public relations, communication, resources
Subscribe to our newsletter and follow our blog to stay up-to-date with all the latest BHL news.
Tags: outreach, swag, press room, marketing, public affairs, media, public relations, communication, resources, social media, announcements, alerts
Explore BHL’s social media activities on the BHL Community page.
Tags: outreach, press room, marketing, public affairs, media, public relations, communication, resources, announcements, alerts, Twitter, Facebook, Instagram, Flickr, Pinterest, blog, WordPress
Thank you for your interest in contributing a guest post to the Biodiversity Heritage Library Blog.
BHL only accepts solicited guest blog posts. If you would like to submit your resume, qualifications, and examples of your work for consideration, please email feedback@biodiversitylibrary.org. After review, your information may be added to our files for future reference. If we are interested in soliciting a post, we may contact you. Please note that BHL does not provide compensation for guest posts.
Tags: outreach, press room, marketing, public affairs, media, public relations, communication, resources, announcements, alerts, content development, content marketing
Will you share information about my project, website, resources, events, etc. with your audiences?
Guided by Smithsonian Directives 814 and 950, promotion of 3rd party content through BHL’s website and communication channels is restricted to entities with which BHL has an official relationship.
Tags: outreach, press room, marketing, public affairs, media, public relations, communication, blog, social media, announcements, alerts
We do not offer advertising placement or other advertising opportunities for the Biodiversity Heritage Library at this time.
If you would like to learn more about our program, or receive notification about potential future opportunities, please subscribe to our mailing list.
Tags: promotion, product placement, link placement, banners, link insertion, post placement, in-text link
In addition to the information provided in the BHL FAQ, many of our consortium partners have created training materials about BHL collections and services. You can explore these materials below:
- BHL LibGuides from Partners
- Harvard Library BHL LibGuide: https://guides.library.harvard.edu/bhlguide
- University Library at University of Illinois, Urbana-Champaign BHL LibGuide: https://guides.library.illinois.edu/BHL
- Training Videos
- Training videos created by BHL partners are available on YouTube: https://www.youtube.com/playlist?list=PLYj_4vJM9EpsLEduvEuszDvjJ3h5aT9Bv
Tags: training, tutorials, how-tos, documentation, user training
Scientific Names
You can generate a bibliography for a scientific name by either searching for a name (see instructions) or using the following URL structure:
https://www.biodiversitylibrary.org/name/Scientific_name
Where Scientific_name is any uninomial, binomial, or trinomial. Replace spaces with the underscore ( _ )character.
Examples:
https://www.biodiversitylibrary.org/name/Orchidaceae (Orchid family)
https://www.biodiversitylibrary.org/name/Carcharodon_carcharias (Great white shark)
https://www.biodiversitylibrary.org/name/Phalacrocorax_carbo_maroccanus (Great Cormorant)
Bibliographies use the canonical form of the species name and include results for both the canonical and long name (with author/date) forms. Bibliography URLs for the long name include references for both name forms.
Learn more about how the Taxonomic Name Recognition algorithm in BHL works.
Tags: taxonomy, taxa, taxon, species, genus, global names
The Biodiversity Heritage Library uses taxonomic intelligence tools, including gnfinder developed by Global Names Architecture, to locate, verify, and record scientific names located within the text of each digitized page. The text used for this identification is usually uncorrected OCR, so may not include all results expected or visible in the page.
Learn more about how the Taxonomic Name Recognition algorithm in BHL works.
Tags: taxonomy, taxa, taxon, species, binomial, genus, trinomial, optical character recognition
Unfortunately we are unable to correct scientific name issues due to the fact that we use an external service through the Global Names Architecture (GNA) (http://globalnames.org/) to identify scientific names within our corpus. We have chosen to be less restrictive in order to match more names. This choice, however, sometimes allows incorrect names to be identified. We feel this is the best choice in order to give our users the most useful results.
Additionally, due to automated processes that regularly update existing content, we are not able to make changes to the scientific names. The changes would simply be reverted. We do this to keep BHL up to date with changes and additions that are made to the external service.
Furthermore, it is not uncommon for the optical character recognized (OCR) text behind the page images to contain errors that may result in the GNA algorithm missing relevant scientific names. We are working on strategies to improve our OCR for the future but do not yet have a way to correct it.
We recommend you inform GNA about the scientific name issue(s) by submitting your feedback to their GitHub site https://github.com/gnames/gnfinder/issues.
Learn more about how the Taxonomic Name Recognition algorithm in BHL works.
Tags: taxonomy, taxa, taxon, binomial, trinomial, genus, contribute, troubleshooting, report an error
You can search for a scientific name starting from either the general search bar or advanced search page.
If you are starting from General Search…
Enter a taxonomic name into the general search box:
Select the Scientific Names tab from the results screen:
If you are starting from Advanced Search…
Select the Scientific Names tab to search for your name:
Both methods will return a list of names matching your search. Click on the desired name:
You will be taken to the Species Bibliography for that name, which lists all of the individual pages where that name occurs. Use the Pages column to access specific pages.
Bibliographies use the canonical form of the species name and include results for both the canonical and long name (with author/date) forms. Bibliography URLs for the long name include references for both name forms.
Learn more about how the Taxonomic Name Recognition algorithm in BHL works.
Tags: taxonomy, binomials, genus, global names, taxa, taxon, search, data mining, text mining, OCR, optical character recognition
For each of the scientific names found on a page in BHL, you can access a variety of taxonomic data sources by clicking on the DNA icon next to each name.
A pop-up window will appear displaying a list of data sources in which that name is indexed. The dropdown menu for the name displays all the variations by which that name is indexed in each of the listed data sources. Selecting one of those variations will display the source(s) in which that variation appears.
Clicking on any source will take you to the entry for that name in that data source on the web.
These data sources are populated from the Global Names Verifier which draws from many taxonomic name databases from around the internet. The names themselves are indexed as part of BHL’s implementation of gnfinder, a taxonomic intelligence tool developed by the Global Names Architecture which is used to locate, verify, and record scientific names within the text of each digitized page in BHL. The text used for this identification is usually uncorrected OCR, so may not include all results expected or visible in the page (see more information on this in our FAQ).
If you have questions about or would like to report an issue related to the data sources in the Global Names Verifier, please contact the Global Names Architecture on github.
Learn more about BHL’s implementation of gnfinder for taxonomic name recognition.
Tags: taxonomy, taxa, taxon, taxonomic name finding, taxonomic data sources
When new or updated page text is added to BHL, that text is used as input for the gnfinder tool from Global Names. See https://github.com/gnames/gnfinder for more information.
The gnfinder tool does the following:
- Analyzes the text
- Identifies text strings that might be scientific names
- Compares potential names to multiple repositories of scientific names (such as EOL and Catalogue of Life) to identify known names
- Compiles the results
- Outputs the results
Questions about the details of how all of this is done should be directed to Global Names Architecture.
Here is an example gnfinder response to a fragment of text that includes the name “Strix varia”.
NOTE: This example uses a response from gnfinder version 0.11.1. Other versions of the tool may format the response differently.
Fields that are evaluated and/or stored by BHL are highlighted:
{ "metadata": { "date": "2020-06-24T16:39:42.4189206-05:00", "gnfinderVersion": "v0.11.1", "withBayes": true, "tokensAround": 0, "language": "eng", "detectLanguage": false, "totalWords": 462, "totalCandidates": 68, "totalNames": 7 }, "names": [ { "cardinality": 2, "verbatim": "Strix varia,", "name": "Strix varia", "odds": 550545.1983958198, "start": 2296, "end": 2308, "annotationNomenType": "NO_ANNOT", "annotation": "", "verification": { "bestResult": { "dataSourceId": 1, "dataSourceTitle": "Catalogue of Life", "taxonId": "3809730", "matchedName": "Strix varia Barton, 1799", "matchedCardinality": 2, "matchedCanonicalSimple": "Strix varia", "matchedCanonicalFull": "Strix varia", "classificationPath": "Animalia|Chordata|Aves|Strigiformes|Strigidae|Strix|Strix varia", "classificationRank": "kingdom|phylum|class|order|family|genus|species", "classificationIds": "3939792|3940184|3944244|3944475|3944476|4195146|3809730", "matchType": "ExactCanonicalMatch" }, "dataSourcesNum": 28, "dataSourceQuality": "HasCuratedSources", "retries": 1 } } ] }
The “dataSourceQuality”, “match_type”, and “odds” fields are evaluated to determine which data to keep and which to discard (responses can include some very uncertain, or “fuzzy”, matches that BHL does not keep).
Once the names to keep are identified, the following data fields are read from the response and stored in BHL:
- name
- matchedName
- matchedCanonicalFull
- dataSourceId (ID for the respository in which the name string was matched)
- dataSourceTitle (the respository in which the name string was matched)
- localId (the ID for the name in the repository in which it was matched)
- taxonId (used in place of the localId, if no localId value exists)
So, in this example, BHL would store the following information (a name and three identifiers for that name):
- name: Strix varia
- matchedName: Strix varia Barton, 1799
- matchedCanonicalFull: Strix varia
- dataSourceId: 1
- dataSourceTitle: Catalogue of Life
- taxonId: 3809730
Search
BHL’s full text search is searching the generally uncorrected text derived through Optical Character Recognition (OCR) for your term. Since this OCR is automatically-generated and uncorrected, it may be incomplete or have errors that prevent the search engine from locating all instances of your term.
Tags: troubleshooting, quality control, quality assurance, QA
ElasticSearch examines multiple fields (i.e., title, keywords, and full OCR text) and assigns a relevancy score based on:
- The number of times a term appears in the field
- The length of the field
- How often that term appears
If a field is short, like the title, then a term appearing in that field is weighted higher.
If a field is very long, and a term appears many times in the field, and many times over all of the texts in the corpus, then that term is weighted lower. This helps give a lower weight to words like a, and, or the.
If a field is very long, and a term appears infrequently across all of the texts in the corpus, then it’s ranked higher. This helps give a stronger weight to words like hippopotamus or giraffe.
All three of these factors are combined to produce a score for each field in the document, and then the scores for each field are combined to assign a score to the entire document. On top of this, we can force certain fields to be given a greater “weight” than others.
Fields with a higher “weight” have more of an affect on the score than fields with a lower “weight”. The final document score is used to rank the document in the search results.
Nitty gritty details: https://www.elastic.co/guide/en/elasticsearch/guide/current/scoring-theory.html
Tags: user interface, browse
BHL’s full text search functionality uses the ElasticSearch engine to try and find your search term(s) within:
— the bibliographic metadata about the items in BHL’s collection and
— the Optical Character Recognized (OCR) text from digitized page images.
Learn more about OCR and how BHL uses it.
Tags: technical development, user interface, search inside
You can generate a bibliography for a scientific name by either searching for a name (see instructions) or using the following URL structure:
https://www.biodiversitylibrary.org/name/Scientific_name
Where Scientific_name is any uninomial, binomial, or trinomial. Replace spaces with the underscore ( _ )character.
Examples:
https://www.biodiversitylibrary.org/name/Orchidaceae (Orchid family)
https://www.biodiversitylibrary.org/name/Carcharodon_carcharias (Great white shark)
https://www.biodiversitylibrary.org/name/Phalacrocorax_carbo_maroccanus (Great Cormorant)
Bibliographies use the canonical form of the species name and include results for both the canonical and long name (with author/date) forms. Bibliography URLs for the long name include references for both name forms.
Learn more about how the Taxonomic Name Recognition algorithm in BHL works.
Tags: taxonomy, taxa, taxon, species, genus, global names
Using the basic search box, select either the “Full-text” or “Catalog” radio button. The default search is a full-text search.
- Full-text — searches bibliographic metadata + the full character recognized text of the entire BHL collection. Yields (many) more results than a “catalog” search.
- Catalog — searches bibliographic metadata only, such as title, author, etc. as you would expect to search within a library catalog. Yields fewer results than full-text search.
Note that full text matches rely on automatically generated Optical Character Recognition (OCR) which is of variable quality for each scanned book. Please review more information about how our full text search works in BHL.
Select either the “Full-text” (searches bibliographic metadata + full text) or “catalog” (searches bibliographic metadata only) button by the search box.
- Enter keywords in a string:
- Example: Darwin origin species
- For an exact search, enter a phrase in quotes:
- Example: “origin of species”
- Use quotes when searching for a DOI
- Example: “10.5962/bhl.title.98662”
- Truncate to search across multiple possible term variants or spellings
- Example: Linn*
- Capitalize Boolean operators
- AND: Search terms combined; Results include any items that contain both terms in the metadata or full text
- Example: barred AND owl
- OR: Results include any items that include either term in the metadata or full text
- Example: Linne OR Linnaeus
- NOT: Results exclude any items that contain the term following the operator ‘NOT’
- Example: owl NOT barred
- AND: Search terms combined; Results include any items that contain both terms in the metadata or full text
- For more complex searches that combine Boolean operators, enter keywords in parentheses:
- Example: (origin OR descent) AND Darwin
For more advanced search strategies, please see our Advanced Search recommendations.
Tags: how to, troubleshooting, support
Each search result displays item information including some basic metadata (e.g., Title, Volume, Publication, etc.). Expand Details to view your search terms in context.
To help refine your search, BHL offers Did You Mean terms presented above the search results. These can be used to help identify potential variant spellings of your search term or other terms of similar spelling.
Tags: facet, filter, advanced search, authors, titles, sort, browse, troubleshooting, support,
BHL now offers the ability to filter your results using facets on the left-hand side. Refine your search by Type, Material, Author, Publication Date, Subject, and Language. By default, each facet will display up to 10 values for filtering. When there are more than 10 available, selecting more… will expand the list to up to 30.
Results automatically update upon selection of a facet value. For best results, select one at a time, allowing the results set to update, before selecting an additional facet value if you wish to further limit your results.
Each facet value displays the number of matches found in parentheses. Some facets are organized by the number of matches while others may be chronological or alphabetical, as outlined below.
Filter by Type
Options sort alphabetically
Filter by Material
Options sort alphabetically
Filter by Author
Options sort by relevance (i.e., names with the most matches listed first)
Filter by Publication Date
Options sort chronologically
Filter by Subject
Options sort by relevance (i.e., names with the most matches listed first)
Filter by Language
Options sort by relevance (i.e., names with the most matches listed first)
Tags: help desk, user interface, browse
Yes you can! To search inside a book in BHL, navigate to the top right corner of the book viewer, select the Search Inside tab and enter your search terms.
In that same panel, results will display the pages where your search terms are found, along with snippets of the surrounding text. Navigate to any of those pages by clicking on the hyperlinked page number.
There are also two alternative “search inside” methods for searching within each individual book from outside of the book viewer:
- Search using the Internet Archive’s interface
From the book viewer, go to the upper right hand corner of the screen. Select the “Download Contents” drop-down menu, choose “View at Internet Archive.” You are now viewing the item on the Internet Archive’s website. Select the magnifying glass icon to the right of the book. Note: Do not use the top right search bar – this searches the entire Internet Archive Corpus. - Download and search the full PDF
From the book viewer, go to the upper right hand corner of the screen. Select the “Download Contents” drop-down menu, choose “Download Book.” Select PDF to download the entire book/volume as a PDF with the full text included as a layer. To search, use your PDF viewer’s search tool (often a magnifying glass icon), or Ctrl+F (Command+F on a Mac)
IMPORTANT: the full text provided by IA/BHL is uncorrected text derived automatically through Optical Character Recognition (OCR) software. As it is automatically generated and uncorrected, it may contain errors. Full text search is a powerful research tool, but should not be considered exhaustive. Learn more about how full text search works in BHL.
Tags: user interface, book viewer, display, find
You can browse the BHL’s collection by Title, Author, Date, Collection, or Contributor.
- Browsing by Title will provide an alphabetical list of all titles in the BHL collection — ignores leading English articles such as The, A, An, etc.; does not ignore non-English leading articles such as La, Die, etc.
- Author browse allows you review an alphabetized list of all creators in the BHL corpus – choose the starting letter of the Last Name, then scroll or search within the results using your browser’s Find feature.
- Browsing by Collection allows you to explore a variety of title subsets curated around a particular topic or theme.
- When browsing by Contributor you are able to view subsets of items grouped according to the institution that contributed them to the BHL collection.
Browse pages display 250 results at a time. You can page through the results using the “Next” and “Prev” controls.
To browse each page more effectively, use your browser’s Find features (Generally Ctrl+F for PCs and Command+F for Macs). This will allow you to search within the browse ‘results.’
Tags: user interface, book viewer, display, find
BHL’s Advanced Search provides a number of options for restricting your search. Advanced searches are catalog-only (i.e. bibliographic metadata like title, author, etc.) searches by default. However, you can use the field on the “Publications” tab to search publication text in addition to bibliographic metadata.
Publications
Search by entering your keywords in to specific bibliographic fields in order to better limit your results. By keyword searching using a combination of 2 or more fields you will limit your results even further:
Title
Author Name
Year
Subject
Language
Collection
Text
For the and fields, select the “All Words” option to search for all of the words specified (in any order) and the “Exact Phrase” option to search for an exact phrase.
Authors
Limit your search for an author’s last name to BHL’s table of author names.
Subjects
Search within the subject keyword database to pull up all subjects related to your search term.
Scientific Names
Biodiversity Heritage Library uses Global Names Architecture’s gnfinder, a taxonomic name finding tool, to search through all of the texts digitized in BHL and extract the scientific names. Searching for a name will return a list of all the individual pages where that name occurs.
IMPORTANT: the full text provided by IA/BHL is uncorrected text derived automatically through Optical Character Recognition (OCR) software. As it is automatically generated and uncorrected, it may contain errors. Full text search is a powerful research tool, but should not be considered exhaustive.
Tags: how to, troubleshooting, support, user interface
You can search for a scientific name starting from either the general search bar or advanced search page.
If you are starting from General Search…
Enter a taxonomic name into the general search box:
Select the Scientific Names tab from the results screen:
If you are starting from Advanced Search…
Select the Scientific Names tab to search for your name:
Both methods will return a list of names matching your search. Click on the desired name:
You will be taken to the Species Bibliography for that name, which lists all of the individual pages where that name occurs. Use the Pages column to access specific pages.
Bibliographies use the canonical form of the species name and include results for both the canonical and long name (with author/date) forms. Bibliography URLs for the long name include references for both name forms.
Learn more about how the Taxonomic Name Recognition algorithm in BHL works.
Tags: taxonomy, binomials, genus, global names, taxa, taxon, search, data mining, text mining, OCR, optical character recognition
Of course. If you know the title of the book you are looking for and none of the other approaches to searching the BHL corpus has worked for you, try doing a site-specific Google search of BHL. To do this:
- go to www.google.com
- in the search box, enter your search terms
- after your search terms, add site:biodiversitylibrary.org
- press enter!
This makes it possible to use Google’s search algorithm on BHL’s website (searches book metadata).
Example search string: “resultats campagnes scientifiques accomplies site:biodiversitylibrary.org”
Hidden within the pages of BHL books and journals are millions of visual resources including drawings, paintings, diagrams, maps, charts, tables, and photographs. They range in size from small, black and white line drawings interspersed within text to full page plate color images.
Image searching within BHL is limited at this time. Searching can only be done at the item (i.e., book or volume) level. Once you’ve identified an item of interest, you can open it in the book viewer and use the thumbnail view to quickly browse for images.
BHL also makes many of the images from available in Flickr. Visit BHL’s Flickr Photostream at
https://www.flickr.com/photos/biodivlibrary.
Use the “search photostream” magnifying glass icon to restrict your search within only BHL’s image collection in flickr
To learn more about searching for BHL images or other useful tips and training on using BHL’s website, please see Where can I find training materials…?
Tags: illustrations, #SciArt, art, artworks, photographs
If you cannot find what you are looking for in the BHL collection, please consider submitting a Scanning Request via our webform. We will do our best to process your request as time allows. Please keep in mind that we may be limited in our ability to fulfill requests depending on the holdings of BHL partner institutions, condition of materials, and any copyright restrictions that may apply.
For more information, see our Guidelines for Submitting Scanning Requests.
Tags: digitization, collection development
- Visit BHL’s photostream at flickr.com/biodivlibrary.
- Click on the magnifying glass.
- Enter your search term (such as a scientific name or keyword) in the search box and press “Enter”.
Note that not all of the images in our Flickr collection have been tagged with scientific names. These tags are added by volunteers as part of a citizen science initiative. Learn more about how you can get involved.
Tags: illustrations, #SciArt, art, artworks, photographs, image tagging, crowdsourcing
We have implemented functionality to allow BHL Staff to upload transcriptions in place of the automatically-generated OCR (Optical Character Recognition) text for digitized materials in BHL’s collection. This functionality supports transcriptions generated as part of in-house or crowdsourced transcription projects hosted by BHL Partners. The Show Text tab now indicates whether the text has been:
- automatically generated and uncorrected;
- automatically generated and error corrected, by machines, which may still include inconsistencies;
- or manually transcribed by humans.
Please note that BHL’s OCR is generated by its Internet Archive digitization partner using Tesseract Open Source OCR (as of 2020) or ABBYYFineReader.
Web-based crowdsource transcription projects are largely managed through the following providers, DigiVol, FromThePage, and Smithsonian Transcription Center.
Especially for archival materials, like field notes and correspondence with handwritten text, transcriptions make these items full-text searchable and enable our taxonomic name recognition software to index scientific names within their pages. Since the transcribed text can be viewed alongside the digitized page image, users can also more easily read materials with difficult-to-decipher handwriting. Thus, this new functionality makes it easier for researchers and the public to explore these valuable primary source materials and access specific information from their pages.
Interested in transcribing materials? Several BHL Partners have transcription projects on various crowdsourcing platforms. Follow the links below to explore the opportunities and get involved:
- Auckland War Memorial Museum Tamaki Paenga Hira on FromThePage
- Ernst Mayr Library of Harvard University on DigiVol
- Harvard Botany Libraries on FromThePage
- Lenhardt Library, Chicago Botanic Garden on FromThePage
- The John Torrey Papers from The New York Botanical Garden on FromThePage
- Smithsonian Institution Archives on the Smithsonian Transcription Center
- Ukrainian Collection items from the National Agricultural Library (NAL) on FromThePage
Tags: crowdsourcing, citizen science, transcription, OCR, full text search, archives
Technical
If the BHL website is down, please monitor our status on Twitter https://twitter.com/biodivlibrary or Facebook https://www.facebook.com/BioDivLibrary/. If BHL is down, you can access content from our partners in the biodiversity collection via Internet Archive.
If the BHL website is available but page images for books are not showing then this means that the Internet Archive’s website at http://archive.org is down. We partner with the Internet Archive to digitize and host our entire collection and as such their outages directly affect the availability of our materials. We will provide updates on our homepage and via our social media outlets as described above.
If you are experiencing website difficulties for whatever reason, please try http://www.downforeveryoneorjustme.com/ to test your connection and if the problem persists, submit your question or problem to our feedback form or (if the form is unavailable due to website outages) email us at feedback@biodiversitylibrary.org describing your issue in as much detail as possible. The more information we have to try and replicate the problem, the better chance we’ll have at being able to address it.
Tags: troubleshooting, technical issues, website problem, feedback, technical support
Status Update: October 28, 2024, 11:30 EDT
The BHL website is currently available. The Internet Archive is back online, though maintenance and upgrades continue until the site is fully restored. If archive.org access is disrupted again, BHL page images will not load.
Why aren’t page images loading in BHL?
BHL serves its page images directly from the Internet Archive, our digitization partner. Outages affecting archive.org directly affect the availability of BHL materials.
What happened to archive.org?
Starting on October 8, archive.org and the Internet Archive experienced a significant cyber attack, resulting in a systems outage, defacement of the website, and in a separate incident, a breach of user data. For security reasons, the Internet Archive chose to take its sites and services offline to focus on securing, prioritizing, and preparing systems to be restored incrementally. Visit the Internet Archive blog for more information on the attacks.
Can I still use BHL while archive.org is offline?
Many BHL services remain available even when page images are not loading.
You can:
- Search the collection
- Search for scientific names
- View or download OCR text of page images
- Download PDFs for individual articles
- View over 300,000 BHL images via Flickr
Can I view BHL items on other platforms?
BHL is a global consortium housing contributions from partners around the world. Our partners have contributed their content to a number of platforms. You can find portions of BHL’s collection in these other repositories:
- Academy of Natural Sciences, Art Collection Guide
- AnimalBase
- American Museum of Natural History Library Digital Repository
- Bibliotheca Alexandrina – contains snapshot of 100,000+ BHL volumes via Advanced Search
- Botanicus
- Cornell University Library Digital Collections
- Digital library of the Caribbean
- Digitale Sammlungen– Bavarian State Library via the Munich Digitization Center
- Europeana
- Field Museum Annual Reports
- Harvard Digital Library Series
- “Dumbarton Oaks Digitization Project. Garden and Landscape Studies. Rare books”
- “Nature Prints from the Botany Libraries”
- Archival collections
- Hathi Trust – primarily Google Books scans but some other things as well
- Gallica (Bibliothèque nationale de France)
- German Botanical Journals Collection, 1753-1914
- Illinois Digital Archives, Chicago Botanic Garden
- Library and Archives Canada Collection Search
- Library of Congress Digital Collections
- Linda Hall Library
- Muséum national d’Histoire naturelle
- New York Botanical Garden Digital Library
- Real Jardín Botánico, CSIC Digital Library
- SciELO.org
When will BHL services be fully restored?
Archive.org came back online on Monday 21 October in read-only mode. At that time, most page images were once again viewable in BHL. Some BHL page images may still be unavailable until all IA services are fully restored.
BHL resumed its custom PDF generator and processed the backlog of requests. Some requests may encounter missing page images until IA services are fully restored.
Archive.org has not resumed all services, so BHL partners are not yet uploading new content to the Internet Archive, and therefore no new content is being contributed to BHL at this time. Our partners will resume contributions when archive.org services are fully restored.
Where can I get the latest updates?
For the latest status updates on BHL services, follow @biodivlibrary.
For the latest status updates on Internet Archive services, follow their official accounts on Twitter/X, Bluesky or Mastodon.
Tools and Services
BHL produces stable URLs for our content and will ensure viability of these URLs. Please read the following blog post for an explanation of how BHL redirects certain IDs when a book has been taken offline.
Stable URLs are available for the following areas of content, with examples:
- Subject: Insects
http://www.biodiversitylibrary.org/subject/Insects - Author: Darwin, Charles, (1809 – 1882)
http://www.biodiversitylibrary.org/creator/93 - Title: The Journal of the Linnean Society
http://www.biodiversitylibrary.org/bibliography/350 - Item/Book: The Journal of the Linnean Society, v. 8 1865
http://www.biodiversitylibrary.org/item/8361 - Article: Davidse, G, & Pohl, R. New taxa and nomenclatural combinations of Mesoamerican grasses (Poaceae). Novon a journal of botanical nomenclature from the Missouri Botanical Garden, 2,81-110. https://doi.org/10.2307/3391667
https://www.biodiversitylibrary.org/part/2491 - Page: Pl. XXX Falco Aesalon from Die Raubvögel Deutschlands und des angrenzenden Mitteleuropas… http://www.biodiversitylibrary.org/page/47850703
With regards to persistent identifiers, BHL also assigns DOIs to a selection of its content. Learn more.
Tags: URI, permalink, linking, identifiers, persistent identifiers
A DOI (Digital Object Identifier) is a unique permanent identifier that provides a persistent link to a digital object. When a DOI is assigned to a publication, such as a journal article, that DOI becomes part of the publication’s bibliographic information and should be included (as a link) whenever the publication is referenced. DOIs thus create a linked network of scholarly research, enabling readers to click from publication to publication. They also facilitate persistent linking to publications from websites, blog posts, tweets, Wikipedia articles, etc.
The vast majority of historic publications lack DOIs and thus sit outside this linked network. BHL is working to change this. BHL started minting DOIs for historic publications in 2011, focusing primarily on books and monographs. In October 2020, BHL launched a new Persistent Identifier Working Group (PIWG) whose first assignment is to retrospectively mint DOIs for the articles on BHL, thereby bringing the world’s legacy biodiversity journals into the modern linked network of knowledge.
DOIs not only enable persistent linking to BHL content; they also allow us to track how that content is being used. For example, in October 2020, BHL minted a DOI for the first scientific description of the Duck-billed Platypus (published in 1799 and contributed to BHL by Museums Victoria): https://doi.org/10.5962/p.304567. By April 2021, the article had been tweeted by 219 twitter accounts, referenced in six Wikipedia pages, picked up by one news outlet and cited in an academic paper (data from Altmetric, April 2021). We know this because the article has a DOI.
To learn more about DOIs, please see:
https://en.wikipedia.org/wiki/Digital_object_identifier.
Want to know more about BHL’s Persistent Identifier Working Group? See:
- What Is BHL’s New Persistent Identifier Working Group DOI’ng?
BHL Blog Post, 10 May 2021: https://blog.biodiversitylibrary.org/2021/05/persistent-identifier-working-group.html - Discovering the Platypus: From its scientific description to its DOI, Biodiversity Information Science and Standards (TDWG) Conference, 6 October 2020:https://youtu.be/4UVSEoWsSrw?t=1285
- #RetroPIDs: making historic Platypus Infinitely Discoverable (PID) PIDapalooza: the festival of persistent identifiers, 28 January 2021:https://youtu.be/CSeQNe5KR5U
With regards to persistent identifiers, BHL also produces persistent and stable URLs for our content and will ensure viability of these URLs. Learn more.
Tags: digital object identifier, DOI, IDs, Crossref, cite, citation, permalink, linking, persistent identifiers, RetroPIDs.
Yes of course! The BHL makes its metadata available for public use under the CC0 1.0 Universal (CC0 1.0) Public Domain Dedication license. This Creative Commons license allows you to reuse, modify, repurpose, and distribute the metadata for all purposes including commercial and non-commercial, with no need to ask for permission.
Metadata in this case, refers to:
- Library catalog records, i.e. bibliographic data, used to describe the books and journals in the BHL collection (e.g. title and author data).
- Page level data such as page numbers and pages types (e.g. “Title page” and “Illustration”).
- Scientific name data, e.g. “Zea mays”.
The data in BHL’s collection is sourced and aggregated from its consortium partners and Internet Archive contributors. It is provided “as is,” without express or implied warranty as to accuracy, reliability, or fitness for any particular application. Please see our Data Disclaimer for more information.
Go ahead, take our metadata and do something creative with it! If you do repurpose BHL metadata please share your story with us. We often like to feature stories of reuse on our BHL blog.
Tags: download, export, METS, MODS, MARC, data, data mining, data export, data quality, text mining, copyright, license, licensing
Of course! As a free and open access repository of biodiversity literature we provide all our code and documentation on our Github site https://github.com/gbhl/bhl-us. Please let us know if you reuse our code in any new or interesting ways. We’d love to hear from you!
Tags: technical development, database, data model
Indeed we do and we encourage people to reuse our data in new and innovative ways! Please see more information about our APIs and other Developer and Data Tools.
Tags: data export, data mining, text mining, technical development, code
BHL provides all its bibliographic and scientific name data for download and reuse via various web-services. Please visit our Developer and Data Tools page for more information.
Tags: download, METS, MODS, data export, API, technical development, text mining, data mining, OAI-PMH
Optical Character Recognition (OCR) is the process of converting images of text into machine readable text characters. This process is performed by special software such as ABBYY FineReader or Tesseract Open Source OCR. It is important to note that OCR text derived from this automated process is uncorrected and of varying quality.
BHL uses OCR to process all the page images in our collection so that the text contained within the images can be indexed and made searchable in support of full text search functionality and the taxonomic name finding algorithm.
BHL’s OCR is generated by its Internet Archive digitization partner using Tesseract Open Source OCR (as of 2020). Items uploaded to BHL prior to 2020 used ABBYY FineReader to generate OCR. To learn more about the quality of Tesseract generated OCR, please see the BHL blog post on OCR Improvements (July 2022).
Tags: search, text mining, data mining, text recognition
The OCR, or Optical Character Recognition, text in BHL is generally automatically-generated and uncorrected, meaning that it may be incomplete or contain errors. We do not yet have a way for users to directly correct the OCR within BHL.
We have implemented functionality to allow BHL Partners to upload transcriptions in place of the automatically-generated OCR (Optical Character Recognition) for archival materials digitized in BHL. You are welcome to help improve the OCR for these materials by participating in various Partner crowdsourcing projects from which BHL sources transcriptions, such as the Smithsonian Transcription Center, FromThePage, and DigiVol.
Tags: crowdsourcing, OCR, text recognition, text correction, citizen science
BHL offers metadata exports in MODS, BibTex, and RIS formats. MODS is an XML-based bibliographic description schema used in a variety of library applications. BibTex and RIS are bibliographic citation files that are compatible with a variety of citation management tools. The MODS file is a title-level download. The BibTex and RIS files are item-level downloads.
The MODS download is available at the bottom of the title and part bibliography pages.
The BibTex and RIS downloads are available in two places:
1) Under the volume or part details on bibliography pages.
2) Under the “Download Contents” menu in the book viewer, via the “Download Citation” option.
Tags: zotero, refworks, cite, citations, export, EndNote, Mendeley
The Biodiversity Heritage Library uses taxonomic intelligence tools, including gnfinder developed by Global Names Architecture, to locate, verify, and record scientific names located within the text of each digitized page. The text used for this identification is usually uncorrected OCR, so may not include all results expected or visible in the page.
Learn more about how the Taxonomic Name Recognition algorithm in BHL works.
Tags: taxonomy, taxa, taxon, species, binomial, genus, trinomial, optical character recognition
Unfortunately we are unable to correct scientific name issues due to the fact that we use an external service through the Global Names Architecture (GNA) (http://globalnames.org/) to identify scientific names within our corpus. We have chosen to be less restrictive in order to match more names. This choice, however, sometimes allows incorrect names to be identified. We feel this is the best choice in order to give our users the most useful results.
Additionally, due to automated processes that regularly update existing content, we are not able to make changes to the scientific names. The changes would simply be reverted. We do this to keep BHL up to date with changes and additions that are made to the external service.
Furthermore, it is not uncommon for the optical character recognized (OCR) text behind the page images to contain errors that may result in the GNA algorithm missing relevant scientific names. We are working on strategies to improve our OCR for the future but do not yet have a way to correct it.
We recommend you inform GNA about the scientific name issue(s) by submitting your feedback to their GitHub site https://github.com/gnames/gnfinder/issues.
Learn more about how the Taxonomic Name Recognition algorithm in BHL works.
Tags: taxonomy, taxa, taxon, binomial, trinomial, genus, contribute, troubleshooting, report an error
BHL is working to make our library catalog records accessible via standard repositories and formats. We provide our title level and item (book or volume) level records through the following methods:
- BHL title records are available in OCLC’s WorldCat® database at https://www.worldcat.org/. If you have access to OCLC metadata service products, please look for the OCLC symbol “BHLMR” when reviewing holdings.
- BHL has implemented the NISO or “Knowledge Bases and Related Tools” KBART standard to facilitate the harvest of our item level holdings data into index-based discovery layer tools. If your institution is a subscriber to a major discovery product or knowledge base supplier, please talk to your provider about indexing BHL records in their system. OCLC’s Knowledgebase, Primo, and Summon are a few of many examples of index-based discovery layer tools that utilize data available in the KBART standard. Please refer to the BHL KBART FAQ for more details.
- Exports of BHL bibliographic, scientific name, and full optical character recognized text are available in a variety of formats via the Biodiversity Heritage Library Open Data Collection on Smithsonian’s Figshare https://doi.org/10.25573/data.21081727.v1.
We have implemented functionality to allow BHL Staff to upload transcriptions in place of the automatically-generated OCR (Optical Character Recognition) text for digitized materials in BHL’s collection. This functionality supports transcriptions generated as part of in-house or crowdsourced transcription projects hosted by BHL Partners. The Show Text tab now indicates whether the text has been:
- automatically generated and uncorrected;
- automatically generated and error corrected, by machines, which may still include inconsistencies;
- or manually transcribed by humans.
Please note that BHL’s OCR is generated by its Internet Archive digitization partner using Tesseract Open Source OCR (as of 2020) or ABBYYFineReader.
Web-based crowdsource transcription projects are largely managed through the following providers, DigiVol, FromThePage, and Smithsonian Transcription Center.
Especially for archival materials, like field notes and correspondence with handwritten text, transcriptions make these items full-text searchable and enable our taxonomic name recognition software to index scientific names within their pages. Since the transcribed text can be viewed alongside the digitized page image, users can also more easily read materials with difficult-to-decipher handwriting. Thus, this new functionality makes it easier for researchers and the public to explore these valuable primary source materials and access specific information from their pages.
Interested in transcribing materials? Several BHL Partners have transcription projects on various crowdsourcing platforms. Follow the links below to explore the opportunities and get involved:
- Auckland War Memorial Museum Tamaki Paenga Hira on FromThePage
- Ernst Mayr Library of Harvard University on DigiVol
- Harvard Botany Libraries on FromThePage
- Lenhardt Library, Chicago Botanic Garden on FromThePage
- The John Torrey Papers from The New York Botanical Garden on FromThePage
- Smithsonian Institution Archives on the Smithsonian Transcription Center
- Ukrainian Collection items from the National Agricultural Library (NAL) on FromThePage
Tags: crowdsourcing, citizen science, transcription, OCR, full text search, archives
For each of the scientific names found on a page in BHL, you can access a variety of taxonomic data sources by clicking on the DNA icon next to each name.
A pop-up window will appear displaying a list of data sources in which that name is indexed. The dropdown menu for the name displays all the variations by which that name is indexed in each of the listed data sources. Selecting one of those variations will display the source(s) in which that variation appears.
Clicking on any source will take you to the entry for that name in that data source on the web.
These data sources are populated from the Global Names Verifier which draws from many taxonomic name databases from around the internet. The names themselves are indexed as part of BHL’s implementation of gnfinder, a taxonomic intelligence tool developed by the Global Names Architecture which is used to locate, verify, and record scientific names within the text of each digitized page in BHL. The text used for this identification is usually uncorrected OCR, so may not include all results expected or visible in the page (see more information on this in our FAQ).
If you have questions about or would like to report an issue related to the data sources in the Global Names Verifier, please contact the Global Names Architecture on github.
Learn more about BHL’s implementation of gnfinder for taxonomic name recognition.
Tags: taxonomy, taxa, taxon, taxonomic name finding, taxonomic data sources
The Biodiversity Heritage Library is committed to continuously improving access and discoverability of the open access biodiversity literature and archives available at https://www.biodiversitylibrary.org/. Thanks to the input of numerous BHL users, partners, and other stakeholders, BHL has identified a number of top priorities to enhance the user experience. As we have a very small team of dedicated staff for technical development efforts, we must focus our limited resources on one or two priorities at a time.
Our current year and past year technical priorities are detailed below.
- 2024 Technical Priorities
- 2023 Technical Priorities
- 2022 Technical Priorities
- 2021 Technical Priorities
- 2020 Technical Priorities
Although our small staff and current priorities prevent us from pursuing most collaborative efforts, we do welcome any and all suggestions and will review them as time allows. These can be especially important as we move to new priorities in the future. We also encourage anyone to download and use BHL data and content for their own projects. To share technical recommendations or the results of your own work processing or improving BHL data, please contact us.
The Biodiversity Heritage Library (BHL) is an open access digital library that aggregates digitized monographs, journal volumes, and archival materials from hundreds of different contributors. In some cases, a consecutive series of journal volumes are available, but in other cases there are gaps in coverage. With the metadata available it is difficult, if not impossible, to accurately identify gaps in coverage. Identifying what is available is easy; identifying what is missing (the gaps) is hard.
These gaps in coverage occur for a variety of reasons: the volumes were not available for digitization due to contributor holdings or condition of the materials, or copyright restrictions prevented BHL from including the volumes in its open access repository. In order to more accurately present these coverage gaps, BHL followed guideline 6.4.6 (pg.17) in “NISO RP-9-2014, KBART Phase II Recommended Practice” where “A title should be listed twice if there is a coverage gap of greater than or equal to 12 months, with only the coverage field changing.”
The NISO KBART Standing Committee reviewed BHL’s KBART data file in 2020 and recommended each volume be listed individually. BHL’s KBART file contains many redundancies, but guarantees that each volume that is held in the collection is represented accurately. Therefore, when a new volume of a title is added to BHL’s collection, a new item (row) is added to the KBART file for that title. From the options listed in the BHL KBART Documentation, the decision was made to elect option 2 as the best method to provide holdings data in BHL’s KBART file.
Troubleshooting
BHL’s full text search is searching the generally uncorrected text derived through Optical Character Recognition (OCR) for your term. Since this OCR is automatically-generated and uncorrected, it may be incomplete or have errors that prevent the search engine from locating all instances of your term.
Tags: troubleshooting, quality control, quality assurance, QA
If you notice a problem with the BHL website or an error with any of the materials in our collection, please let us know by filling out our feedback form at http://biodiversitylibrary.org/contact. BHL is voluntarily staffed by our Partner Libraries and we are limited in our ability to respond personally to each contact with our patrons. We appreciate your patience. A BHL staff member may contact you if we require further information.
If the form is unavailable for some reason (i.e. BHL website outage), you can email us at feedback@biodiversitylibrary.org.
Tags: troubleshooting, error, quality control, quality assurance, QA, technical problem
If the BHL website is down, please monitor our status on Twitter https://twitter.com/biodivlibrary or Facebook https://www.facebook.com/BioDivLibrary/. If BHL is down, you can access content from our partners in the biodiversity collection via Internet Archive.
If the BHL website is available but page images for books are not showing then this means that the Internet Archive’s website at http://archive.org is down. We partner with the Internet Archive to digitize and host our entire collection and as such their outages directly affect the availability of our materials. We will provide updates on our homepage and via our social media outlets as described above.
If you are experiencing website difficulties for whatever reason, please try http://www.downforeveryoneorjustme.com/ to test your connection and if the problem persists, submit your question or problem to our feedback form or (if the form is unavailable due to website outages) email us at feedback@biodiversitylibrary.org describing your issue in as much detail as possible. The more information we have to try and replicate the problem, the better chance we’ll have at being able to address it.
Tags: troubleshooting, technical issues, website problem, feedback, technical support
If you cannot find what you are looking for in the BHL collection, please consider submitting a Scanning Request via our webform. We will do our best to process your request as time allows. Please keep in mind that we may be limited in our ability to fulfill requests depending on the holdings of BHL partner institutions, condition of materials, and any copyright restrictions that may apply.
For more information, see our Guidelines for Submitting Scanning Requests.
Tags: digitization, collection development
BHL is committed to maintaining persistent and open access to the materials in our collection. We make our best effort to incorporate materials that are relevant to the broad scope of biodiversity, high quality digital copies, and free of copyright restriction or otherwise contributed with permission from copyright holder. Due to the nature of the BHL program as a mass digitization project, we are not able to scrutinize every item that enters into the BHL collection. On rare occasions, content may be removed from the collection for a few reasons:
- Relevance — the BHL Collections Committee may determine that the content is irrelevant to the spectrum of fields related to biodiversity knowledge.
- Poor Image Quality — content may be temporarily removed from BHL if the scan quality is so poor as to render the digital content illegible within a reasonable zoom level. Items digitized from the collections of BHL partners will be resubmitted to the scanning queue and replaced to the best of our ability.
- Copyright Concern — please refer to our Take Down Guidelines for details.
Tags: error, ocean perch, title, relevance, removed, unavailable, unpublished
Sometimes our “Select pages to download” feature, also known as our custom PDF generator, may experience temporary technical difficulties that may delay or, at worst, prevent your PDF from successfully reaching your email Inbox. We apologize for the error and kindly request that you try to generate your custom PDF once again.
Depending on your connection speed it may take some time to download the PDF to your machine as our custom PDF files can sometimes be very large in size.
Note that some browsers offer a built-in PDF viewer, which may not correctly display the images. If you experience viewing problems in your browser, open the PDF in an alternative viewer or try these troubleshooting tips: https://helpx.adobe.com/acrobat/kb/cant-view-pdf-web.html
Tags: PDF, download, article
Depending on your connection speed it may take some time to download a PDF generated using the “Select pages to download” feature. Our custom PDF files can sometimes be very large in file size. If you have trouble loading the PDF file on your machine, try finding the downloaded PDF file (in your downloads folder) and opening the file with your preferred PDF reading software.
Note that some browsers offer a built-in PDF viewer, which may not correctly display the images. If you experience viewing problems in your browser, open the PDF in an alternative viewer or try these troubleshooting tips: https://helpx.adobe.com/acrobat/kb/cant-view-pdf-web.html.
Tags: article, download
In addition to the information provided in the BHL FAQ, many of our consortium partners have created training materials about BHL collections and services. You can explore these materials below:
- BHL LibGuides from Partners
- Harvard Library BHL LibGuide: https://guides.library.harvard.edu/bhlguide
- University Library at University of Illinois, Urbana-Champaign BHL LibGuide: https://guides.library.illinois.edu/BHL
- Training Videos
- Training videos created by BHL partners are available on YouTube: https://www.youtube.com/playlist?list=PLYj_4vJM9EpsLEduvEuszDvjJ3h5aT9Bv
Tags: training, tutorials, how-tos, documentation, user training
Status Update: October 28, 2024, 11:30 EDT
The BHL website is currently available. The Internet Archive is back online, though maintenance and upgrades continue until the site is fully restored. If archive.org access is disrupted again, BHL page images will not load.
Why aren’t page images loading in BHL?
BHL serves its page images directly from the Internet Archive, our digitization partner. Outages affecting archive.org directly affect the availability of BHL materials.
What happened to archive.org?
Starting on October 8, archive.org and the Internet Archive experienced a significant cyber attack, resulting in a systems outage, defacement of the website, and in a separate incident, a breach of user data. For security reasons, the Internet Archive chose to take its sites and services offline to focus on securing, prioritizing, and preparing systems to be restored incrementally. Visit the Internet Archive blog for more information on the attacks.
Can I still use BHL while archive.org is offline?
Many BHL services remain available even when page images are not loading.
You can:
- Search the collection
- Search for scientific names
- View or download OCR text of page images
- Download PDFs for individual articles
- View over 300,000 BHL images via Flickr
Can I view BHL items on other platforms?
BHL is a global consortium housing contributions from partners around the world. Our partners have contributed their content to a number of platforms. You can find portions of BHL’s collection in these other repositories:
- Academy of Natural Sciences, Art Collection Guide
- AnimalBase
- American Museum of Natural History Library Digital Repository
- Bibliotheca Alexandrina – contains snapshot of 100,000+ BHL volumes via Advanced Search
- Botanicus
- Cornell University Library Digital Collections
- Digital library of the Caribbean
- Digitale Sammlungen– Bavarian State Library via the Munich Digitization Center
- Europeana
- Field Museum Annual Reports
- Harvard Digital Library Series
- “Dumbarton Oaks Digitization Project. Garden and Landscape Studies. Rare books”
- “Nature Prints from the Botany Libraries”
- Archival collections
- Hathi Trust – primarily Google Books scans but some other things as well
- Gallica (Bibliothèque nationale de France)
- German Botanical Journals Collection, 1753-1914
- Illinois Digital Archives, Chicago Botanic Garden
- Library and Archives Canada Collection Search
- Library of Congress Digital Collections
- Linda Hall Library
- Muséum national d’Histoire naturelle
- New York Botanical Garden Digital Library
- Real Jardín Botánico, CSIC Digital Library
- SciELO.org
When will BHL services be fully restored?
Archive.org came back online on Monday 21 October in read-only mode. At that time, most page images were once again viewable in BHL. Some BHL page images may still be unavailable until all IA services are fully restored.
BHL resumed its custom PDF generator and processed the backlog of requests. Some requests may encounter missing page images until IA services are fully restored.
Archive.org has not resumed all services, so BHL partners are not yet uploading new content to the Internet Archive, and therefore no new content is being contributed to BHL at this time. Our partners will resume contributions when archive.org services are fully restored.
Where can I get the latest updates?
For the latest status updates on BHL services, follow @biodivlibrary.
For the latest status updates on Internet Archive services, follow their official accounts on Twitter/X, Bluesky or Mastodon.