Biodiversity Heritage Library Open Data Collection

Published on by Jacqueline Dearborn

About the Biodiversity Heritage Library Open Data Collection

All BHL data is available for public use under the CC0 1.0 Universal (CC0 1.0) Public Domain Dedication license. This Creative Commons license allows anyone to reuse, modify, re-purpose, and distribute the data for all purposes including commercial and non-commercial, without the need to ask for permission.

Go ahead, take our data and do something creative with it! If you do repurpose BHL metadata please share your story with us. We often like to feature stories of reuse on our BHL blog.

To use this data effectively it is important to understand how it was cataloged, what format types are available, and what data exists for the named entities in the BHL database. Please consult the following definitions below:

Hosted vs. Complete Versions

The Biodiversity Heritage Library Open Data Collection contains two iterations of BHL data -- hosted and complete:

Hosted: contains data that is hosted on BHL servers

Complete: contains data that is hosted on BHL servers, plus data that is externally hosted.

Exports Formats

The records in this collection are cataloged and made available for export by format type:

MODS

BibTex

RIS

TSV

OCR TXT

Each record contains several dataset distributions for the following named entities in the BHL database:

BHL Titles: contains bibliographic metadata about the journals and monographs as extracted from the contributing library’s catalog at the time of digitization or applied post-digitization.

BHL Items: contains information about each bound object (or “book”) digitized from a contributing library. For a serial, journal, or multi-volume monograph, an item represents a volume or multiple volumes bound together. For a single-volume monograph, an item represents the book.

BHL Creators: contains the names of the authors of each journal and monograph

BHL Parts (Segments): contains information about articles/chapters/treatments/etc. These parts may or may not be contained in material scanned by BHL

BHL Pages: contains the metadata about the scanned pages from an Item.

BHL Subjects: contains information about subject headings assigned to each journal and monograph represented in the BHL web portal.

BHL Names: contains all of the names that have been identified by Global Names Scientific Names Services and the pages on which those names are found.

Additional Information

• Data only includes records for the entities with the status of “Published” in the BHL database. 

• Data is refreshed on the first of the month. Only the most recent versions are available due to size limitations.     


Cite items from this project

cite all items

Share

email