The Smithsonian Institution
Browse

Biodiversity Heritage Library Open Data Collection

Published on by Jacqueline Dearborn

About the Biodiversity Heritage Library Open Data Collection

All BHL data is available for public use under the CC0 1.0 Universal (CC0 1.0) Public Domain Dedication license. This Creative Commons license allows anyone to reuse, modify, re-purpose, and distribute the data for all purposes including commercial and non-commercial, without the need to ask for permission.

Go ahead, take our data and do something creative with it! If you do repurpose BHL metadata please share your story with us. We often like to feature stories of reuse on our BHL blog.

To use this data effectively it is important to understand how it was cataloged, what format types are available, and what data exists for the named entities in the BHL database. Please consult the following definitions below:

Hosted vs. Complete Versions

The Biodiversity Heritage Library Open Data Collection contains two iterations of BHL data -- hosted and complete:

Hosted: contains data that is hosted on BHL servers
Complete: contains data that is hosted on BHL servers, plus data that is externally hosted.

Exports Formats

The records in this collection are cataloged and made available for export by format type:

MODS

BibTex

RIS

TSV

OCR TXT

Each record contains several dataset distributions for the following named entities in the BHL database:
BHL Titles: contains bibliographic metadata about the journals and monographs as extracted from the contributing library’s catalog at the time of digitization or applied post-digitization.
BHL Items: contains information about each bound object (or “book”) digitized from a contributing library. For a serial, journal, or multi-volume monograph, an item represents a volume or multiple volumes bound together. For a single-volume monograph, an item represents the book.
BHL Creators: contains the names of the authors of each journal and monograph
BHL Parts (Segments): contains information about articles/chapters/treatments/etc. These parts may or may not be contained in material scanned by BHL
BHL Pages: contains the metadata about the scanned pages from an Item.
BHL Subjects: contains information about subject headings assigned to each journal and monograph represented in the BHL web portal.
BHL Names: contains all of the names that have been identified by Global Names Scientific Names Services and the pages on which those names are found.


Please check out our documentation for a more comprehensive overview of BHL datsets.

Additional Information

• Data only includes records for the entities with the status of “Published” in the BHL database. 

• Data is refreshed on the first of the month. Only the most recent versions are available due to figshare size constraints.     


Data Disclaimer

The data in BHL’s collection is sourced and aggregated from its consortium partners and Internet Archive contributors. It is provided "as is," without express or implied warranty as to accuracy, reliability, or fitness for any particular application. Please see our Data Disclaimer for more information.




Cite items from this project

DataCite
3 Biotech
3D Printing in Medicine
3D Research
3D-Printed Materials and Systems
4OR
AAPG Bulletin
AAPS Open
AAPS PharmSciTech
Abhandlungen aus dem Mathematischen Seminar der Universität Hamburg
ABI Technik (German)
Academic Medicine
Academic Pediatrics
Academic Psychiatry
Academic Questions
Academy of Management Discoveries
Academy of Management Journal
Academy of Management Learning and Education
Academy of Management Perspectives
Academy of Management Proceedings
Academy of Management Review

cite all items

Share

email