The Smithsonian Institution
Browse
new_final_txt.tar.gz (111.07 MB)
Download file

Smithsonian Annual Reports Text files v2

Download (111.07 MB)
dataset
posted on 2022-09-26, 19:19 authored by Rebecca DikowRebecca Dikow, Michael TriznaMichael Trizna

txt files resulting from OCR using Tesseract for Smithsonian Annual Report documents. The JPGs that were used as input data were downloaded from https://library.si.edu/digital-library/collection/smithsonian-legacy-publications. These txt files were not post-processed.

History