txt files resulting from OCR using Tesseract for Smithsonian Annual Report documents. The JPGs that were used as input data were downloaded from https://library.si.edu/digital-library/collection/smithsonian-legacy-publications. These txt files were not post-processed.