Data Downloads

Image of waterfallThe Smithsonian Libraries is committed to the principles of Open Access, and provides as much data as we can with the least restrictive licenses. What about images?

We have several datasets available for reuse provided below - most are licensed CC0, but please check the README file for specific licensing and reuse information.

Please note, this page is a work in progress! Check out the README files for information on the contents of the various data sets, and the date when the dataset was last refreshed. Some data is continually updated, some only on a quarterly or annual basis.

If you have questions, feel free to ask - contact us.

Biodiversity Heritage Library

Metadata for digitized texts in the Biodiversity Heritage Library (not just from the Smithsonian) is available for download and reuse either through their API, or as downloadable delimited text files. Metadata primarily consists of bibliographic data, but lists of species names found in BHL texts are also available.

Information about accessing and reusing BHL data can be found on their wiki: https://about.biodiversitylibrary.org/tools-and-services/developer-and-data-tools/#Data%20Exports

Catalog MARC Data

A downloadable file containing all MARC (not MARCXML) records from our Sirsi Dynix Horizon Catalog is available for download. Not for the faint of heart! The ZIP file uncompresses to 1.2 GB and contains approximately 1.8 million records.

Last Updated: March 2016
URL: SIL_MARC_20151130.zip (313 MB)
README hints for working with our MARC data 

Catalog data via Z39.50

You can also query the catalog in real time using the Z39.50 protocol. There is an abundance of both free and commercial software that will help you do this. Our catalog is hosted on Sirsi-Dynix Horizon and may be queried using the following parameters:

Address: siris-libraries.si.edu
Port: 210
Attributes:

  • Author Keyword - 1003
  • Title Keyword - 44
  • Subject Keyword - 121
  • General Keyword - 1035 or 1017
  • Barcode - 999

Taxonomic Literature II

The raw data for this important reference work in the field of Botany is available in two formats: text and XML. Contained in the .zip file below are two folders of information, one for each format. In each folder are 15 files named by TL-2 Volume or Supplemental Volume (e.g., TL_2_Vol_2.txt or TL_2_Suppl_7.xml). Browse before you download here.

Last Updated: January 23, 2013
URL: Download ZIP File Now (29.3 MB, Version 1.2)
README and Release Notes

Index Animalium

Charles Davies Sherborn's Index Animalium is a compendium of zoological taxonomic species nomenclature from 1758 to 1850. For each species described in this period it clearly lists the genus name, author, publication, pages, and date. The data elements are laid out on the page in an extremely regular columnar format. The work is approximately 9,000 pages and includes introductions, a full bibliography of the works cited in the main body, and other apparatus.

Last Updated: January 1, 2006
URL: https://www.sil.si.edu/DigitalCollections/indexanimalium/Datasets/ (~48 MB)

Trade Literature

The trade literature collection of the Smithsonian is internationally known as an important source for the history of American business, technology, marketing, consumption, and design. Manufacturers issued trade catalogs to promote and sell their products. The present collection contains more than 500,000 catalogs, technical manuals, advertising brochures, price lists, company histories and related materials representing more than 30,000 companies.
The trade literature database contains over 35,000 records inventorying the holdings of each company in the collection.

Last Updated: Refreshed daily
URL: http://www.sil.si.edu/tradeliterature/TL_solr.cfm (~15.2 MB)
README and data notes

Art and Artist Vertical Files

The Smithsonian Libraries' Art and Artists' Files is an inventory list of the personal and corporate names contained in the vertical file collections held in the art and design libraries of the Smithsonian. Personal names are primarily those of artists and designers, though patrons, sitters, and others related to art are also represented. Corporate names include those of galleries, museums and other arts organizations. Files contain information on over 60,000 individuals and organizations and may include exhibition announcements, newspaper and magazine clippings, press releases, brochures, reviews, invitations, illustrations, résumés, artists' statements, or exhibition catalogs.

Last Updated: Refreshed daily
URL: http://www.sil.si.edu/DigitalCollections/art-design/artandartistfiles/vf_all_solr.cfm (~3.6 MB)
README File

Smithsonian Research Online 

Smithsonian Research Online is a set of services provided to the Smithsonian research community. Managed by the Smithsonian Libraries, the program assists in capturing the published research output of Smithsonian scholars and making it available to Institutional administration as well as scientists and historians world-wide. 

Users can download search results as JSON or CSV formatted files. For those who want the full data set, please contact us at research-online@si.edu 

Last Updated: Refreshed Daily
URL: https://research.si.edu