Data Repositories

Whether you choose to deposit your data in a specialty repository, a general-purpose external repository, or a local Smithsonian (SI) repository, make sure that the services and terms offered fit the needs of your data.

Specialty Repositories

Given the large number of specialty repositories that exist or are being built for specific data types, specific organisms, and large grant-funded collaborative projects, it is impractical to list all the data repositories that could be used by SI researchers to conform to the FAIR (Findable, Accessible, Interoperable and Reusable) data principles. Before depositing data in a repository not listed below, you should ensure that the repository:

  • Has a plan and sufficient funding to ensure its long-term viability.
  • Allows export of data and data descriptions in a standards-compliant format, preferably identical to the format you deposited.

Ideally, the repository should also:

  • Enable easy citation of your data, including supporting DOIs (either minted by the repository or by Smithsonian Libraries and Archives). For more about DOIs see: ​Globally Unique Identifiers (GUIDs) (DOIs, ARKs, etc.).
  • Be searchable and indexed in a service such as DataCite or Elsevier's DataSearch.
  • Support application of an appropriate license, and embargo of data if necessary.
  • Support metadata standards for your data, e.g., ISO 19115 for Geographic data.

Find a Specialty Repository

  • RE3DATA.org: RE3DATA.org, a registry of research data repositories, is an excellent source of detailed information on individual repositories.
  • PLoS One: You may also want to consult the PLoS One list of recommended repositories (listed by data format and discipline).

Any repository managed by a U.S. Federal agency or national laboratory, e.g., NIH's GenBank, NASA's National Space Science Data Center, or ORNL's DAAC is considered a preferred repository for any SI research data that meets their criteria for deposit. In addition, data repositories run by established U.S. institutions such as Harvard's Dataverse, are also acceptable.

General-purpose Repositories

SI has two centrally managed repositories that accept Smithsonian-produced data: Figshare for Institutions and Smithsonian Research Online (SRO). Both accept (or mint) DOIs to improve citation. Figshare is a commercial platform designed for sharing research data, though it is centrally administered by Smithsonian Libraries and Archives and Smithsonian's OCIO.  

  • Figshare for Institutions: is best for sharing data that need a DOI including those that underlie peer-reviewed publications; bounded datasets of mixed formats; or data that is periodically updated and needs to be versioned. It also provides mechanisms for private sharing of datasets prior to publication. See the Figshare Confluence (SI staff) site for more information.
  • SRO: is best for smaller (<50GB), fixed (inactive) datasets that accompany or support publications deposited in SRO. To deposit data and publications in SRO, you can self-deposit using the forms found on the SRO Staff Portal's Enter Data for New Item (SI staff) or contact research-online@si.edu.
Last Updated October 6, 2023