Data Repositories

Whether you choose to deposit your data in a specialty repository, a general-purpose external repository, or a local SI repository, make sure that the services and terms offered fit the needs of your data.

Specialty Repositories

Given the large number of specialty repositories that exist or are being built for specific data types, specific organisms, and large grant-funded collaborative projects, it is impractical to list all the data repositories that could be used by SI researchers to conform to the FAIR (Findable, Accessible, Interoperable and Reusable) data principles. Before depositing data in a repository not listed below or on the attached best practices document, you should at a minimum insure that the repository:

  • Has a plan and sufficient funding to ensure its long-term viability.
  • Allows export of data and data descriptions in a standards-compliant format, preferably identical to the format you deposited.

Ideally, the repository should also:

  • Enable easy citation of your data, including supporting DOIs (either minted by the repository or by SIL).
  • Be searchable, and indexed in a service such as DataCite or Elsevier's DataSearch.
  • Support application of an appropriate license, and embargo of data if necessary.
  • Support metadata standards for your data, e.g., ISO 19115 for Geographic data.

Find a Specialty Repository

  •—a registry of research data repositories—is an excellent source of detailed information on individual repositories.
  • PLoS One: You may also want to consult the PLoS One list of recommended repositories (listed by data format and discipline).

Any repository managed by a U.S. Federal agency or national laboratory, e.g., NIH's GenBank, NASA's National Space Science Data Center, or ORNL's DAAC is considered a preferred repository for any SI research data that meets their criteria for deposit. In addition,  data repositories run by established U.S. institutions such as Harvard's Dataverse, are also acceptable.

General-purpose Repositories

SI has two centrally-managed repositories that accept Smithsonian-produced data: Figshare for Institutions and Smithsonian Research Online (SRO). Both accept (or mint) DOIs for citation. Figshare is a commercial platform designed for sharing research data and is backed up in the cloud, though it is centrally administered by Smithsonain Libraries and Archives and Smithsonian's OCIO. SRO and SIdora have actively managed, backed-up, secure storage in Smithsonian's Herndon Data Center. All platforms support having both open (accessible) and closed (private data), though Figshare is primarily designed for open data.

  • Figshare for Institutions: best for sharing data that need a DOI including those that underlie peer-reviewed publications; bounded datasets of mixed formats; or data that is periodically updated and needs to be versioned. See the Figshare Confluence (SI staff) site for more information.
  • SRO: is best for smaller (<50GB), fixed (inactive) datasets that accompany or support publications deposited in SRO.
    To deposit data and publications in SRO, you can self-deposit using the forms found on the internal staff pages (SI staff) or contact
Last Updated April 3, 2021