Describing Your Data: Data Dictionaries
To enable discovery and citation all research data should have a corresponding descriptive metadata record as well as explanatory metadata, or data dictionary, that defines and describes the data itself. Not convinced? The USGS persuasively answers the question why do we need metadata?
A data dictionary, "read me" file, or key explains the contents of the dataset. Any information someone would need to interpret or re-use your data should be included in the data dictionary. It may include full definitions of any abbreviations used, units of measurement, allowable values in a field, data types, thesauri or controlled vocabularies used, and other important details of the data elements along with a brief description of the provenance or parameters of the data, i.e., date or location the data was collected. For an example of a data dictionary, and more detailed guidelines, download the best practices pdf at the bottom of this page.
“ The increased use of data processing and electronic data interchange heavily relies on accurate, reliable, controllable, and verifiable data recorded in databases. One of the prerequisites for a correct and proper use and interpretation of data is that both users and owners of data have a common understanding of the meaning and descriptive characteristics (e.g., representation) of that data. To guarantee this shared view, a number of basic attributes has to be defined.”
-International Standards Organization (ISO) Information Technology Parts 1-6 (2nd Edition),2004.
Resources for creating data dictionaries:
- Northwest Environmental Data Network Best Practices for Data Dictionary Definitions and Usage (2006) (pdf)
- Open Science Framework "How to make a data dictionary"
- USGS Data Dictionaries and Thesauri