Introduction
In addition to – or instead of – collecting your own data, there are many opportunities to reuse existing research data, public data, registry data, and health data. The potential of data reuse is substantial, and as part of project planning it is useful to map the available options. Below we have selected data sources relevant for different purposes, as well as widely used search services specifically developed for finding datasets. We have also included information on data citation, licenses and terms of use in a reuse context.
Data Sources
Below we have selected a set of data sources where one can find data material that may be useful for parts of the research activities at Kristiania.
Data Sources, pulldown
A key challenge when reusing existing data is knowing where to find relevant material. Below we have therefore selected a range of search services that help you find shared datasets.
Data Citation
When using others’ data in a research context, those data must be cited. The purpose is the same as for any other citation in academic work: to acknowledge the creator’s contribution and to enable readers to locate the same information themselves. This strengthens transparency and reproducibility, and helps further increase the visibility of data that are available for reuse.
Normalising data citation is a step in the right direction toward making data a meritable part of scientific contributions.
Where dataset references should be placed in a publication will depend on the publisher’s/journal’s guidelines. It is often required to state information about the availability of the publication’s data in a dedicated declaration in the text—a “data availability statement” or “data access statement”—and the full reference should then be included in that section. Unless otherwise specified, it will often be natural to describe the use of the data and include the corresponding reference in the methods section. You must also consider whether it is most appropriate to cite the dataset directly or the associated publication. You can read more about data citation here (CESSDA).
Did you know that data citation also applies to your own data? If you have archived data that you collected yourself, they should also be cited; in this way, you create a link between the publication and the data on which it is based.
Data citations should follow the reference style used for the rest of the text and should, in practice, include the following information:
- Author(s)
- Publication date
- Title
- Version number (if relevant)
- Resource type (if relevant)
- Repository
- DOI or another persistent identifier
Example of syntax – Note! The format depends on the reference style used: Author. (Year). Title. (Version). Repository/Publisher. Persistent identifier (e.g., DOI).
Licenses and Terms
If you plan on using data made available either openly or with restricted access, you must pay attention to the licenses and terms under which the data are provided.
For data made openly available through data repositories and other sharing platforms, there will usually be an open license. There are several options, but the most commonly used for datasets are Creative Commons (CC) and Open Data Commons (ODC). Each framework offers a set of licenses that vary in the degree of free use and the restrictions imposed. It may also occur that repositories/institutions use custom licenses.
Some data are shared under restricted access – in such cases, you must obtain permission to access and use the data. This can be as simple as contacting the data owner and agreeing on use directly, or it may involve a more extensive application process in which, for example, the project and its purpose must be documented and specific conditions may need to be met in order to gain access.