Making Your Data Open: Preserve and Share | Program for Open Scholarship and Education

Data repositories ensure a persistent location where the data file(s) and associated documentation (i.e., metadata) are preserved and available to share.

Data could be published in either an institutional or disciplinary repository. An example of an institutional repository is UBC’s Scholars Portal Dataverse, where any researcher can create an account and deposit data. Its required metadata template helps to enhance discovery of data. The Registry of Research Data Repositories includes many disciplinary repositories, and you can search by subject to find a resource relevant to your field. Choosing a generalist data repository is also an option if there is no discipline specific repository in your field. Below are some common generalist repositories:

Scholars Portal Dataverse: A publicly accessible data repository, open to affiliated researchers (primarily from Canadian universities) to deposit and share research data. It is hosted by the University of Toronto libraries.
Dryad: Home for a wide variety of data types.
Figshare: Cloud-based and features the ability to preview data.
Zenodo: Does not impose any requirements on the format, size, access restrictions, or license. All data is licensed CC0.

Your choice of repository will depend in part on why you’re depositing your data. Is it open to support a publication, open to encourage its re-use as a standalone dataset, or open to contribute to a larger dataset like a gene databank or a databank of human observations?

Regardless of the route you choose, make sure the repository will provide a persistent identifier or DOI (Digital Object Identifier) that can be shared to allow others to access these files. Unlike a URL, which can be unstable and make lead to broken links over time, a DOI will never change.

Scenario – Data persistence

Marija published an article in the journal Conservation Biology and referenced related data sources and a database with citations that could be downloaded from her institutional faculty page. This worked until Marija left that university and the faculty page was disabled. Now when readers tried to download the files referenced in the Conservation Biology article they were directed to an “Access forbidden” page, meaning any data or related sources to the article could not be found.

If the institution had its own data repository Marija could deposit that data and related files there. This would provide a persistent DOI, meaning Marija’s data should be accessible for a long time no matter where she is based.

Scenario adapted from Case study: Data persistence with permission from Standford Libraries.

Licensing Open Data

Applying a license when depositing into a repository is an often neglected step. But without a license, no one knows how they can use the data and how they are expected to give credit to your hard work.

Choosing a License

Different types of open licenses make sense for different aspects of open scholarship and data. Funding agencies and journals may require your data be made open. If you are choosing you own licensing option, select the appropriate one based on how you want others to reuse your data. It is a good practice to include a rights statement within your dataset or in your readme file. Here are two resources that can help you compare and select the appropriate data license for your work:

Test Your Knowledge

Dig Deeper

Licensing Your Data Guide
FAQs about Data and CC Licenses
How to choose a license for open scientific data and code: The first 4 minutes of this video covers applying a license to data.

Back

Continue