Skip to content
Start main Content

Understanding Data Licenses

One important aspect of research data management is data sharing and reuse. This post introduces a few common data licenses that enable data creators to explain what users may or may not do with a given work.

Data is an integral part of a research process. Traditionally, upon completion of a research project, the underlying data likely will be archived and never visited again. Nonetheless, the ecosystem about the use of research data has evolved (Fig.1). The lifespan of research data is getting longer than the research project that creates them, and the potential use of research data will be extended to validate, reproduce, and generate new ideas and hypotheses.

paradigm shift in using research data

Figure 1. Paradigm shift in the use of research data (Image source)

For many researchers, one way to share and to enable re-use of research data is to deposit them in a data repository (e.g., DataSpace@HKUST, FigShare, Zenodo). Rather than releasing data online unrestrictedly, data repositories would normally ask the depositor to choose a data license for their dataset, stating what others may or may not do with a given work.

On the repository website (for example, Figshare), you can usually find out what the licensing options are available and recommended. When choosing data licenses, the decision varies based on a number of considerations, such as:

  • Whether attribution is needed
  • Whether modification or redistribution is permitted
  • Whether commercial use is permitted

The data license you choose affects the future use of your data. Likewise, when browsing or re-using data by others, you need to pay close attention to data licenses assigned by others as well. Here we introduce two common standard licenses, namely Creative Commons Licenses and the Open Data Commons Licenses.

Creative Commons Licenses

Creative Commons (CC) licensing was developed as an alternative to the “all or nothing” copyright. CC allows a number of licenses that can be used with a wide variety of creations that might otherwise fall under the “all rights reserved” copyright restrictions. Although not tailored for data only, CC can be used as data licenses because of its ease of use and prevalence. 

Permission Levels

The permission level provided by a CC license can be understood from its name, which is a combination of two-letter “permission marks”. 

Permission Mark What can I do with the data?
BY Credit must be given to the creator
ND Adaptations must be shared under the same terms
NC Only non-commercial uses of the work are permitted
SA No derivatives or adaptations of the work are permitted


This webpage outlines and explains all six different CC licenses with simple visual symbols. The Creative Commons Attribution License, or CC-BY (example), would be most useful if you want your data to be disseminated as widely as possible – It lets others distribute, remix, tweak, and build upon your work, even commercially, as long as they credit you for the original creation. On the contrary, the least permissive CC license is CC-BY-NC-ND (example). In this case only distribution for non-commercial use of the unmodified original work is allowed.

The Creative Commons License Chooser can assist you to choose a proper CC license with a list of self-guiding questions.

Other than the six CC licenses, you may have also come across CC0 licensed materials (example). CC0 is a public dedication tool, which allows creators to give up their copyright and place their works as completely as possible in the public domain.

Open Data Commons

Apart from CC licenses, data creators may consider applying Open Data Commons (ODC) licenses to their work. Maintained by the Open Knowledge Foundation, ODC licenses are made as legal tools to open data/databases: 

CC vs ODC – A Few Things to Note

Licensing options between CC and ODC are quite similar. However, unlike CC licenses which are mainly for general purposes and can be applied to a variety of works (e.g. artwork, music, photos, writings), the ODC licenses are made specifically to be applied to data, and typically cover only database rights.

Another difference between CC and ODC is the inclusion of a set of community norms with the PDDL (Public Domain Dedication and License), which serves as a useful guide to encourage fair, open sharing of data. CC0, the counterpart of PDDL, however, lacks such a document – How to give attributions often relies on ethical and professional norms in the context of specific academic or scholarly communities. 

Sources and Further Readings 

– By Jennifer Gu, Library

Hits: 436

Go Back to page Top

Tags: , , , ,

published November 18, 2022