In today’s data-driven research landscape, effective data management and sharing are essential for advancing knowledge. DataSpace@HKUST, our institutional data repository, supports researchers in meeting these demands by aligning with the FAIR Data Principles, ensuring research data is Findable, Accessible, Interoperable, and Reusable.
Understanding the FAIR Data Principles
Introduced in 2016, the FAIR principles provide a globally recognized framework for improving the discoverability and reuse of research data. Here is what FAIR means:
- Findable: Data and metadata are easy to find for both humans and machines, using unique identifiers and rich metadata.
- Accessible: Data and metadata are retrievable via standardized protocols, with clear access conditions.
- Interoperable: Data integrates seamlessly with other systems, using standard formats and vocabularies.
- Reusable: Data is well-documented and licensed to enable reuse, maximizing its long-term value.
You can read the original document about FAIR here: https://www.go-fair.org/fair-principles/.
How DataSpace@HKUST Puts FAIR into Practice
DataSpace@HKUST is a data repository built on the open-source Dataverse platform. It enables researchers to share datasets effectively while complying with FAIR principles:
Findable
To ensure discoverability, each dataset is assigned a unique Digital Object Identifier (DOI) and described with rich, searchable metadata, including titles, creators, keywords, subject categories, and publication dates. The DOI provides persistent referencing, while curated metadata improves dataset discovery and citation.
For example, the Sediment nitrogen fluxes and rates in the Pearl River Estuary region include standardized metadata that enhance searchability.
Accessible
DataSpace@HKUST supports flexible access controls, allowing researchers to choose open or restricted access. Public datasets are available for download with clear licensing terms to guide reuse.
For example, the dataset on Four-band non-Abelian topological insulator and its experimental realization is readily available for download and reuse. The related publication is also linked in the dataset’s metadata.
Interoperable
Interoperability ensures that data can integrate with other systems and workflows. DataSpace@HKUST supports this through standardized metadata schemas, such as DataCite, Dublin Core and OAI-ORE, consistent identifiers like DOIs, and machine-readable file formats.
To enhance data-level interoperability, researchers are encouraged to use open file formats (e.g. csv, xml, JSON, txt, etc), commonly used controlled vocabularies, and include supplementary files (e.g. data dictionary and readme file) for their data and metadata.
Reusable
To support reuse, datasets include comprehensive metadata, licensing options (e.g., Creative Commons), and citation-ready formats that credit creators.
For example, the dataset Ambient Measurements and Improved Statistical Proxies for Gaseous Sulfuric Acid in Coastal Hong Kong uses a CC BY license, making it easy for others to reuse the data with proper attribution.
Real-World Impact of DataSpace@HKUST
DataSpace@HKUST hosts a wide range of datasets that demonstrate its versatility. The China Government Employee Database-Qing (CGED-Q) Jinshenlu supports historical and social science research, while environmental datasets like North China Plain and Sichuan Basin OC&EC, IN&ON data during winter 2023-2024 advance air quality studies. These datasets are downloaded by researchers around the world, enhancing HKUST’s research impact and promoting interdisciplinary collaboration.
Behind the scenes, the DataSpace Team assists researchers in tidying datasets, resolving formatting issues, and applying licenses (e.g., Creative Commons), ensuring data is ready for reuse in future studies. If you need help with your dataset, please contact the DataSpace Team at lbds@ust.hk.
– By DataSpace Team, Library
Hits: 62
Go Back to page Top
- Category:
- Research Data Management Tips
Tags: Data Repository, data sharing, DataSpace, DOI, FAIR, metadata, Open Research, Open Science, RDM
published May 8, 2025