High Value Datasets (DCAT-AP-LT) metadata specification

Technical Specification,


1. Implementation of DCAT-AP HVD in Lithuania

The Definition of High Value Datasets arose from the need to identify the most important data for example:

This data has the greatest potential to influence the most important areas identified by the European Commission. The opening and reuse of such data creates significant added value, but at the same time requires additional rules governing their availability, interoperability and use. Considering the fact that the DCAT-AP extension cannot fully satisfy the quality, reliability and openness requirements of high-value data sets, the development of the DCAT-AP HVD extension was started, based on the guidelines provided in the High-value Data Set Implementation Regulation.

Accordingly, this extension is also adapted to DCAT-AP-LT, changing the LegalResource resource to an attribute under ELI:LegalResource with the name id_local. In Lithuania, High Value Datasets are described 'as-is' in the DCAT-AP HVD 2.2.0 specification, with minor adaptations regarding the citation of the legal source, as the ELI has not yet been adopted in Lithuania. In this annex, the LegalResource resource is an attribute under ELI with the name id_local. The local identifier is used as the URI of the legal resource published in the Register of Legal Acts.

2. General properties of high-value datasets

High-value datasets have specific technical and legal requirements. These data sets are subject to the following general requirements:

Considering these features and, accordingly, the resulting requirements, in the DCAT-AP-LT model there is a need to apply additional, modified or tightened metadata model rules for high-value data sets. These rules include properties that complement DCAT-AP classes or enforce stricter binding of existing properties and are defined in DCAT-AP - in HVD’s appendix. In Lithuania, the descriptions of high-value data sets "as is" in the DCAT-AP HVD 2.2.0 specification, with minor changes to the legal source citation, as ELI has not yet been adopted in Lithuania. In this appendix, the LegalResource resource is an ELI attribute named id_local. The local identifier is used as the URI of the legal resource published in the Register of Legal Acts.

3. Classes and properties used in High value datasets (DCAT-AP HVD)

3.1. Data service

Definition
A collection of operations that provides access to one or more datasets or data processing functions.
Reference in DCAT
Nuoroda
Subclass of
Katalogo resursas
Savybės
For this entity the following properties are defined: applicable legislation, contact point, documentation, ndpoint description, endpoint URL, HVD category, licence, rights, serves dataset.


Property Interval Cardinality Definition Usage note Reusage
applicable legislation Legal Resource 1..* The legislation that mandates the creation or management of the Data Service. For HVD, the value MUST include ELI http://data.europa.eu/eli/reg_impl/2023/138/oj. Since a resource can be subject to multiple legislations, there is no limit to the maximum cardinality. In Lithuania, high-value datasets are described "as is" in the DCAT-AP HVD 2.2.0 specification, with minor changes to the legal source citation, as ELI has not yet been adopted in Lithuania. In this appendix, the LegalResource resource is the ELI attribute named id_local. Local identificator used as in Legal Acts Register published legal resource URI. E
contact point Kind 1..* Contact information that can be used for sending comments about the Data Service. Article 3.4 requires the designation of a point of contact for an API. A
documention Document 1..* A page that provides additional information about the Data Service. Quality of service covers a broad spectrum of aspects. The HVD regulation does not list any mandatory topic. Therefore quality of service information is considered part of the generic documentation of a Data Service. P
endpoint description Resource 0..* A description of the services available via the end-points, including their operations, parameters etc. The property gives specific details of the actual endpoint instances, while dct:conformsTo is used to indicate the general standard or specification that the endpoints implement. Article 3.3 requires to provide API documentation in a Union or internationally recognised open, human-readable and machine-readable format. A
endpoint URL Resource 1..* The root location or primary endpoint of the service (an IRI). The endpoint URL SHOULD be persistent. This means that publishers should do everything in their power to maintain the value stable and existing. A
HVD category Concept 1..* The HVD category to which this Data Service belongs. - P
licence Licence Document 0..1 A licence under which the Data service is made available. Article 3.3 specifies that the terms of use should be provided. According to the guidelines for legal Information in DCAT-AP HVD this is fullfilled by providing by preference a licence. As alternative rights can be used. A
rights Rights statement 0..* A statement that specifies rights associated with the Distribution. Article 3.3 specifies that the terms of use should be provided. According to the guidelines for legal Information in DCAT-AP HVD this is fullfilled by providing by preference a licence. As alternative rights can be used. A
serves dataset Dataset 1..* This property refers to a collection of data that this data service can distribute. An API in the context of HVD is not a standalone resource. It is used to open up HVD datasets. Therefore each Data Service is at least tightly connected with a Dataset. E

3.2. Duomenų rinkinys

Definition
A conceptual entity that represents the information published.
Reference in DCAT
Link
Subclass of
Catalogue Resource
Properties
For this entity the following properties are defined: applicable legislation, comforms to, contact point, dataset distribution, HVD categogy.


Property Interval Cardinality Definition Usage note Reusage
applicable legislation Legal Resource 1..* The legislation that mandates the creation or management of the Dataset. For HVD, the value MUST include ELI http://data.europa.eu/eli/reg_impl/2023/138/oj. Since a resource can be subject to multiple legislations, there is no limit to the maximum cardinality. In Lithuania, high-value datasets are described "as is" in the DCAT-AP HVD 2.2.0 specification, with minor changes to the legal source citation, as ELI has not yet been adopted in Lithuania. In this appendix, the LegalResource resource is the ELI attribute named id_local. Local identificator used as in Legal Acts Register published legal resource URI. E
conforms to Standard 0..* An implementing rule or other specification. Pateikta informacija turėtų leisti patikrinti, ar yra tenkinami išsamūs HVD informacijos reikalavimai. Daugiau naudojimo pasiūlymų žr. skyriuje apie konkrečius duomenų reikalavimus. A
contact point Kind 0..* Contact information that can be used for sending comments about the Dataset. - A
distribution Distribution 1..* An available Distribution for the Dataset. The HVD IR is a quality improvement of existing datasets. The intention is that HVD datasets are publicly and open accessible. Therefore a Distribution is expected to be present. (Article 3.1) A
HVD Category Concept 1..* The HVD category to which this Dataset belongs. - P

3.3. Distribution

Definition
A physical embodiment of the Dataset in a particular format.
Reference in DCAT
Link
Usage note
Bulk downloads should be encoded as a Distribution.
Properties
For this entity the following properties are defined: access service, access URL, applicable legislation, licence, linked schemas, rights.


Property Interval Cardinality Definition Usage note Reusage
prieigos paslauga Data Service 0..* A data service that gives access to the distribution of the dataset. - A
access URL Resource 1..* A URL that gives access to a Distribution of the Dataset. The resource at the access URL contains information about how to get the Dataset. In accordance to the DCAT guidelines it is preferred to also set the downloadURL property if the URL is a reference to a downloadable resource. A
applicable Legislation Legal Resource 1..* The legislation that mandates the creation or management of the Distribution. For HVD, the value MUST include ELI http://data.europa.eu/eli/reg_impl/2023/138/oj. Since a resource can be subject to multiple legislations, there is no limit to the maximum cardinality. In Lithuania, high-value datasets are described "as is" in the DCAT-AP HVD 2.2.0 specification, with minor changes to the legal source citation, as ELI has not yet been adopted in Lithuania. In this appendix, the LegalResource resource is the ELI attribute named id_local. Local identificator used as in Legal Acts Register published legal resource URI. E
licence Licence Document 0..1 A licence under which the Distribution is made available. Article 3.3 specifies that the terms of use should be provided. According to the guidelines for legal Information in DCAT-AP HVD this is fullfilled by providing by preference a licence. As alternative rights can be used. A
linked schema Standard 0..* An established schema to which the described Distribution conforms. The provided information should enable to the verification whether the detailed information requirements by the HVD is satisfied. For more usage suggestions see section on specific data requirements. A
rights Rights statement 0..* A statement that specifies rights associated with the Distribution. Article 4.3 specifies that High-value datasets should be made available for reuse. According to the guidelines for legal Information in DCAT-AP HVD this is fullfilled by providing by preference a licence. As alternative rights can be used. A

4. Acknowledgments

We would like to express our gratitude to everyone who worked on the development of the DCAT-AP-LT HVD extention: Kęstutis Andrijauskas, Martynas Mockus, Mantas Zimnickas, Darius Amilevičius, Martynas Daugirdas, Gabrielė Stočkūnaitė, Alanas Lukjanovičius, Vladimiras Desiatnikovas.

5. References

[DCAT-AP-LT]
Lithuanian DCAT Application Profile
LIthuanian register of legal acts
[SEMIC]
JoinUp welcomes Interoperable Europe. European Commission. URL: https://joinup.ec.europa.eu/
[vocab-dcat-1]
Data Catalog Vocabulary (DCAT). Fadi Maali; John Erickson. W3C. 4 February 2020. W3C Recommendation. URL: https://www.w3.org/TR/vocab-dcat-1/
[vocab-dcat-2]
Data Catalog Vocabulary (DCAT) - Version 2. Riccardo Albertoni; David Browning; Simon Cox; Alejandra Gonzalez Beltran; Andrea Perego; Peter Winstanley. W3C. 4 February 2020. W3C Recommendation. URL: https://www.w3.org/TR/vocab-dcat-2/
[vocab-dcat-3]
Data Catalog Vocabulary (DCAT) - Version 3. Simon Cox; Andrea Perego; Alejandra Gonzalez Beltran; Peter Winstanley; Riccardo Albertoni; David Browning. W3C. 18 January 2024. W3C Candidate Recommendation. URL: https://www.w3.org/TR/vocab-dcat-3/
[DCAT-AP-HVD]
Usage Guidelines of DCAT-AP for High-Value Datasets. European Commission. URL: https://semiceu.github.io/uri.semic.eu-generated/DCAT-AP/releases/2.2.0-hvd/
[FAIR]
How to make your data FAIR. OpenAire. URL: https://www.openaire.eu/how-to-make-your-data-fair
[HVD]
Implementing Regulation for High Value Datasets. European Union. URL: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32023R0138