DCAT-AP-LT - Version 2

Technical Specification,

This version:
https://ivpk.github.io/DCAT-AP-LT/en/
Issue Tracking:
GitHub
Editor:
VSSA
Translations (non-normative):
lt
Latest published version:
DCAT-AP-LT v2.0
Latest editors draft:
No draft
This document is also available in these non-normative formats:
Turtle
RDF/XML

1. Introduction

DCAT-AP-LT is a metadata specification for describing Lithuanian datasets and services. DCAT-AP-LT is an extension of the European metadata specification DCAT-AP, developed in accordance with the rules and recommendations established for the formation of the DCAT-AP specification, including ontologies and thesauri as auxiliary tools for data description.

DCAT-AP-LT includes the classes used by BREG-DCAT-AP(https://github.com/SEMICeu/BregDCAT-AP/tree/main/releases/2.1.0) and supplements them with the dcat:Catalog class that extends the Information System class and two controlled dictionaries to indicate the type and importance of the Information System in accordance with the procedure established in the State Information Resources Management Law.

1.1. Context

Currently, it is expected that the European Commission will provide support for development of these Common European Data Spaces:

This specification is intended to support these planned data spaces and provide easier data management within them. After the implementation of this specification in the information system, where the data of the registers and other information systems of the Republic of Lithuania will be cataloged, tools will be created to help the representatives of the Lithuanian data management areas to create data management strategies for registers and other information systems, to recognize inappropriate duplication of information, and will also provide an opportunity to share the data of the Republic of Lithuania data with the European Union according to the common semantic model.

1.2. Scope

This specification is focused on the implementation of the provisions of the State Information Resources Management Law on metadata managed in information systems.

2. Status

DCAT-AP-LT 0.1

3. Licence

The material in this model is published under the CC-BY 4.0 license, unless otherwise noted.

4. Terminology

Prefix Name space URI
dcataplt https://data.gov.lt/onto/DCATAPLT#
adms http://www.w3.org/ns/adms#
cpsv http://purl.org/vocab/cpsv#
cv http://data.europa.eu/m8g/
dcat http://www.w3.org/ns/dcat#
dqv http://www.w3.org/ns/dqv#
eli http://data.europa.eu/eli/ontology#
foaf http://xmlns.com/foaf/0.1/
odrl http://www.w3.org/ns/odrl/2/
prov http://www.w3.org/ns/prov#
rdfs http://www.w3.org/2000/01/rdf-schema#
skos http://www.w3.org/2004/02/skos/core#
spdx http://spdx.org/rdf/terms#
vann http://purl.org/vocab/vann/
vcard http://www.w3.org/2006/vcard/ns#
voaf http://purl.org/vocommons/voaf#

5. Model overview

DCAT-AP-LT includes all the classes listed in BREG-DCAT-AP 2.1 and supplements them with an Information System class that extends the dcat:Catalog class and two controlled dictionaries to mark the type and importance of the Information System as defined in the State Information Resource Cleaning Law. Also, DCAT-AP-HVD classes and additional properties are used to describe the metadata of high-value datasets.

6. DCAT-AP-LT Specific Classes

6.1. Information system (dcataplt:InformationSystem)

Definition
The set of software for digitizing the operational processes of a state or municipal institution or institution, other public administration entity, public institution or state-run company, managing data and/or providing administrative or public services electronically .
Properties


Property Range Card Definition Description Reuse
dcataplt:importance skos:Concept 1..1 Information system importance. Information System importance as defined by law of State Information Resources Management. P
dcataplt:type skos:Concept 1..1 Information system type. Information System type as defined by law of State Information Resources Management. P
dcataplt:ISImportanceAssessmentURL rdfs:Resource 1..1 Information system importance assessment. URL reference to information System importance assessment document. P

7. DCAT-AP-LT Specific Controlled Vocabularies

7.1. Importance (dcataplt:Importance)

Definition
Information system importance as defined by law of State Information Resources Management.
List of possible importance levels


Value URI Range Description Reuse
Highly important dcataplt:highlyImportant skos:Concept They include data important for the entire state and/or state registers or state information systems in which these data are processed - value 1 is indicated. P
Important dcataplt:important skos:Concept They include data important to several institutions and/or state registers or state information systems in which these data are processed - value 2 is indicated. P
Moderately important dcataplt:moderatelyImportant skos:Concept They include data important for one institution and/or departmental registers or state information systems in which these data are processed - value 3 is indicated. P
Slightly important dcataplt:slightlyImportant skos:Concept They include data managed by the institution in the performance of internal administration functions and/or information systems in which these data are processed. The procedure for the establishment, development, modernization and liquidation of the information systems mentioned in this point shall be determined by the institutions authorized by the Government of the Republic of Lithuania - meaning 4. P

7.2. Type (dcataplt:Type)

Definition
Information systems types according to the data processed in them by Article 9 of the Law on State Information Resources Management of the Republic of Lithuania.
List of possible information system types


Name URI Range Description Reuse
Register information system dcataplt:registerInformationSystem skos:Concept Information systems of registers in which objects are registered, objects and their registration data are processed. P
State information system dcataplt:stateInformationSystem skos:Concept State information systems in which the data necessary for the performance of the functions established by legal acts of entities are processed, including state information systems intended for the performance and data processing of several or many entities of the same kind of operational functions defined in the legal acts regulating their activities (hereinafter referred to as the general information system. P
Internal information system dcataplt:internalInformationSystem skos:Concept Internal administration information systems, which process data managed by entities, required for internal administration functions or other functions of an auxiliary nature, which help the entity to implement the tasks established by legal acts. P

8. Used BregDCAT-AP classes

Note: In Lithuania, Datasets are described with minor adaptations regarding the citation of the legal source, as the ELI has not yet been adopted in Lithuania. LegalResource resource is an attribute under ELI with the name id_local. The local identifier is used as the URI of the legal resource published in the Register of Legal Acts.

8.1. Catalog

Definition
A directory or repository that stores descriptive datasets or data services.
Class Description Context Range
Catalog A directory or repository that stores descriptive datasets or data services This class describes a specific information system - a state information system or register, the scope of which includes data sets. dcat:Catalog

8.2. Agent

Class Description Context Range
Agent Any entity that performs actions related to the Core classes: Catalog, Dataset, Data Service, and Distribution. - foaf:Agent

8.3. Dataset

Class Description Context Range
Dataset A conceptual class that describes the information being collected/provided. Associates a directory with a dataset that is part of the directory. Instances of this class describe data sets managed within the scope of the information system. Each data set is assigned to one instance of the information system (Catalog) through the catalog record link CatalogRecord. In this way, specific data sets are described, controlled and managed within the scope of a specific information system. The basis for registering a specific data set (Dataset) in the scope of the information system (Catalog) is the approval of the provisions of the information system (the procedure defined in the methodology of the IS life cycle). Each data set gives meaning to a certain part of the information structure to be named in the regulations of the state information system or register. The data set is distinguished and formulated as a meaningful and significant concept in the scope of the information system activity, described by significant properties/attributes. dcat:Dataset

8.4. Data Service

Class Description Context Range
Data Service A set of operations that provide access to one or more data sets or data processing functions. - dcat:DataService

8.5. Category, Publisher type, Status

Class Description Context Range
Category, Publisher type, Status A concept defining resource publisher type, resource status or other reource category - skos:Concept

8.6. Distribution

Class Description Context Range
Dstribution The physical form of expression of a data set in a particular format, or the realization of a data set. This Class describes a concrete and active implementation of a data set in an information system - a state information system or a data management environment, such as a database table, data files stored in a data warehouse, structured and semi-structured files in a closed intranet or Internet space. In the context of IS, the priority and largest application of this Class is for the description of realizations of data sets managed in the information system environment, and further for detailing their models in Classmis DataObjectTypeSpecification and DataElement. dcat:Distribution

8.7. Licence Document

Class Description Context Range
Licence document A legal document that gives official permission to do something with a resource. - dct:LicenseDocument

8.8. Category Scheme

Class Description Context Range
Category Scheme skos:ConceptScheme

8.9. Legal Resource

Class Description Context Range
Legal Resource Legal sources are legal acts, policies or policies, which are rules that define the services of base registries and other information systems. Note: In Lithuania, Datasets are described with minor adaptations regarding the citation of the legal source, as the ELI has not yet been adopted in Lithuania. LegalResource resource is an attribute under ELI with the name id_local. The local identifier is used as the URI of the legal resource published in the Register of Legal Acts. - eli:LegalResource

8.10. Rule

Class Description Context Range
Rule A rule is a document that sets out specific guidelines or procedures to be followed by a base registry. It may contain information and set requirements for the services provided by the public base registry. - cpsv:Rule

8.11. Public Organisation

Class Description Context Range
Public Organisation A public institution is an Agent that is responsible for the provision of public services. This uses a class from the Core Entity Dictionary, also based on the W3C Organizations Ontology. - cv:PublicOrganisation

8.12. Public Registry Service

Class Description Context Range
Public Registry Service This class represents the ability to perform the processes of providing a public service and exists regardless of whether it is used or not. It is a set of actions performed by or on behalf of a public administration body for the benefit of a citizen, business or other public institution. - cpsv:PublicService

8.13. Catalog Record

Class Description Context Range
Catalog record The dataset entry in catalog. - dcat:CatalogRecord

8.14. Checksum

Class Description Context Range
Checksum A value to authenticate the contents of the file. - spdx:Checksum

8.15. Document

Class Description Context Range
Document An abstract class describing different types of documents foaf:Document

8.16. Frequency

Class Description Context Range
Frequency The frequency of certain events, such as data updates, is marked. - dct:Frequency

8.17. Identifier

Class Description Context Range
Identifier An identifier in a certain context is a string, which is an identifier; an optional identifier of the identifier scheme; an optional identifier for the version of the identifier scheme; optional identifier of the agency managing the identifier scheme. - adms:Identifier

8.18. Kind

Class Description Context Range
Kind according to the vCard specification, e.g. for the contact’s phone number and email to specify the postal address. The Type class is the parent class for the four vCard types (Individual, Organization, Place, Group). vcard:Kind

8.19. Linguistic system

Class Description Context Range
Linguistic system Communication uses a system of signs, symbols, sounds, gestures or rules, e.g. language. - dct:LinguisticSystem

8.20. Location

Class Description Context Range
Location Spatial region or named place. Usage Note: Can be represented using a controlled dictionary or geographic coordinates. In the latter case, it is recommended to use a basic terrain dictionary, following the method described in the GeoDCAT-AP specification. dct:Location

8.21. Media type

Class Description Context Range
Media type Media type, e.g. computer file format. dct:MediaType

8.22. Period of Time

Class Description Context Range
Period of Time Time period defined by start and end dates. - dct:PeriodOfTime

8.23. Resource

Class Description Context Range
Resource Any object that can be described with RDF This is an abstrct class. This class does not impose any requirements for the properties rdfs:Resource

8.24. Rights Statements

Class Description Context Range
Rights Statements A declaration of intellectual property rights (INT) to the information or resource contained in the resource, a legal document that gives formal permission to do something with the resource, or a declaration of access rights. - dct:RightsStatement

8.25. Provenence Statement

Class Description Context Range
Provenence Statement This class contains any changes to a resource’s from properties and storage to its creation that are relevant to its authenticity, integrity, and interpretation. declaration of the origin of the data set. - dct:ProvenanceStatement

8.26. Standard

Class Description Context Range
Standard A standard or other specification to which a data set or representation conforms dct:Standard

8.27. Contact Point

Class Description Context Range
Contact Point This class describes contact point information. - schema:ContactPoint

8.28. Quality Annotation

Class Description Context Range
Quality Annotation Specifies quality annotations, including ratings, quality certificates, or feedback, that can be associated with datasets or submissions. Quality annotations must contain a single oa:motivatedBy statement with an instance of oa:Motivation (and skos:Concept) that represents the purpose of the quality assessment. We define this case as dqv:qualityAssessment. - dqv:QualityAnnotation

8.29. Quality Measurement

Class Description Context Range
Quality Measurement This class defines possible quality assessment based on specific quality assessment criteria - dqv:QualityMeasurement

8.30. Activity

Class Description Context Range
Activity Activities are actions that take place over a period of time and affect resources; this may include the use, processing, transformation, modification, transfer, use or generation of data or other resources - prov:Activity

8.31. Address

Class Description Context Range
Address "Address Mapping" as the INSPIRE address representation data type is conceptually defined. locn:Address

9. BReg-DCAT-AP used controlled vocabularies #

Property URI Used in Class Vocabulary name Vocabulary URI Usage note
dct:accessRights Data Service; Dataset; Registry Catalogue; Distribution Access Rights NAL http://publications.europa.eu/resource/authority/access-right Datasets, and Data Services must include the level of access rights according to the list of values (i.e., Public, non-public, and restricted).
dct:accrualPeriodicity Dataset; Registry Catalogue Frequency NAL http://publications.europa.eu/resource/authority/frequency Datasets and catalogues must specify the frequency of update using the EU Publications Office File Frequency NAL (e.g., continuous, daily, hourly, etc.).
dct:creator Dataset; Registry Catalogue Corporate Bodies NAL http://publications.europa.eu/resource/dataset/corporate-body The Corporate bodies NAL includes all the European institutions and a reduced set of international organisations. The Corporate bodies NAL must be used for European institutions and a small set of international organisations. In case of other types of organisations, national, regional, or local vocabularies should be used, if available.
dct:format Data Service; Distribution; Registry Catalogue File Type NAL http://publications.europa.eu/resource/authority/file-type The media type of distributions must be represented using the concrete list of document file types of the EU Publications Office File Type NAL.
dct:language Dataset; Distribution; Registry Catalogue; Rule Languages NAL http://publications.europa.eu/resource/authority/language Descriptions of Datasets, Data Service, Catalogues, Rules, and Distributions must include the languages using the EU Publications Office NAL.
dct:license Data Service; Distribution; Registry Catalogue Licences NAL http://publications.europa.eu/resource/authority/licence This vocabulary must be used in case the licence of a distribution, dataset or catalogue is internationally recognised and included in the EU Publications Office NAL.
dcat:mediaType Distribution IANA Media Types https://www.iana.org/assignments/media-types/media-types.xhtml Distributions must represent the format of the document using the IANA Media Types list (e.g., application/mp4, application/pdf, etc.).
dct:publisher Data Service; Dataset; Registry Catalogue Corporate Bodies NAL http://publications.europa.eu/resource/dataset/corporate-body The Corporate bodies NAL includes all the European institutions and a reduced set of international organisations. The Corporate bodies NAL must be used for European institutions and a small set of international organisations. In case of other types of organisations, national, regional or local vocabularies should be used, if available.
dct:spatial Dataset; Public Organisation; Registry Catalogue; Registry Service Continents NAL, Countries NAL, Places NAL Continents NAL
Countries NAL
Places NAL
Spatial coverage must be represented using the NAL according to the scope of the description (i.e., continent, country or region).
adms:status Distribution; Registry Catalogue; Registry Service ADMS Status vocabulary http://purl.org/adms/status/ Data Distributions must indicate the status of the resource according to the ADMS Status vocabulary (i.e., Completed, Deprecated, Under Development).
cv:thematicArea Public Registry Service EuroVoc http://eurovoc.europa.eu/ The use of EuroVoc is recommended (not mandatory) at any hierarchical level, in a flexible way, to describe the thematic area of a service.
dcat:theme Data Service; Dataset Data Theme Taxonomy NAL; EuroVoc Data Theme Taxonomy NAL
EuroVoc
The EU Publications Office Data Theme NAL is mandatory to describe catalogues and datasets in open data portals. EuroVoc is recommended at any hierarchical level, in a flexible way, to describe the theme of a resource, dataset or data service.
dcat:themeTaxonomy Registry Catalogue Data Theme Taxonomy NAL; EuroVoc Data Theme Taxonomy NAL
EuroVoc
The Registry Catalogue must specify the URI to the thesaurus that defines the potential themes of its resources, as well as the taxonomy of the Data Theme NAL (mandatory) or EuroVoc (recommended).
dct:type Legal Resource Resource Type NAL http://publications.europa.eu/resource/authority/resource-type Legal Resource must indicate the type of the document represented (e.g., Amended proposal, Agreement, etc.).
dct:type Agent ADMS publisher type vocabulary http://purl.org/adms/publishertype/ The list of terms in the ADMS publisher type vocabulary is included in the ADMS specification.
dcatap:availability Distribution Distribution availability vocabulary http://data.europa.eu/r5r/availability/ The list of terms for the availability levels of a dataset distribution, according to the DCAT-AP specification.

10. Metadata Standard for High Value Datasets (HVD-DCAT-AP-LT)

Metadata information for high-value datasets is provided in this appendix.

11. Aknowledgements

We would like to express our gratitude to everyone who worked on the development of the DCAT-AP-LT model: Kęstutis Andrijauskas, Martynas Mockus, Mantas Zimnickas, Darius Amilevičius, Martynas Daugirdas, Gabrielė Stočkūnaitė, Alanas Lukjanovičius, Vladimiras Desiatnikovas

12. References

[DATACITE]
DataCite. DataCite. URL: http://www.datacite.org/
[DEU]
The official portal for European data. European Commission. URL: https://data.europa.eu
[DOI]
Digital Object Identifier. DOI Foundation. URL: http://www.doi.org/
[EZID]
EZID. California Digital Library. URL: https://ezid.cdlib.org/
[ODRL]
The Open Digital Rights Language Ontology Version 2.2. W3C POE Working Group. URL: https://www.w3.org/ns/odrl/2/
[PSI]
Directive on open data and the re-use of public sector information (recast). European Union. URL: https://eur-lex.europa.eu/eli/dir/2019/1024/oj
[SEMIC]
JoinUp welcomes Interoperable Europe. European Commission. URL: https://joinup.ec.europa.eu/
[vocab-dcat-1]
Data Catalog Vocabulary (DCAT). Fadi Maali; John Erickson. W3C. 4 February 2020. W3C Recommendation. URL: https://www.w3.org/TR/vocab-dcat-1/
[vocab-dcat-2]
Data Catalog Vocabulary (DCAT) - Version 2. Riccardo Albertoni; David Browning; Simon Cox; Alejandra Gonzalez Beltran; Andrea Perego; Peter Winstanley. W3C. 4 February 2020. W3C Recommendation. URL: https://www.w3.org/TR/vocab-dcat-2/
[vocab-dcat-3]
Data Catalog Vocabulary (DCAT) - Version 3. Simon Cox; Andrea Perego; Alejandra Gonzalez Beltran; Peter Winstanley; Riccardo Albertoni; David Browning. W3C. 18 January 2024. W3C Candidate Recommendation. URL: https://www.w3.org/TR/vocab-dcat-3/
[W3ID]
Permanent Identifiers for the Web. W3C Permanent Identifier Community Group. URL: https://w3id.org/
[DCAT-AP-HVD]
Usage Guidelines of DCAT-AP for High-Value Datasets. European Commission. URL: https://semiceu.github.io/uri.semic.eu-generated/DCAT-AP/releases/2.2.0-hvd/
[FAIR]
How to make your data FAIR. OpenAire. URL: https://www.openaire.eu/how-to-make-your-data-fair
[geodcat-ap]
GeoDCAT-AP: A geospatial extension for the DCAT application profile for data portals in Europe. European Commission. 23 December 2020. URL: https://semiceu.github.io/GeoDCAT-AP/releases/
[HVD]
Implementing Regulation for High Value Datasets. European Union. URL: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32023R0138
[json-ld11]
JSON-LD 1.1. Gregg Kellogg; Pierre-Antoine Champin; Dave Longley. W3C. 16 July 2020. W3C Recommendation. URL: https://www.w3.org/TR/json-ld11/
[prov-o]
PROV-O: The PROV Ontology. Timothy Lebo; Satya Sahoo; Deborah McGuinness. W3C. 30 April 2013. W3C Recommendation. URL: https://www.w3.org/TR/prov-o/
[rfc3986]
Uniform Resource Identifier (URI): Generic Syntax. T. Berners-Lee; R. Fielding; L. Masinter. IETF. January 2005. Internet Standard. URL: https://www.rfc-editor.org/rfc/rfc3986
[rfc6497]
BCP 47 Extension T - Transformed Content. M. Davis; A. Phillips; Y. Umaoka; C. Falk. IETF. February 2012. Informational. URL: https://www.rfc-editor.org/rfc/rfc6497
[shacl]
Shapes Constraint Language (SHACL). Holger Knublauch; Dimitris Kontokostas. W3C. 20 July 2017. W3C Recommendation. URL: https://www.w3.org/TR/shacl/
Lithuanian Register of Legal Acts
[UAPI]
Universal Application Programming Interface Specification
[DCAT-AP-LT HVD]
Lithuanian High Value Datasets Metadata Specification