1. Introduction
-
DCAT-AP-LT is a metadata specification for describing Lithuanian datasets and services. DCAT-AP-LT is an extension of the European metadata specification DCAT-AP, developed in accordance with the rules and recommendations established for the formation of the DCAT-AP specification, including ontologies and thesauri as auxiliary tools for data description.
DCAT-AP-LT includes the classes used by BREG-DCAT-AP(https://github.com/SEMICeu/BregDCAT-AP/tree/main/releases/2.1.0) and supplements them with the dcat:Catalog class that extends the Information System class and two controlled dictionaries to indicate the type and importance of the Information System in accordance with the procedure established in the State Information Resources Management Law.
1.1. Context
Currently, it is expected that the European Commission will provide support for development of these Common European Data Spaces:
- Agriculture;
- Cultural Heritage;
- Energy;
- Finance;
- Green deal;
- Language;
- Health;
- Manufacturing;
- Media;
- Mobility;
- Public administration;
- Research and Innovation;
- Skills;
- Tourism.
This specification is intended to support these planned data spaces and provide easier data management within them. After the implementation of this specification in the information system, where the data of the registers and other information systems of the Republic of Lithuania will be cataloged, tools will be created to help the representatives of the Lithuanian data management areas to create data management strategies for registers and other information systems, to recognize inappropriate duplication of information, and will also provide an opportunity to share the data of the Republic of Lithuania data with the European Union according to the common semantic model.
1.2. Scope
This specification is focused on the implementation of the provisions of the State Information Resources Management Law on metadata managed in information systems.
2. Status
DCAT-AP-LT 0.1
3. Licence
The material in this model is published under the CC-BY 4.0 license, unless otherwise noted.
4. Terminology
Prefix | Name space URI |
---|---|
dcataplt
| https://data.gov.lt/onto/DCATAPLT#
|
adms
| http://www.w3.org/ns/adms#
|
cpsv
| http://purl.org/vocab/cpsv#
|
cv
| http://data.europa.eu/m8g/
|
dcat
| http://www.w3.org/ns/dcat#
|
dqv
| http://www.w3.org/ns/dqv#
|
eli
| http://data.europa.eu/eli/ontology#
|
foaf
| http://xmlns.com/foaf/0.1/
|
odrl
| http://www.w3.org/ns/odrl/2/
|
prov
| http://www.w3.org/ns/prov#
|
rdfs
| http://www.w3.org/2000/01/rdf-schema#
|
skos
| http://www.w3.org/2004/02/skos/core#
|
spdx
| http://spdx.org/rdf/terms#
|
vann
| http://purl.org/vocab/vann/
|
vcard
| http://www.w3.org/2006/vcard/ns#
|
voaf
| http://purl.org/vocommons/voaf#
|
5. Model overview
DCAT-AP-LT includes all the classes listed in BREG-DCAT-AP 2.1 and supplements them with an Information System class that extends the dcat:Catalog class and two controlled dictionaries to mark the type and importance of the Information System as defined in the State Information Resource Cleaning Law. Also, DCAT-AP-HVD classes and additional properties are used to describe the metadata of high-value datasets.
6. DCAT-AP-LT Specific Classes
6.1. Information system (dcataplt:InformationSystem)
- Definition
- The set of software for digitizing the operational processes of a state or municipal institution or institution, other public administration entity, public institution or state-run company, managing data and/or providing administrative or public services electronically .
- Properties
-
Property Range Card Definition Description Reuse dcataplt:importance skos:Concept 1..1 Information system importance. Information System importance as defined by law of State Information Resources Management. P dcataplt:type skos:Concept 1..1 Information system type. Information System type as defined by law of State Information Resources Management. P dcataplt:ISImportanceAssessmentURL rdfs:Resource 1..1 Information system importance assessment. URL reference to information System importance assessment document. P
7. DCAT-AP-LT Specific Controlled Vocabularies
7.1. Importance (dcataplt:Importance)
- Definition
- Information system importance as defined by law of State Information Resources Management.
- List of possible importance levels
-
Value URI Range Description Reuse Highly important dcataplt:highlyImportant skos:Concept They include data important for the entire state and/or state registers or state information systems in which these data are processed - value 1 is indicated. P Important dcataplt:important skos:Concept They include data important to several institutions and/or state registers or state information systems in which these data are processed - value 2 is indicated. P Moderately important dcataplt:moderatelyImportant skos:Concept They include data important for one institution and/or departmental registers or state information systems in which these data are processed - value 3 is indicated. P Slightly important dcataplt:slightlyImportant skos:Concept They include data managed by the institution in the performance of internal administration functions and/or information systems in which these data are processed. The procedure for the establishment, development, modernization and liquidation of the information systems mentioned in this point shall be determined by the institutions authorized by the Government of the Republic of Lithuania - meaning 4. P
7.2. Type (dcataplt:Type)
- Definition
- Information systems types according to the data processed in them by Article 9 of the Law on State Information Resources Management of the Republic of Lithuania.
- List of possible information system types
-
Name URI Range Description Reuse Register information system dcataplt:registerInformationSystem skos:Concept Information systems of registers in which objects are registered, objects and their registration data are processed. P State information system dcataplt:stateInformationSystem skos:Concept State information systems in which the data necessary for the performance of the functions established by legal acts of entities are processed, including state information systems intended for the performance and data processing of several or many entities of the same kind of operational functions defined in the legal acts regulating their activities (hereinafter referred to as the general information system. P Internal information system dcataplt:internalInformationSystem skos:Concept Internal administration information systems, which process data managed by entities, required for internal administration functions or other functions of an auxiliary nature, which help the entity to implement the tasks established by legal acts. P
8. Used BregDCAT-AP classes
Note: In Lithuania, Datasets are described with minor adaptations regarding the citation of the legal source, as the ELI has not yet been adopted in Lithuania. LegalResource resource is an attribute under ELI with the name id_local. The local identifier is used as the URI of the legal resource published in the Register of Legal Acts.
8.1. Catalog
- Definition
- A directory or repository that stores descriptive datasets or data services.
Class Description Context Range Catalog A directory or repository that stores descriptive datasets or data services This class describes a specific information system - a state information system or register, the scope of which includes data sets. dcat:Catalog
8.2. Agent
Class | Description | Context | Range |
---|---|---|---|
Agent | Any entity that performs actions related to the Core classes: Catalog, Dataset, Data Service, and Distribution. | - | foaf:Agent |
8.3. Dataset
Class | Description | Context | Range |
---|---|---|---|
Dataset | A conceptual class that describes the information being collected/provided. Associates a directory with a dataset that is part of the directory. | Instances of this class describe data sets managed within the scope of the information system. Each data set is assigned to one instance of the information system (Catalog) through the catalog record link CatalogRecord. In this way, specific data sets are described, controlled and managed within the scope of a specific information system. The basis for registering a specific data set (Dataset) in the scope of the information system (Catalog) is the approval of the provisions of the information system (the procedure defined in the methodology of the IS life cycle). Each data set gives meaning to a certain part of the information structure to be named in the regulations of the state information system or register. The data set is distinguished and formulated as a meaningful and significant concept in the scope of the information system activity, described by significant properties/attributes. | dcat:Dataset |
8.4. Data Service
Class | Description | Context | Range |
---|---|---|---|
Data Service | A set of operations that provide access to one or more data sets or data processing functions. | - | dcat:DataService |
8.5. Category, Publisher type, Status
Class | Description | Context | Range |
---|---|---|---|
Category, Publisher type, Status | A concept defining resource publisher type, resource status or other reource category | - | skos:Concept |
8.6. Distribution
Class | Description | Context | Range |
---|---|---|---|
Dstribution | The physical form of expression of a data set in a particular format, or the realization of a data set. | This Class describes a concrete and active implementation of a data set in an information system - a state information system or a data management environment, such as a database table, data files stored in a data warehouse, structured and semi-structured files in a closed intranet or Internet space. In the context of IS, the priority and largest application of this Class is for the description of realizations of data sets managed in the information system environment, and further for detailing their models in Classmis DataObjectTypeSpecification and DataElement. | dcat:Distribution |
8.7. Licence Document
Class | Description | Context | Range |
---|---|---|---|
Licence document | A legal document that gives official permission to do something with a resource. | - | dct:LicenseDocument |
8.8. Category Scheme
Class | Description | Context | Range |
---|---|---|---|
Category Scheme | skos:ConceptScheme |
8.9. Legal Resource
Class | Description | Context | Range |
---|---|---|---|
Legal Resource | Legal sources are legal acts, policies or policies, which are rules that define the services of base registries and other information systems. Note: In Lithuania, Datasets are described with minor adaptations regarding the citation of the legal source, as the ELI has not yet been adopted in Lithuania. LegalResource resource is an attribute under ELI with the name id_local. The local identifier is used as the URI of the legal resource published in the Register of Legal Acts. | - | eli:LegalResource |
8.10. Rule
Class | Description | Context | Range |
---|---|---|---|
Rule | A rule is a document that sets out specific guidelines or procedures to be followed by a base registry. It may contain information and set requirements for the services provided by the public base registry. | - | cpsv:Rule |
8.11. Public Organisation
Class | Description | Context | Range |
---|---|---|---|
Public Organisation | A public institution is an Agent that is responsible for the provision of public services. This uses a class from the Core Entity Dictionary, also based on the W3C Organizations Ontology. | - | cv:PublicOrganisation |
8.12. Public Registry Service
Class | Description | Context | Range |
---|---|---|---|
Public Registry Service | This class represents the ability to perform the processes of providing a public service and exists regardless of whether it is used or not. It is a set of actions performed by or on behalf of a public administration body for the benefit of a citizen, business or other public institution. | - | cpsv:PublicService |
8.13. Catalog Record
Class | Description | Context | Range |
---|---|---|---|
Catalog record | The dataset entry in catalog. | - | dcat:CatalogRecord |
8.14. Checksum
Class | Description | Context | Range |
---|---|---|---|
Checksum | A value to authenticate the contents of the file. | - | spdx:Checksum |
8.15. Document
Class | Description | Context | Range |
---|---|---|---|
Document | An abstract class describing different types of documents | foaf:Document |
8.16. Frequency
Class | Description | Context | Range |
---|---|---|---|
Frequency | The frequency of certain events, such as data updates, is marked. | - | dct:Frequency |
8.17. Identifier
Class | Description | Context | Range |
---|---|---|---|
Identifier | An identifier in a certain context is a string, which is an identifier; an optional identifier of the identifier scheme; an optional identifier for the version of the identifier scheme; optional identifier of the agency managing the identifier scheme. | - | adms:Identifier |
8.18. Kind
Class | Description | Context | Range |
---|---|---|---|
Kind | according to the vCard specification, e.g. for the contact’s phone number and email to specify the postal address. | The Type class is the parent class for the four vCard types (Individual, Organization, Place, Group). | vcard:Kind |
8.19. Linguistic system
Class | Description | Context | Range |
---|---|---|---|
Linguistic system | Communication uses a system of signs, symbols, sounds, gestures or rules, e.g. language. | - | dct:LinguisticSystem |
8.20. Location
Class | Description | Context | Range |
---|---|---|---|
Location | Spatial region or named place. | Usage Note: Can be represented using a controlled dictionary or geographic coordinates. In the latter case, it is recommended to use a basic terrain dictionary, following the method described in the GeoDCAT-AP specification. | dct:Location |
8.21. Media type
Class | Description | Context | Range |
---|---|---|---|
Media type | Media type, e.g. computer file format. | dct:MediaType |
8.22. Period of Time
Class | Description | Context | Range |
---|---|---|---|
Period of Time | Time period defined by start and end dates. | - | dct:PeriodOfTime |
8.23. Resource
Class | Description | Context | Range |
---|---|---|---|
Resource | Any object that can be described with RDF | This is an abstrct class. This class does not impose any requirements for the properties | rdfs:Resource |
8.24. Rights Statements
Class | Description | Context | Range |
---|---|---|---|
Rights Statements | A declaration of intellectual property rights (INT) to the information or resource contained in the resource, a legal document that gives formal permission to do something with the resource, or a declaration of access rights. | - | dct:RightsStatement |
8.25. Provenence Statement
Class | Description | Context | Range |
---|---|---|---|
Provenence Statement | This class contains any changes to a resource’s from properties and storage to its creation that are relevant to its authenticity, integrity, and interpretation. declaration of the origin of the data set. | - | dct:ProvenanceStatement |
8.26. Standard
Class | Description | Context | Range |
---|---|---|---|
Standard | A standard or other specification to which a data set or representation conforms | dct:Standard |
8.27. Contact Point
Class | Description | Context | Range |
---|---|---|---|
Contact Point | This class describes contact point information. | - | schema:ContactPoint |
8.28. Quality Annotation
Class | Description | Context | Range |
---|---|---|---|
Quality Annotation | Specifies quality annotations, including ratings, quality certificates, or feedback, that can be associated with datasets or submissions. Quality annotations must contain a single oa:motivatedBy statement with an instance of oa:Motivation (and skos:Concept) that represents the purpose of the quality assessment. We define this case as dqv:qualityAssessment. | - | dqv:QualityAnnotation |
8.29. Quality Measurement
Class | Description | Context | Range |
---|---|---|---|
Quality Measurement | This class defines possible quality assessment based on specific quality assessment criteria | - | dqv:QualityMeasurement |
8.30. Activity
Class | Description | Context | Range |
---|---|---|---|
Activity | Activities are actions that take place over a period of time and affect resources; this may include the use, processing, transformation, modification, transfer, use or generation of data or other resources | - | prov:Activity |
8.31. Address
Class | Description | Context | Range |
---|---|---|---|
Address | "Address Mapping" as the INSPIRE address representation data type is conceptually defined. | locn:Address |
9. BReg-DCAT-AP used controlled vocabularies #
Property URI | Used in Class | Vocabulary name | Vocabulary URI | Usage note |
---|---|---|---|---|
dct:accessRights | Data Service; Dataset; Registry Catalogue; Distribution | Access Rights NAL | http://publications.europa.eu/resource/authority/access-right | Datasets, and Data Services must include the level of access rights according to the list of values (i.e., Public, non-public, and restricted). |
dct:accrualPeriodicity | Dataset; Registry Catalogue | Frequency NAL | http://publications.europa.eu/resource/authority/frequency | Datasets and catalogues must specify the frequency of update using the EU Publications Office File Frequency NAL (e.g., continuous, daily, hourly, etc.). |
dct:creator | Dataset; Registry Catalogue | Corporate Bodies NAL | http://publications.europa.eu/resource/dataset/corporate-body | The Corporate bodies NAL includes all the European institutions and a reduced set of international organisations. The Corporate bodies NAL must be used for European institutions and a small set of international organisations. In case of other types of organisations, national, regional, or local vocabularies should be used, if available. |
dct:format | Data Service; Distribution; Registry Catalogue | File Type NAL | http://publications.europa.eu/resource/authority/file-type | The media type of distributions must be represented using the concrete list of document file types of the EU Publications Office File Type NAL. |
dct:language | Dataset; Distribution; Registry Catalogue; Rule | Languages NAL | http://publications.europa.eu/resource/authority/language | Descriptions of Datasets, Data Service, Catalogues, Rules, and Distributions must include the languages using the EU Publications Office NAL. |
dct:license | Data Service; Distribution; Registry Catalogue | Licences NAL | http://publications.europa.eu/resource/authority/licence | This vocabulary must be used in case the licence of a distribution, dataset or catalogue is internationally recognised and included in the EU Publications Office NAL. |
dcat:mediaType | Distribution | IANA Media Types | https://www.iana.org/assignments/media-types/media-types.xhtml | Distributions must represent the format of the document using the IANA Media Types list (e.g., application/mp4, application/pdf, etc.). |
dct:publisher | Data Service; Dataset; Registry Catalogue | Corporate Bodies NAL | http://publications.europa.eu/resource/dataset/corporate-body | The Corporate bodies NAL includes all the European institutions and a reduced set of international organisations. The Corporate bodies NAL must be used for European institutions and a small set of international organisations. In case of other types of organisations, national, regional or local vocabularies should be used, if available. |
dct:spatial | Dataset; Public Organisation; Registry Catalogue; Registry Service | Continents NAL, Countries NAL, Places NAL | Continents NAL Countries NAL Places NAL | Spatial coverage must be represented using the NAL according to the scope of the description (i.e., continent, country or region). |
adms:status | Distribution; Registry Catalogue; Registry Service | ADMS Status vocabulary | http://purl.org/adms/status/ | Data Distributions must indicate the status of the resource according to the ADMS Status vocabulary (i.e., Completed, Deprecated, Under Development). |
cv:thematicArea | Public Registry Service | EuroVoc | http://eurovoc.europa.eu/ | The use of EuroVoc is recommended (not mandatory) at any hierarchical level, in a flexible way, to describe the thematic area of a service. |
dcat:theme | Data Service; Dataset | Data Theme Taxonomy NAL; EuroVoc | Data Theme Taxonomy NAL EuroVoc | The EU Publications Office Data Theme NAL is mandatory to describe catalogues and datasets in open data portals. EuroVoc is recommended at any hierarchical level, in a flexible way, to describe the theme of a resource, dataset or data service. |
dcat:themeTaxonomy | Registry Catalogue | Data Theme Taxonomy NAL; EuroVoc | Data Theme Taxonomy NAL EuroVoc | The Registry Catalogue must specify the URI to the thesaurus that defines the potential themes of its resources, as well as the taxonomy of the Data Theme NAL (mandatory) or EuroVoc (recommended). |
dct:type | Legal Resource | Resource Type NAL | http://publications.europa.eu/resource/authority/resource-type | Legal Resource must indicate the type of the document represented (e.g., Amended proposal, Agreement, etc.). |
dct:type | Agent | ADMS publisher type vocabulary | http://purl.org/adms/publishertype/ | The list of terms in the ADMS publisher type vocabulary is included in the ADMS specification. |
dcatap:availability | Distribution | Distribution availability vocabulary | http://data.europa.eu/r5r/availability/ | The list of terms for the availability levels of a dataset distribution, according to the DCAT-AP specification. |
10. Metadata Standard for High Value Datasets (HVD-DCAT-AP-LT)
Metadata information for high-value datasets is provided in this appendix.
11. Aknowledgements
We would like to express our gratitude to everyone who worked on the development of the DCAT-AP-LT model: Kęstutis Andrijauskas, Martynas Mockus, Mantas Zimnickas, Darius Amilevičius, Martynas Daugirdas, Gabrielė Stočkūnaitė, Alanas Lukjanovičius, Vladimiras Desiatnikovas
12. References
- [DATACITE]
- DataCite. DataCite. URL: http://www.datacite.org/
- [DEU]
- The official portal for European data. European Commission. URL: https://data.europa.eu
- [DOI]
- Digital Object Identifier. DOI Foundation. URL: http://www.doi.org/
- [EZID]
- EZID. California Digital Library. URL: https://ezid.cdlib.org/
- [ODRL]
- The Open Digital Rights Language Ontology Version 2.2. W3C POE Working Group. URL: https://www.w3.org/ns/odrl/2/
- [PSI]
- Directive on open data and the re-use of public sector information (recast). European Union. URL: https://eur-lex.europa.eu/eli/dir/2019/1024/oj
- [SEMIC]
- JoinUp welcomes Interoperable Europe. European Commission. URL: https://joinup.ec.europa.eu/
- [vocab-dcat-1]
- Data Catalog Vocabulary (DCAT). Fadi Maali; John Erickson. W3C. 4 February 2020. W3C Recommendation. URL: https://www.w3.org/TR/vocab-dcat-1/
- [vocab-dcat-2]
- Data Catalog Vocabulary (DCAT) - Version 2. Riccardo Albertoni; David Browning; Simon Cox; Alejandra Gonzalez Beltran; Andrea Perego; Peter Winstanley. W3C. 4 February 2020. W3C Recommendation. URL: https://www.w3.org/TR/vocab-dcat-2/
- [vocab-dcat-3]
- Data Catalog Vocabulary (DCAT) - Version 3. Simon Cox; Andrea Perego; Alejandra Gonzalez Beltran; Peter Winstanley; Riccardo Albertoni; David Browning. W3C. 18 January 2024. W3C Candidate Recommendation. URL: https://www.w3.org/TR/vocab-dcat-3/
- [W3ID]
- Permanent Identifiers for the Web. W3C Permanent Identifier Community Group. URL: https://w3id.org/
- [DCAT-AP-HVD]
- Usage Guidelines of DCAT-AP for High-Value Datasets. European Commission. URL: https://semiceu.github.io/uri.semic.eu-generated/DCAT-AP/releases/2.2.0-hvd/
- [FAIR]
- How to make your data FAIR. OpenAire. URL: https://www.openaire.eu/how-to-make-your-data-fair
- [geodcat-ap]
- GeoDCAT-AP: A geospatial extension for the DCAT application profile for data portals in Europe. European Commission. 23 December 2020. URL: https://semiceu.github.io/GeoDCAT-AP/releases/
- [HVD]
- Implementing Regulation for High Value Datasets. European Union. URL: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32023R0138
- [json-ld11]
- JSON-LD 1.1. Gregg Kellogg; Pierre-Antoine Champin; Dave Longley. W3C. 16 July 2020. W3C Recommendation. URL: https://www.w3.org/TR/json-ld11/
- [prov-o]
- PROV-O: The PROV Ontology. Timothy Lebo; Satya Sahoo; Deborah McGuinness. W3C. 30 April 2013. W3C Recommendation. URL: https://www.w3.org/TR/prov-o/
- [rfc3986]
- Uniform Resource Identifier (URI): Generic Syntax. T. Berners-Lee; R. Fielding; L. Masinter. IETF. January 2005. Internet Standard. URL: https://www.rfc-editor.org/rfc/rfc3986
- [rfc6497]
- BCP 47 Extension T - Transformed Content. M. Davis; A. Phillips; Y. Umaoka; C. Falk. IETF. February 2012. Informational. URL: https://www.rfc-editor.org/rfc/rfc6497
- [shacl]
- Shapes Constraint Language (SHACL). Holger Knublauch; Dimitris Kontokostas. W3C. 20 July 2017. W3C Recommendation. URL: https://www.w3.org/TR/shacl/
- [REGISTER OF LEGAL ACTS IN LITHUANIA]
- Lithuanian Register of Legal Acts
- [UAPI]
- Universal Application Programming Interface Specification
- [DCAT-AP-LT HVD]
- Lithuanian High Value Datasets Metadata Specification