RML takes advantage of W3C-standardized or widely-accepted vocabularies used to advertise services or datasets, to define how to retrieve and access data sources, available on the Web or not. Such descriptions, either derived from data owners/publishers or defined by data publishers/consumers, are used to describe how to retrieve the data.
Data Catalog Vocabulary (DCAT)
is the W3C recommended vocabulary for describing datasets in data catalogs,
enabling applications to easily consume the underlying data.
dcat:Dataset
represents a dataset in the catalog.
DCAT considers as a dataset a collection of data, published or curated by a single agent, and
available for access or download in one or more formats.
dcat:Distribution
represents an accessible form of a dataset,
e.g., a downloadable file, an RSS feed or a Web Service.
An example is shown below.
@prefix dcat: <http://www.w3.org/ns/dcat#> .
<#DCAT_source>
a dcat:Dataset ;
dcat:distribution [
a dcat:Distribution;
dcat:downloadURL "http://example.org/file.xml" ].
Hydra core vocabulary (Hydra) is a lightweight vocabulary,
published by the Hydra W3C Community Group, for the description of Hypermedia-Driven Web APIs.
Hydra can be used both to describe static data sources identified by a URI, and dynamic sources, described by a template-valued URI that contains variables, whose values depends on information only known by the client.
Hydra enables a server to advertise valid state transitions.
A client can use this information to construct HTTP requests to retrieve the data.
hydra:IriTemplate
represents an IRI template.
hydra:TemplateMapping
represents a mapping from an IRI template variable to a property
An example is shown below.
@prefix hydra : <http://www.w3.org/ns/hydra/core#> .
<#API_template_source>
a hydra:IriTemplate
hydra:template "https://biblio.ugent.be/publication/{id}?format={format}";
hydra:mapping
[ a hydra:TemplateMapping ;
hydra:variable "id";
hydra:required true ],
[ a hydra:TemplateMapping ;
hydra:variable "format";
hydra:required false ] .
The D2RQ Mapping Language (D2RQ) is a declarative Mapping Language for
describing the relation between a relational database schema and RDFS vocabularies or OWL ontologies.
D2RQ defines d2rq:Database
to represent a JDBC connection to a local or remote relational database.
An example is shown below.
@prefix d2rq : <http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/0.1#> .
<#DB_source> a d2rq:Database;
d2rq:jdbcDSN "jdbc:mysql://localhost/example";
d2rq:jdbcDriver "com.mysql.jdbc.Driver";
d2rq:username "user";
d2rq:password "password" .
SPARQL service description (SPARQL-SD)
is a W3C standardized vocabulary for describing
SPARQL services made available via the SPARQL 1.1 Protocol.
These descriptions provide a mechanism by
which a client or end user can discover information
about the SPARQL service and details about the available dataset.
SPARQL-SD defines sd:Service
to represent a SPARQL service made available
via the SPARQL Protocol, sd:Dataset
to represent a RDF Dataset comprised of a default graph and
zero or more named graphs and sd:Graph
to represent the description of an RDF graph.
Below two examples are shown.
The first one is a a SPARQL-SD description of a SPARQL endpoint set to return data in JSON format.
The second one is a SPARQL-SD description of a SPARQL endpoint set to return data in XML.
@prefix sd : <http://www.w3.org/ns/sparql-service-description#> .
<#SPARQL_JSON_source> a sd:Service ;
sd:endpoint <http://dbpedia.org/sparql/> ;
sd:supportedLanguage sd:SPARQL11Query ;
sd:resultFormat <http://www.w3.org/ns/formats/SPARQL_Results_JSON> .
@prefix sd : <http://www.w3.org/ns/sparql-service-description#> .
<#SPARQL_XML_source> a sd:Service ;
sd:endpoint <http://dbpedia.org/sparql/> ;
sd:supportedLanguage sd:SPARQL11Query ;
sd:resultFormat <http://www.w3.org/ns/formats/SPARQL_Results_XMLN> .
CSV on the Web Vocabulary (CSVW)
is a W3C working draft vocabulary for metadata that annotates tabular data.
CSVW defines csvw:Table
that represents a table within a CSV file and csvw:Dialect
that
represents a CSV dialect and informs the parsers regarding how to parse the file in a table description.
An example is shown below.
@prefix csvw : <http://www.w3.org/ns/csvw#> .
<#CSVW_source> a csvw:Table;
csvw:url "http://rml.io/data/csvw/Airport.csv" ;
csvw:dialect [ a csvw:Dialect;
csvw:delimiter ";";
csvw:encoding "UTF-8";
csvw:header "1"^^xsd:boolean
Original data: If you want to map data stored in a local file. Access description: Provide the path to the file. An example is shown below.
<#TriplesMapLocalFile> rml:logicalSource [
rml:source "/path/to/local/file.xml" ;
rml:referenceFormulation ql:CSV ] .
Original data: If you want to map data published in a data catalog on the Web. Access description: Provide the distribution description of the published dataset as the data source (DCAT). An example is shown below.
<#TriplesMapLocalFile> rml:logicalSource [
rml:source <#DCAT_source> ;
rml:referenceFormulation ql:XML;
rml:iterator "/" ] .
Original data: If you want to map data published on the Web and accessed via a Web API. Access description: Provide the API description or, at least, the description of the IRI (template) as the data source (Web API). An example is shown below.
<#TriplesMapLocalFile> rml:logicalSource [
rml:source <#API_template_source> ;
rml:referenceFormulation ql:JSON;
rml:iterator "$" ] .
Original data: If you want to map data stored in a database. Access description: Provide the database connectivity description as the data source (Database description). An example is shown below.
<#TriplesMapLocalFile> rml:logicalSource [
rml:source <#DB_source> ;
rr:sqlVersion rr:SQL2008;
rml:query """
SELECT DEPTNO, DNAME, LOC,
(SELECT COUNT(*) FROM EMP WHERE EMP.DEPTNO=DEPT.DEPTNO) AS STAFF
FROM DEPT; """ .
Original data: If you consider data already in RDF. Access description: Provide the SPARQL service description (SPARQL). An example is shown below.
<#TriplesMapLocalFile> rml:logicalSource [
rml:source <#SPARQL_XML_source> ;
rml:referenceFormulation ql:XML;
rml:iterator "/";
rml:query " select distinct ?resource ?resource_label
where { ?resource rdfs:label ?resource_label } " ] .