RDF Mapping Language (RML)

Unofficial Draft

More details about this document
Latest published version:
https://www.w3.org/RML/
Latest editor's draft:
https://rml.io/specs/rml/
Editors:
Anastasia Dimou (DTAI - KU Leuven) (main editor)
Miel Vander Sande (original contributor)
Authors:
Ben De Meester (IDLab - imec - Ghent University)
Pieter Heyvaert (IDLab - imec - Ghent University)
Thomas Delva (IDLab - imec - Ghent University)

Abstract

This document describes RML, a generic mapping language, based on and extending [R2RML]. The RDF Mapping language (RML) is a mapping language defined to express customized mapping rules from heterogeneous data structures and serializations to the RDF [RDF-CONCEPTS] data model. RML is defined as a superset of the W3C-standardized mapping language [R2RML], aiming to extend its applicability and broaden its scope, adding support for data in other structured formats. [R2RML] is the W3C standard to express customized mappings from relational databases to RDF. RML follows exactly the same syntax as R2RML; therefore, RML mappings are themselves RDF graphs. The present document describes the RML language and its concepts through definitions and examples.

The Knowledge Graph Construction W3C Community Group is developing a new version of the RML specification at https://w3id.org/rml/portal with prefix https://w3id.org/rml/. This document covers the original RML specification.

The version of this document is v1.1.2.

Status of This Document

This document is a draft of a potential specification. It has no official standing of any kind and does not represent the support or consensus of any standards organization.

1. Overview

This section is non-normative.

This document describes RML, a language for expressing customized mappings from heterogeneous data structures and serializations to the RDF data model (currently defined for sources in structured format, e.g. databases as in [R2RML] and CSV, TSV, XML and JSON data sources) to RDF datasets. Such mappings describe how existing data can be represented using the RDF data model. RML is based on and extends [R2RML]. R2RML is defined to express customized mappings only from relational databases to RDF datasets.

An RML mapping is not tailored to a specific database schema as an R2RML mapping, but can be defined for data in any other source format (currently defined for data sources of structured formats, e.g. CSV, TSV, XML, JSON). RML keeps the mapping definitions as in R2RML but excludes its database-specific references from the core model of the mapping definition. RML provides a generic way of defining the mappings that is easily transferable to cover references to other data structures. Thus, RML is a generic approach combined with case-specific extensions, but always remains backward compatible with R2RML as relational databases form such a specific case. An RML mapping definition follows the same syntax as R2RML.

No mapping formalisation exists to define how to map such heterogeneous sources into RDF in an integrated and interoperable fashion.

The input to an RML mapping can be any data source. The output is an RDF dataset that uses predicates and types from the target vocabulary. RML follows the same syntax as R2RML; therefore the mapping definitions are expressed as RDF graphs.

1.1 Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

In this document, examples assume the following namespace prefix bindings unless otherwise stated:

Prefix IRI
rml: http://semweb.mmlab.be/ns/rml#
ql: http://semweb.mmlab.be/ns/ql#
rr: http://www.w3.org/ns/r2rml#
rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs: http://www.w3.org/2000/01/rdf-schema#
xsd: http://www.w3.org/2001/XMLSchema#
ex: http://example.com/ns#

Gray boxes contain RDFS definitions of RML vocabulary terms:

# This box contains RDFS definitions of RML vocabulary terms

Yellow boxes contain example fragments of RML mappings in Turtle syntax:

# This box contains example RML mappings

Blue tables contain example input into an RML mapping:

#This box contains example input

Green boxes contain example output:

# This box contains example output RDF triples or fragments

2. RML Vocabulary

The RML vocabulary namespace is http://semweb.mmlab.be/ns/rml#

http://semweb.mmlab.be/ns/rml#

The RML vocabulary preferred prefix is the rml.

An RML mapping defines a mapping from any data in a structured source format to RDF. It consists of one or more triples maps.

The input to an RML mapping is called input data source. The output of an RML mapping is called output dataset.

The output dataset of an RML mapping is an RDF dataset that contains the generated RDF triples for each of the triples maps of the RML mapping. RML processors may provide additional triples or graphs.

As in R2RML, conforming RML processors may rename blank nodes when providing access to the output dataset.

The RML vocabulary consists of the RML specific defined classes but also includes all the [R2RML] classes

An RML mapping, defines a mapping from any logical source to RDF. It is a structure that consists of one or more triples maps.

The input to an RML mapping is called the input source.

An RML processor has access to one of the followings:

An RML processor may include an RML data validator or an RML default mapping generator, but these are not required.

An RML data validator is a system that takes as its input an RML mapping, a base IRI, and an input source, and checks for the presence of data errors. When checking the input source, a data validator must report any data errors that are raised in the process of generating the output dataset.

An RML default mapping generator may introspect the input source and generates an RML mapping, intended for further customization by a mapping author. Such a mapping is known as a default mapping.

A base IRI is used in resolving relative IRIs produced by the RML mapping. According to the [R2RML] spec document, the base IRI must be a valid [IRI]. It should not contain question mark (“?”) or hash (“#”) characters and should end in a slash (“/”) character.

3. RML Overview and Examples

An RML mapping refers to logical sources to retrieve data from the input source. A logical source extends R2RML's logical Table. A logical source can be one of the following:

  1. A base source (any input source or base table),
  2. a view (in case of databases)

3.1 Example: Mapping a CSV data source

The following RML mapping document produces the desired triples from the corresponding CSV data source.
RML mappings for CSV data sources follow exactly the same syntax as in R2RML to refer to the CSV's records.
It is considered as a correspondence of CSV records to the databases' rows, delimited by a line break (CRLF).

id,stop,latitude,longitude
6523,25,50.901389,4.484444
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix rml: <http://semweb.mmlab.be/ns/rml#>.
@prefix ql: <http://semweb.mmlab.be/ns/ql#>.
@prefix transit: <http://vocab.org/transit/terms/>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix wgs84_pos: <http://www.w3.org/2003/01/geo/wgs84_pos#>.
@base <http://example.com/ns#>.

<#AirportMapping> a rr:TriplesMap;
  rml:logicalSource [
    rml:source "Airport.csv" ;
    rml:referenceFormulation ql:CSV
  ];
  rr:subjectMap [
    rr:template "http://airport.example.com/{id}";
    rr:class transit:Stop
  ];

  rr:predicateObjectMap [
    rr:predicate transit:route;
    rr:objectMap [
      rml:reference "stop";
      rr:datatype xsd:int
      ]
    ];

  rr:predicateObjectMap [
    rr:predicate wgs84_pos:lat;
    rr:objectMap [
      rml:reference "latitude"
    ]
  ];

  rr:predicateObjectMap [
    rr:predicate wgs84_pos:long;
    rr:objectMap [
      rml:reference "longitude"
    ]
  ].
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix transit: <http://vocab.org/transit/terms/>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix wgs84_pos: <http://www.w3.org/2003/01/geo/wgs84_pos#>.

<http://airport.example.com/6523> rdf:type transit:Stop.
<http://airport.example.com/6523> transit:route "25"^^xsd:int.
<http://airport.example.com/6523> wgs84_pos:lat "50.901389".
<http://airport.example.com/6523> wgs84_pos:long "4.484444".

3.2 Example: Mapping an XML data source

The following RML mapping document produces the desired triples from the corresponding XML data source.
RML mappings for XML data sources follow the same syntax as in R2RML.
The references to the XML elements follow the syntax of the reference formulation specified at the logical source.
[XPath] is the default reference formulation used by RML for XML data sources.

<transport>
    <bus id="25">
        <route>
            <stop id="645">International Airport</stop>
            <stop id="651">Conference center</stop>
        </route>
    </bus>
</transport>
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix rml: <http://semweb.mmlab.be/ns/rml#>.
@prefix ex: <http://example.com/ns#>.
@prefix ql: <http://semweb.mmlab.be/ns/ql#>.
@prefix transit: <http://vocab.org/transit/terms/>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@base <http://example.com/ns#>.

<#TransportMapping> a rr:TriplesMap;
  rml:logicalSource [
    rml:source "Transport.xml" ;
    rml:iterator "/transport/bus";
    rml:referenceFormulation ql:XPath;
  ];

  rr:subjectMap [
    rr:template "http://trans.example.com/{@id}";
    rr:class transit:Stop
  ];

  rr:predicateObjectMap [
    rr:predicate transit:stop;
    rr:objectMap [
      rml:reference "route/stop/@id";
      rr:datatype xsd:int
    ]
  ];

  rr:predicateObjectMap [
    rr:predicate rdfs:label;
    rr:objectMap [
      rml:reference "route/stop"
    ]
  ].
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix transit: <http://vocab.org/transit/terms/>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.

<http://trans.example.com/25> rdf:type transit:Stop.
<http://trans.example.com/25> transit:stop "645"^^xsd:int.
<http://trans.example.com/25> rdfs:label "International Airport".
<http://trans.example.com/25> transit:stop "651"^^xsd:int.
<http://trans.example.com/25> rdfs:label "Conference center".

3.3 Example: Mapping a JSON data source

The following RML mapping document produces the desired triples from the corresponding JSON data source.
RML mappings for JSON data sources follow the same syntax as in R2RML.
The references to the JSON objects follow the syntax of the reference formulation specified at the logical source.
JSONPath is the default reference formulation used by RML for references to JSON data sources.

{
  "venue":
  {
    "latitude": "51.0500000",
    "longitude": "3.7166700"
  },
  "location":
  {
    "continent": " EU",
    "country": "BE",
    "city": "Brussels"
 }
}
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix rml: <http://semweb.mmlab.be/ns/rml#>.
@prefix ql: <http://semweb.mmlab.be/ns/ql#>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix schema: <http://schema.org/>.
@prefix wgs84_pos: <http://www.w3.org/2003/01/geo/wgs84_pos#>.
@prefix gn: <http://www.geonames.org/ontology#>.
@base <http://example.com/ns#>.

<#VenueMapping> a rr:TriplesMap;
  rml:logicalSource [
    rml:source "Venue.json";
    rml:referenceFormulation ql:JSONPath;
    rml:iterator "$"
  ];

  rr:subjectMap [
    rr:template "http://loc.example.com/city/{location.city}";
    rr:class schema:City
  ];

  rr:predicateObjectMap [
    rr:predicate wgs84_pos:lat;
    rr:objectMap [
      rml:reference "venue.latitude"
    ]
  ];

  rr:predicateObjectMap [
    rr:predicate wgs84_pos:long;
    rr:objectMap [
      rml:reference "venue.longitude"
    ]
  ];

  rr:predicateObjectMap [
    rr:predicate gn:countryCode;
    rr:objectMap [
      rml:reference "location.country"
    ]
  ].
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix schema: <http://schema.org/>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix wgs84_pos: <http://www.w3.org/2003/01/geo/wgs84_pos#lat>.
@prefix gn: <http://www.geonames.org/ontology#>.

<http://loc.example.com/city/Brussels> rdf:type schema:City.
<http://loc.example.com/city/Brussels> wgs84_pos:lat "50.901389".
<http://loc.example.com/city/Brussels> wgs84_pos:long "4.484444".
<http://loc.example.com/city/Brussels> gn:countryCode "BE".

4. Defining Logical Sources

A logical source is any source that is mapped to RDF triples. A logical source is a Base Source, rml:BaseSource

4.1 Base Sources (rml:iterator, rml:logicalSource, rml:referenceFormulation, rml:source)

A base source (rml:baseSource) is a logical source, rml:logicalSource, pointing to a source that contains the data to be mapped.
At least the source, rml:source, of the data source and its logical iterator (rml:iterator), should be defined.

A base source (rml:baseSource) is represented by a resource that has:

The source (rml:source) locates the input data source. It is a [URI] that represents the data source where the data source is.
The logical iterator (rml:iterator) defines the iteration loop used to map the data of the input source.
The reference formulation (rml:referenceFormulation) defines the reference formulation used to refer to the elements of the data source. The reference formulation should always be specified using rml:referenceFormulation. In case of relational databases, to remain backwards compliant with [R2RML], rr:sqlVersion can be used instead of rml:referenceFormulation. Examples of references formulations are SQL2008 for relational databases, as SQL2008 is the default for [R2RML] (rr:sqlVersion rr:SQL2008), [XPath] for XML (rml:referenceFormulation ql:XPath) and JSONPath for JSON data sources (rml:referenceFormulation ql:JSONPath).

The logical source (rml:logicalSource) definition requires:

The value of the source (rml:source) specifies the data source or the database to be mapped. Its value can be either a string (implicit reference to the data source) or a valid [URI] of an existing source.

A logical iterator (rml:iterator) is used to refer to any of the following:

A logical iterator must be a valid identifier, considering the reference formulation (rml:referenceFormulation) specified.

As default iterator is considered the row.

4.1.1 Examples

The following example shows a logical source specified for a CSV data source.

[] rml:logicalSource [
  rml:source "Airport.csv" ;
  rml:referenceFormulation ql:CSV
].

The following example shows a logical source specified for a database.

[] rml:logicalSource [
  rml:source "TRANSPORT.BUS" ;
  rml:referenceFormulation rr:SQL2008;
].

The following example shows a logical source specified for an XML source.

[] rml:logicalSource [
  rml:source "Transport.xml" ;
  rml:referenceFormulation ql:XPath;
  rml:iterator "/transport/bus";
].

The following example shows a logical source specified for a JSON source.

[] rml:logicalSource [
  rml:source "Venue.json";
  rml:referenceFormulation ql:JSONPath;
  rml:iterator "$.venue[*]"
].

5. Mapping Logical Sources to RDF with Triples Maps

A triples map specifies the rules for translating

to zero or more RDF triples.

A triples map in RML is defined as a triples map in R2RML.

A triples map is represented by a resource that references the following other resources:

The references of all term maps of a triples map (subject map, predicate maps, object maps, graph maps) must be references to rows/records/elements/objects that exist in the term map's logical source.

5.1 Examples

The following example shows a triples map including its logical source, the subject map, and a predicate-object maps for an XML data source.

@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix rml: <http://semweb.mmlab.be/ns/rml#>.
@prefix ex: <http://example.com/ns#>.
@prefix ql: <http://semweb.mmlab.be/ns/ql#>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@base <http://example.com/ns#>.

<#TransportMapping> a rr:TriplesMap;
  rml:logicalSource [
    rml:source "Transport.xml" ;
    rml:iterator "/transport/bus/route/stop";
    rml:referenceFormulation ql:XPath;
  ];

  rr:subjectMap [
    rr:template
      "http://trans.example.com/stop/{@id}";
    rr:class ex:Stop
  ];

  rr:predicateObjectMap [
    rr:predicate rdfs:label;
    rr:objectMap [
      rml:reference "."
    ]
  ].
<http://trans.example.com/stop/645> rdf:type ex:Stop.
<http://trans.example.com/stop/645> rdfs:label "International Airport".

<http://trans.example.com/stop/651> rdf:type ex:Stop.
<http://trans.example.com/stop/651> rdfs:label "Conference center".

<http://trans.example.com/stop/873> rdf:type ex:Stop.
<http://trans.example.com/stop/645> rdfs:label "Central park".

5.2 Creating Resources with Subject Maps

As defined in [R2RML]: A subject map is a term map. It specifies a rule for generating the subjects of the RDF triples generated by a triples map.

5.3 Typing Resources (rr:class)

As defined in R2RML (text adjusted to refer to data in other structured formats):

A subject map may have one or more class IRIs. They are represented by the rr:class property. The values of the rr:class property must be IRIs. For each RDF term generated by the subject map, RDF triples with predicate rdf:type and the class [IRI] as object will be generated.

In the following example, the generated subject will be asserted as an instance of the ex:Stop class. Using the example, the following RDF triple will be generated:

[] rr:template "http://trans.example.com/{@id}";
   rr:class ex:Stop.
<http://trans.example.com/stop/645> rdf:type ex:Stop.
<http://trans.example.com/stop/651> rdf:type ex:Stop.
<http://trans.example.com/stop/873> rdf:type ex:Stop.

Mappings where the class [IRI] is not constant, but needs to be computed based on the contents of the logical source, can be achieved by defining a predicate-object map with predicate rdf:type and a non-constant object map.

In the following example, the generated subject will be asserted based on the contents of the logical source to be mapped.

[] rr:predicateObjectMap [
  rr:predicate rdf:type;
  rr:objectMap [
    rr:template "http://trans.example.com/{@type}"
    ]
  ].

5.4 Creating Properties and Values with Predicate-Object Maps

A predicate-object map is a function that creates one or more predicate-object pairs for each row/record/element/object of a logical source. It is used in conjunction with a subject map to generate RDF triples in a triples map.

A predicate-object map is represented by a resource that references the following other resources:

Both predicate maps and object maps are term maps.

6. Creating RDF Terms with Term Maps

As defined in R2RML (text adjusted to refer to data in other structured formats):

An RDF term is either an [IRI], or a blank node, or a literal.

A term map is a function that generates an RDF term from a logical reference. The result of that function is known as the term map's generated RDF term.

Term maps are used to generate the subjects, predicates and objects of the RDF triples that are generated by a triples map. Consequently, there are several kinds of term maps, depending on where in the mapping they occur: subject maps, predicate maps, object maps and graph maps.

A term map must be exactly one of the following:

The references of a term map are the set of logical references referenced in the term map and depend on the type of term map.

6.1 Constant RDF Terms (rr:constant)

As defined in R2RML (text adjusted to refer to data in other structured formats):

A constant-valued term map is a term map that ignores the logical iterator specified by the query and always generates the same RDF term. A constant-valued term map is represented by a resource that has exactly one rr:constant property.

The constant value of a constant-valued term map is the RDF term that is the value of its rr:constant property.

If the constant-valued term map is a subject map, predicate map or graph map, then its constant value must be an [IRI].

If the constant-valued term map is an object map, then its constant value must be an [IRI] or literal.

The references of a constant-valued term map is the empty set.

Constant-valued term maps can be expressed more concisely using the constant shortcut properties rr:subject, rr:predicate, rr:object and rr:graph. Occurrences of these properties must be treated exactly as if the following triples were present in the mapping graph instead:

Triple involving constant shortcut property Replacement triples
?x rr:subject ?y. ?x rr:subjectMap [ rr:constant ?y ].
?x rr:predicate ?y. ?x rr:predicateMap [ rr:constant ?y ].
?x rr:object ?y. ?x rr:objectMap [ rr:constant ?y ].
?x rr:graph ?y. ?x rr:graphMap [ rr:constant ?y ].

The following example shows a predicate-object map that uses a constant-valued term map both for its predicate and for its object.

[] rr:predicateMap [ rr:constant rdf:type ];
rr:objectMap [ rr:constant ex:Stop ].

If added to a triples map, this predicate-object map would add the following triple to all resources ?x generated by the triples map:

?x rdf:type ex:Stop.

The following example uses constant shortcut properties and is equivalent to the example above:

[] rr:predicate rdf:type;
   rr:object ex:Stop.

6.2 From a Reference (rml:reference)

A reference rml:reference is used to refer to

A reference must be a valid expression, considering the reference formulation (rml:referenceFormulation) specified. The reference can be an absolute path, or a path relative to the iterator specified at the logical source.

A reference-valued term map is a term map that is represented by a resource that has exactly one rml:reference property.

The object of the rml:reference property must be an RDF literal encoding a valid reference formulation, e.g. a column identifier according to the SQL2008 specification (in the case of databases), a valid name of a column (in the case of CSV and TSV data sources), a valid XPath expression (in the case of XML data sources), or a valid JSONPath expression (in the case of JSON data sources).

A reference extracts a reference value from a given logical iteration and the reference value is used to create an RDF term. The reference value of a reference is the data value of the column/element/object indicated by that reference for a given logical iteration.

The references of a reference-valued term map is the singleton set containing the value of the term map's rml:reference property.

The following examples define an object map that generates literals for each different case of data formats. These examples are based on the sample data specified in the beginning.

6.2.1 Database (SQL) reference

[] rr:objectMap [ rr:column "DNAME" ].

6.2.2 CSV reference

[] rr:objectMap [ rml:reference "city" ].

6.2.3 XPath reference

[] rr:objectMap [ rml:reference "@id" ].

6.2.4 JSONPath reference

[] rr:objectMap [ rml:reference "location.city" ].

RML supports writing relative JSONPath expressions. These do not exist in the proposed JSONPath framework.

To refer to the current reference value, you can use the @ JSONPath expression

6.3 From a Template (rr:template)

As defined in R2RML (text adjusted to refer to data in other structured formats):

A template-valued term map is a term map that is represented by a resource that has exactly one rr:template property. The value of the rr:template property must be a valid string template.

A string template is a format string that can be used to build strings from multiple components. It can reference logical references by enclosing them in curly braces (“{” and “}”). The following syntax rules apply to valid string templates:

The template value of the term map for a given logical iterator is determined as follows:

  1. Let result be the template string
  2. For each pair of unescaped curly braces in result:
    1. Let value be the data value of the reference whose name is enclosed in the curly braces
    2. If value is NULL, then return NULL
    3. Let value be the natural RDF lexical form corresponding to value
    4. If the term type is rr:IRI, then replace the pair of curly braces with an IRI-safe version of value; otherwise, replace the pair of curly braces with value
  3. Return result
[] rr:subjectMap [ rr:template "http://trans.example.com/stop/{@id}" ].
http://trans.example.com/stop/645

6.4 IRIs, Literal, Blank Nodes (rr:termType)

As defined in R2RML (text adjusted to refer to data in other structured formats):

The term type of a reference-valued term map or template-valued term map determines the kind of generated RDF term (IRIs, blank nodes or literals).

If the term map has an optional rr:termType property, then its term type is the value of that property. The value must be an [IRI] and must be one of the following options:

If the term map does not have a rr:termType property, then its term type is:

6.5 Language Tags (rml:languageMap and rr:language)

As defined in R2RML (text adjusted to refer to data in other structured formats and inclusion of rml:languageMap):

A term map with a term type of rr:Literal may have a specified language tag. This language tag can be specified using either a language map and/or a language property. A specified language tag causes generated literals to be language-tagged plain literals.

A language map is represented by the rml:languageMap property on a term map. If present, its value must be a term map. The value generated by the language map is used as a language tag on the created term.

[] rr:objectMap [
  rml:reference "Localization";
  rml:languageMap [
    rml:reference "Localization[@Culture]"
    ]
  ].

A language property is represented by the rr:language property on a term map. If present, its value must be a valid language tag.

[] rr:objectMap [ rml:reference "location.city"; rr:language "en-us" ].

The language of a generated term is thus defined using following precedence rules: IF there's a language map AND its generated value is a valid language tag, use that value ELSIF there's a language property, use that value; ELSE don't specify the language.

6.6 Typed Literals (rr:datatype)

As defined in R2RML (text adjusted to refer to data in other structured formats):

A datatypeable term map is a term map with a term type of rr:Literal that does not have a specified language tag.

Datatypeable term maps may generate typed literals. The datatype of these literals can be explicitly defined using rr:datatype (producing a datatype-override RDF literal).

A datatypeable term map may have a rr:datatype property. Its value must be an [IRI]. This [IRI] is the specified datatype of the term map.

A term map must not have more than one rr:datatype value.

A term map that is not a datatypeable term map must not have an rr:datatype property.

Note

O ne cannot explicitly state that a plain literal without language tag should be generated. They are the default for string columns. To generate one from a non-string logical reference, a template-valued term map with a template such as "{reference}" and a term type of rr:Literalcan be used.

The following example shows an object map that overrides the default datatype of the logical source with an explicitly specified xsd:positiveInteger type. A datatype-override RDF literal of that datatype will be generated from whatever is in the intensity element.

[] rr:objectMap [ rml:reference "@id"; rr:datatype xsd:positiveInteger ].

7. Relationships among Logical Sources (rr:parentTriplesMap, rr:joinCondition, rr:child and rr:parent)

A referencing object map allows using the subjects of another triples map as the objects generated by a predicate-object map.
Since the triples maps may be based on different logical sources, this may require joins between several logical sources.

A referencing object map is represented by a resource that:

A join condition is represented by a resource that has exactly one value for each of the following two properties:

The child query of a referencing object map is the reference of the logical source of the term map containing the referencing object map. In this case, the triples map of this referencing object map is the child triples map.

The parent query of a referencing object map is the reference or the query of the logical source of its parent triples map.

The joint query is used when generating RDF triples from referencing object maps. If the logical source of the child triples map and the logical source of the parent triples map of a referencing object map are not identical, then the referencing object map must have at least one join condition.

The following example shows a referencing object map as part of a predicate-object map:

<#TriplesMap1> a rr:TriplesMap;
  rml:logicalSource [
    rml:source "Airport.csv" ;
    rml:referenceFormulation ql:CSV;
  ];
  rr:subjectMap [
    rr:template "http://trans.example.com/airport/{id}";
  ];
  rr:predicateObjectMap [
    rr:predicate ex:located;
    rr:objectMap [
      rr:parentTriplesMap <#TriplesMap2>;
      rr:joinCondition [
        rr:child "city";
        rr:parent "location.city";
      ];
    ];
  ].
<#TriplesMap2> a rr:TriplesMap;
  rml:logicalSource [
    rml:source "Venue.json" ;
    rml:iterator "$.venue[*]";
  ];
  rr:subjectMap [
    rr:template "http://venue.example.com/{location.city}"
  ].

Given the example input, and subject maps as defined in the example mapping, this would result in a triple:

<http://trans.example.com/airport/6523> ex:located <http://venue.example.com/Brussels>.

The following example shows a referencing object map that does not have a join condition.

<#TriplesMap1> a rr:TriplesMap;
  rml:logicalSource [
    rml:source "Airport.csv" ;
    rml:referenceFormulation ql:CSV;
  ];
  rr:subjectMap [
    rr:template "http://trans.example.com/airport/{id}";
  ];
  rr:predicateObjectMap [
    rr:predicate ex:located;
    rr:objectMap [
      rr:parentTriplesMap <#TriplesMap2>;
    ];
  ].
<#TriplesMap2> a rr:TriplesMap;
  rml:logicalSource [
    rml:source "Airport.csv" ;
    rml:referenceFormulation ql:CSV;
  ];
  rr:subjectMap [
    rr:template "http://venue.example.com/{latitude},{longitude}"
  ].

No join condition is needed as both triples maps use the same logical source. Given the example input, this mapping would result in the following triples (assuming an appropriate base IRI):

<http://data.example.com/department/10> rdf:type ex:Department.
<http://data.example.com/department/10> ex:location <http://data.example.com/site/NEW%20YORK>.
<http://data.example.com/site/NEW%20YORK> rdf:type ex:Site.
<http://data.example.com/site/NEW%20YORK> ex:siteName "NEW YORK".

8. Named Graphs

As defined in R2RML (text adjusted to refer to data in other structured formats):

Each triple generated from an RML mapping is placed into one or more graphs of the output dataset. Possible target graphs are the unnamed default graph, and the IRI-named named graphs.

Any subject map or predicate-object map may have one or more associated graph maps.

They are specified in one of two ways:

  1. using the rr:graphMap property, whose value must be a graph map,
  2. using the constant shortcut property rr:graph.

Graph maps are themselves term maps. When RDF triples are generated, the set of target graphs is determined by taking into account any graph maps associated with the subject map or predicate-object map.

If a graph map generates the special [IRI] rr:defaultGraph, then the target graph is the default graph of the output dataset.

8.1 Example

[] rr:subjectMap [
  rr:template "http://trans.example.com/stop/{@id}";
  rr:graphMap [ rr:constant ex:StopsGraph ];
].

This is equivalent to the following example, which uses a constant shortcut property:

[] rr:subjectMap [
  rr:template "http://trans.example.com/stop/{@id}";
  rr:graph ex:StopsGraph;
].
[] rr:subjectMap [
  rr:template "http://trans.example.com/stop/{@id}";
  rr:graphMap [ rr:template "http://www.example.com/{@type}" ];
].
In this example, the generated triples are placed into named graphs according to their type.
<http://www.example.com/SingleDecker>
<http://www.example.com/DoubleDecker>
These are the two graphs to be generated.

8.2 The scope of Blank Nodes

Blank nodes in the output dataset are scoped to a single RDF graph. If the same blank node identifier occurs in multiple RDF triples that are in the same graph, then the triples will share the same blank node. If, however, the same blank node identifier occurs inblank nodes can never be shared by two triples in two different graphs.

This implies that triples generated from a single logical source will have different subjects if the subjects are blank nodes and the triples are placed into different graphs.

9. Integrated mapping

To have more complete mappings, triples are generated combining the triples maps of the different sources. In these cases, the triples' subject come from a triples map, while the objects come from the another triples map. This is achieved by adding another predicate-object-map property (rr:predicateObjectMap) to the first triples map. This one uses the other triples map, as a parent triples map:

9.1 Example Input

Airport.csv - CSV source


id, city, bus, latitude, longitude
6523, Brussels, 25, 50.901389, 4.484444
Venue.json - JSON source


{
  "venue":
  [
    {
      "latitude": "50.901389",
      "longitude": "4.484444",
      "location":
      {
          "continent": "EU",
          "country": "BE",
          "city": "Brussels"
      }
    },
    {
      "latitude": "51.51334",
      "longitude": "-0.08901",
      "location":
      {
          "continent": "EU",
          "country": "GB",
          "city": "London"
      }
   }
  ]
}
Transport.xml - XML source 


<transport>
  <bus id="25" type="SingleDecker">
    <route>
      <stop id="645">Airport</stop>
      <stop id="651">Conference center</stop>
    </route>
    <aircondition>
      false
    </aircondition>
  </bus>
  <bus id="47" type="DoubleDecker">
    <route>
      <stop id="873">Central Park</stop>
      <stop id="651">Conference center</stop>
    </route>
    <aircondition>
      true
    </aircondition>
  </bus>
</transport>

9.2 Example custom mappings

@prefix rml: <http://semweb.mmlab.be/ns/rml#>.
@prefix ql: <http://semweb.mmlab.be/ns/ql#>.
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix ex: <http://example.com/ns#>.
@base <http://example.com/ns#>.

<#AirportMapping> a rr:TriplesMap;
  rml:logicalSource [
    rml:source "Airport.csv" ;
    rml:referenceFormulation ql:CSV
  ];
  rr:subjectMap [
    rr:template "http://airport.example.com/{id}";
    rr:class ex:Stop
  ];
  rr:predicateObjectMap [
    rr:predicate ex:latlong;
    rr:objectMap [
      rr:parentTriplesMap <#LocationMapping_CSV>
    ]
  ];
  rr:predicateObjectMap [
    rr:predicate ex:route;
    rr:objectMap [
      rr:parentTriplesMap <#TransportMapping> ;
      rr:joinCondition [
        rr:child "bus";
        rr:parent "transport/bus/@id"
      ]
    ]
  ];

  rr:predicateObjectMap [
    rr:predicate ex:location;
    rr:objectMap [
      rr:parentTriplesMap <#VenueMapping> ;
      rr:joinCondition [
        rr:child "city";
        rr:parent "location.city"
      ]
    ]
  ].
@prefix rml: <http://semweb.mmlab.be/ns/rml#>.
@prefix ql: <http://semweb.mmlab.be/ns/ql#>.
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix ex: <http://example.com/ns#>.
@base <http://example.com/ns#>.

<#LocationMapping_CSV> a rr:TriplesMap;
  rml:logicalSource [
    rml:source "Airport.csv" ;
    rml:referenceFormulation ql:CSV
  ];
  rr:subjectMap [
    rr:template "http://loc.example.com/latlong/{latitude},{longitude}"
  ];

  rr:predicateObjectMap [
    rr:predicate ex:lat;
    rr:objectMap [
      rml:reference "latitude"
    ]
  ];

  rr:predicateObjectMap [
    rr:predicate ex:long;
    rr:objectMap [
      rml:reference "longitude"
    ]
  ].
@prefix rml: <http://semweb.mmlab.be/ns/rml#>.
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix ql: <http://semweb.mmlab.be/ns/ql#>.
@prefix ex: <http://example.com/ns#>.
@base <http://example.com/ns#>.

<#VenueMapping> a rr:TriplesMap;
  rml:logicalSource [
    rml:source "Venue.json";
    rml:referenceFormulation ql:JSONPath ;
    rml:iterator "$.venue[*]"
  ];
  rr:subjectMap
  [
    rr:template "http://loc.example.com/city/{location.city}";
    rr:class ex:City;
  ];
  rr:predicateObjectMap [
    rr:predicate ex:latlong;
    rr:objectMap [
      rr:parentTriplesMap <#LocationMapping_JSON>
    ]
  ];
  rr:predicateObjectMap [
    rr:predicate ex:countryCode;
    rr:objectMap [
      rml:reference "location.country"
    ]
  ];
  rr:predicateObjectMap [
    rr:predicate ex:onContinent;
    rr:objectMap [
      rml:reference "location.continent"
    ]
  ].
@prefix rml: <http://semweb.mmlab.be/ns/rml#>.
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix ql: <http://semweb.mmlab.be/ns/ql#>.
@prefix ex: <http://example.com/ns#>.
@base <http://example.com/ns#>.

<#LocationMapping_JSON> a rr:TriplesMap;
  rml:logicalSource [
    rml:source "Venue.json" ;
    rml:referenceFormulation ql:JSONPath ;
    rml:iterator "$.venue[*]"
  ];
  rr:subjectMap [
    rr:template "http://loc.example.com/latlong/{latitude},{longitude}"
  ];
  rr:predicateObjectMap [
    rr:predicate ex:lat;
    rr:objectMap [
      rml:reference "latitude"
    ]
  ];

  rr:predicateObjectMap [
    rr:predicate ex:long;
    rr:objectMap [
      rml:reference "longitude"
    ]
  ].
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix rml: <http://semweb.mmlab.be/ns/rml#>.
@prefix ql: <http://semweb.mmlab.be/ns/ql#>.
@prefix ex: <http://example.com/ns#>.
@base <http://example.com/ns#>.

<#TransportMapping> a rr:TriplesMap;
  rml:logicalSource [
    rml:source "Transport.xml" ;
    rml:referenceFormulation ql:XPath ;
    rml:iterator "/transport/bus"
  ];
  rr:subjectMap [
    rr:template "http://trans.example.com/bus/{@id}";
    rr:class ex:Transport ;
  ];

  rr:predicateObjectMap [
    rr:predicate ex:type ;
    rr:objectMap [
      rr:template "http://trans.example.com/vehicle/{@type}";
    ]
  ];

  rr:predicateObjectMap [
    rr:predicate ex:stop;
    rr:objectMap [
      rr:parentTriplesMap <#StopMapping> ;
      rr:joinCondition [
        rr:child "@id";
        rr:parent "../../@id";
      ]
    ]
  ].
@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix rml: <http://semweb.mmlab.be/ns/rml#>.
@prefix ql: <http://semweb.mmlab.be/ns/ql#>.
@prefix ex: <http://example.com/ns#>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@base <http://example.com/ns#>.

<#StopMapping> a rr:TriplesMap;
  rml:logicalSource [
    rml:source "Transport.xml" ;
    rml:referenceFormulation ql:XPath ;
    rml:iterator "/transport/bus/route/stop"
  ];
  rr:subjectMap [
    rr:template "http://trans.example.com/stop/{@id}";
    rr:class ex:Stop
  ];
  rr:predicateObjectMap [
    rr:predicate ex:stop;
    rr:objectMap [
      rml:reference "@id";
      rr:datatype xsd:int
    ]
  ];
  rr:predicateObjectMap [
    rr:predicate ex:stopLabel;
    rr:objectMap  [
      rml:reference ".";
    ]
  ].

9.3 Example output

The desired RDF triples to be produced from these logical sources are as follows:

@prefix ex: <http://example.com/ns#> .

<http://airport.example.com/6523> a ex:Stop;
  ex:latlong <http://loc.example.com/latlong/50.901389,4.484444>;
  ex:location <http://loc.example.com/city/Brussels> .

<http://loc.example.com/city/Brussels> a ex:City;
  ex:countryCode "BE";
  ex:latlong <http://loc.example.com/latlong/50.901389,4.484444>, <http://loc.example.com/latlong/51.51334,-0.08901>;
  ex:onContinent "EU" .

<http://loc.example.com/city/London> a ex:City;
  ex:countryCode "GB";
  ex:latlong <http://loc.example.com/latlong/50.901389,4.484444>, <http://loc.example.com/latlong/51.51334,-0.08901>;
  ex:onContinent "EU" .

<http://loc.example.com/latlong/50.901389,4.484444> ex:lat "50.901389";
  ex:long "4.484444" .

<http://loc.example.com/latlong/51.51334,-0.08901> ex:lat "51.51334";
  ex:long "-0.08901" .

<http://trans.example.com/bus/25> a ex:Transport;
  ex:stop <http://trans.example.com/stop/645>, <http://trans.example.com/stop/651>;
  ex:type <http://trans.example.com/vehicle/SingleDecker> .

<http://trans.example.com/bus/47> a ex:Transport;
  ex:stop <http://trans.example.com/stop/651>, <http://trans.example.com/stop/873>;
  ex:type <http://trans.example.com/vehicle/DoubleDecker> .

<http://trans.example.com/stop/645> a ex:Stop;
  ex:stop "645"^^<http://www.w3.org/2001/XMLSchema#int>;
  ex:stopLabel "Airport" .

<http://trans.example.com/stop/651> a ex:Stop;
  ex:stop "651"^^<http://www.w3.org/2001/XMLSchema#int>;
  ex:stopLabel "Conference center" .

<http://trans.example.com/stop/873> a ex:Stop;
  ex:stop "873"^^<http://www.w3.org/2001/XMLSchema#int>;
  ex:stopLabel "Central Park" .

This completes the RML mapping document. An RML processor will generate the triples listed above from the aforementioned triples maps.

10. Index of RML Vocabulary Terms

This appendix lists all the classes, properties and other terms defined by this specification within the RML vocabulary.

An RDFS representation of the vocabulary is available from the http://semweb.mmlab.be/ns/rml#.

10.1 Classes

The following table lists all RML classes.

The third column contains minimum conditions that a resource has to fulfil in order to be considered member of the class. Where multiple conditions are listed, all must be fulfilled.

Class Represents
rml:BaseSource Base Source
rr:GraphMap graph map imported from R2RML vocabulary
rr:Join join condition imported from R2RML vocabulary
rml:LanguageMap language map
rml:LogicalSource logical source
rr:ObjectMap object map imported from R2RML vocabulary
rr:PredicateMap predicate map imported from R2RML vocabulary
rr:PredicateObjectMap predicate-object map imported from R2RML vocabulary
rml:referenceFormulation query language
rr:RefObjectMap referencing object map imported from R2RML vocabulary
rr:SubjectMap subject map imported from R2RML vocabulary
rr:TermMap term map imported from R2RML vocabulary
rr:TriplesMap triples map imported from R2RML vocabulary

10.2 Properties

The following table lists all properties in the R2RML vocabulary.

Property Represents Cardinality
rr:child imported from R2RML vocabulary 1
rr:class class IRI imported from R2RML vocabulary 0…∞
rml:reference reference name 1
rr:datatype specified datatype imported from R2RML vocabulary 0…1 !
rr:constant constant value imported from R2RML vocabulary 1
rr:graph constant shortcut property imported from R2RML vocabulary 0…∞
rr:graphMap graph map imported from R2RML vocabulary
rr:inverseExpression inverse expression imported from R2RML vocabulary 0…1 !
rml:iterator iterator 1
rr:joinCondition join condition imported from R2RML vocabulary 0…∞
rr:language specified language tag imported from R2RML vocabulary 0…1 !
rml:languageMap language map 0…1 !
rml:logicalSource logical source 1
rr:object constant shortcut property imported from R2RML vocabulary 1…∞
rr:objectMap object map, referencing object map imported from R2RML vocabulary
rr:parent parent reference imported from R2RML vocabulary 1
rr:parentTriplesMap parent triples map imported from R2RML vocabulary 1
rr:predicate constant shortcut property imported from R2RML vocabulary 1…∞
rr:predicateMap predicate map
rr:predicateObjectMap predicate-object map imported from R2RML vocabulary 0…∞
rml:referenceFormulation referenceFormulation 0…1
rml:reference reference 1
rml:version version identifier imported from R2RML vocabulary 0…∞
rr:subject constant shortcut property imported from R2RML vocabulary 0…1
rr:subjectMap subject map
rml:source source name 1
rr:template string template imported from R2RML vocabulary 1
rr:termType term type imported from R2RML vocabulary 0…1 !

10.3 Other Terms

Term Denotes
rr:defaultGraph default graph imported from R2RML vocabulary
rml:JSONPath JSONPath
rr:SQL2008 Core SQL 2008 imported from R2RML vocabulary
rr:IRI IRI imported from R2RML vocabulary
rr:BlankNode blank node imported from R2RML vocabulary
rr:Literal literal imported from R2RML vocabulary
rml:XPath [XPath]

11. Changelog

11.1 v1.1.2

  • Added
    • Link to the new RML specification by the Knowledge Graph Construction W3C Community Group.

11.2 v1.1.1

  • Added
    • Clarification that SQL column labels are supported as references
    • Clarification about JSONPath references
  • Fixed
    • Make mapping and output consistent for buses/stops in integrated example
    • Add missing rr:TriplesMap in examples
    • Some small example validations

11.3 v1.1.0

  • Added: Language Map specification.
  • Fixed: General fixes of the code samples and typos.

11.4 v1.0.0

  • This is the RML specification from 2015.

A. References

A.1 Normative references

[IRI]
Internationalized Resource Identifiers (IRIs). M. Duerst; M. Suignard. IETF. January 2005. Proposed Standard. URL: https://www.rfc-editor.org/rfc/rfc3987
[R2RML]
R2RML: RDB to RDF Mapping Language. Souripriya Das; Seema Sundara; Richard Cyganiak. W3C. 27 September 2012. W3C Recommendation. URL: https://www.w3.org/TR/r2rml/
[URI]
Uniform Resource Identifier (URI): Generic Syntax. T. Berners-Lee; R. Fielding; L. Masinter. IETF. January 2005. Internet Standard. URL: https://www.rfc-editor.org/rfc/rfc3986
[XPath]
XML Path Language (XPath) Version 1.0. James Clark; Steven DeRose. W3C. 16 November 1999. W3C Recommendation. URL: https://www.w3.org/TR/xpath-10/

A.2 Informative references

[RDF-CONCEPTS]
Resource Description Framework (RDF): Concepts and Abstract Syntax. Graham Klyne; Jeremy Carroll. W3C. 10 February 2004. W3C Recommendation. URL: https://www.w3.org/TR/rdf-concepts/