Response to Mapping Diverse Data to RDF in Practice, presented at ISWC2018 October 12th, 2018


In this blog post, we respond to the paper Mapping Diverse Data to RDF in Practice by Alexandros Chortaras and Giorgos Stamou, presented at ISWC2018, as this paper presents certain inaccuracies.

The paper missed from its related work three main track papers from SEMANTICS2015 ([RML-SEMANTICS2015]), ESWC2017 ([RML+FnO-ESWC2017]), and ISWC2017 ([FnO+DBpedia-ISWC2017]). This leads to misinforming its readers about our work with RML. The paper claims that existing work does not support certain aspects which they present as their novel contributions. However, we did support everything they claim that we do not based on the 3 aforementioned papers. This blog post provides a detailed analysis to show how each one of their motivating examples which are used also as their evaluation are addressed with our work.

On high level:

  • The paper claims that RML does not support transformations/conditions, but it ignores the [RML+FnO-ESWC2017] which was also applied to [FnO+DBpedia-ISWC2017].
  • The paper ignores the [RML-SEMANTICS2015] on data sources description and alignment with RML where the well-informed choices for the data sources descriptions are explained. Instead, it cites the [RML-LDOW2016] on automated metadata generation which builds on top of [RML-SEMANTICS2015].

In details:

  • Transformation & Defined Column (Section 6.1) What is introduced as transformation and defined column is what RML does with transformations. To make it clear visually, one can have a look at the RMLEditor integration with the FnO extension of RML (http://rml.io/editor/functions/).
  • dr:CurrentModel & dr:SetModel (Section 6.2) The RMLMapper does allow to specify the order the Triples Maps should be executed. One needs to provide the Term Maps in the desired order using the RMLMapper's CLI.
  • Section 6.3 is fully covered by [RML-SEMANTICS2015]
  • dr:Transformation & dr:DefinedColumn (Section 6.4) dr:Transformation is covered by the RML bind condition [RML-SEMANTICS2015] dr:DefinedColumn is covered by the RML and FnO alignment [RML+FnO-ESWC2017]
  • Conditions (Section 6.5) Conditions may be defined as functions - functions are not only meant for transformations

Motivating Examples

The motivating examples are
  1. randomly chosen - why those examples not others?
  2. covered by RML as opposed to what the paper claims

Below, we describe how each one of the motivating examples is addressed by RML

Example 1

Data transformations are covered as per [RML+FnO-ESWC2017]. An exemplary mapping snippet is shown below, a full mapping file can be found here.

<#events>
  a rr:predicateObjectMap [
    rr:predicate kvoc-t:text;
    rr:objectMap [
      fnml:functionValue [
        rr:predicateObjectMap [
          rr:predicate fno:executes ;
          rr:objectMap [ rr:constant ex:extractKeywordFunction ] ] ;
        rr:predicateObjectMap [
          rr:predicate ex:input ;
          rr:objectMap [ rr:reference "keywords" ] ]
      ]
    ]
  ]

Example 2

This is in fact a covered as per the original RML paper ([RML-LDOW2014]). A mapping in YARRRML is shown below, a full mapping file as RML can be found here.

prefixes:
  ex: "http://example.com/"
  exr: "http://example.com/r/"

mappings:
  person:
    sources:
      - ['persons.json~jsonpath', '$.periodCollections.*']
    s: http://example.com/$(id)
    po:
      - [a, foaf:Person]
  def:
    sources:
      - ['persons.json~jsonpath', '$.periodCollections.*.definitions.*']
    s: http://example.com/r/$(id)
    po:
      - [a, foaf:Document]
      - p: ex:inSchema
        o:
         - mapping: person
           condition:
            function: equal
            parameters:
              - [str1, $(id)]
              - [str2, $(definitions.*.id)]

Example 3

All data files as presented by D2RML paper are available at following links: adminCodes.txt, GR.txt, and alternatenames/GR.txt.

In the paper, performance is discussed, not whether you can write the rules or not to get the required result. So actually the discussion here should be about: should the execution/implementation of a language be considered? Considering that the rules are declarative and hence a different implementation can be chosen, e.g., one that can handle the mentioned problems. Bind conditions can be used to get the correct XX.txt file, as it was introduced at [RML-SEMANTICS2015], see also the detailed page about bind conditions . For the use of the join tables, we could use a an rml:JoinCondition.

An exemplary mapping snippet is shown below, a full mapping file can be found here.

<#GR>
  a rr:predicateObjectMap [
    rr:predicate gn:parentADM1;
    rr:objectMap [
      rr:parentTriplesMap <#GR>;
      rml:joinCondition [
        fnml:functionValue [
          rr:predicateObjectMap [
            rr:predicate fno:executes ;
            rr:objectMap [ rr:constant idlab-fn:equal ]
          ] ;

        rr:predicateObjectMap [
            rr:predicate grel:valueParameter ;
            rr:objectMap [
              rml:parentTermMap [
                  rml:reference “geonameid”
              ]
            ]
          ] ;

          rr:predicateObjectMap [
            rr:predicate grel:valueParameter2 ;
            rr:objectMap [
                fnml:functionValue [
                rml:logicalSource [
                  //refer to admin1Codes.txt
                ];

                rr:predicateObjectMap [
                  rr:predicate fno:executes ;
                  rr:objectMap [ rr:constant ex:getValueFromClmnIfMatch ] ] ;
...

Example 4

By describing functions recursively, this problem can be covered. An exemplary mapping snippet is shown below, a full mapping file can be found here.

<#T>
  a rr:predicateObjectMap [
    rr:predicate kvoc-ont:period;
    rr:objectMap [
      fnml:functionValue [
        rr:predicateObjectMap [
          rr:predicate fno:executes ;
          rr:objectMap [ rr:constant ex:find-period ]
        ] ;

      rr:predicateObjectMap [
          rr:predicate grel:valueParameter ;
          rr:objectMap [
            fnml:functionValue [
                    rr:predicateObjectMap [
                        rr:predicate fno:executes ;
                        rr:objectMap [ rr:constant ex:find-location ] ] ;
                    rr:predicateObjectMap [
                        rr:predicate grel:valueParameter ;
                        rr:objectMap [ rml:reference "geo" ] ]
                ]
          ]
        ] ;
...

References

  • [RML-LDOW2014]: Dimou et al. RML: A Generic Language for Integrated RDF Mappings of Heterogeneous Data
  • [RML-SEMANTICS2015]: Dimou et al. Machine-interpretable dataset and service descriptions for heterogeneous data access and retrieval
  • [FnO-PD/ESWC2016]: De Meester et al. An Ontology to Semantically Declare and Describe Functions
  • [RML-LDOW2016]: Dimou et al. Automated Metadata Generation for Linked Data Generation and Publishing Workflows
  • [RML+FnO-ESWC2017]: De Meester et al. Declarative Data Transformations for Linked Data Generation: The Case of DBpedia
  • [FnO+DBpedia-ISWC2017]: Maroy et al. Sustainable Linked Data Generation: The Case of DBpedia
  • [RML-LDOW2018]: Dimou et al. What Factors Influence the Design of a Linked Data Generation Algorithm?

Ben De Meester, Anastasia Dimou, Pieter Heyvaert, Ruben Verborgh

Feel free to refer to this work:

Dimou et al. D2RML Comparison: Response to "Mapping Diverse Data to RDF in Practice", 2018. https://rml.io/blog/2018/d2rml-comparison