ISWC 2020 trip report
December 4th, 2020


A general impression of ISWC 2020

In the beginning of November, Dylan, Thomas, and Anastasia attended one of the biggest events in our field, the International Semantic Web Conference, or ISWC for short. In general, it was an inspiring experience full of bright new ideas for our research area and beyond!

There were at least another two reports regarding ISWC 2020 by Juan and Alberto, but ours is more focused on our favorite topic, namely knowledge graph construction! :)

Other trip reports compare the virtual experience of the conference with previous physical conferences. However, for some PhD students like us, Thomas and Dylan, who are just in the beginning of our PhD journey, we never had the chance to attend physical conferences before they all went virtual because of COVID-19! All we can say is that the online experience was as smooth as we expected from a physical conference. Paper sessions were timed well so we could easily switch between them and attend exactly those sessions you wanted.

ISWC 2020 used Remo for the posters and demo sessions and Slack for other discussions. It took some time to get used to Remo, but once past that initial barrier, it was nice to interact with authors and others. Slack was a huge asset which we hope to see again even if the conference comes back to its physical substance!

RML in broad research

As we proposed the RDF Mapping Language (RML), it was quite satisfying to see several works building on top!

FunMap was one of the papers whose research was around RML. The authors of FunMap, investigated how preprocessing transformation rules described with Function Ontology (FnO) and embedded in mapping rules could be optimized.
They observed that repeated execution of the same function and processing duplicates was the cause of high execution time in certain use cases. FunMap avoids this by executing each FnO function once in a preprocessing step. The results of the functions are joined with the TriplesMaps. Furthermore, RML processors without FnO support, can also execute these mapping rules since all FnO functions are already executed. FunMap was the only research paper which was marked as Fully Reproduced by the conference. This was only possible since all source code, documentation and data are made available, allowing others to fully reproduce all the experiments executed in the FunMap paper.

This approach shows the power of a declarative mapping language: Since the mapping and transformation rules only describe the intended result, how the mapping and transformation rules are executed can be adapted to make the overall process more efficient.

Another interesting paper was “Turning Transport Data into EU Compliance while Enabling a Multimodal Transport Knowledge Graph” where RML was used for transforming transport data into a Knowledge Graph. This Knowledge Graph is later on converted into various transport data formats. This way, transport agencies only have to provide their data in a single format while being compliant with the EU regulations (017/1926) around publishing public transport data through a National Access Point.

This approach shows a good example of why Knowledge Graphs are ideal for data integration since they not only contain the data, but also describes what the data means. Because of this, converting the Knowledge Graph in other formats is straightforward.

Besides these new research approaches around RML, tutorials on Knowledge Graphs construction were presented as well such as “Knowledge Graph Construction using Declarative Mapping Rules” and “How to build large knowledge graphs efficiently (LKGT)”.

The former tutorial explains in detail the process of constructing knowledge graphs, from writing mapping and transformation rules to their execution. They described Mapeathor, a tool to write mapping rules, Morph-CSV, a framework for virtual knowledge graph access over tabular data (demo), and Helio, a generator from heterogeneous data sources and publisher of Linked Data that provides unified access in real-time. The latter tutorial also explains the process from knowledge creation over knowledge hosting, knowledge curation to knowledge deployment. The tutorial shows how this happens to a Knowledge Graph using schema.org and domain specific extensions of schema.org as ontology.

RML & COVID19

There were also two papers presented showing how Knowledge Graphs are generated: “Facilitating the Analysis of COVID-19 Literature Through a Knowledge Graph” and “Covid-on-the-Web”. Both use a mapping language to define how to generate the corresponding Knowledge Graph. The former uses RML, the latter xR2RML, the sister language of RML :)

Covid-on-the-Web is a Knowledge Graph generated from the COVID-19 Open Research Dataset (CORS-19) to help biomedical scientists to understand and find COVID-19 related research papers. Besides generating a Knowledge Graph using the xR2RML mapping language, the Knowledge Graph is also analyzed to create links between COVID-19 research. Facilitating the Analysis of COVID-19 Literature Through a Knowledge Graph generated a Knowledge Graph from the original CORS-19 CSV and JSON files using RML mapping rules executed by the RMLMapper in the beginning and the RMLStreamer when they size grew. It is very interesting that the two datasets come from the same data source but in the end different knowledge graphs are generated!

Other mapping languages

Besides RML, other mapping languages were also present! We cannot cover them all in this blogpost, so we had to select our personal favorites: G2GML, a mapping language for property graphs and Covid-on-the-Web.

G2GML (Graph To Graph Mapping) bridges the gap between the Resource Description Framework (RDF) world and Property Graphs. With G2GML an existing RDF Knowledge Graph can be transformed into a Property Graph. This way, graph databases such as Neo4j can make use of the RDF data.

Mapping Languages are one way to generate Knowledge Graphs, however Machine Learning is getting more and more popular for this task as well. During ISWC 2020, several Machine Learning approaches were presented to generate Knowledge Graphs. An interesting approach was “Codebreaker” from IBM, they created a Knowledge Graph using Natural Language Processing (NLP) from Python code. Using this Knowledge Graph they can provide detailed code suggestions, search on StackOverflow and other information sources with the context of the current code, etc. to help programmers during this job.

Conclusion

We really enjoyed attending ISWC, even though it was virtual! ISWC was well organized and a lot of interesting research was presented. We were definitely inspired by all these new ideas on using RML and other approaches for constructing Knowledge Graphs! It also made clear that a mapping language for generating knowledge graphs from heterogeneous data, like RML, is needed as it is actively researched more and more. We have a bright future ahead of us with more research to come in the next coming years!

Written by Dylan Van Assche, Thomas Delva & Anastasia Dimou on behalf of the RML team