Keep an overview of how you integrate your data in a knowledge graph, and keep it simple to improve your process. The declarative configuration document is a single place to inspect how your data should be integrated.
Normalization functions can be integrated to improve the quality of your data values, and the semantic integrity of you knowledge can already be validated, even before any data is generated.
For more information on the declarative configuration, see the introduction on RML.
For more information on the data transformations, see the connection between FnO and RML
For more information on data validation, see Validatrr.
Big or small, generate your knowledge using the right type of software. The RMLMapper is our reference implementation in JAVA with full functionality, perfect for smaller datasets. The RMLStreamer is our streaming solution in SCALA built on KAFKA and Flink, capable of scaling up as much as you need.
Our processors are MIT licensed Open Source projects which you can find on Github. Deployment is eased by providing Docker files. If you would like to collaborate on deployed solutions, feel free to contact us.