Abstract
Since Google announced the internal use of Knowledge Graphs to improve search and organize information, their
use and
application has increased impressively.
Various technologies have been proposed to implement Knowledge Graphs:
RDF-based triplestores are canonical in the Semantic Web, while in the graph databases context Labeled
Property Graphs like Neo4J
are also considered as another technology for Knowledge Graphs.
Wikidata, a popular Knowledge Graph, offers RDF through its SPARQL
query service, but its data model aligns
closely with
Property Graphs using qualifiers and references.
The proposal of RDF-Star which is expected to become RDF
1.2 can bridge the gap between RDF and
Property Graphs by allowing statements about statements.
The quality of data within these graphs is pivotal, often validated against expected data models or shapes to
enhance
accuracy.
We will present some approaches that have been developed to describe and validate RDF like Shape Expressions
(ShEx) or
Shapes Constraint Language (SHACL).
We will briefly describe them and show some differences.
In the case of Property Graphs, PGSchema was proposed, as well
as other proposals like PShEx or ProGS,
and
more recently
GQL offers a way to define typed graphs.
Wikidata adopted Entity Schemas, which are based on ShEx as well as its own property constraint system, and
there is a
proposal called WShEx.
This tutorial will explore different types of Knowledge Graphs and approaches for their validation. We will
also
review practical
applications like inferring shapes from existing data and creating conforming subsets of Knowledge Graphs.
Slides (work in progress)
Topics
This is a half-day tutorial with the following topics:
- Introduction to Knowledge graphs
- Types of Knowledge Graphs:
- RDF graphs
- Property Graphs
- Wikidata and Wikibase graphs
- RDF-Star (RDF 1.2)
- Shaping RDF:
- Introduction to ShEx
- Introduction to SHACL
- ShEx & SHACL compared
- Shaping other types of Knowledge Graphs
- Shaping Wikidata and Wikibase graphs: Entity Schemas and WShEx
- Shaping property graphs: P-ShEx, PGSchema, etc.
- Shaping RDF-Star: ShEx-Star
- Applications: Inferring shapes from data, Knowledge Graphs Subsets, etc.
We plan to devote the first slot to the first 3 items (knowledge graphs as well as Validating RDF
technologies, ShEx and SHACL) which are more introductory,
and the second slot for the rest of the items, which are more specialised.
Goals
- Attendees will understand the different types of technologies to implement Knowledge Graphs
- Users will understand the differences between the data models of RDF, Property graphs, Wikibase and
RDF-Star.
- Participants will understand use cases for defining shapes and validating Knowledge Graphs.
- Participants will be able to create their own RDF data shapes or Schemas and validate instance data
against them using ShEx and SHACL.
- They will see how RDF validation works in ShEx and SHACL.
- Hands-on experience will leave users comfortable using existing tools to solve practical needs in
communicating
schemas and verifying instance data conformance.
- Users will be able to assess and compare the differences between ShEx, SHACL and other validation
approachs for property graphs and Wikibase.
Tutorial type and intended audience
- Target audience: Anyone interested in Knolwedge Graphs and data quality.
- Tutorial type: We consider that this is a introductory/specialised tutorial as it can serve as an
introduction to
people that are not aware of the technologies, but we can also offer specialised knowledge about validating
property
graphs in the second slot.
- Prior knowledge: Some rudimentary knowledge of RDF and Turtle is expected, although a short introduction to
the
RDF data model will be done.
- Tutorial type: Half day tutorial (14h to 17:40h, 11th November 2024).
- Materials: We will create a github repository containing the running examples that can be executed online
using
the following tools:
Anyone interested in Semantic Web technologies and tools can attend this tutorial.
Some rudimentary knowledge of RDF and Turtle is expected, although a short introduction to the RDF data model
will be done.
Tutoring team
- Jose Emilio Labra Gayo.
Full Professor at University of Oviedo, Spain.
Founder and main researcher of WESO (Web Semantics Oviedo) research group,
which collaborates with different companies around the world applying semantic web technologies.
The development of data portals for several companies and public administrations led to his interest on RDF
validation.
He was a member of the W3C Data Shapes working group and of the W3C community groups:
Shape Expressions and SHACL.
He implemented the SHACL and ShEx library SHaclEX in Scala,
maintains the online RDF validator services RDFShape
and WikiShape, and is now implementing the rudof library in Rust which can also be used to validate RDF
with ShEx, SHACL, DCTAP, etc.
Registration and schedule
To register, visit: ESWC'25
The tutorial will start at 14:00h and has two slots: 14:00h to 15:30h and 16:00h to 18:00h on Sunday, 1st
June 2025 (see
Conference Program).