Abstract
The EU Common Data Spaces initiative aims to enable secure, sovereign and interoperable data sharing across
organizational and national boundaries. However, the high heterogeneity of underlying data models and
formats, prevents semantic interoperability from being realized. Publishers can address this challenge by
exposing their internal knowledge by adopting continuous publishing models that reduce operational overhead
for both publishers and consumers. Yet for data consumers, costly alignments still remain a necessity when
the semantics of published datasets differ from their expected internal data models and schemas. Data spaces
require mechanisms to define, discover, and govern such alignments throughout their entire lifecycle, enabling
eventual interoperability. In this paper, we show that considering additional semantic artifacts as part of
the vocabulary hub, namely dataset profiles defining structural and semantic constraints, and profile
alignments (e.g., in the form of SPARQL construct queries), could provide consumers with a semantic entry
point for dataset discovery and integration. We focus on the interaction patterns afforded by the additions
of these semantic artifacts and provide a demonstrator implementation of a user interface that integrates
this functionality. We validate our approach through a use case from the DeployEMDS project, focused the
automatic discovery and alignment of traffic measurements. The extended vocabulary hub enables clients to
discover datasets based on profile characteristics such as shapes, ontologies, and publishing data models,
while also identifying available alignment pathways toward target consumer data models. It shows how the
technical barriers for creating and relying on semantic alignments are lowered, enabling the consumption of
data using the desired vocabularies and schemas. Future work will focus on integrating this component with
existing data space connector implementations to further automate semantic interoperability by enabling
semantic and profile-based content negotiation for data exchanges.