Generating DCAT-AP Compliant JSON-LD for WMS Layers
To generate DCAT-AP compliant JSON-LD for WMS layers, map the OGC service endpoint to a dcat:DataService, each published layer to a dcat:Dataset, and GetMap/GetFeatureInfo operations to dcat:Distribution resources. The most reliable implementation uses Python’s rdflib to construct an RDF graph, bind DCAT/DCT namespaces, and serialize to JSON-LD with a strict @context. This structural alignment guarantees interoperability with European data portals, national spatial infrastructures, and automated harvesters that expect DCAT-AP for Spatial Data Portals compliance.
Core Mapping Strategy
WMS Capabilities XML (GetCapabilities) does not natively expose DCAT semantics. You must parse the XML and transform elements into DCAT-AP v3 triples:
| WMS Capabilities Element | DCAT-AP v3 Property | Implementation Notes |
|---|---|---|
<Service><Title> / <Abstract> |
dct:title / dct:description |
Always attach @language tags (e.g., lang="en") |
<EX_GeographicBoundingBox> |
dcat:spatial → dcat:Location |
Use dcat:bbox as "minX minY maxX maxY" string |
<KeywordList>/<Keyword> |
dcat:theme |
Resolve to controlled vocabularies (INSPIRE, AGROVOC, or local SKOS) |
<OnlineResource xlink:href> |
dcat:accessURL |
Attach to dcat:Distribution, not the dataset |
<Layer><Name> |
dcat:identifier |
Use as URI fragment for dataset/distribution nodes |
| Service metadata | dcat:endpointURL, dcat:servesDataset |
Endpoint must resolve to the base WMS URL |
This mapping ensures catalog ingestion pipelines traverse from service to dataset without breaking referential integrity. Properly structured metadata also feeds directly into broader Spatial Metadata & Catalog Integration workflows, enabling cross-portal discovery, automated quality scoring, and federated search across distributed spatial infrastructures.
Production-Ready Python Implementation
The following implementation uses rdflib>=6.2.0 to build the RDF graph, then exports a clean JSON-LD structure with the official DCAT-AP v3 context. It handles spatial bounding boxes, language tagging, and distribution media types correctly.
import json
from rdflib import Graph, Namespace, URIRef, Literal, BNode
from rdflib.namespace import DCAT, DCTERMS, RDF, XSD
# Official DCAT-AP v3 context (simplified for JSON-LD serialization)
DCAT_AP_CONTEXT = {
"@context": {
"dcat": "http://www.w3.org/ns/dcat#",
"dct": "http://purl.org/dc/terms/",
"xsd": "http://www.w3.org/2001/XMLSchema#",
"locn": "http://www.w3.org/ns/locn#",
"geo": "http://www.opengis.net/ont/geosparql#"
}
}
def generate_wms_jsonld(layer: dict, service_url: str) -> str:
"""
Generates DCAT-AP v3 compliant JSON-LD for a single WMS layer.
Expects layer dict: {'id', 'title', 'abstract', 'bbox': {'west','south','east','north'}}
"""
g = Graph()
g.bind("dcat", DCAT)
g.bind("dct", DCTERMS)
g.bind("xsd", XSD)
# 1. DataService node
svc_uri = URIRef(service_url)
g.add((svc_uri, RDF.type, DCAT.DataService))
g.add((svc_uri, DCTERMS.title, Literal("OGC Web Map Service", lang="en")))
g.add((svc_uri, DCAT.endpointURL, URIRef(service_url)))
g.add((svc_uri, DCAT.servesDataset, URIRef(f"{service_url}#dataset/{layer['id']}")))
# 2. Dataset node
ds_uri = URIRef(f"{service_url}#dataset/{layer['id']}")
g.add((ds_uri, RDF.type, DCAT.Dataset))
g.add((ds_uri, DCTERMS.title, Literal(layer['title'], lang="en")))
g.add((ds_uri, DCTERMS.description, Literal(layer['abstract'], lang="en")))
g.add((ds_uri, DCAT.identifier, Literal(layer['id'])))
# 3. Spatial bounding box (DCAT-AP v3 compliant)
bbox_node = BNode()
g.add((ds_uri, DCAT.spatial, bbox_node))
g.add((bbox_node, RDF.type, DCAT.Location))
bbox_str = f"{layer['bbox']['west']} {layer['bbox']['south']} {layer['bbox']['east']} {layer['bbox']['north']}"
g.add((bbox_node, DCAT.bbox, Literal(bbox_str, datatype=XSD.string)))
# 4. Distribution (GetMap operation)
dist_uri = URIRef(f"{service_url}#distribution/{layer['id']}")
g.add((ds_uri, DCAT.distribution, dist_uri))
g.add((dist_uri, RDF.type, DCAT.Distribution))
g.add((dist_uri, DCTERMS.title, Literal(f"{layer['title']} - WMS GetMap", lang="en")))
g.add((dist_uri, DCAT.accessURL, URIRef(service_url)))
g.add((dist_uri, DCAT.mediaType, Literal("image/png")))
g.add((dist_uri, DCTERMS.conformsTo, URIRef("http://www.opengis.net/spec/wms/1.3.0")))
# Serialize to JSON-LD with explicit context
json_ld_raw = g.serialize(format="json-ld", indent=2)
parsed = json.loads(json_ld_raw)
# Flatten @graph if rdflib wraps it, then inject strict context
if "@graph" in parsed:
parsed = parsed["@graph"][0]
parsed["@context"] = DCAT_AP_CONTEXT["@context"]
return json.dumps(parsed, indent=2)
Key Implementation Details
- Context Injection:
rdflib’s default JSON-LD serializer often generates a verbose or auto-inferred@context. Overriding it with the official DCAT-AP v3 context ensures harvesters parse prefixes correctly. - Spatial Handling: DCAT-AP v3 recommends
dcat:bboxas a space-delimited string ("minX minY maxX maxY") inside adcat:Locationnode. Avoid raw WKT unless explicitly required by your national profile. - Language Tags: Always wrap literals with
lang="en"(or your target language). Harvesters reject untagged strings in multilingual catalogs. - Conformance URIs: Use the official OGC WMS 1.3.0 specification URI in
dct:conformsToto signal protocol version to automated validators.
Validation & Compliance Best Practices
Before publishing to production portals, validate your output against the W3C DCAT v3 Recommendation and the EU DCAT-AP v3 profile. Common validation failures include:
- Missing
@idon nodes: Everydcat:Datasetanddcat:Distributionmust have a resolvable URI or blank node. - Incorrect coordinate order: Bounding boxes must follow WGS 84 (
EPSG:4326) order:west south east north. - Unresolved themes:
dcat:thememust point to a URI in a registered vocabulary (e.g.,http://inspire.ec.europa.eu/theme/orth). Free-text keywords will fail strict validation. - Distribution media types: Use IANA-registered MIME types (
image/png,image/jpeg,application/vnd.ogc.wms_xml). Avoid genericapplication/octet-stream.
For automated testing, pipe the generated JSON-LD through the OGC Web Map Service (WMS) Implementation Specification compliance checks alongside DCAT-AP validators. Run periodic harvest simulations using tools like pyld or json-ld.org playground to verify context resolution and graph flattening.
Deployment Considerations
When scaling this pattern across hundreds of layers:
- Cache
GetCapabilitiesXML and parse it asynchronously to avoid blocking catalog sync jobs. - Use
dcat:thememappings stored in a configuration file or SKOS endpoint rather than hardcoding. - Attach
dct:licenseanddct:rightsat the dataset level to satisfy open-data mandates. - Version your JSON-LD output by appending
?version=3to the endpoint or using content negotiation (Accept: application/ld+json).
This approach delivers machine-readable, portal-ready metadata that survives strict ingestion pipelines while maintaining full alignment with European spatial data interoperability standards.