Generating DCAT-AP Compliant JSON-LD for WMS Layers

To generate DCAT-AP compliant JSON-LD for WMS layers, map the OGC service endpoint to a dcat:DataService, each published layer to a dcat:Dataset, and GetMap/GetFeatureInfo operations to dcat:Distribution resources. The most reliable implementation uses Python’s rdflib to construct an RDF graph, bind DCAT/DCT namespaces, and serialize to JSON-LD with a strict @context. This structural alignment guarantees interoperability with European data portals, national spatial infrastructures, and automated harvesters that expect DCAT-AP for Spatial Data Portals compliance.

Core Mapping Strategy

WMS Capabilities XML (GetCapabilities) does not natively expose DCAT semantics. You must parse the XML and transform elements into DCAT-AP v3 triples:

WMS Capabilities Element DCAT-AP v3 Property Implementation Notes
<Service><Title> / <Abstract> dct:title / dct:description Always attach @language tags (e.g., lang="en")
<EX_GeographicBoundingBox> dcat:spatialdcat:Location Use dcat:bbox as "minX minY maxX maxY" string
<KeywordList>/<Keyword> dcat:theme Resolve to controlled vocabularies (INSPIRE, AGROVOC, or local SKOS)
<OnlineResource xlink:href> dcat:accessURL Attach to dcat:Distribution, not the dataset
<Layer><Name> dcat:identifier Use as URI fragment for dataset/distribution nodes
Service metadata dcat:endpointURL, dcat:servesDataset Endpoint must resolve to the base WMS URL

This mapping ensures catalog ingestion pipelines traverse from service to dataset without breaking referential integrity. Properly structured metadata also feeds directly into broader Spatial Metadata & Catalog Integration workflows, enabling cross-portal discovery, automated quality scoring, and federated search across distributed spatial infrastructures.

Production-Ready Python Implementation

The following implementation uses rdflib>=6.2.0 to build the RDF graph, then exports a clean JSON-LD structure with the official DCAT-AP v3 context. It handles spatial bounding boxes, language tagging, and distribution media types correctly.

import json
from rdflib import Graph, Namespace, URIRef, Literal, BNode
from rdflib.namespace import DCAT, DCTERMS, RDF, XSD

# Official DCAT-AP v3 context (simplified for JSON-LD serialization)
DCAT_AP_CONTEXT = {
    "@context": {
        "dcat": "http://www.w3.org/ns/dcat#",
        "dct": "http://purl.org/dc/terms/",
        "xsd": "http://www.w3.org/2001/XMLSchema#",
        "locn": "http://www.w3.org/ns/locn#",
        "geo": "http://www.opengis.net/ont/geosparql#"
    }
}

def generate_wms_jsonld(layer: dict, service_url: str) -> str:
    """
    Generates DCAT-AP v3 compliant JSON-LD for a single WMS layer.
    Expects layer dict: {'id', 'title', 'abstract', 'bbox': {'west','south','east','north'}}
    """
    g = Graph()
    g.bind("dcat", DCAT)
    g.bind("dct", DCTERMS)
    g.bind("xsd", XSD)

    # 1. DataService node
    svc_uri = URIRef(service_url)
    g.add((svc_uri, RDF.type, DCAT.DataService))
    g.add((svc_uri, DCTERMS.title, Literal("OGC Web Map Service", lang="en")))
    g.add((svc_uri, DCAT.endpointURL, URIRef(service_url)))
    g.add((svc_uri, DCAT.servesDataset, URIRef(f"{service_url}#dataset/{layer['id']}")))

    # 2. Dataset node
    ds_uri = URIRef(f"{service_url}#dataset/{layer['id']}")
    g.add((ds_uri, RDF.type, DCAT.Dataset))
    g.add((ds_uri, DCTERMS.title, Literal(layer['title'], lang="en")))
    g.add((ds_uri, DCTERMS.description, Literal(layer['abstract'], lang="en")))
    g.add((ds_uri, DCAT.identifier, Literal(layer['id'])))

    # 3. Spatial bounding box (DCAT-AP v3 compliant)
    bbox_node = BNode()
    g.add((ds_uri, DCAT.spatial, bbox_node))
    g.add((bbox_node, RDF.type, DCAT.Location))
    bbox_str = f"{layer['bbox']['west']} {layer['bbox']['south']} {layer['bbox']['east']} {layer['bbox']['north']}"
    g.add((bbox_node, DCAT.bbox, Literal(bbox_str, datatype=XSD.string)))

    # 4. Distribution (GetMap operation)
    dist_uri = URIRef(f"{service_url}#distribution/{layer['id']}")
    g.add((ds_uri, DCAT.distribution, dist_uri))
    g.add((dist_uri, RDF.type, DCAT.Distribution))
    g.add((dist_uri, DCTERMS.title, Literal(f"{layer['title']} - WMS GetMap", lang="en")))
    g.add((dist_uri, DCAT.accessURL, URIRef(service_url)))
    g.add((dist_uri, DCAT.mediaType, Literal("image/png")))
    g.add((dist_uri, DCTERMS.conformsTo, URIRef("http://www.opengis.net/spec/wms/1.3.0")))

    # Serialize to JSON-LD with explicit context
    json_ld_raw = g.serialize(format="json-ld", indent=2)
    parsed = json.loads(json_ld_raw)
    
    # Flatten @graph if rdflib wraps it, then inject strict context
    if "@graph" in parsed:
        parsed = parsed["@graph"][0]
    parsed["@context"] = DCAT_AP_CONTEXT["@context"]
    
    return json.dumps(parsed, indent=2)

Key Implementation Details

  • Context Injection: rdflib’s default JSON-LD serializer often generates a verbose or auto-inferred @context. Overriding it with the official DCAT-AP v3 context ensures harvesters parse prefixes correctly.
  • Spatial Handling: DCAT-AP v3 recommends dcat:bbox as a space-delimited string ("minX minY maxX maxY") inside a dcat:Location node. Avoid raw WKT unless explicitly required by your national profile.
  • Language Tags: Always wrap literals with lang="en" (or your target language). Harvesters reject untagged strings in multilingual catalogs.
  • Conformance URIs: Use the official OGC WMS 1.3.0 specification URI in dct:conformsTo to signal protocol version to automated validators.

Validation & Compliance Best Practices

Before publishing to production portals, validate your output against the W3C DCAT v3 Recommendation and the EU DCAT-AP v3 profile. Common validation failures include:

  1. Missing @id on nodes: Every dcat:Dataset and dcat:Distribution must have a resolvable URI or blank node.
  2. Incorrect coordinate order: Bounding boxes must follow WGS 84 (EPSG:4326) order: west south east north.
  3. Unresolved themes: dcat:theme must point to a URI in a registered vocabulary (e.g., http://inspire.ec.europa.eu/theme/orth). Free-text keywords will fail strict validation.
  4. Distribution media types: Use IANA-registered MIME types (image/png, image/jpeg, application/vnd.ogc.wms_xml). Avoid generic application/octet-stream.

For automated testing, pipe the generated JSON-LD through the OGC Web Map Service (WMS) Implementation Specification compliance checks alongside DCAT-AP validators. Run periodic harvest simulations using tools like pyld or json-ld.org playground to verify context resolution and graph flattening.

Deployment Considerations

When scaling this pattern across hundreds of layers:

  • Cache GetCapabilities XML and parse it asynchronously to avoid blocking catalog sync jobs.
  • Use dcat:theme mappings stored in a configuration file or SKOS endpoint rather than hardcoding.
  • Attach dct:license and dct:rights at the dataset level to satisfy open-data mandates.
  • Version your JSON-LD output by appending ?version=3 to the endpoint or using content negotiation (Accept: application/ld+json).

This approach delivers machine-readable, portal-ready metadata that survives strict ingestion pipelines while maintaining full alignment with European spatial data interoperability standards.