Syncing PostgreSQL PostGIS Layers with GeoServer via Python
TL;DR: Call GET /rest/workspaces/{ws}, then POST or PUT to register a PostGIS datastore, then POST /rest/workspaces/{ws}/datastores/{store}/featuretypes with nativeCRS, srs, and projectionPolicy: REPROJECT_TO_DECLARED. Wrap the entire flow in environment-variable-driven Python and the result is an idempotent sync script that runs cleanly on every CI/CD deployment without duplicating resources or leaking connection pools.
The Core Challenge
GeoServer’s REST API for PostGIS datasources uses an uncommon JSON format — connection parameters are not a plain object but an entry-list of {"@key": "...", "$": "..."} pairs. Send a standard dict and GeoServer silently accepts the request with a 201 but never actually connects to PostgreSQL. There is no validation error; the datastore appears registered yet no features are visible.
A second trap is CRS handling. PostGIS stores each table’s coordinate system as an integer SRID in geometry_columns. GeoServer must receive that SRID as a declared srs field AND a matching nativeCRS field in the feature type payload. If either is absent or contradicts the stored SRID, GeoServer either refuses to publish or silently mis-declares the bounding box — a problem that only surfaces when a WMS client requests tiles and receives spatially offset results. The SRS and Coordinate Reference System Handling guide explains on-the-fly reprojection and why a correctly declared projectionPolicy is the reliable fix.
The diagram below shows the three-phase call sequence the sync script follows:
Prerequisites & Compatibility
Verify these constraints before running the sync to prevent silent connection drops, JDBC driver mismatches, or OGC non-compliance:
| Component | Minimum Version | Critical Notes |
|---|---|---|
| Python | 3.10+ | Requires requests and urllib3 |
| GeoServer | 2.20.0+ | REST API v2+ natively supports JSON payloads |
| PostGIS | 3.0+ | geometry_columns and spatial_ref_sys must be populated |
| JDBC Driver | postgresql-42.5.0.jar+ |
Must reside in WEB-INF/lib or data_dir/lib |
| OGC Services | WMS 1.3.0 / WFS 2.0 | Validate via GetCapabilities post-sync |
GeoServer’s REST interface requires exact parameter casing. The host, port, database, user, and passwd keys must align with the JDBC URL schema. Mismatched SRS codes between the declared srs and the stored SRID will trigger on-the-fly reprojection overhead during every WMS tile request. Declare both nativeCRS and srs explicitly when publishing to avoid this penalty.
Production-Ready Code
The script below is a self-contained, idempotent PostGIS-to-GeoServer sync routine. It reads all sensitive values from environment variables, uses requests for HTTP transport, and never duplicates resources on repeated runs:
import os
import requests
import logging
from requests.auth import HTTPBasicAuth
logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s")
logger = logging.getLogger("postgis_sync")
# Inject via CI/CD secret manager or .env — never hard-code credentials
GEOSERVER_URL = os.getenv("GEOSERVER_URL", "http://localhost:8080/geoserver")
AUTH = HTTPBasicAuth(
os.getenv("GEOSERVER_USER", "admin"),
os.getenv("GEOSERVER_PASS", "geoserver"),
)
HEADERS = {"Content-Type": "application/json", "Accept": "application/json"}
def _get(endpoint: str) -> requests.Response:
"""Raw GET — does NOT raise on 404; callers inspect status_code."""
return requests.get(f"{GEOSERVER_URL}{endpoint}", auth=AUTH, headers=HEADERS)
def _post(endpoint: str, payload: dict) -> requests.Response:
resp = requests.post(
f"{GEOSERVER_URL}{endpoint}", auth=AUTH, headers=HEADERS, json=payload
)
resp.raise_for_status()
return resp
def _put(endpoint: str, payload: dict) -> requests.Response:
resp = requests.put(
f"{GEOSERVER_URL}{endpoint}", auth=AUTH, headers=HEADERS, json=payload
)
resp.raise_for_status()
return resp
def check_tcp(host: str, port: int, timeout: float = 3.0) -> bool:
"""Pre-flight connectivity check — prevents partial state commits."""
import socket
try:
with socket.create_connection((host, port), timeout=timeout):
return True
except OSError:
return False
def ensure_workspace(ws: str) -> None:
"""Create workspace if it does not already exist."""
resp = _get(f"/rest/workspaces/{ws}")
if resp.status_code == 404:
_post("/rest/workspaces", {"workspace": {"name": ws}})
logger.info("Created workspace: %s", ws)
elif resp.status_code == 200:
logger.info("Workspace '%s' already exists.", ws)
else:
resp.raise_for_status()
def sync_postgis_store(ws: str, db_config: dict) -> None:
"""Register or update a PostGIS datastore idempotently.
GeoServer REST requires connection parameters as an entry-list of
{"@key": "...", "$": "..."} pairs — plain dicts are silently ignored.
"""
store_name = f"{ws}_postgis"
store_payload = {
"dataStore": {
"name": store_name,
"type": "PostGIS",
"enabled": True,
"connectionParameters": {
"entry": [
{"@key": "dbtype", "$": "postgis"},
{"@key": "host", "$": db_config["host"]},
{"@key": "port", "$": str(db_config["port"])},
{"@key": "database", "$": db_config["database"]},
{"@key": "user", "$": db_config["user"]},
{"@key": "passwd", "$": db_config["password"]},
{"@key": "schema", "$": db_config.get("schema", "public")},
{"@key": "Expose primary keys", "$": "true"},
]
},
}
}
check = _get(f"/rest/workspaces/{ws}/datastores/{store_name}")
if check.status_code == 404:
_post(f"/rest/workspaces/{ws}/datastores", store_payload)
logger.info("Created PostGIS datastore: %s", store_name)
else:
check.raise_for_status()
_put(f"/rest/workspaces/{ws}/datastores/{store_name}", store_payload)
logger.info("Updated PostGIS datastore: %s", store_name)
def publish_layer(ws: str, table: str, srs: str) -> None:
"""Publish a feature type with explicit CRS enforcement.
projectionPolicy REPROJECT_TO_DECLARED treats the declared SRS as
authoritative, eliminating bbox misalignment in WMS tile responses.
"""
store_name = f"{ws}_postgis"
ft_endpoint = f"/rest/workspaces/{ws}/datastores/{store_name}/featuretypes"
layer_payload = {
"featureType": {
"name": table,
"nativeCRS": srs, # must match geometry_columns SRID
"srs": srs, # OGC-advertised CRS in GetCapabilities
"projectionPolicy": "REPROJECT_TO_DECLARED",
"enabled": True,
}
}
check = _get(f"{ft_endpoint}/{table}")
if check.status_code == 404:
_post(ft_endpoint, layer_payload)
logger.info("Published layer: %s", table)
else:
check.raise_for_status()
_put(f"{ft_endpoint}/{table}", layer_payload)
logger.info("Updated layer metadata: %s", table)
def main() -> None:
db_config = {
"host": os.getenv("PG_HOST", "localhost"),
"port": os.getenv("PG_PORT", "5432"),
"database": os.getenv("PG_DB", "spatial_db"),
"user": os.getenv("PG_USER", "gis_user"),
"password": os.getenv("PG_PASS", "secure_password"),
"schema": os.getenv("PG_SCHEMA", "public"),
}
workspace = os.getenv("WORKSPACE", "agency_spatial")
table_name = os.getenv("TABLE_NAME", "parcels_2024")
srs = os.getenv("SRS", "EPSG:4326")
pg_host = db_config["host"]
pg_port = int(db_config["port"])
if not check_tcp(pg_host, pg_port):
raise RuntimeError(
f"Cannot reach PostgreSQL at {pg_host}:{pg_port} — "
"aborting to prevent partial state commit."
)
logger.info("Starting PostGIS → GeoServer sync...")
ensure_workspace(workspace)
sync_postgis_store(workspace, db_config)
publish_layer(workspace, table_name, srs)
logger.info("Sync complete.")
if __name__ == "__main__":
main()
Step-by-Step Walkthrough
Pre-flight TCP check (check_tcp). Before issuing any REST calls, the script opens a raw TCP socket to the PostgreSQL host and port. If the connection fails, the script raises immediately rather than registering a datastore that points at an unreachable database. Without this guard, the workspace and datastore may be created successfully while the underlying PostGIS instance is offline — leaving GeoServer in a partially configured state that is difficult to diagnose from logs alone.
_get does not call raise_for_status. The private _get helper intentionally returns the raw response without raising. If it raised on 404, the first check in ensure_workspace would throw an exception before the create branch could execute. Only _post and _put call raise_for_status, because any non-2xx response from a write operation is always an unrecoverable error.
Entry-list format in sync_postgis_store. The connectionParameters.entry list uses @key/$ pairs — GeoServer’s documented JSON encoding for Java property maps. This is not standard JSON convention; it mirrors the XML attribute syntax the REST API originally used. Sending a plain {"host": "localhost"} object is accepted without error but never applied.
nativeCRS vs srs in publish_layer. nativeCRS tells GeoServer the coordinate system actually stored in PostGIS (the value from geometry_columns.srid). srs is the system GeoServer advertises in WMS GetCapabilities responses and uses when clients specify a requested CRS. Setting projectionPolicy to REPROJECT_TO_DECLARED makes the declared srs authoritative: GeoServer reprojects incoming features on read if nativeCRS and srs differ, rather than failing silently or exposing misaligned bounding boxes to downstream WMS clients.
Idempotency. Every write path is preceded by a _get check. A 404 triggers POST (create); a 200 triggers PUT (update). Running the script twice against an already-configured GeoServer updates configuration without duplicating datastores or creating duplicate connection pools.
Verification
After the script completes, query the WMS capabilities endpoint to confirm the layer is published:
curl -s -u admin:geoserver \
"http://localhost:8080/geoserver/wms?service=WMS&version=1.3.0&request=GetCapabilities" \
| grep -A5 "parcels_2024"
Expected output (abbreviated):
<Layer queryable="1">
<Name>agency_spatial:parcels_2024</Name>
<Title>parcels_2024</Title>
<CRS>EPSG:4326</CRS>
<BoundingBox CRS="EPSG:4326" minx="-180.0" miny="-90.0" maxx="180.0" maxy="90.0"/>
</Layer>
If the layer is absent, check the GeoServer logs at data_dir/logs/geoserver.log. A JDBC driver error appears as org.postgresql.Driver not found; a missing schema appears as relation "public.parcels_2024" does not exist.
Gotchas & Edge Cases
JDBC driver placement. GeoServer loads the PostgreSQL JDBC driver at startup. If postgresql-42.x.x.jar is absent from WEB-INF/lib or data_dir/lib, the datastore registration call returns 201 but GeoServer cannot open any connections. Place the jar and restart the service — a hot reload is not sufficient.
Non-public schemas. By default GeoServer’s PostGIS datastore scans only the public schema. Pass "schema": "parcels_schema" in db_config and confirm the running PostgreSQL user has USAGE privilege on that schema: GRANT USAGE ON SCHEMA parcels_schema TO gis_user;
SRS mismatch without error. If nativeCRS does not match the actual SRID in geometry_columns, GeoServer may accept the payload and generate a WMS layer whose bounding box is expressed in the wrong coordinate space. The symptom is WMS tiles that render in the wrong location. Always cross-check the declared SRS against the PostGIS SRID: SELECT srid FROM geometry_columns WHERE f_table_name = 'parcels_2024';
WFS transaction differences. This workflow publishes layers for read-only WMS access. If you need transactional write access via WFS-T operations, enable the wfs service on the workspace and confirm the PostGIS user has INSERT, UPDATE, and DELETE privileges on the target table.
Back to Environment Parity for Spatial Servers
Related
- Automating GeoServer with the Python REST API — the full REST API reference for workspaces, stores, and styles
- Python Script to Auto-Publish Shapefiles to a GeoServer Workspace — the same idempotent pattern applied to file-backed datastores
- SRS and Coordinate Reference System Handling — on-the-fly reprojection mechanics and how
projectionPolicyinteracts with WMS axis order - WFS Transactional Operations Deep-Dive — enabling write access on published PostGIS layers via WFS-T
- Layer Publishing Workflows in Python — broader pipeline patterns for SLD style deployment and batch layer registration