SchemaScout

SchemaScout is a web app (with optional API) that continuously monitors public open-data endpoints and files (CKAN, Socrata, ArcGIS, CSV/JSON URLs) and alerts users when datasets change in ways that silently break dashboards, ETL jobs, or research. It snapshots schemas, column types, code lists, row counts, missingness, and basic distribution stats, then compares versions to detect breaking changes (renamed fields, type flips, new null spikes, shifted geographies, changed date formats). Users can subscribe to datasets, set “breakage thresholds,” and receive human-readable change reports plus machine-readable diffs for pipelines. It’s not a data catalog; it’s a reliability layer for people who already depend on open data but get burned by unannounced updates. Realistically, the product lives or dies on trust, clear diffs, and low-noise alerts—not flashy visuals.

← Back to idea list