Case in point:
Overture maps discrovery
We were asked to run a data quality scan on the Overture Maps dataset, a large geospatial dataset backed by Meta and Microsoft. Here’s what Aspen uncovered.
What Aspen found at glance:
Places Lost at Sea
Overture’s “Places” dataset had 167,021 records located in the ocean. That’s 0.25% of all entries. Planning tools, search APIs, and spatial analytics models are all degraded when using data containing these points.
Duplicate Addresses Everywhere
Aspen ran a deduplication sweep across Overture’s “Addresses” dataset and flagged 7,121,175 duplicates, 1.66% of a 429M+ record dataset. Duplicate addresses skew analytics, location-based AI insights, and increase processing and storage costs.
Even New York City, where data is about as clean as it gets, was filled with duplicates.
Road Connectors That Connect to Nothing
We found 439,522 “floating connectors” in Overture’s “Transportation” dataset as well - about 0.18% of total records. These are road nodes that don’t actually link to any road segment. Disconnected connectors confuse navigation engines, and cause inaccurate location-based insights from AI models.
Aspen GIS flagged thousands of orphan connectors in NYC.
Technical Breakdown
Joined spatial data at scale using S2 indexing.
Filtered using true geospatial containment, not just bounding boxes.
Exported actionable reports in minutes.
All checks were run on Overture drops from March 19 and April 23, 2025.


