7. Spatial Queries and Relationships#
7.1. Introduction#
7.2. Learning Objectives#
7.3. Sample Datasets#
7.4. Installation and Setup#
# %pip install duckdb leafmap
7.4.1. Library Import and Configuration#
import duckdb
import leafmap
print(f"DuckDB version: {duckdb.__version__}")
7.4.2. Downloading the Sample Database#
url = "https://data.gishub.org/duckdb/nyc_data.db.zip"
leafmap.download_file(url, unzip=True)
7.5. Connecting to DuckDB and Loading Extensions#
con = duckdb.connect("nyc_data.db")
7.5.1. Installing and Loading the Spatial Extension#
con.install_extension("spatial")
con.load_extension("spatial")
7.5.2. Exploring the Database Structure#
con.sql("SHOW TABLES;")
con.sql("SELECT * from nyc_subway_stations LIMIT 5")
7.6. Understanding Spatial Relationships#
7.7. Testing Geometric Identity#
7.7.1. Finding an Exact Match#
con.sql(
"""
SELECT NAME, geom, ST_AsText(geom)
FROM nyc_subway_stations
WHERE name = 'Broad St';
"""
)
con.sql(
"""
SELECT name
FROM nyc_subway_stations
WHERE ST_Equals(geom, ST_GeomFromText('POINT (583571.9059213118 4506714.341192182)'));
"""
)
7.7.2. Practical Considerations for ST_Equals#
7.8. Topological Relationships#
7.8.1. Testing Spatial Intersection#
7.8.2. Testing Spatial Separation#
7.8.3. Testing Containment#
7.8.4. When to Use ST_Contains vs ST_Within#
7.8.5. Important Distinction from ST_Intersects#
7.8.6. Testing Boundary Contact#
7.8.7. Understanding Boundary vs Interior Contact#
7.8.8. When to Use ST_Touches#
7.8.9. ST_Touches vs ST_Intersects#
7.8.10. Detecting Linear Crossings#
7.8.11. Testing Partial Overlap#
7.8.12. Practical Applications with NYC Data#
con.sql(
"""
SELECT name, ST_AsText(geom)
FROM nyc_subway_stations
WHERE name = 'Broad St';
"""
)
con.sql(
"""
SELECT name, boroname
FROM nyc_neighborhoods
WHERE ST_Intersects(geom, ST_GeomFromText('POINT(583571 4506714)'));
"""
)
7.9. Distance-Based Relationships#
7.9.1. Understanding Distance Calculation#
7.9.2. Basic Distance Example#
con.sql(
"""
SELECT ST_Distance(
ST_GeomFromText('POINT(0 5)'),
ST_GeomFromText('LINESTRING(-2 2, 2 2)')) as distance;
"""
)
7.9.3. Distance with Real Data#
con.sql(
"""
SELECT
a.name AS from_station,
b.name AS to_station,
ROUND(ST_Distance(a.geom, b.geom)) AS distance_meters
FROM nyc_subway_stations a, nyc_subway_stations b
WHERE a.name = 'Broad St'
AND b.name IN ('Wall St', 'Bowling Green', 'Fulton St')
ORDER BY distance_meters;
"""
)
7.9.4. Critical: Coordinate System Units Matter#
7.9.5. When to Use ST_Distance#
7.10. Filtering by Proximity Threshold#
7.10.1. Finding Features Within a Distance#
con.sql("SELECT name FROM nyc_streets WHERE name IS NOT NULL LIMIT 5")
con.sql(
"""
SELECT name
FROM nyc_streets
WHERE ST_DWithin(
geom,
ST_GeomFromText('POINT(583571 4506714)'),
10
)
AND name IS NOT NULL
ORDER BY name;
"""
)
7.10.2. When to Use ST_DWithin#
7.11. Nearest Neighbor Queries#
con.sql(
"""
SELECT name,
ST_Distance(geom, ST_GeomFromText('POINT(583571 4506714)')) AS distance_m
FROM nyc_subway_stations
ORDER BY distance_m
LIMIT 3;
"""
)