Introduction¶
Learning Objectives¶
Sample Datasets¶
Installation and Setup¶
# %pip install duckdb leafmapLibrary Import and Configuration¶
import duckdb
import leafmapprint(f"DuckDB version: {duckdb.__version__}")Downloading the Sample Database¶
url = "https://data.gishub.org/duckdb/nyc_data.db.zip"
leafmap.download_file(url, unzip=True)Connecting to DuckDB and Loading Extensions¶
con = duckdb.connect("nyc_data.db")Installing and Loading the Spatial Extension¶
con.install_extension("spatial")
con.load_extension("spatial")Exploring the Database Structure¶
con.sql("SHOW TABLES;")con.sql("SELECT * from nyc_subway_stations LIMIT 5")Understanding Spatial Relationships¶
Testing Geometric Identity¶
Finding an Exact Match¶
con.sql(
"""
SELECT NAME, geom, ST_AsText(geom)
FROM nyc_subway_stations
WHERE name = 'Broad St';
"""
)con.sql(
"""
SELECT name
FROM nyc_subway_stations
WHERE ST_Equals(geom, ST_GeomFromText('POINT (583571.9059213118 4506714.341192182)'));
"""
)Practical Considerations for ST_Equals¶
Topological Relationships¶
Testing Spatial Intersection¶
Testing Spatial Separation¶
Testing Containment¶
When to Use ST_Contains vs ST_Within¶
Important Distinction from ST_Intersects¶
Testing Boundary Contact¶
Understanding Boundary vs Interior Contact¶
When to Use ST_Touches¶
ST_Touches vs ST_Intersects¶
Detecting Linear Crossings¶
Testing Partial Overlap¶
Practical Applications with NYC Data¶
con.sql(
"""
SELECT name, ST_AsText(geom)
FROM nyc_subway_stations
WHERE name = 'Broad St';
"""
)con.sql(
"""
SELECT name, boroname
FROM nyc_neighborhoods
WHERE ST_Intersects(geom, ST_GeomFromText('POINT(583571 4506714)'));
"""
)Distance-Based Relationships¶
Understanding Distance Calculation¶
Basic Distance Example¶
con.sql(
"""
SELECT ST_Distance(
ST_GeomFromText('POINT(0 5)'),
ST_GeomFromText('LINESTRING(-2 2, 2 2)')) as distance;
"""
)Distance with Real Data¶
con.sql(
"""
SELECT
a.name AS from_station,
b.name AS to_station,
ROUND(ST_Distance(a.geom, b.geom)) AS distance_meters
FROM nyc_subway_stations a, nyc_subway_stations b
WHERE a.name = 'Broad St'
AND b.name IN ('Wall St', 'Bowling Green', 'Fulton St')
ORDER BY distance_meters;
"""
)Critical: Coordinate System Units Matter¶
When to Use ST_Distance¶
Filtering by Proximity Threshold¶
Finding Features Within a Distance¶
con.sql("SELECT name FROM nyc_streets WHERE name IS NOT NULL LIMIT 5")con.sql(
"""
SELECT name
FROM nyc_streets
WHERE ST_DWithin(
geom,
ST_GeomFromText('POINT(583571 4506714)'),
10
)
AND name IS NOT NULL
ORDER BY name;
"""
)When to Use ST_DWithin¶
Nearest Neighbor Queries¶
con.sql(
"""
SELECT name,
ST_Distance(geom, ST_GeomFromText('POINT(583571 4506714)')) AS distance_m
FROM nyc_subway_stations
ORDER BY distance_m
LIMIT 3;
"""
)Key Takeaways¶
Exercises¶
Exercise 1: Testing Geometric Equality¶
Exercise 2: Point-in-Polygon Analysis¶
Exercise 3: Containment vs. Intersection¶
Exercise 4: Distance Calculations¶
Exercise 5: Proximity Filtering¶
Exercise 6: Combining Multiple Spatial Criteria¶