Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Spatial Queries and Relationships

Introduction

Learning Objectives

Sample Datasets

Installation and Setup

# %pip install duckdb leafmap

Library Import and Configuration

import duckdb
import leafmap
print(f"DuckDB version: {duckdb.__version__}")

Downloading the Sample Database

url = "https://data.gishub.org/duckdb/nyc_data.db.zip"
leafmap.download_file(url, unzip=True)

Connecting to DuckDB and Loading Extensions

con = duckdb.connect("nyc_data.db")

Installing and Loading the Spatial Extension

con.install_extension("spatial")
con.load_extension("spatial")

Exploring the Database Structure

con.sql("SHOW TABLES;")
con.sql("SELECT * from nyc_subway_stations LIMIT 5")

Understanding Spatial Relationships

Testing Geometric Identity

Finding an Exact Match

con.sql(
    """
SELECT NAME, geom, ST_AsText(geom)
FROM nyc_subway_stations
WHERE name = 'Broad St';
"""
)
con.sql(
    """
SELECT name
FROM nyc_subway_stations
WHERE ST_Equals(geom, ST_GeomFromText('POINT (583571.9059213118 4506714.341192182)'));
"""
)

Practical Considerations for ST_Equals

Topological Relationships

Testing Spatial Intersection

Testing Spatial Separation

Testing Containment

When to Use ST_Contains vs ST_Within

Important Distinction from ST_Intersects

Testing Boundary Contact

Understanding Boundary vs Interior Contact

When to Use ST_Touches

ST_Touches vs ST_Intersects

Detecting Linear Crossings

Testing Partial Overlap

Practical Applications with NYC Data

con.sql(
    """
SELECT name, ST_AsText(geom)
FROM nyc_subway_stations
WHERE name = 'Broad St';
"""
)
con.sql(
    """
SELECT name, boroname
FROM nyc_neighborhoods
WHERE ST_Intersects(geom, ST_GeomFromText('POINT(583571 4506714)'));
"""
)

Distance-Based Relationships

Understanding Distance Calculation

Basic Distance Example

con.sql(
    """
SELECT ST_Distance(
  ST_GeomFromText('POINT(0 5)'),
  ST_GeomFromText('LINESTRING(-2 2, 2 2)')) as distance;
"""
)

Distance with Real Data

con.sql(
    """
SELECT
  a.name AS from_station,
  b.name AS to_station,
  ROUND(ST_Distance(a.geom, b.geom)) AS distance_meters
FROM nyc_subway_stations a, nyc_subway_stations b
WHERE a.name = 'Broad St'
  AND b.name IN ('Wall St', 'Bowling Green', 'Fulton St')
ORDER BY distance_meters;
"""
)

Critical: Coordinate System Units Matter

When to Use ST_Distance

Filtering by Proximity Threshold

Finding Features Within a Distance

con.sql("SELECT name FROM nyc_streets WHERE name IS NOT NULL LIMIT 5")
con.sql(
    """
SELECT name
FROM nyc_streets
WHERE ST_DWithin(
        geom,
        ST_GeomFromText('POINT(583571 4506714)'),
        10
      )
AND name IS NOT NULL
ORDER BY name;
"""
)

When to Use ST_DWithin

Nearest Neighbor Queries

con.sql(
    """
SELECT name,
       ST_Distance(geom, ST_GeomFromText('POINT(583571 4506714)')) AS distance_m
FROM nyc_subway_stations
ORDER BY distance_m
LIMIT 3;
"""
)

Key Takeaways

Exercises

Exercise 1: Testing Geometric Equality

Exercise 2: Point-in-Polygon Analysis

Exercise 3: Containment vs. Intersection

Exercise 4: Distance Calculations

Exercise 5: Proximity Filtering

Exercise 6: Combining Multiple Spatial Criteria