Spatial Queries and Relationships

Contents

7. Spatial Queries and Relationships#

7.1. Introduction#

7.2. Learning Objectives#

7.3. Sample Datasets#

7.4. Installation and Setup#

# %pip install duckdb leafmap

7.4.1. Library Import and Configuration#

import duckdb
import leafmap
print(f"DuckDB version: {duckdb.__version__}")

7.4.2. Downloading the Sample Database#

url = "https://data.gishub.org/duckdb/nyc_data.db.zip"
leafmap.download_file(url, unzip=True)

7.5. Connecting to DuckDB and Loading Extensions#

con = duckdb.connect("nyc_data.db")

7.5.1. Installing and Loading the Spatial Extension#

con.install_extension("spatial")
con.load_extension("spatial")

7.5.2. Exploring the Database Structure#

con.sql("SHOW TABLES;")
con.sql("SELECT * from nyc_subway_stations LIMIT 5")

7.6. Understanding Spatial Relationships#

7.7. Testing Geometric Identity#

7.7.1. Finding an Exact Match#

con.sql(
    """
SELECT NAME, geom, ST_AsText(geom)
FROM nyc_subway_stations
WHERE name = 'Broad St';
"""
)
con.sql(
    """
SELECT name
FROM nyc_subway_stations
WHERE ST_Equals(geom, ST_GeomFromText('POINT (583571.9059213118 4506714.341192182)'));
"""
)

7.7.2. Practical Considerations for ST_Equals#

7.8. Topological Relationships#

7.8.1. Testing Spatial Intersection#

7.8.2. Testing Spatial Separation#

7.8.3. Testing Containment#

7.8.4. When to Use ST_Contains vs ST_Within#

7.8.5. Important Distinction from ST_Intersects#

7.8.6. Testing Boundary Contact#

7.8.7. Understanding Boundary vs Interior Contact#

7.8.8. When to Use ST_Touches#

7.8.9. ST_Touches vs ST_Intersects#

7.8.10. Detecting Linear Crossings#

7.8.11. Testing Partial Overlap#

7.8.12. Practical Applications with NYC Data#

con.sql(
    """
SELECT name, ST_AsText(geom)
FROM nyc_subway_stations
WHERE name = 'Broad St';
"""
)
con.sql(
    """
SELECT name, boroname
FROM nyc_neighborhoods
WHERE ST_Intersects(geom, ST_GeomFromText('POINT(583571 4506714)'));
"""
)

7.9. Distance-Based Relationships#

7.9.1. Understanding Distance Calculation#

7.9.2. Basic Distance Example#

con.sql(
    """
SELECT ST_Distance(
  ST_GeomFromText('POINT(0 5)'),
  ST_GeomFromText('LINESTRING(-2 2, 2 2)')) as distance;
"""
)

7.9.3. Distance with Real Data#

con.sql(
    """
SELECT
  a.name AS from_station,
  b.name AS to_station,
  ROUND(ST_Distance(a.geom, b.geom)) AS distance_meters
FROM nyc_subway_stations a, nyc_subway_stations b
WHERE a.name = 'Broad St'
  AND b.name IN ('Wall St', 'Bowling Green', 'Fulton St')
ORDER BY distance_meters;
"""
)

7.9.4. Critical: Coordinate System Units Matter#

7.9.5. When to Use ST_Distance#

7.10. Filtering by Proximity Threshold#

7.10.1. Finding Features Within a Distance#

con.sql("SELECT name FROM nyc_streets WHERE name IS NOT NULL LIMIT 5")
con.sql(
    """
SELECT name
FROM nyc_streets
WHERE ST_DWithin(
        geom,
        ST_GeomFromText('POINT(583571 4506714)'),
        10
      )
AND name IS NOT NULL
ORDER BY name;
"""
)

7.10.2. When to Use ST_DWithin#

7.11. Nearest Neighbor Queries#

con.sql(
    """
SELECT name,
       ST_Distance(geom, ST_GeomFromText('POINT(583571 4506714)')) AS distance_m
FROM nyc_subway_stations
ORDER BY distance_m
LIMIT 3;
"""
)

7.12. Key Takeaways#

7.13. Exercises#

7.13.1. Exercise 1: Testing Geometric Equality#

7.13.2. Exercise 2: Point-in-Polygon Analysis#

7.13.3. Exercise 3: Containment vs. Intersection#

7.13.4. Exercise 4: Distance Calculations#

7.13.5. Exercise 5: Proximity Filtering#

7.13.6. Exercise 6: Combining Multiple Spatial Criteria#