Getting Started with Geospatial Rust

// 2026-05-07 · Updated 2026-05-11 · 10 min read

If you’ve ever looked at a satellite image of farmland, grasslands, forests, or cities, you’ve seen remote sensing data. Each pixel in those images contains measured values — stored as digital numbers (DNs) in raw data, or as surface reflectance in analysis-ready products — across different spectral bands: blue, green, red, near-infrared, and more. These bands tell us about vegetation health, water content, urban development, and land cover.

This post introduces the core concepts of geospatial raster processing for Rust programmers who’ve never worked with satellite imagery. We’ll cover what the satellites measure, why their data looks the way it does, and what you can actually do with it.

Already know remote sensing? Jump ahead to End-to-End Geospatial Processing with EORST — a code-heavy walkthrough building a complete pipeline.

Why Satellite Imagery?

Earth observation satellites carry sensors that measure sunlight reflected from Earth’s surface. Unlike a regular camera, these sensors don’t just capture red, green, and blue — they capture many more “bands” across the electromagnetic spectrum. Each band measures a different wavelength of light.

These bands reveal things our eyes can’t see:

By combining these bands, we can map vegetation health, detect water stress, identify burn scars, classify land cover, and track changes over time.

The Two Main Satellite Platforms

Two satellite programs provide the backbone of open Earth observation data:

Landsat (NASA)

The Landsat program has been collecting imagery since 1972 — that’s over 50 years of continuous data. Landsat 8 and Landsat 9 carry the Operational Land Imager (OLI) sensor. Each scene covers 185×185 km at 30-meter resolution, revisiting the same location every 16 days.

Band Wavelength (µm) Resolution What it shows
Coastal Aerosol 0.43–0.45 30m Coastal waters, atmospheric correction
Blue 0.45–0.51 30m Water penetration, soil/vegetation contrast
Green 0.53–0.59 30m Healthy vegetation (peaks here)
Red 0.64–0.67 30m Vegetation biomass, chlorophyll absorption
NIR 0.85–0.88 30m Healthy vegetation (strongly reflects)
SWIR 1 1.57–1.65 30m Soil/vegetation moisture
SWIR 2 2.11–2.29 30m Soil moisture, vegetation stress
Pan 0.50–0.68 15m Image sharpening
Thermal 1 10.6–11.2 100m Surface temperature
Thermal 2 11.5–12.5 100m Surface temperature

Sentinel-2 (ESA)

The European Sentinel-2 satellites provide higher-resolution imagery (10-20m) with more frequent revisits (every 5 days with both 2A and 2B). They include unique “red-edge” bands sensitive to vegetation stress.

Band Wavelength (µm) Resolution What it shows
Coastal 0.44 60m Coastal aerosols
Blue 0.49 10m Water, soils
Green 0.56 10m Healthy vegetation
Red 0.665 10m Vegetation chlorophyll
Red-Edge 1 0.705 20m Early vegetation stress
Red-Edge 2 0.74 20m Mid-vegetation stress
Red-Edge 3 0.783 20m Late vegetation stress
NIR 0.842 10m Biomass, vegetation
NIR Narrow 0.865 20m Vegetation moisture
SWIR 1 1.61 20m Soil/vegetation moisture
SWIR 2 2.19 20m Soil/vegetation moisture

Both platforms provide free, open data. Combining them gives you more frequent observations and redundant coverage.

Raw Data vs. Analysis-Ready Products

When you download satellite imagery, you have a choice: raw scenes or analysis-ready products.

Raw Data

The raw “Level-1” data (called L1G or L1TP) is the sensor’s measurement at the top of atmosphere. Problems:

  1. Atmospheric scattering — blue light scatters off air molecules, making everything bluer than reality
  2. Atmospheric absorption — water vapor absorbs SWIR bands
  3. Sun angle differences — same location looks different at 9am vs noon
  4. Sensor degradation — the sensor changes over time
  5. Geolocation errors — pixels might not line up perfectly

To use raw data, you’d need to apply your own atmospheric correction algorithm — a PhD’s worth of work.

Who Does All This Correction?

Thankfully, agencies like Australia’s DEA (Digital Earth Australia), the USGS, and ESA invest significant compute and expertise to process raw data into analysis-ready products. They:

  1. Download raw scenes from NASA/ESA archives
  2. Apply atmospheric correction (the complex physics)
  3. Orthorectify for terrain
  4. Calibrate across sensors and time
  5. Package as Cloud-Optimized GeoTIFFs (COGs)
  6. Serve via STAC (Spatio-Temporal Asset Catalog) APIs

STAC is a standard API specification — think of it as “REST for rasters.” You query by location, time, and cloud cover; the catalog returns URLs to cloud-hosted GeoTIFFs. No manual download needed.

The key providers:

Analysis-Ready Products

Most users don’t start with raw data. Instead, they use analysis-ready products that have been corrected:

DEA’s NBART (National Broadband Analysis-Ready Tool) corrects for:

The bands are renamed: nbart_blue, nbart_green, nbart_red, nbart_nir, nbart_swir_1, nbart_swir_2

Microsoft Planetary Computer provides similar ARD (Analysis-Ready Data) with consistent Band 1-7 naming.

This is why rss_core uses .canonical_bands(["red", "nir"]) — it translates your semantic names to the provider-specific band names (nbart_red vs band4 vs B04) automatically.

Why Use Indices?

Once you have analysis-ready products, you face a new problem: how do you see what’s happening when you have 10-13 bands?

Human eyes can’t visualize 13-dimensional data. It’s hard to spot “that field is stressed” when you’re looking at a stack of grey rasters. That’s where spectral indices come in.

Over decades, remote sensing scientists discovered that specific materials behave differently at specific wavelengths:

By exploiting these patterns, indices condense multiple bands into a single value that highlights a phenomenon. Instead of checking 4 bands visually, you look at one “vegetation health” map.

What Are Spectral Indices?

A spectral index is a mathematical combination of bands that highlights a specific phenomenon. Indices let you condense multi-band data into single-value “vegetation health” or “water presence” maps.

Here are five commonly used indices:

1. NDVI — Normalized Difference Vegetation Index

The most widely used index. Measures vegetation greenness by comparing red (absorbed) to NIR (reflected):

NDVI = (NIR - Red) / (NIR + Red)

2. NDWI — Normalized Difference Water Index

Highlights open water:

NDWI = (Green - NIR) / (Green + NIR)

3. NBR — Normalized Burn Ratio

Detects burn scars:

NBR = (NIR - SWIR) / (NIR + SWIR)

4. EVI — Enhanced Vegetation Index

Similar to NDVI but corrects for atmospheric and canopy background:

EVI = 2.5 × (NIR - Red) / (NIR + 6×Red - 7.5×Blue + 1)

5. SAVI — Soil Adjusted Vegetation Index

Works better in sparse vegetation areas:

SAVI = (NIR - Red) / (NIR + Red + L) × (1 + L)

The Cloud Problem

Here’s the catch: satellites look through the sky. Clouds obscure the ground, and their shadows darken pixels. Without accounting for clouds, your analysis is wrong.

Cloud cover varies by location and season. In tropical regions like northern Australia, the “wet season” (November to April) has daily afternoon thunderstorms — many scenes are completely clouded over.

Cloud Detection Algorithms

Several algorithms detect clouds and shadows:

These analyze spectral signatures and spatial patterns to classify each pixel as clear, cloud, shadow, snow, or water. The results come as QA bands alongside your data.

For most workflows, you query the STAC catalog with a cloud cover threshold (e.g., < 10%) and use the QA band to mask cloudy pixels before analysis. See DEA’s masking guide for practical examples.

Python: The De Facto Standard

Python is the standard language for geospatial analysis. Libraries like rasterio, xarray, and datacube (Open Data Cube) power most research and operational systems worldwide. It’s accessible, expressive, and fast enough for most tasks.

For smaller analyses — a few scenes, a single year — Python with Dask works well:

But there’s a catch: as your data grows, so do the problems.

When Python Struggles

For large-scale processing — thousands of scenes, continental coverage, multi-decade time series — Python starts to strain:

These are all Mitigable — mypy, pytest, ruff, typed libraries — but they’re opt-in. There’s no mandatory type checker or linter. If you don’t write tests, nothing fails.

Why Rust?

Rust flips this around:

Note: eorst currently runs on a single machine (multicore, not multinode). Distributed processing across multiple nodes is a planned feature.

You still write your algorithms in the same conceptual style — apply functions to arrays, combine bands, compute indices. But the compiler is your teammate, catching bugs before you run.

The Rust Libraries

Two Rust crates make up the core workflow:

rss_core handles the data access layer:

Earth Observation and Remote Sensing Toolkit (eorst) handles the processing layer:

The Typical Pipeline

A geospatial processing pipeline follows four steps:

  1. Query — Define your area of interest, time range, and cloud cover threshold
  2. Download — Fetch the data (or stream directly from cloud storage)
  3. Process — Apply your analysis (indices, classification, change detection)
  4. Export — Write results to GeoTIFF for use in GIS or Python

In the next post, I’ll build a complete pipeline: query, download, and process satellite imagery — End-to-End Geospatial Processing with EORST.

Benchmarking eorst vs Python+Dask? See Benchmarking Rust vs Python+Dask for NDVI: 23× Faster, and Why That Matters for Carbon — comparing eorst/Rust against Python+rioxarray on real satellite imagery. Simple NDVI is 2.9× faster in Rust; add FMask cloud masking and it’s 23× faster.

Further Reading

Type to search...