Let me tell you about the gauge that looked alive but wasn’t.
I was building BETA — a climbing conditions tool for Cascade crags — and I wanted river data for the Skykomish corridor. Index Town Wall and Miller River Boulders both sit in that drainage. When the Skykomish is running high and angry, the approaches are a mess regardless of what the weather’s doing. That’s useful signal.
USGS publishes real-time streamflow data through their National Water Information System, free, no API key. I found a gauge right near Miller River — 12132000 — that showed up on the map with an active status. Perfect location, exactly what I needed.
Wired it up, ran the pipeline, got nothing back.
Not an error. Not a timeout. Just… no data. Crickets.
Turns out that gauge has been inactive for years. It’s still on the map. It still shows a green “active” indicator. It returns a valid JSON response with the right structure — just empty values arrays inside. The USGS equivalent of a store that’s been closed for years but never took down the “Open” sign.
values = data["value"]["timeSeries"][0]["values"][0]["value"]
# returns [] — perfectly valid, completely useless
This is the kind of thing that makes you want to flip a table. But okay. I fell back to gauge 12134500 — Skykomish River near Gold Bar — about 10 miles downstream from Miller River. Not ideal, but close enough given the scale of what I’m measuring.
The pipeline
The USGS IV (instantaneous values) API is dead simple:
GET https://waterservices.usgs.gov/nwis/iv/?sites=12134500¶meterCd=00060&period=PT2H&format=json
parameterCd=00060 is discharge in cubic feet per second. period=PT2H gives me the last 2 hours of 15-minute interval readings — about 8 values. I only need the last two to determine trend.
def fetch_streamflow(gauge_id: str) -> dict:
params = {
"sites": gauge_id,
"parameterCd": "00060",
"period": "PT2H",
"format": "json",
}
url = USGS_URL + "?" + urllib.parse.urlencode(params)
try:
with urllib.request.urlopen(url, timeout=10) as response:
data = json.loads(response.read().decode())
except Exception as e:
print(f" ⚠️ USGS API error for gauge {gauge_id}: {e}")
return None
try:
values = data["value"]["timeSeries"][0]["values"][0]["value"]
except (KeyError, IndexError):
return None
# Filter out USGS sentinel value for missing data
valid = [v for v in values if float(v["value"]) >= 0]
if len(valid) < 2:
return None
current_cfs = float(valid[-1]["value"])
previous_cfs = float(valid[-2]["value"])
delta = current_cfs - previous_cfs
if delta > 5:
trend = "rising"
elif delta < -5:
trend = "falling"
else:
trend = "steady"
return {
"cfs": round(current_cfs),
"trend": trend,
"gauge_id": gauge_id,
}
A few things worth noting:
The -999999 sentinel. USGS uses this as a “no data” marker for individual readings within an otherwise valid response. If I don’t filter those out, I get wildly wrong delta calculations. I filter for value >= 0 which catches it without hardcoding the sentinel value.
5 CFS trend threshold. On a calm river, you get minor fluctuations in consecutive 15-minute readings that aren’t meaningful. 5 CFS eliminates the noise without masking real movement on a river that swings thousands of CFS during a storm.
Null-safe integration. In crags.json, only crags with a gauge_id field get streamflow fetched. The main loop checks if crag.get("gauge_id"): so non-gauged crags are completely unaffected. Adding a new gauged crag is a one-line config change.
What I learned
The USGS data infrastructure is genuinely impressive — real-time data from thousands of gauges nationwide, free, reliable, well-documented. But the map UI doesn’t clearly distinguish inactive gauges from active ones, which cost me maybe an hour of debugging before I figured out what was happening.
The fix was simple once I understood the problem: validate that you actually got readings, not just a valid response shape. A response with the right JSON structure but empty values arrays will pass most basic error checks. You have to go one level deeper.
Also: PT2H is ISO 8601 duration format. Took me longer than I’d like to admit to remember that.
The pipeline runs on GitHub Actions every 6 hours. USGS is polite to poll at that cadence — they recommend not hitting the API more than once per minute per gauge. 4 times a day for 2 gauges is nowhere near that limit.
Full pipeline code is in the BETA repo. The tool itself is at beta.trenigma.dev.
Part 2 of 3 in BETA: Building a Climbing Conditions Pipeline ← Part 1 — Why I Built It Next: Part 3 — Adding PurpleAir AQI (and the Wood Stove Surprise) →