What happened?
Following the workflow to compute hydrologic signatures in the River Discharge example of the HyRiver docs. Working in VScode jupyter notebook. I was using the example code verbatim except for my own bounding box coordinates and different dates (different code shown below, verbatim code not shown).
dates = ("2000-10-01", "2011-09-30")
bbox = (-115.63, 43.94, -114.96, 44.35)
qobs = nwis.get_streamflow(stations, dates, mmd=True)
plot.signatures(qobs)
The nwis.get_streamflow() function fails and returns this error: ValueError: invalid literal for int() with base 10: '1990:2017'
What did you expect to happen?
I expected a plot of hydrologic signatures for the specified station and date range.
Minimal Complete Verifiable Example
from pygeohydro import NWIS, plot
dates = ("2000-10-01", "2011-09-30")
bbox = (-115.63, 43.94, -114.96, 44.35)
nwis = NWIS()
query = {
"bBox": ",".join(f"{b:.06f}" for b in bbox),
"hasDataTypeCd": "dv",
"outputDataTypeCd": "dv",
}
info_box = nwis.get_info(query)
stations = info_box[
(info_box.begin_date <= dates[0]) & (info_box.end_date >= dates[1])
].site_no.tolist()
query = {
"site": ",".join(stations),
"hasDataTypeCd": "dv",
"outputDataTypeCd": "dv",
}
info = nwis.get_info(query, expanded=True)
info.set_index("site_no").hcdn_2009
qobs = nwis.get_streamflow(stations, dates, mmd=True)
plot.signatures(qobs)
MVCE confirmation
Relevant log output
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[26], line 1
----> 1 qobs = nwis.get_streamflow(stations, dates, mmd=True)
2 plot.signatures(qobs)
File ~/miniconda/envs/geos505/lib/python3.11/site-packages/pygeohydro/nwis.py:759, in NWIS.get_streamflow(cls, station_ids, dates, freq, mmd, to_xarray)
757 siteinfo = siteinfo[siteinfo.site_no.isin(sids)]
758 if mmd:
--> 759 area_sqm = cls._drainage_area_sqm(siteinfo, freq)
760 ms2mmd = 1000.0 * 24.0 * 3600.0
761 try:
File ~/miniconda/envs/geos505/lib/python3.11/site-packages/pygeohydro/nwis.py:537, in NWIS._drainage_area_sqm(cls, siteinfo, freq)
535 """Get drainage area of the stations."""
536 if "nhd_areasqkm" not in siteinfo:
--> 537 area = cls._nhd_info(siteinfo["site_no"].to_list())
538 area = area[["site_no", "nhd_areasqkm"]].copy()
539 else:
File ~/miniconda/envs/geos505/lib/python3.11/site-packages/pygeohydro/nwis.py:301, in NWIS._nhd_info(site_ids)
299 except (TypeError, IntCastingNaNError):
300 area["comid"] = area["comid"].astype("Int32")
--> 301 nhd_area = pynhd.streamcat("fert", comids=area["comid"].dropna().to_list(), area_sqkm=True)
302 area = area.merge(
303 nhd_area[["comid", "wsareasqkm"]], left_on="comid", right_on="comid", how="left"
304 )
305 area["identifier"] = area["identifier"].str.replace("USGS-", "")
File ~/miniconda/envs/geos505/lib/python3.11/site-packages/pynhd/nhdplus_derived.py:726, in streamcat(metric_names, metric_areas, comids, regions, states, counties, conus, percent_full, area_sqkm, lakes_only)
724 if metric_names is None:
725 return StreamCat().metrics_df
--> 726 sc = StreamCatValidator(lakes_only)
727 names = [metric_names] if isinstance(metric_names, str) else metric_names
728 sc.validate(name=names)
File ~/miniconda/envs/geos505/lib/python3.11/site-packages/pynhd/nhdplus_derived.py:586, in StreamCatValidator.__init__(self, lakes_only)
585 def __init__(self, lakes_only: bool = False) -> None:
--> 586 super().__init__(lakes_only)
File ~/miniconda/envs/geos505/lib/python3.11/site-packages/pynhd/nhdplus_derived.py:576, in StreamCat.__init__(self, lakes_only)
573 self.metrics_df = names
575 years = names.set_index("METRIC_NAME").YEAR.dropna()
--> 576 self.valid_years = {
577 str(v): list(range(*(int(y) for y in yrs.split("-"))))
578 if "-" in yrs
579 else [int(y) for y in yrs.split(",")]
580 for v, yrs in years.items()
581 }
File ~/miniconda/envs/geos505/lib/python3.11/site-packages/pynhd/nhdplus_derived.py:579, in <dictcomp>(.0)
573 self.metrics_df = names
575 years = names.set_index("METRIC_NAME").YEAR.dropna()
576 self.valid_years = {
577 str(v): list(range(*(int(y) for y in yrs.split("-"))))
578 if "-" in yrs
--> 579 else [int(y) for y in yrs.split(",")]
580 for v, yrs in years.items()
581 }
File ~/miniconda/envs/geos505/lib/python3.11/site-packages/pynhd/nhdplus_derived.py:579, in <listcomp>(.0)
573 self.metrics_df = names
575 years = names.set_index("METRIC_NAME").YEAR.dropna()
576 self.valid_years = {
577 str(v): list(range(*(int(y) for y in yrs.split("-"))))
578 if "-" in yrs
--> 579 else [int(y) for y in yrs.split(",")]
580 for v, yrs in years.items()
581 }
ValueError: invalid literal for int() with base 10: '1990:2017'
Anything else we need to know?
This is the explanation of the error given by co-pilot:
The error ValueError: invalid literal for int() with base 10: '1990:2017' occurs because the code in the pynhd library (specifically in nhdplus_derived.py) is trying to parse a year string from StreamCat data, but the string '1990:2017' contains a colon (:) instead of the expected dash (-) for ranges or comma (,) for lists.
Here's what's happening step-by-step:
When you call nwis.get_streamflow(stations, dates, mmd=True), the mmd=True parameter triggers the calculation of mean monthly discharge, which requires drainage area data.
To get the drainage area, the code calls pynhd.streamcat("fert", ...) to retrieve StreamCat metrics.
Inside StreamCat.init(), the code parses the YEAR column from the metrics data to build valid_years. It expects year ranges to use dashes (e.g., "1990-2017") or comma-separated lists (e.g., "1990,2017").
For the string '1990:2017', it doesn't contain a dash, so it falls back to splitting by commas and converting each part to an integer. Since '1990:2017' splits to ['1990:2017'], int('1990:2017') fails because the colon isn't a valid character for an integer.
This appears to be a data formatting issue in the StreamCat dataset where a year range is incorrectly delimited with a colon instead of a dash. The pynhd library doesn't handle this case.
Environment
Details
SYS INFO
commit: None
python: 3.11.14 | packaged by conda-forge | (main, Oct 22 2025, 22:53:07) [Clang 19.1.7 ]
python-bits: 64
OS: Darwin
OS-release: 21.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: (None, 'UTF-8')
PACKAGE VERSION
async-retriever 0.19.3
pygeoogc 0.19.4
pygeoutils 0.19.5
py3dep 0.19.0
pynhd 0.19.4
pygridmet N/A
pydaymet N/A
hydrosignatures 0.19.3
pynldas2 N/A
pygeohydro 0.19.4
tiny-retriever N/A
aiodns 3.0.0
aiofiles 25.1.0
aiohttp 3.13.2
aiohttp-client-cache 0.14.1
aiosqlite 0.21.0
brotli 1.1.0
cytoolz 1.1.0
orjson 3.11.4
numpy 2.3.5
pandas 2.3.3
scipy 1.16.3
xarray 2025.12.0
numba N/A
numbagg N/A
click 8.3.0
geopandas 1.1.1
rasterio 1.4.3
rioxarray 0.19.0
shapely 2.1.2
netcdf4 1.7.3
pyproj 3.7.2
defusedxml 0.7.1
folium 0.20.0
h5netcdf 1.7.2
matplotlib 3.10.8
planetary-computer N/A
pystac-client N/A
joblib 1.5.2
multidict 6.6.3
owslib 0.34.1
requests 2.32.5
requests-cache 1.2.1
typing-extensions 4.15.0
url-normalize 2.2.1
urllib3 2.5.0
yarl 1.22.0
networkx 3.5
pyarrow 21.0.0
py7zr N/A
flox N/A
opt-einsum N/A
None
What happened?
Following the workflow to compute hydrologic signatures in the River Discharge example of the HyRiver docs. Working in VScode jupyter notebook. I was using the example code verbatim except for my own bounding box coordinates and different dates (different code shown below, verbatim code not shown).
dates = ("2000-10-01", "2011-09-30")
bbox = (-115.63, 43.94, -114.96, 44.35)
qobs = nwis.get_streamflow(stations, dates, mmd=True)
plot.signatures(qobs)
The nwis.get_streamflow() function fails and returns this error: ValueError: invalid literal for int() with base 10: '1990:2017'
What did you expect to happen?
I expected a plot of hydrologic signatures for the specified station and date range.
Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output
Anything else we need to know?
This is the explanation of the error given by co-pilot:
The error ValueError: invalid literal for int() with base 10: '1990:2017' occurs because the code in the pynhd library (specifically in nhdplus_derived.py) is trying to parse a year string from StreamCat data, but the string '1990:2017' contains a colon (:) instead of the expected dash (-) for ranges or comma (,) for lists.
Here's what's happening step-by-step:
When you call nwis.get_streamflow(stations, dates, mmd=True), the mmd=True parameter triggers the calculation of mean monthly discharge, which requires drainage area data.
To get the drainage area, the code calls pynhd.streamcat("fert", ...) to retrieve StreamCat metrics.
Inside StreamCat.init(), the code parses the YEAR column from the metrics data to build valid_years. It expects year ranges to use dashes (e.g., "1990-2017") or comma-separated lists (e.g., "1990,2017").
For the string '1990:2017', it doesn't contain a dash, so it falls back to splitting by commas and converting each part to an integer. Since '1990:2017' splits to ['1990:2017'], int('1990:2017') fails because the colon isn't a valid character for an integer.
This appears to be a data formatting issue in the StreamCat dataset where a year range is incorrectly delimited with a colon instead of a dash. The pynhd library doesn't handle this case.
Environment
Details
SYS INFO
commit: None
python: 3.11.14 | packaged by conda-forge | (main, Oct 22 2025, 22:53:07) [Clang 19.1.7 ]
python-bits: 64
OS: Darwin
OS-release: 21.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: (None, 'UTF-8')
PACKAGE VERSION
async-retriever 0.19.3
pygeoogc 0.19.4
pygeoutils 0.19.5
py3dep 0.19.0
pynhd 0.19.4
pygridmet N/A
pydaymet N/A
hydrosignatures 0.19.3
pynldas2 N/A
pygeohydro 0.19.4
tiny-retriever N/A
aiodns 3.0.0
aiofiles 25.1.0
aiohttp 3.13.2
aiohttp-client-cache 0.14.1
aiosqlite 0.21.0
brotli 1.1.0
cytoolz 1.1.0
orjson 3.11.4
numpy 2.3.5
pandas 2.3.3
scipy 1.16.3
xarray 2025.12.0
numba N/A
numbagg N/A
click 8.3.0
geopandas 1.1.1
rasterio 1.4.3
rioxarray 0.19.0
shapely 2.1.2
netcdf4 1.7.3
pyproj 3.7.2
defusedxml 0.7.1
folium 0.20.0
h5netcdf 1.7.2
matplotlib 3.10.8
planetary-computer N/A
pystac-client N/A
joblib 1.5.2
multidict 6.6.3
owslib 0.34.1
requests 2.32.5
requests-cache 1.2.1
typing-extensions 4.15.0
url-normalize 2.2.1
urllib3 2.5.0
yarl 1.22.0
networkx 3.5
pyarrow 21.0.0
py7zr N/A
flox N/A
opt-einsum N/A
None