bulum.io.idx_io module
IO functions for IDX files and IQQM .OUT files.
This module provides functions for reading and writing IDX files (both the text index files and their corresponding binary .OUT data files) used by the IQQM hydrological model. Functions include both native Python implementations and utilities that rely on external tools like csvidx.exe.
- read_idx(filename: str | Path, skip_header_bytes: bool | None = None) TimeseriesDataframe
Read IDX file and corresponding IQQM .OUT binary file.
This function reads an IQQM .IDX index file and its corresponding .OUT binary data file, returning the data as a DataFrame. Currently only supports daily data (date_flag=0).
- Parameters:
filename (str or Path) – Path to the IDX file. The corresponding .OUT file is expected to be in the same directory with the same base name.
skip_header_bytes (bool, optional) – Whether to skip header bytes in the corresponding .OUT file. Some versions of IQQM compiled with older compilers include metadata/junk data as a header row. If None (default), attempts automatic detection based on file structure.
- Returns:
DataFrame with datetime index and columns named as “{num}>{source_file}>{description}”.
- Return type:
utils.TimeseriesDataframe
- Raises:
FileNotFoundError – If the IDX file or corresponding OUT file does not exist.
NotImplementedError – If the file contains monthly (date_flag=1) or annual (date_flag=3) data.
ValueError – If the date_flag in the file is not 0, 1, or 3.
- write_area_ts_csv(df: DataFrame, filename: str | Path, units: str = '(mm.d^-1)') None
Write timeseries data to area-weighted CSV format for use with csvidx.
This function writes a DataFrame to a CSV file in a specific format used by the csvidx tool, with column names truncated to 12 characters and a header row containing catchment area information (defaulting to 1.0 km^2 for all columns).
- Parameters:
- Raises:
ValueError – If column names clash when truncated to 12 characters.
- write_idx(df: DataFrame, filename: str | Path, cleanup_tempfile: bool = True, *, exist_ok: bool = True) None
Write IDX file from dataframe using csvidx.exe.
This function creates both an .IDX index file and a corresponding .OUT binary file by first writing a temporary CSV file and then calling the external csvidx.exe utility.
- Parameters:
df (DataFrame) – DataFrame with datetime index to write.
filename (str or Path) – Path to the IDX file to write. Will overwrite any existing file if exist_ok is True.
cleanup_tempfile (bool, default=True) – Whether to remove the temporary CSV file after conversion.
exist_ok (bool, default=True) – If False, raise FileExistsError if the file already exists. If True, allow overwriting existing files.
- Raises:
FileExistsError – If exist_ok is False and filename already exists.
FileNotFoundError – If csvidx.exe is not found on path.
- write_idx_native(df: DataFrame, filepath: str | Path, type: str = 'None', units: str = 'None') None
Write IDX and OUT binary files using pure Python (no external tools).
This function writes both an .IDX index file and a corresponding .OUT binary file using native Python, without requiring external tools like csvidx.exe. Currently only supports daily data (date_flag=0), matching the capabilities of
read_idx().The function assumes all columns in the DataFrame share the same units and data type (e.g., all Precipitation in mm, or all Flow in ML/d).
- Parameters:
df (pd.DataFrame) – DataFrame with datetime index containing the data to write. Should follow the same format as output from
read_idx().filepath (str or Path) – Path to the IDX file to write (including .IDX extension). The corresponding .OUT file will be created with the same base name.
type (str, default="None") – Data type specifier for all columns in df, e.g., “Gauged Flow”, “Precipitation”, “Evaporation”, etc.
units (str, default="None") – Units for all data in df, e.g., “mm”, “ML/d”, “mm/day”, etc.
Notes
The first line of the IDX file contains version/timestamp metadata (currently a placeholder copied from reference files).
Column names in the output are truncated/padded to fit the IDX format specifications (12 chars for source, 40 for description, etc.).
The .OUT file is written in binary format as 32-bit floats (float32).