bulum.stats.storage_level_assessment module

Storage level assessment functionality for water resource analysis.

This module provides the StorageLevelAssessment class for analyzing storage levels against trigger thresholds, including event detection, duration analysis, and statistical summaries by water year.

class StorageLevelAssessment(df: Series, triggers: list[float], wy_month: int = 7, allow_part_years: bool = False, trigger_names: list[str] | None = None)

Bases: object

Analyze storage levels against trigger thresholds for water resource management.

This class provides comprehensive analysis of storage time series data, including event detection when storage falls below specified trigger levels, duration analysis, and statistical summaries organized by water year.

Examples

Basic usage with numeric triggers:

>>> import pandas as pd
>>> from bulum.stats import StorageLevelAssessment
>>>
>>> # Create sample storage data
>>> dates = pd.date_range('2020-01-01', '2022-12-31', freq='D')
>>> storage = pd.Series(np.random.uniform(20, 120, len(dates)), index=dates, name='Storage')
>>>
>>> # Initialize assessment with trigger levels
>>> sla = StorageLevelAssessment(storage, triggers=[100, 75, 50, 25])
>>>
>>> # Get comprehensive summary
>>> summary = sla.Summary()
>>> print(summary)

Usage with named triggers for better readability:

>>> # Initialize with meaningful trigger names
>>> trigger_names = ["Full Supply", "Level 1", "Level 2", "Critical"]
>>> sla = StorageLevelAssessment(storage, triggers=[100, 75, 50, 25],
...                            trigger_names=trigger_names)
>>>
>>> summary = sla.Summary()  # Shows "Trigger Name" column
>>>
>>> # Add additional trigger dynamically
>>> sla.add_trigger(10.0, name="Emergency")

Advanced analysis and visualization:

>>> # Get event statistics for specific trigger
>>> events_count = sla.EventsBelowTriggerCount(length=7)  # Events >= 7 days
>>> max_events = sla.EventsBelowTriggerMax()
>>>
>>> # Annual analysis
>>> annual_days = sla.AnnualDaysBelow()
>>> percent_years = sla.PercentWaterYearsBelow()
>>>
>>> # Create visualizations
>>> chart = sla.plot_events_ranked(50, interactive=True)
>>> freq_chart = sla.plot_event_length_frequency(25)

Water year analysis with custom parameters:

>>> # Use calendar year (wy_month=1) and allow partial years
>>> sla_cal = StorageLevelAssessment(storage, triggers=[50, 25],
...                                wy_month=1, allow_part_years=True)
>>>
>>> # Get summary for single trigger
>>> critical_summary = sla_cal.Summary(trigger=25)

Custom aggregation of event data:

>>> # Use custom function to analyze events
>>> import numpy as np
>>> mean_events = sla.EventsBelowTriggerAggregate(np.mean)
>>> median_events = sla.EventsBelowTriggerAggregate(np.median)

See also

bulum.utils.crop_to_wy

Crop data to complete water years

bulum.utils.get_wy

Get water year for dates

AnnualDaysBelow() dict

Calculate total days at or below trigger threshold by water year.

This method counts the number of days in each water year where storage was at or below each trigger threshold.

Returns:

dict of {float – Dictionary where keys are trigger threshold values (float) and values are pandas.Series with water years as index and day counts as values.

Return type:

pandas.Series}

Examples

>>> annual_days = sla.AnnualDaysBelow()
>>> print(annual_days[50])  # Days below 50 ML by water year
>>> # Example output:
>>> # 2020    45
>>> # 2021    12
>>> # 2022    78
AnnualDaysBelowSummary(trigger: float | None = None, annualdaysbelow: dict | None = None)

Generate summary of total days at or below trigger threshold by water year.

Parameters:
  • trigger (any, optional) – Optionally provide single trigger threshold to be assessed. Default is None.

  • annualdaysbelow (dict, optional) – Optionally provide output from AnnualDaysBelow, otherwise recalculate. Default is None.

Returns:

DataFrame of total days at or below threshold by water year, grouped by trigger threshold. If trigger is specified, returns Series for that trigger.

Return type:

pandas.DataFrame or pandas.Series

EventsBelowTrigger(min_length: int = 1) dict

Get event length arrays for each trigger threshold with minimum length filter.

Parameters:

min_length (int, optional) – Minimum event length to return. Default is 1.

Returns:

dict of {float – Dictionary of event length arrays, grouped by trigger threshold.

Return type:

list of int}

EventsBelowTriggerAggregate(function: Callable, *, min_length: int = 1) dict

Aggregate event lengths using a custom function for each trigger threshold with minimum length filter.

Only events with duration >= length days are considered in the analysis. This allows filtering out short-duration events before applying custom aggregation functions like median, standard deviation, percentiles, etc.

Parameters:
  • function (typing.Callable) – Function that acts on arrays/iterables and returns a single value (e.g., float).

  • min_length (int, optional) – Minimum event length (in days). Only events with duration >= this value are included in the analysis. Default is 1.

Returns:

dict of {float – Dictionary of aggregated event values, grouped by trigger threshold. Returns NaN for triggers with no events meeting the minimum length criteria.

Return type:

float}

Examples

>>> import numpy as np
>>> # Get median of events 30 days or longer before counting
>>> median_long_events = sla.EventsBelowTriggerAggregate(np.median, length=30)
>>>
>>> # Get 95th percentile of events 7 days or longer
>>> p95_events = sla.EventsBelowTriggerAggregate(
...     lambda x: np.percentile(x, 95), length=7)
EventsBelowTriggerAlgorithm(trigger: float) list[int]

Calculate array of event lengths for a specific trigger threshold.

This is the core algorithm that detects continuous periods where storage is at or below the trigger threshold. An event starts when storage falls to or below the trigger and ends when storage rises above the trigger.

Parameters:

trigger (float) – Trigger threshold against which daily storage data is assessed. Uses <= comparison (storage at or below trigger).

Returns:

List where each element represents the length (in days) of a single continuous event below the trigger threshold. Empty list if no events occurred.

Return type:

list of int

Examples

>>> # Get event lengths for 50 ML trigger
>>> events = sla.EventsBelowTriggerAlgorithm(50.0)
>>> print(events)  # e.g., [5, 12, 3, 45] - four events of different lengths
>>>
>>> # Analyze the events
>>> print(f"Number of events: {len(events)}")
>>> print(f"Longest event: {max(events)} days" if events else "No events")
>>> print(f"Average event length: {np.mean(events):.1f} days" if events else "No events")

Notes

This algorithm handles edge cases including: - Events that start at the beginning of the time series - Events that end at the end of the time series - Single-day events - No events (returns empty list)

EventsBelowTriggerCount(min_length: int = 1) dict

Count events for each trigger threshold with minimum length filter.

Parameters:

min_length (int, optional) – Minimum event length to count. Default is 1.

Returns:

dict of {float – Dictionary of event counts, grouped by trigger threshold.

Return type:

int}

EventsBelowTriggerMax(*, min_length: int = 1) dict

Find maximum event length for each trigger threshold with minimum length filter.

Only events with duration >= length days are considered in the analysis. This allows filtering out short-duration events that may not be operationally significant.

Parameters:

min_length (int, optional) – Minimum event length (in days). Only events with duration >= this value are included in the analysis. Default is 1.

Returns:

dict of {float – Dictionary of maximum event lengths, grouped by trigger threshold. Returns NaN for triggers with no events meeting the minimum length criteria.

Return type:

int}

EventsBelowTriggerMean(*, min_length: int = 1) dict

Calculate mean event length for each trigger threshold with minimum length filter.

Only events with duration >= length days are considered in the analysis. This allows filtering out short-duration events that may not be operationally significant when calculating average event durations.

Parameters:

min_length (int, optional) – Minimum event length (in days). Only events with duration >= this value are included in the analysis. Default is 1.

Returns:

dict of {float – Dictionary of mean event lengths, grouped by trigger threshold. Returns NaN for triggers with no events meeting the minimum length criteria.

Return type:

float}

Examples

>>> mean_events = sla.EventsBelowTriggerMean()
>>> print(mean_events[50])  # Average event length for 50 ML trigger
>>> # Example output: 12.5 (average of all events below 50 ML)
>>>
>>> # Only consider events 7 days or longer before counting
>>> mean_long_events = sla.EventsBelowTriggerMean(length=7)
>>> print(mean_long_events[50])  # Average of events >= 7 days only

See also

EventsBelowTriggerMax

Maximum event lengths

EventsBelowTriggerAggregate

Custom aggregation functions

NumberWaterYearsBelow(annualdaysbelow: dict | None = None, *, min_days_per_year: int = 1)

Calculate total water years with at least one day at or below trigger threshold.

Parameters:
  • annualdaysbelow (dict, optional) – Optionally provide output from AnnualDaysBelow, otherwise recalculate. Default is None.

  • min_days_per_year (int, optional) – Minimum number of days in a water year before counting it. Only water years with >= this many days below the trigger are counted. Default is 1.

Returns:

dict of {float – Dictionary of total water years grouped by trigger threshold.

Return type:

int}

PercentWaterYearsBelow(numberyears: dict | None = None, *, min_days_per_year: int = 1)

Calculate percentage of water years with at least one day at or below trigger threshold.

Parameters:
  • numberyears (dict, optional) – Optionally provide output from NumberWaterYearsBelow, otherwise recalculate. Default is None.

  • min_days_per_year (int, optional) – Minimum number of days in a water year before counting it. Only water years with >= this many days below the trigger are counted. Default is 1.

Returns:

dict of {float – Dictionary of percentage years grouped by trigger threshold.

Return type:

float}

Summary(trigger: float | None = None, include_mean: bool = False) DataFrame | Series

Generate comprehensive summary table of storage level assessment outputs.

Parameters:
  • trigger (any, optional) – Optionally provide single trigger threshold to be assessed. Default is None.

  • include_mean (bool, optional) – Include average event length column in the summary. Default is False.

Returns:

Comprehensive summary including start/end dates, water year statistics, event counts for various durations, and maximum event lengths. If trigger is specified, returns Series for that trigger only. When trigger names are provided, they are displayed in the summary. When include_mean is True, adds “Average period at or below trigger (days)” column.

Return type:

pandas.DataFrame or pandas.Series

add_trigger(trigger: float, name: str | None = None) None

Add an additional trigger level to the assessment.

Parameters:
  • trigger (float) – New trigger threshold to be assessed. Must not already exist in the current trigger list. Value should be in same units as storage data.

  • name (str, optional) – Descriptive name for the new trigger level. Required if the assessment was initialized with trigger_names, otherwise must be None to maintain consistency. Default is None.

Raises:

ValueError – If trigger already exists in the assessment, if name is required but not provided (when trigger_names exist), or if name is provided but no trigger_names exist.

Examples

>>> # Assessment without names - add trigger without name
>>> sla = StorageLevelAssessment(storage_data, [100, 50])
>>> sla.add_trigger(25.0)  # ✓ Valid
>>>
>>> # Assessment with names - name is required
>>> sla_named = StorageLevelAssessment(storage_data, [100, 50],
...                                  trigger_names=["High", "Low"])
>>> sla_named.add_trigger(25.0, name="Critical")  # ✓ Valid
>>>
>>> # Immediate availability for analysis
>>> summary = sla_named.Summary()  # Includes new trigger
>>> events = sla_named.EventsBelowTriggerCount()  # Includes new trigger
>>> chart = sla_named.plot_events_ranked(25.0)  # Can plot new trigger

See also

EventsBelowTriggerAlgorithm

Algorithm used for new trigger analysis

Summary

Method that includes newly added triggers

plot_event_length_frequency(trigger: float, *, width=600, height=400, xmax: int | None = None, interactive=False, bind_y=True, mark: Literal['bar', 'rect'] = 'bar') Chart

Create an Altair chart showing frequency distribution of event lengths.

Parameters:
  • trigger (float) – Trigger level for which to plot the frequency distribution.

  • width (int, optional) – Plot width. Default is 600.

  • height (int, optional) – Plot height. Default is 400.

  • xmax (int, optional) – Maximum event length to plot. Default is None, indicating all data.

  • interactive (bool, optional) – Set to True to enable pan and zoom functionality. Default is False.

  • bind_y (bool, optional) – When True (default), zooming will only be horizontal; vertical values remain fixed.

  • mark ({"bar", "rect"}, optional) – Controls plot style. “rect” fixes gaps between data but may be harder to read. Default is “bar”.

Returns:

Altair chart showing frequency distribution of event lengths with tooltips.

Return type:

altair.Chart

Raises:
  • KeyError – Provided trigger has not been evaluated previously for this assessment.

  • ValueError – Invalid keyword argument supplied.

plot_events_ranked(trigger: float, *, width=600, height=400, xmax: int | None = None, interactive=False, bind_y=True, mark: Literal['bar', 'rect'] = 'bar') Chart

Create an Altair chart of ranked event durations below the trigger threshold.

Parameters:
  • trigger (float) – Trigger level for which to plot the ranking.

  • width (int, optional) – Plot width. Default is 600.

  • height (int, optional) – Plot height. Default is 400.

  • xmax (int, optional) – Controls how many data points to plot. Default is None, indicating all data.

  • interactive (bool, optional) – Set to True to enable pan and zoom functionality. Default is False.

  • bind_y (bool, optional) – When True (default), zooming will only be horizontal; vertical values remain fixed.

  • mark ({"bar", "rect"}, optional) – Controls plot style. “rect” fixes gaps between data but may be harder to read. Default is “bar”.

Returns:

Altair chart showing ranked event durations with tooltips and interactive features.

Return type:

altair.Chart

Raises:
  • KeyError – Provided trigger has not been evaluated previously for this assessment.

  • ValueError – Invalid keyword argument supplied.

property trigger_names: dict[float, str] | None

Get trigger names as a dictionary mapping trigger levels to names.