bulum.stats.negflo module

Bulum implementation of Negflo.

Warning

This implementation is experimental, insofar as having been written entirely based off the Qld Hydrology page as a “spec”. If there are bugs/unexpected behaviours please let us know!

See also

Spec

class ContiguousIndexTracker

Bases: object

Convenience class to track contiguous blocks of indices.

add(idx: int, v) → None: Adds current index/val pair to tracker.

force_add(idx: int, v) → None: Be careful with this, as it may invalidate any computations to do with the indices of the values.

get() → list[int]: Return the list of indices which are currently being tracked.

indices(): Returns a range of indices of the associated collection for which values were tracked.

is_member_of_block(idx): Check if idx is adjacent to the currently tracked block of indices.

is_tracking(): Checks if the tracker is active.

Note

If start_idx is not None then it is required that last_idx is also not null.

offset(offset_val: Any | Callable[[Any], Any]) → list[Any]

reset(idx: int | None = None, val: list | None = None): Resets to default or resets to current index/value (list).

sum_and_reset(): Returns the sum of the underlying accumulator and resets the trackers.

class Negflo(df_residual: DataFrame | TimeseriesDataframe, flow_limit: float = 0.0, num_segments: int = 0, segments: MutableSequence[tuple[Timestamp, Timestamp]] | None = None)

Bases: object

Bulum implementation of NEGFLO.

When there are negative overflows from the smoothing algorithm, they will be noted in self.neg_overflows.

cl1() → DataFrame

Clip all negative flows to zero.

Returns:: The clipped residual dataframe.
Return type:: pd.DataFrame

property df_residual: DataFrame: The current working residual dataframe (read-only).

log() → None

Not yet implemented.

Input_file_name.LOG

A file is also created which gives the total of the positive and negative flows, the total of the positive flows above the flow limit. It also gives the start and end of each period of flows above the flow limit, the total of the preceding negatives and the total of the positive flow above the flow limit.

run_all(folder: str = '.') → None: Runs all analyses on the residual, saving each to folder.

rw1() → DataFrame

Compute the raw residual i.e. downstream-upstream flows.

Internally, resets the residual to that stored on initialisation.

Returns:: The raw residual dataframe.
Return type:: pd.DataFrame

sm1() → DataFrame

Redistribute negative flows across all positive flow events.

The negative flows are set to zero and the excess positive flows have been adjusted by a factor of:

1 - abs(Total of the negative flows)/(Total of the positive flows)

Returns:: The smoothed residual dataframe.
Return type:: pd.DataFrame

sm2() → DataFrame

Redistribute negative flows into future positive flow events, with carry-over.

Accumulated negative flows are factored into positive flow events (defined as periods above the flow limit) using the formula from before, namely:

1 - abs(Total of the accumulated negative flows)/(Total of the positive flow period)

Note that this will not reduce flows below the specified flow limit (self.flow_limit).

This method accumulates negative flows such that if the first encountered positive flow period is not sufficiently large, it will load the remaining balance into the next positive flow period.

As before, if the flow limit is set to zero flow, the flows will give modelled flows with a mean that is close to the mean of the measure flows. However, it can eliminate small flow peaks if there are a lot of negative flows. Setting the flow limit to a high flow preserves these peaks, but can severely reduce the high flows. It can give a ranked flow plot with a notch at the flow limit.

Returns:: The smoothed residual dataframe.
Return type:: pd.DataFrame

sm3() → DataFrame

Redistribute negative flows into future positive flow events, without carry-over.

Returns:: The smoothed residual dataframe.
Return type:: pd.DataFrame

See also

sm2()

sm4() → DataFrame

Redistribute negative flows into past positive flow events, carrying forward negative flow into the future. Refer to sm2().

Returns:: The smoothed residual dataframe.
Return type:: pd.DataFrame

sm5() → DataFrame

Redistribute negative flows into past positive flow events, without carrying negative flows into the future.

Returns:: The smoothed residual dataframe.
Return type:: pd.DataFrame

See also

sm2()
sm4()

sm6(*, use_predefined_segments=True, sampling_frequency: DateOffset | None = None, sampling_start_date: Timestamp | None = None) → DataFrame

Smooths over the specified segments.

Warning

Not fully tested/implemented. If this doesn’t work, let us know and you should be able to manually chunk data and apply sm1(). See e.g. utils.get_wy().

Applies the SM1 smoothing algorithm (ie global smoothing) for flows across the specified periods. If no segments are defined or method is set to sample, then it will partition the full period on an annual (default) basis.

Unlike the reference documentation, this function does not set the flow limit to zero while smoothing.

Assumes the indices of the underlying dataframe are datetimes.

Parameters:

use_predefined_segments (bool, default True) – Use the stored segments (self.sm6_segment_boundaries) if they exist. Otherwise the segments will be computed when this method is called.
sampling_frequency (pd.DateOffset, optional) – Specifies the time interval for smoothing periods. Defaults to one year.
sampling_start_date (pd.Timestamp, optional) – Specifies the start of the first period for sampling. Defaults to the start of the data period.

Returns:

The smoothed residual dataframe.

Return type:

pd.DataFrame

See also

sm1()

sm7() → DataFrame

Smooths negative flows over the largest adjacent positive flow event.

Note

Unlike the reference document, this program does not require the flow limit be negative.

Returns:: The smoothed residual dataframe.
Return type:: pd.DataFrame

See also

sm2()

to_file(*, out_filename: str | None = None, folder: str | None = None) → None

Saves the result dataframe to the output file.

Parameters:

out_filename (str, optional) – Explicit output path. If omitted, the filename is derived from df_name and the current analysis type extension.
folder (str, optional) – Directory in which to place the auto-named file. Ignored when out_filename is supplied.

class NegfloAnalysisType(*values)

Bases: Enum

Negflo class keeps track of most recent analysis performed.

CLIPPED = 0

RAW = -1

SMOOTHED_ALL = 1

SMOOTHED_BACKWARD = 4

SMOOTHED_BACKWARD_NO_CARRY = 5

SMOOTHED_FORWARD = 2

SMOOTHED_FORWARD_NO_CARRY = 3

SMOOTHED_NEG_LIM = 7

SMOOTHED_SEGMENTS = 6

to_file_extension() → str: Gives the corresponding file extension for the analysis type.

class NegfloFileType(*values)

Bases: Enum

Negflo file types. For use in config file setup.

IQQM = 0

IQQM_GUI = 1

SOURCE_INPUT = 2

SOURCE_OUTPUT = 3