bulum.utils.datetime_functions module
- get_date_format(date_str: str) str
Determine the date format of a date string using trial and error.
This function tries several common date formats and returns the first one that successfully parses the input string.
- Parameters:
date_str (str) – A date string to analyze for format detection.
- Returns:
The date format string (e.g.,
'%Y-%m-%d','%d/%m/%Y') that matches the input string.- Return type:
- Raises:
ValueError – If none of the supported date formats can parse the input string.
Examples
>>> get_date_format("2023-12-25") '%Y-%m-%d' >>> get_date_format("25/12/2023") '%d/%m/%Y'
- get_dates(start_date: str, end_date: str | None = None, days: int = 0, years: int = 1, include_end_date: bool = False, str_format: str | None = None) list[str]
- get_dates(start_date: datetime, end_date: datetime | None = None, days: int = 0, years: int = 1, include_end_date: bool = False, str_format: str | None = None) list[str] | list[datetime]
Generates a list of daily datetime values from a given start date.
The length may be defined by an end_date, or a number of days, or a number of years. This function is useful for working with daily datasets and models. Defaults to 1 year after start_date if end_date, days, and years are not specified.
- Parameters:
start_date (datetime | str) – The starting date for the sequence.
end_date (Optional[datetime | str], default None) – The ending date for the sequence. If provided, takes precedence over days and years.
days (int, default 0) – Number of days to generate. If > 0, takes precedence over years parameter.
years (int, default 1) – Number of years to generate if neither end_date nor days are specified.
include_end_date (bool, default False) – Whether to include the end_date in the generated sequence.
str_format (Optional[str], default None) – If provided, returns string dates in this format instead of datetime objects.
- Returns:
A list of datetime objects or formatted date strings covering the specified range.
- Return type:
- Raises:
ValueError – If years <= 0 when using years parameter for date generation.
Examples
>>> get_dates(datetime(2023, 1, 1), days=3) [datetime.datetime(2023, 1, 1, 0, 0), datetime.datetime(2023, 1, 2, 0, 0), datetime.datetime(2023, 1, 3, 0, 0)]
>>> get_dates('2023-01-01', '2023-01-03', str_format='%Y-%m-%d') ['2023-01-01', '2023-01-02']
- get_month(dates: Iterable[str]) list[int]
Extract month numbers from a list of date strings.
- Parameters:
dates (Iterable[str]) – Iterable of date strings in YYYY-MM-DD format. Assumes consecutive dates.
- Returns:
List of month numbers (1-12) corresponding to the input dates.
- Return type:
Examples
>>> get_month(['2023-01-15', '2023-01-16']) [1, 1] >>> get_month(['2023-12-31']) [12]
- get_next_month_start(stringdate: str) str
Get the first day of the next month for a given date.
- Parameters:
stringdate (str) – Date string in YYYY-MM-DD format.
- Returns:
Date string in YYYY-MM-DD format representing the first day of the next month.
- Return type:
Examples
>>> get_next_month_start("2023-02-15") '2023-03-01' >>> get_next_month_start("2023-12-15") '2024-01-01'
- get_prev_month_end(stringdate: str) str
Get the last day of the previous month for a given date.
- Parameters:
stringdate (str) – Date string in YYYY-MM-DD format.
- Returns:
Date string in YYYY-MM-DD format representing the last day of the previous month.
- Return type:
Examples
>>> get_prev_month_end("2023-03-15") '2023-02-28' >>> get_prev_month_end("2024-03-15") # Leap year '2024-02-29'
- get_this_month_end(stringdate: str) str
Get the last day of the current month for a given date.
- Parameters:
stringdate (str) – Date string in YYYY-MM-DD format.
- Returns:
Date string in YYYY-MM-DD format representing the last day of the current month.
- Return type:
Examples
>>> get_this_month_end("2023-02-15") '2023-02-28' >>> get_this_month_end("2024-02-15") # Leap year '2024-02-29' >>> get_this_month_end("2023-04-15") '2023-04-30'
- get_wy(dates: Index | list[str] | list[datetime64], wy_month: int = 7, using_end_year: bool = False) list[int]
Returns water years for a given array of dates.
Use this function to add water year information into a pandas DataFrame. Assumes consecutive dates for efficiency.
- Parameters:
dates (pd.Index | list[str] | list[np.datetime64]) – Array of dates. Assumes consecutive dates.
wy_month (int, default 7) – Water year start month (1=January, 7=July, etc.).
using_end_year (bool, default False) –
Water year labeling convention:
False: Aligns water years with the primary water allocation at the start of the water year.True: Follows the fiscal year convention whereby water years are labeled based on their end dates. Using the fiscal convention, the 2022 water year is from 2021-07-01 to 2022-06-30 inclusive.
- Returns:
The water years corresponding to the given dates.
- Return type:
Examples
Basic usage with default July start:
>>> get_wy(['2023-06-30', '2023-07-01']) [2022, 2023]
Using fiscal year convention:
>>> get_wy(['2023-06-30', '2023-07-01'], using_end_year=True) [2023, 2024]
Integration with pandas for aggregation:
>>> df.groupby(get_wy(df.index, wy_month=7)).sum().median()
- get_wy_end_date(df: DataFrame, wy_month: int = 7) datetime
Returns an appropriate water year end date based on data frame dates and the water year start month.
- Parameters:
df (pd.DataFrame) – Dataframe with date as index
wy_month (int, optional) – Water year start month. Defaults to 7.
- Returns:
Water year end date.
- Return type:
datetime
- get_wy_start_date(df: Series | DataFrame, wy_month: int = 7) datetime
Returns an appropriate water year start date based on data frame dates and the water year start month.
- Parameters:
df (pd.DataFrame) – Dataframe with date as index
wy_month (int, optional) – Water year start month. Defaults to 7.
Returns – datetime: Water year start date.
- get_year_and_month(v: list[str] | list[datetime]) list[str]
Extract year and month strings from a list of dates.
Returns year and month strings in YYYY-MM format for aggregation by month.
- Parameters:
v (list[str] | list[datetime]) – List of date strings in YYYY-MM-DD format or datetime objects.
- Returns:
List of year-month strings in YYYY-MM format.
- Return type:
Examples
>>> get_year_and_month(['2023-01-15', '2023-02-20']) ['2023-01', '2023-02']
>>> from datetime import datetime >>> get_year_and_month([datetime(2023, 1, 15), datetime(2023, 2, 20)]) ['2023-01', '2023-02']
- standardise_datestring_format(values)
Australian spelling version of
standardize_datestring_format().
- standardize_datestring_format(values: list[str]) list[str]
Converts a list of date strings into a list of date strings in the format YYYY-MM-DD.
This function automatically detects the input date format and converts all dates to the standard ISO 8601 format (YYYY-MM-DD). Uses numpy datetime64 for efficient processing. Tested over the range 0001-01-01 to 9999-12-31.
- Parameters:
values (list[str]) – List of date strings in any supported format.
- Returns:
List of date strings in YYYY-MM-DD format.
- Return type:
Examples
>>> standardize_datestring_format(["25/12/2023", "26/12/2023"]) ['2023-12-25', '2023-12-26']
- to_np_datetimes64d(values: list[str], date_fmt: str = '%Y-%m-%d', *, check_length: bool = True) ndarray[tuple[Any, ...], dtype[datetime64]]
Convert a list of date strings to numpy datetime64[D] array.
This function efficiently converts date strings to numpy datetime64 arrays with day precision. It generates all dates between the first and last date in the input. Handles edge cases at the end of the representable date range (9999-12-31).
- Parameters:
values (list[str]) – List of date strings to convert. Can also accept pandas Series. Generates all dates from first to last date (inclusive).
date_fmt (str, default '%Y-%m-%d') – The date format string for parsing the input dates.
check_length (bool, default True) – Whether to validate that the number of generated dates matches the input length. If True and lengths don’t match, issues a warning but still returns all dates between start and end. Set to False to suppress the warning.
- Returns:
Numpy array of datetime64[D] values from first to last date (inclusive). Returns all dates in the range, regardless of input length.
- Return type:
np.typing.NDArray[np.datetime64]
- Warns:
UserWarning – If check_length is True and the number of generated dates doesn’t match the input length, indicating non-consecutive dates or gaps.
Examples
>>> dates = to_np_datetimes64d(['2023-01-01', '2023-01-02', '2023-01-03']) >>> dates.dtype dtype('<M8[D]')
>>> len(to_np_datetimes64d(['2023-01-01', '2023-01-03'], check_length=False)) 3