Method: Polars::Expr#rolling_var_by

Defined in:
lib/polars/expr.rb

#rolling_var_by(by, window_size, min_periods: 1, closed: "right", ddof: 1, warn_if_unsorted: nil) ⇒ Expr

Note:

If you want to compute multiple aggregation statistics over the same dynamic window, consider using rolling - this method can cache the window size computation.

Compute a rolling variance based on another column.

Examples:

Create a DataFrame with a datetime column and a row number column

start = DateTime.new(2001, 1, 1)
stop = DateTime.new(2001, 1, 2)
df_temporal = Polars::DataFrame.new(
    {"date" => Polars.datetime_range(start, stop, "1h", eager: true)}
).with_row_index
# =>
# shape: (25, 2)
# ┌───────┬─────────────────────┐
# │ index ┆ date                │
# │ ---   ┆ ---                 │
# │ u32   ┆ datetime[ns]        │
# ╞═══════╪═════════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 │
# │ 1     ┆ 2001-01-01 01:00:00 │
# │ 2     ┆ 2001-01-01 02:00:00 │
# │ 3     ┆ 2001-01-01 03:00:00 │
# │ 4     ┆ 2001-01-01 04:00:00 │
# │ …     ┆ …                   │
# │ 20    ┆ 2001-01-01 20:00:00 │
# │ 21    ┆ 2001-01-01 21:00:00 │
# │ 22    ┆ 2001-01-01 22:00:00 │
# │ 23    ┆ 2001-01-01 23:00:00 │
# │ 24    ┆ 2001-01-02 00:00:00 │
# └───────┴─────────────────────┘

Compute the rolling var with the temporal windows closed on the right (default)

df_temporal.with_columns(
  rolling_row_var: Polars.col("index").rolling_var_by("date", "2h")
)
# =>
# shape: (25, 3)
# ┌───────┬─────────────────────┬─────────────────┐
# │ index ┆ date                ┆ rolling_row_var │
# │ ---   ┆ ---                 ┆ ---             │
# │ u32   ┆ datetime[ns]        ┆ f64             │
# ╞═══════╪═════════════════════╪═════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 ┆ null            │
# │ 1     ┆ 2001-01-01 01:00:00 ┆ 0.5             │
# │ 2     ┆ 2001-01-01 02:00:00 ┆ 0.5             │
# │ 3     ┆ 2001-01-01 03:00:00 ┆ 0.5             │
# │ 4     ┆ 2001-01-01 04:00:00 ┆ 0.5             │
# │ …     ┆ …                   ┆ …               │
# │ 20    ┆ 2001-01-01 20:00:00 ┆ 0.5             │
# │ 21    ┆ 2001-01-01 21:00:00 ┆ 0.5             │
# │ 22    ┆ 2001-01-01 22:00:00 ┆ 0.5             │
# │ 23    ┆ 2001-01-01 23:00:00 ┆ 0.5             │
# │ 24    ┆ 2001-01-02 00:00:00 ┆ 0.5             │
# └───────┴─────────────────────┴─────────────────┘

Compute the rolling var with the closure of windows on both sides

df_temporal.with_columns(
  rolling_row_var: Polars.col("index").rolling_var_by(
    "date", "2h", closed: "both"
  )
)
# =>
# shape: (25, 3)
# ┌───────┬─────────────────────┬─────────────────┐
# │ index ┆ date                ┆ rolling_row_var │
# │ ---   ┆ ---                 ┆ ---             │
# │ u32   ┆ datetime[ns]        ┆ f64             │
# ╞═══════╪═════════════════════╪═════════════════╡
# │ 0     ┆ 2001-01-01 00:00:00 ┆ null            │
# │ 1     ┆ 2001-01-01 01:00:00 ┆ 0.5             │
# │ 2     ┆ 2001-01-01 02:00:00 ┆ 1.0             │
# │ 3     ┆ 2001-01-01 03:00:00 ┆ 1.0             │
# │ 4     ┆ 2001-01-01 04:00:00 ┆ 1.0             │
# │ …     ┆ …                   ┆ …               │
# │ 20    ┆ 2001-01-01 20:00:00 ┆ 1.0             │
# │ 21    ┆ 2001-01-01 21:00:00 ┆ 1.0             │
# │ 22    ┆ 2001-01-01 22:00:00 ┆ 1.0             │
# │ 23    ┆ 2001-01-01 23:00:00 ┆ 1.0             │
# │ 24    ┆ 2001-01-02 00:00:00 ┆ 1.0             │
# └───────┴─────────────────────┴─────────────────┘

Parameters:

  • by (String)

    This column must be of dtype Datetime or Date.

  • window_size (String)

    The length of the window. Can be a dynamic temporal size indicated by a timedelta or the following string language:

    • 1ns (1 nanosecond)
    • 1us (1 microsecond)
    • 1ms (1 millisecond)
    • 1s (1 second)
    • 1m (1 minute)
    • 1h (1 hour)
    • 1d (1 calendar day)
    • 1w (1 calendar week)
    • 1mo (1 calendar month)
    • 1q (1 calendar quarter)
    • 1y (1 calendar year)

    By "calendar day", we mean the corresponding time on the next day (which may not be 24 hours, due to daylight savings). Similarly for "calendar week", "calendar month", "calendar quarter", and "calendar year".

  • min_periods (Integer) (defaults to: 1)

    The number of values in the window that should be non-null before computing a result.

  • closed ('left', 'right', 'both', 'none') (defaults to: "right")

    Define which sides of the temporal interval are closed (inclusive), defaults to 'right'.

  • ddof (Integer) (defaults to: 1)

    "Delta Degrees of Freedom": The divisor for a length N window is N - ddof

  • warn_if_unsorted (Boolean) (defaults to: nil)

    Warn if data is not known to be sorted by by column.

Returns:


4749
4750
4751
4752
4753
4754
4755
4756
4757
4758
4759
4760
4761
4762
4763
4764
4765
4766
4767
4768
# File 'lib/polars/expr.rb', line 4749

def rolling_var_by(
  by,
  window_size,
  min_periods: 1,
  closed: "right",
  ddof: 1,
  warn_if_unsorted: nil
)
  window_size = _prepare_rolling_by_window_args(window_size)
  by = Utils.parse_into_expression(by)
  _from_rbexpr(
    _rbexpr.rolling_var_by(
      by,
      window_size,
      min_periods,
      closed,
      ddof
    )
  )
end