Method: Polars::SQLContext#execute

Defined in:
lib/polars/sql_context.rb

#execute(query, eager: nil) ⇒ Object

Parse the given SQL query and execute it against the registered frame data.

Examples:

Execute a SQL query against the registered frame data:

df = Polars::DataFrame.new(
  [
    ["The Godfather", 1972, 6_000_000, 134_821_952, 9.2],
    ["The Dark Knight", 2008, 185_000_000, 533_316_061, 9.0],
    ["Schindler's List", 1993, 22_000_000, 96_067_179, 8.9],
    ["Pulp Fiction", 1994, 8_000_000, 107_930_000, 8.9],
    ["The Shawshank Redemption", 1994, 25_000_000, 28_341_469, 9.3],
  ],
  schema: ["title", "release_year", "budget", "gross", "imdb_score"]
)
ctx = Polars::SQLContext.new(films: df)
ctx.execute(
  "
  SELECT title, release_year, imdb_score
  FROM films
  WHERE release_year > 1990
  ORDER BY imdb_score DESC
  ",
  eager: true
)
# =>
# shape: (4, 3)
# ┌──────────────────────────┬──────────────┬────────────┐
# │ title                    ┆ release_year ┆ imdb_score │
# │ ---                      ┆ ---          ┆ ---        │
# │ str                      ┆ i64          ┆ f64        │
# ╞══════════════════════════╪══════════════╪════════════╡
# │ The Shawshank Redemption ┆ 1994         ┆ 9.3        │
# │ The Dark Knight          ┆ 2008         ┆ 9.0        │
# │ Schindler's List         ┆ 1993         ┆ 8.9        │
# │ Pulp Fiction             ┆ 1994         ┆ 8.9        │
# └──────────────────────────┴──────────────┴────────────┘

Execute a GROUP BY query:

ctx.execute(
  "
  SELECT
      MAX(release_year / 10) * 10 AS decade,
      SUM(gross) AS total_gross,
      COUNT(title) AS n_films,
  FROM films
  GROUP BY (release_year / 10) -- decade
  ORDER BY total_gross DESC
  ",
  eager: true
)
# =>
# shape: (3, 3)
# ┌────────┬─────────────┬─────────┐
# │ decade ┆ total_gross ┆ n_films │
# │ ---    ┆ ---         ┆ ---     │
# │ i64    ┆ i64         ┆ u32     │
# ╞════════╪═════════════╪═════════╡
# │ 2000   ┆ 533316061   ┆ 1       │
# │ 1990   ┆ 232338648   ┆ 3       │
# │ 1970   ┆ 134821952   ┆ 1       │
# └────────┴─────────────┴─────────┘

Parameters:

  • query (String)

    A valid string SQL query.

  • eager (Boolean) (defaults to: nil)

    Apply the query eagerly, returning DataFrame instead of LazyFrame. If unset, the value of the init-time parameter "eager_execution" will be used. (Note that the query itself is always executed in lazy-mode; this parameter only impacts the type of the returned frame).

Returns:


89
90
91
92
# File 'lib/polars/sql_context.rb', line 89

def execute(query, eager: nil)
  res = Utils.wrap_ldf(_ctxt.execute(query))
  eager || _eager_execution ? res.collect : res
end