Red Parquet - Apache Parquet Ruby

Red Parquet is the Ruby bindings of Apache Parquet. Red Parquet is based on GObject Introspection.

Apache Parquet is a columnar storage format.

GObject Introspection is a middleware for language bindings of C library. GObject Introspection can generate language bindings automatically at runtime.

Red Parquet uses Apache Parquet GLib and gobject-introspection gem to generate Ruby bindings of Apache Parquet.

Apache Parquet GLib is a C wrapper for Apache Parquet C++. GObject Introspection can't use Apache Parquet C++ directly. Apache Parquet GLib is a bridge between Apache Parquet C++ and GObject Introspection.

gobject-introspection gem is a Ruby bindings of GObject Introspection. Red Parquet uses GObject Introspection via gobject-introspection gem.

Install

Install Apache Parquet GLib before install Red Parquet. See Apache Arrow install document for details.

Install Red Parquet after you install Apache Parquet GLib:

% gem install red-parquet

Usage

require "parquet"

table = Arrow::Table.load("/dev/shm/data.parquet")
# Process data in table
table.save("/dev/shm/data-processed.parquet")