Sequel::Packer
Sequel::Packer
is a Ruby serialization library to be used with the Sequel
ORM with the following qualities:
- Declarative: Define the shape of your serialized data with a simple, straightforward DSL.
- Flexible: Certain contexts require different data. Packers provide an easy way to opt-in to serializing certain data only when you need it. The library also provides convenient escape hatches when you need to do something not explicitly supported by the API.
- Reusable: The Packer library naturally composes well with itself. Nested data can be serialized in the same way no matter what endpoint it's fetched from.
- Efficient: When not using Sequel's
TacticalEagerLoading
plugin, the Packer library will intelligently determine which associations and nested associations it needs to eager load in order to avoid any N+1 query issues.
Example
Sequel::Packer
uses your existing Sequel::Model
declarations and leverages
the use of associations to efficiently serialize data.
class User < Sequel::Model(:users)
one_to_many :posts
end
class Post < Sequel::Model(:posts); end
Packer definitions use a simple domain-specific language (DSL) to declare which fields to serialize:
class PostPacker < Sequel::Packer
model Post
field :id
field :title
trait :truncated_content do
field :truncated_content do |post|
post.content[0..Post::PREVIEW_LENGTH]
end
end
end
class UserPacker < Sequel::Packer
model User
field :id
field :name
trait :posts do
field :posts, PostPacker, :truncated_content
end
end
Once defined, Packers are easy to use; just call .pack
and pass in a Sequel
dataset, an array of models, or a single model, and get back Ruby hashes.
From there you can simply call to_json
on the result!
UserPacker.pack(User.dataset)
=> [
{id: 1, name: 'Paul'},
{id: 2, name: 'Julius'},
...
]
UserPacker.pack(User[1], :posts)
=> {
id: 1,
name: 'Paul',
posts: [
{
id: 15,
title: 'Announcing Sequel::Packer!',
truncated_content: 'Sequel::Packer is a new gem...',
},
{
id: 21,
title: 'Postgres Internals',
truncated_content: 'I never quite understood autovacuum...',
},
...
],
}
Contents
- Example
- Getting Started
- API Reference
- Using a Packer
- Defining a Packer
self.model(sequel_model_class)
self.field(column_name)
(orself.field(method_name)
)self.field(key, &block)
self.field(association, subpacker, *traits)
self.field(&block)
self.trait(trait_name, &block)
self.eager(*associations)
self.set_association_packer(association, subpacker, *traits)
self.pack_association(association, models)
self.precompute(&block)
- Context
self.with_context(&block)
- Potential Future Functionality
- Contributing
- Attribution
- License
Getting Started
This section will explain the basic use of Sequel::Packer
. Check out the API
Reference for an exhaustive coverage of the API and more
detailed documentation.
Installation
Add this line to your application's Gemfile:
gem 'sequel-packer'
And then execute:
$ bundle install
Or install it yourself as:
$ gem install sequel-packer
Example Schema
Most of the following examples will use the following database schema:
DB.create_table(:users) do
primary_key :id
String :name
end
DB.create_table(:posts) do
primary_key :id
foreign_key :author_id, :users
String :title
String :content
end
DB.create_table(:comments) do
primary_key :id
foreign_key :author_id, :users
foreign_key :post_id, :posts
String :content
end
class User < Sequel::Model(:users)
one_to_many :posts, key: :author_id, class: :Post
end
class Post < Sequel::Model(:posts)
one_to_many :comments, key: :post_id, class: :Comment
end
class Comment < Sequel::Model(:comments)
many_to_one :author, key: :author_id, class: :User
end
Basic Fields
Suppose an endpoint wants to fetch all the ten most recent comments by a user. After validating the user id, we end up with the Sequel dataset representing the data we want to return:
recent_comments = Comment
.where(author_id: user_id)
.order(:id.desc)
.limit(10)
We can define a Packer class to serialize just fields we want to, using a custom DSL:
class CommentPacker < Sequel::Packer
model Comment
field :id
field :content
end
This can then be used as follows:
CommentPacker.pack(recent_comments)
=> [
{id: 536, content: "Great post, man!"},
{id: 436, content: "lol"},
{id: 413, content: "What a story..."},
]
Packing Associations by Nesting Packers
Now, suppose that we want to fetch a post and all of its comments. We can do
this by defining another packer for Post
that uses the CommentPacker
:
class PostPacker < Sequel::Packer
model Post
field :id
field :title
field :content
field :comments, CommentPacker
end
Since post.comments
is an array of Sequel::Models
and not a primitive value,
we must tell the Packer how to serialize them using another packer. The second
argument in field :comments, CommentPacker
tells the PostPacker
to use the
pack those comments using the CommentPacker
.
We can then use this as follows:
PostPacker.pack(Post[validated_id])
=> [
{
id: 682,
title: "Announcing sequel-packer",
content: "I've written a new gem...",
comments: [
{id: 536, content: "Great post, man!"},
{id: 541, content: "Incredible, this solves my EXACT problem!"},
...
],
}
]
Traits
But suppose we want to be able to show who authored each comment on the post. We first have to define a packer for users:
class UserPacker < Sequel::Packer
model User
field :id
field :name
end
We could now define a new packer, CommentWithAuthorPacker
, and use that in the
PostPacker
instead, but then we'd have to redeclare all the other fields we
want on a packed Comment
:
class CommentWithAuthorPacker < Sequel::Packer
model Comment
field :author, UserPacker
# Also defined in CommentPacker!
field :id
field :content
end
class PostPacker < Sequel::Packer
...
# Eww!
- field :comments, CommentPacker
+ field :comments, CommentWithAuthorPacker
end
Declaring these fields in two places could cause them to get out of sync
as more fields are added. Instead, we will use a trait. A trait is a
way to define a set of fields that we only want to pack sometimes. Instead
of defining a totally new packer, we can extend the CommentPacker
as follows:
class CommentPacker < Sequel::Packer
model Comment
field :id
field :content
+ trait :author do
+ field :author, UserPacker
+ end
end
To use a trait, simply pass it in when calling pack
:
# Without the trait
CommentPacker.pack(Comment.dataset)
=> [
{id: 536, content: "Great post, man!"},
...
]
# With the trait
CommentPacker.pack(Comment.dataset, :author)
=> [
{
id: 536,
content: "Great post, man!",
author: {id: 1, name: "Paul Martinez"},
},
...
]
To use a trait when packing an association in another packer, simply include
the name of the trait as additional argument to field
. Thus, to modify our
PostPacker to pack comments with their authors we make the following change:
class PostPacker < Sequel::Packer
model Post
field :id
field :title
field :content
- field :comments, CommentPacker
+ field :comments, CommentPacker, :author
end
While the basic Packer DSL is convenient, traits are the things that make Packers so powerful. Each packer should define a small set of fields that every endpoint needs, but then traits can be used to pack additional data only when it's needed.
API Reference
Custom packers are written by creating subclasses of Sequel::Packer
. This
class defines a DSL for declaring how a Sequel Model will be converted into a
plain Ruby hash.
Using a Packer
Using a Packer is dead simple. There's a single class method:
self.pack(data, *traits, **context)
data
can be in the form of a Sequel dataset, an array of Sequel models, or
a single Sequel model. No matter which form the data is passed in, the Packer
class will ensure nested data is efficiently loaded.
To pack additional fields defined in a trait, pass the name of the trait as an
additional argument, e.g., UserPacker.pack(users, :recent_posts)
to include
recent posts with each user.
Finally, additional context can be provided to the Packer by passing additional
keyword arguments to pack
. This context is handled opaquely by the Packer, but
it can be accessed in the blocks passed to field
declarations. Common uses of
context
include passing in the current user making a request, or passing in
additional precomputed data.
The implementation of pack
is very simple. It creates an instance of a Packer,
by passing in the traits and the context, then calls pack
on that instance,
and passes in the data:
def self.pack(data, *traits, **context)
return nil if !data # small easy optimization to avoid unnecessary work
new(*traits, **context).pack(data)
end
It simply combines a constructor and single exposed instance method:
initialize(*traits, **context)
pack(data)
One instantiated, the same Packer could be used to pack data multiple times. This is unlikely to be needed, but the functionality is there.
Defining a Packer
self.model(sequel_model_class)
The beginning of each Packer class must begin with model MySequelModel
, which
specifies which Sequel Model this Packer class will serialize. This is mostly
to catch certain errors at load time, rather than at run time:
class UserPacker < Sequel::Packer
model User
...
end
self.field(column_name)
(or self.field(method_name)
)
Defining the shape of the outputted data is done using the field
method, which
exists in four different variants. This first variant is the simplest. It simply
fetches the value of the column from the model and stores it in the outputted
hash under a key of the same name. Essentially field :my_column
eventually
results in hash[:my_column] = model.my_column
.
Sequel Models define accessor methods for each column in the underlying table,
so technically underneath the hood Packer is actually calling the sending the
method column_name
to the model: hash[:my_column] = model.send(:my_column)
.
This means that the result of any method can be serialized using
field :method_name
. For example, suppose a User model has a first_name
and
last_name
column, and a helper method full_name
:
class User < Sequel::Model(:users)
def full_name
"#{first_name} #{last_name}"
end
end
Then when User.create(first_name: "Paul", last_name: "Martinez")
gets packed
with field :full_name
specified, the outputted hash will contain
full_name: "Paul Martinez"
.
self.field(key, &block)
A block can be passed to field
to perform arbitrary computation and store the
result under the specified key
. The block will be passed the model as a single
argument. Use this to call methods on the model that may take additional
arguments, or to "rename" a column.
Examples:
class MyPacker < Sequel::Packer
model MyModel
field :friendly_public_name do |model|
model.unfriendly_internal_name
end
# Shorthand for above
field :friendly_public_name, &:unfriendly_internal_name
field :foo do |model|
model.(baz, quux)
end
end
self.field(association, subpacker, *traits)
A Sequel association (defined in the model file using one_to_many
, or
many_to_one
, etc.), can be packed using another Packer class, possibly with
multiple traits specified. A similar output could be generated by doing:
field :association do |model|
subpacker.pack(model.association_dataset, *traits)
end
This form is very inefficient though, because it would result in a new subpacker getting instantiated for every packed model. Additionally, unless the subpacker is declared up-front, the Packer won't know to eager load that association, potentially resulting in many unnecessary database queries.
self.field(&block)
Passing a block but no key
to field
allows for arbitrary manipulation of the
packed hash. The block will be passed the model and the partially packed hash.
One potential usage is for dynamic keys that cannot be determined at load time,
but otherwise it's meant as a general escape hatch.
field do |model, hash|
hash[model.compute_dynamic_key] = model.dynamic_value
end
self.trait(trait_name, &block)
Define optional serialization behavior by defining additional fields within a
trait
block. Traits can be opted into when initializing a packer by passing
the name of the trait as an argument:
class MyPacker < Sequel::Packer
model MyObj
field :id
trait :my_trait do
field :trait_field
end
end
# packed objects don't have trait_field
MyPacker.pack(dataset)
=> [{id: 1}, {id: 2}, ...]
# packed objects do have trait_field
MyPacker.pack(dataset, :my_trait)
=> [{id: 1, trait_field: 'foo'}, {id: 2, trait_field: 'bar'}, ...]
Traits can also be used when packing associations by passing the name of the traits after the packer class:
class MyOtherPacker < Sequel::Packer
model MyObj
field :my_packers, MyPacker, :my_trait
end
self.eager(*associations)
When packing an association, a Packer will automatically ensure that association
is eager loaded, but there may be cases when an association will be accessed
that the Packer doesn't know about. In these cases you can tell the Packer to
eager load that data by calling eager(*associations)
, passing in arguments
the exact same way you would to Sequel::Dataset.eager
.
One case where this may be useful is for a "count" field, that just lists the number of associated objects, but doesn't actually return them:
class UserPacker < Sequel::Packer
model User
field :id
eager(:posts)
field(:num_posts) do |user|
user.posts.count
end
end
UserPacker.pack(User.dataset)
=> [
{id: 123, num_posts: 7},
{id: 456, num_posts: 3},
...
]
Using eager
can help prevent N+1 query problems when not using Sequel's
TacticalEagerLoading
plugin.
Another use of eager
, even when using TacticalEagerLoading
, is to modify or
limit which records gets fetched from the database by using an eager proc. For
example, to only pack recent posts, published in the past month, we might do:
class UserPacker < Sequel::Packer
model User
field :id
trait :recent_posts do
eager posts: (proc {|ds| ds.where {created_at > Time.now - 1.month}})
field :posts, PostIdPacker
end
end
IMPORTANT NOTE: Eager procs are not guaranteed to be executed when passing
in models, rather than a dataset, to pack
. Specifically, if the models already
have fetched the association, the Packer won't refetch it. Because of this, it's
good practice to use set_association_packer
and pack_association
(see next
section) in a field
block and duplicate the filtering action.
Also keep in mind that this limits the association that gets used by ALL fields,
so if another field actually needs access to all the users posts, it might not
make sense to use eager
.
Additionally, it's important to note that if eager
is called multiple times,
with multiple procs, each proc will get applied to the dataset, likely resulting
in overly restrictive filtering.
self.set_association_packer(association, subpacker, *traits)
See self.pack_association(association, models)
below.
self.pack_association(association, models)
The simplest way to pack an association is to use
self.field(association, subpacker, *traits)
, but sometimes this doesn't do
exactly what we want. We may want to pack the association under a different key
than the name of the association. Or we may only want to pack some of the
associated models (and it may be difficult or impossible to express which subset
we want to pack using eager
). Or perhaps we have a one_to_many
association
and instead of packing an array, we want to pack a single associated object
under a key. The two methods, set_association_packer
and pack_association
are designed to handle these cases.
First, we'll note that following are exactly equivalent:
field :my_assoc, MyAssocPacker, :trait1, :trait2
and
set_association_packer :my_assoc, MyAssocPacker, :trait1, :trait2
field :my_assoc do |model|
pack_association(:my_assoc, model.my_assoc)
end
set_association_packer
tells the Packer class that we will want to pack models
from a particular association using the designated Packer with the specified
traits. Declaring this ahead of time allows the Packer to ensure that the
association is eager loaded, as well as any nested associations used when using
the designated Packer with the specified traits.
pack_association
can then be used in a field
block to use that Packer after
the data has been fetched and we are actually packing the data. The key things
here are that we don't need to use the name of the association as the name of
the field, and that we can choose which models get serialized. If
pack_association
is passed an array, it will return an array of packed models,
but if it is passed a single model, it will return just that packed model.
Examples:
Use a different field name than the name of the association
set_association_packer :ugly_internal_names, InternalPacker
field :nice_external_names do |model|
pack_association(:ugly_internal_names, model.ugly_internal_names)
end
Pack a single instance of a one_to_many
association
class PostPacker < Sequel::Packer
set_association_packer :comments, CommentPacker
field :top_comment do |model|
pack_association(:comments, model.comments.max_by(&:num_likes))
end
end
self.precompute(&block)
Occasionally packing a model may require a computation that doesn't fit in with
the rest of the Packer paradigm. This may be a Sequel query that is particularly
difficult to express as an association, or even a call to an external service.
If such a computation can be performed in bulk, then the precompute
method can
be used as an entry point for that operation.
The precompute
method will execute a given block and pass it all of the models
that will be packed using that packer. This block will be executed a single
time, even when called by a deeply nested packer.
The precompute
block is instance_exec
ed in the context of the packer
instance, the result of any computation can be saved in a simple instance
variable (@precomputed_result
) and later referenced inside the blocks that are
passed to field
methods.
As an example, suppose a video uploading platform performs additional video
processing on every uploaded video and exposes the status of that processing as
a separate service over the network, rather than directly with the upload
metadata in the database. precompute
could be used as follows:
class VideoUploadPacker < Sequel::Packer
model VideoUpload
precompute do |video_uploads|
@processing_statuses = ResolutionService
.get_status_bulk(ids: video_uploads.map(&:id))
end
field :id
field :filename
field :processing_status do |video_upload|
@processing_statuses[video_upload.id]
end
end
Instance method versions
In addition to the class method versions of field
, eager
,
set_association_packer
, and precompute
, there are also regular instance method
versions which take the exact same arguments. When writing a trait
block, the
block is evaulated in the context of a new Packer instance and actually calls the
instance method versions instead.
Context
In addition to the data to be packed, and a set of traits, the pack
method
also accepts arbitrary keyword arguments. This is referred to as context
is
handled opaquely by the Packer. The data passed in here is saved as the
@context
instance variable, which is then accessible from within the blocks
passed to field
, trait
, and precompute
, for whatever purpose. It is also
automatically passed to any nested subpackers.
The most common usage for context would be to pass in the current user making a request. It could then be used to pack permission levels about records, for example.
class PostPacker < Sequel::Packer
model Post
eager :permissions
field :access_level do |post|
= post..find do |perm|
perm.user_id == @context[:user].id
end
.access_level
end
end
You might notice something inefficient about the above code. Even though we only
want to look at the user's permission record, we fetch ALL of the permission
records for each Post. Ideally we would filter the permissions
association
dataset when we call eager
, but we don't have access to @context
at that
point. This leads to the final DSL method available when writing a Packer:
self.with_context(&block)
You can pass a block to with_context
that will be executed as soon as a Packer
instance is constructed. The block can access @context
and can also call the
standard Packer DSL methods, field
, eager
, etc.
The above example could then be made more efficient as follows:
class PostPacker < Sequel::Packer
model Post
- eager :permissions
+ with_context do
+ eager permissions: (proc {|ds| ds.where(user_id: @context[:user].id)})
+ end
end
A very tricky usage of with_context
(and not recommended...) would be to
control the traits used on subpackers:
class UserPacker < Sequel::Packer
model User
with_context do
field :comments, CommentPacker, *@context[:comment_traits]
end
end
UserPacker.pack(User.dataset, comment_traits: [])
=> [{comments: [{id: 7}, ...]}]
UserPacker.pack(User.dataset, comment_traits: [:author])
=> [{comments: [{id: 7, author: {id: 1, ...}}, ...]}]
UserPacker.pack(User.dataset, comment_traits: [:num_likes])
=> [{comments: [{id: 7, likes: 53}, ...]}]
Potential Future Functionality
The 1.0.0 version of the Packer library is flexible to support many use cases. That said, of course there are ways to improve it! There are three main improvements I can imagine adding:
Automatically Generated Type Declarations
It would be fairly easy to add generate type definitions by adding arguments to
field
. Packers could produce TypeScript
interface
declarations, and adding a simple build step to a CI pipeline could enforce type
safety across the frontend and backend. Or they could produce
OpenAPI specifications, which could then
be used to automatically generate clients using something like
Swagger.
Lifecycle Hooks
It should be fairly easy to extend the Packer library using standard Ruby
features like subclassing, mixins, via include
, or even monkey-patching. It
may be beneficial to have explicit hooks for common operations however, like
before_fetch
, or around_pack
. It's more likely that these hooks are needed
for logging and tracing capabilities, than for actual functionality, so I'd
like to see some real-world usage before committing to a specific style of
integration.
Less Data Fetching
Sequel by default fetches every column in a table, but a Packer knows (roughly) what data is going to be used so it could only select the columns neede for actual serialization, and limit how much data is actually fetched from the database. I haven't done any benchmarking on this, so I'm not sure how much of a benefit could be gained by this, but it would be interesting!
This could work roughly as follows:
- Start by fetching all columns that appear in simple
field(:column_name)
declarations - Add any columns need to fetch nested associations, or to re-asscociate fetched
records with their "parent" models, using the
left_key
andright_key
fields of theAssociationReflections
- Add a
column(*columns)
DSL method to explicitly fetch additional columns
Other Enhancements
Here are some other potential enhancements, though these are less fleshed out.
- Support not including a key in a hash if the associated value is nil, to reduce size of outputted data.
- Support different casing of the outputted hashes, i.e.,
snake_case
vs.camelCase
. - Explicitly support different output formats, rather than just plain Ruby hashes, such as Protocol Buffers or Cap'n Proto.
- When using nested
precompute
blocks, the Packer has to flatten the associations of a model, which may be expensive, but has not been benchmarked. These flattened arrays already exist internally in Sequel when the eager loading occurs, but those aren't exposed. The code in Sequel could be re-implemented as part of the library to avoid re-constructing those arrays.
Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/PaulJuliusMartinez/sequel-packer.
Development
After checking out the repo, run bin/setup
to install dependencies. Then, run
rake test
to run the tests. You can also run bin/console
for an interactive
prompt that will allow you to experiment.
Releases
To release a new version, update the version number in
lib/sequel/packer/version.rb
, update the CHANGELOG.md
with new changes, then
run rake release
, which which will create a git tag for the version, push git
commits and tags, and push the .gem
file to
rubygems.org.
Attribution
Karthik Viswanathan designed the original API of the Packer library while at Affinity. This library is a ground up rewrite which defines a very similar API, but shares no code with the original implementation.
License
The gem is available as open source under the terms of the MIT License.