Sequel::Packer

Sequel::Packer is a Ruby serialization library to be used with the Sequel ORM with the following qualities:

  • Declarative: Define the shape of your serialized data with a simple, straightforward DSL.
  • Flexible: Certain contexts require different data. Packers provide an easy way to opt-in to serializing certain data only when you need it. The library also provides convenient escape hatches when you need to do something not explicitly supported by the API.
  • Reusable: The Packer library naturally composes well with itself. Nested data can be serialized in the same way no matter what endpoint it's fetched from.
  • Efficient: When not using Sequel's TacticalEagerLoading plugin, the Packer library will intelligently determine which associations and nested associations it needs to eager load in order to avoid any N+1 query issues.

Example

Sequel::Packer uses your existing Sequel::Model declarations and leverages the use of associations to efficiently serialize data.

class User < Sequel::Model(:users)
  one_to_many :posts
end
class Post < Sequel::Model(:posts); end

Packer definitions use a simple domain-specific language (DSL) to declare which fields to serialize:

class PostPacker < Sequel::Packer
  model Post

  field :id
  field :title

  trait :truncated_content do
    field :truncated_content do |post|
      post.content[0..Post::PREVIEW_LENGTH]
    end
  end
end

class UserPacker < Sequel::Packer
  model User

  field :id
  field :name

  trait :posts do
    field :posts, PostPacker, :truncated_content
  end
end

Once defined, Packers are easy to use; just call .pack and pass in a Sequel dataset, an array of models, or a single model, and get back Ruby hashes. From there you can simply call to_json on the result!

UserPacker.pack(User.dataset)
=> [
  {id: 1, name: 'Paul'},
  {id: 2, name: 'Julius'},
  ...
]

UserPacker.pack(User[1], :posts)
=> {
  id: 1,
  name: 'Paul',
  posts: [
    {
      id: 15,
      title: 'Announcing Sequel::Packer!',
      truncated_content: 'Sequel::Packer is a new gem...',
    },
    {
      id: 21,
      title: 'Postgres Internals',
      truncated_content: 'I never quite understood autovacuum...',
    },
    ...
  ],
}

Contents

Getting Started

This section will explain the basic use of Sequel::Packer. Check out the API Reference for an exhaustive coverage of the API and more detailed documentation.

Installation

Add this line to your application's Gemfile:

gem 'sequel-packer'

And then execute:

$ bundle install

Or install it yourself as:

$ gem install sequel-packer

Example Schema

Most of the following examples will use the following database schema:

DB.create_table(:users) do
  primary_key :id
  String :name
end

DB.create_table(:posts) do
  primary_key :id
  foreign_key :author_id, :users
  String :title
  String :content
end

DB.create_table(:comments) do
  primary_key :id
  foreign_key :author_id, :users
  foreign_key :post_id, :posts
  String :content
end

class User < Sequel::Model(:users)
  one_to_many :posts, key: :author_id, class: :Post
end
class Post < Sequel::Model(:posts)
  one_to_many :comments, key: :post_id, class: :Comment
end
class Comment < Sequel::Model(:comments)
  many_to_one :author, key: :author_id, class: :User
end

Basic Fields

Suppose an endpoint wants to fetch all the ten most recent comments by a user. After validating the user id, we end up with the Sequel dataset representing the data we want to return:

recent_comments = Comment
  .where(author_id: user_id)
  .order(:id.desc)
  .limit(10)

We can define a Packer class to serialize just fields we want to, using a custom DSL:

class CommentPacker < Sequel::Packer
  model Comment

  field :id
  field :content
end

This can then be used as follows:

CommentPacker.pack(recent_comments)
=> [
  {id: 536, content: "Great post, man!"},
  {id: 436, content: "lol"},
  {id: 413, content: "What a story..."},
]

Packing Associations by Nesting Packers

Now, suppose that we want to fetch a post and all of its comments. We can do this by defining another packer for Post that uses the CommentPacker:

class PostPacker < Sequel::Packer
  model Post

  field :id
  field :title
  field :content
  field :comments, CommentPacker
end

Since post.comments is an array of Sequel::Models and not a primitive value, we must tell the Packer how to serialize them using another packer. The second argument in field :comments, CommentPacker tells the PostPacker to use the pack those comments using the CommentPacker.

We can then use this as follows:

PostPacker.pack(Post[validated_id])
=> [
  {
    id: 682,
    title: "Announcing sequel-packer",
    content: "I've written a new gem...",
    comments: [
      {id: 536, content: "Great post, man!"},
      {id: 541, content: "Incredible, this solves my EXACT problem!"},
      ...
    ],
  }
]

Traits

But suppose we want to be able to show who authored each comment on the post. We first have to define a packer for users:

class UserPacker < Sequel::Packer
  model User

  field :id
  field :name
end

We could now define a new packer, CommentWithAuthorPacker, and use that in the PostPacker instead, but then we'd have to redeclare all the other fields we want on a packed Comment:

class CommentWithAuthorPacker < Sequel::Packer
  model Comment

  field :author, UserPacker

  # Also defined in CommentPacker!
  field :id
  field :content
end

class PostPacker < Sequel::Packer
  ...

  # Eww!
- field :comments, CommentPacker
+ field :comments, CommentWithAuthorPacker
end

Declaring these fields in two places could cause them to get out of sync as more fields are added. Instead, we will use a trait. A trait is a way to define a set of fields that we only want to pack sometimes. Instead of defining a totally new packer, we can extend the CommentPacker as follows:

class CommentPacker < Sequel::Packer
  model Comment

  field :id
  field :content

+ trait :author do
+   field :author, UserPacker
+ end
end

To use a trait, simply pass it in when calling pack:

# Without the trait
CommentPacker.pack(Comment.dataset)
=> [
  {id: 536, content: "Great post, man!"},
  ...
]

# With the trait
CommentPacker.pack(Comment.dataset, :author)
=> [
  {
    id: 536,
    content: "Great post, man!",
    author: {id: 1, name: "Paul Martinez"},
  },
  ...
]

To use a trait when packing an association in another packer, simply include the name of the trait as additional argument to field. Thus, to modify our PostPacker to pack comments with their authors we make the following change:

class PostPacker < Sequel::Packer
  model Post

  field :id
  field :title
  field :content

- field :comments, CommentPacker
+ field :comments, CommentPacker, :author
end

While the basic Packer DSL is convenient, traits are the things that make Packers so powerful. Each packer should define a small set of fields that every endpoint needs, but then traits can be used to pack additional data only when it's needed.

API Reference

Custom packers are written by creating subclasses of Sequel::Packer. This class defines a DSL for declaring how a Sequel Model will be converted into a plain Ruby hash.

Using a Packer

Using a Packer is dead simple. There's a single class method:

self.pack(data, *traits, **context)

data can be in the form of a Sequel dataset, an array of Sequel models, or a single Sequel model. No matter which form the data is passed in, the Packer class will ensure nested data is efficiently loaded.

To pack additional fields defined in a trait, pass the name of the trait as an additional argument, e.g., UserPacker.pack(users, :recent_posts) to include recent posts with each user.

Finally, additional context can be provided to the Packer by passing additional keyword arguments to pack. This context is handled opaquely by the Packer, but it can be accessed in the blocks passed to field declarations. Common uses of context include passing in the current user making a request, or passing in additional precomputed data.

The implementation of pack is very simple. It creates an instance of a Packer, by passing in the traits and the context, then calls pack on that instance, and passes in the data:

def self.pack(data, *traits, **context)
  return nil if !data # small easy optimization to avoid unnecessary work
  new(*traits, **context).pack(data)
end

It simply combines a constructor and single exposed instance method:

initialize(*traits, **context)

pack(data)

One instantiated, the same Packer could be used to pack data multiple times. This is unlikely to be needed, but the functionality is there.

Defining a Packer

self.model(sequel_model_class)

The beginning of each Packer class must begin with model MySequelModel, which specifies which Sequel Model this Packer class will serialize. This is mostly to catch certain errors at load time, rather than at run time:

class UserPacker < Sequel::Packer
  model User
  ...
end

self.field(column_name) (or self.field(method_name))

Defining the shape of the outputted data is done using the field method, which exists in four different variants. This first variant is the simplest. It simply fetches the value of the column from the model and stores it in the outputted hash under a key of the same name. Essentially field :my_column eventually results in hash[:my_column] = model.my_column.

Sequel Models define accessor methods for each column in the underlying table, so technically underneath the hood Packer is actually calling the sending the method column_name to the model: hash[:my_column] = model.send(:my_column).

This means that the result of any method can be serialized using field :method_name. For example, suppose a User model has a first_name and last_name column, and a helper method full_name:

class User < Sequel::Model(:users)
  def full_name
    "#{first_name} #{last_name}"
  end
end

Then when User.create(first_name: "Paul", last_name: "Martinez") gets packed with field :full_name specified, the outputted hash will contain full_name: "Paul Martinez".

self.field(key, &block)

A block can be passed to field to perform arbitrary computation and store the result under the specified key. The block will be passed the model as a single argument. Use this to call methods on the model that may take additional arguments, or to "rename" a column.

Examples:

class MyPacker < Sequel::Packer
  model MyModel

  field :friendly_public_name do |model|
    model.unfriendly_internal_name
  end

  # Shorthand for above
  field :friendly_public_name, &:unfriendly_internal_name

  field :foo do |model|
    model.bar(baz, quux)
  end
end

self.field(association, subpacker, *traits)

A Sequel association (defined in the model file using one_to_many, or many_to_one, etc.), can be packed using another Packer class, possibly with multiple traits specified. A similar output could be generated by doing:

field :association do |model|
  subpacker.pack(model.association_dataset, *traits)
end

This form is very inefficient though, because it would result in a new subpacker getting instantiated for every packed model. Additionally, unless the subpacker is declared up-front, the Packer won't know to eager load that association, potentially resulting in many unnecessary database queries.

self.field(&block)

Passing a block but no key to field allows for arbitrary manipulation of the packed hash. The block will be passed the model and the partially packed hash. One potential usage is for dynamic keys that cannot be determined at load time, but otherwise it's meant as a general escape hatch.

field do |model, hash|
  hash[model.compute_dynamic_key] = model.dynamic_value
end

self.trait(trait_name, &block)

Define optional serialization behavior by defining additional fields within a trait block. Traits can be opted into when initializing a packer by passing the name of the trait as an argument:

class MyPacker < Sequel::Packer
  model MyObj
  field :id

  trait :my_trait do
    field :trait_field
  end
end

# packed objects don't have trait_field
MyPacker.pack(dataset)
=> [{id: 1}, {id: 2}, ...]
# packed objects do have trait_field
MyPacker.pack(dataset, :my_trait)
=> [{id: 1, trait_field: 'foo'}, {id: 2, trait_field: 'bar'}, ...]

Traits can also be used when packing associations by passing the name of the traits after the packer class:

class MyOtherPacker < Sequel::Packer
  model MyObj
  field :my_packers, MyPacker, :my_trait
end

self.eager(*associations)

When packing an association, a Packer will automatically ensure that association is eager loaded, but there may be cases when an association will be accessed that the Packer doesn't know about. In these cases you can tell the Packer to eager load that data by calling eager(*associations), passing in arguments the exact same way you would to Sequel::Dataset.eager.

One case where this may be useful is for a "count" field, that just lists the number of associated objects, but doesn't actually return them:

class UserPacker < Sequel::Packer
  model User

  field :id

  eager(:posts)
  field(:num_posts) do |user|
    user.posts.count
  end
end

UserPacker.pack(User.dataset)
=> [
  {id: 123, num_posts: 7},
  {id: 456, num_posts: 3},
  ...
]

Using eager can help prevent N+1 query problems when not using Sequel's TacticalEagerLoading plugin.

Another use of eager, even when using TacticalEagerLoading, is to modify or limit which records gets fetched from the database by using an eager proc. For example, to only pack recent posts, published in the past month, we might do:

class UserPacker < Sequel::Packer
  model User

  field :id

  trait :recent_posts do
    eager posts: (proc {|ds| ds.where {created_at > Time.now - 1.month}})
    field :posts, PostIdPacker
  end
end

IMPORTANT NOTE: Eager procs are not guaranteed to be executed when passing in models, rather than a dataset, to pack. Specifically, if the models already have fetched the association, the Packer won't refetch it. Because of this, it's good practice to use set_association_packer and pack_association (see next section) in a field block and duplicate the filtering action.

Also keep in mind that this limits the association that gets used by ALL fields, so if another field actually needs access to all the users posts, it might not make sense to use eager.

Additionally, it's important to note that if eager is called multiple times, with multiple procs, each proc will get applied to the dataset, likely resulting in overly restrictive filtering.

self.set_association_packer(association, subpacker, *traits)

See self.pack_association(association, models) below.

self.pack_association(association, models)

The simplest way to pack an association is to use self.field(association, subpacker, *traits), but sometimes this doesn't do exactly what we want. We may want to pack the association under a different key than the name of the association. Or we may only want to pack some of the associated models (and it may be difficult or impossible to express which subset we want to pack using eager). Or perhaps we have a one_to_many association and instead of packing an array, we want to pack a single associated object under a key. The two methods, set_association_packer and pack_association are designed to handle these cases.

First, we'll note that following are exactly equivalent:

field :my_assoc, MyAssocPacker, :trait1, :trait2

and

set_association_packer :my_assoc, MyAssocPacker, :trait1, :trait2
field :my_assoc do |model|
  pack_association(:my_assoc, model.my_assoc)
end

set_association_packer tells the Packer class that we will want to pack models from a particular association using the designated Packer with the specified traits. Declaring this ahead of time allows the Packer to ensure that the association is eager loaded, as well as any nested associations used when using the designated Packer with the specified traits.

pack_association can then be used in a field block to use that Packer after the data has been fetched and we are actually packing the data. The key things here are that we don't need to use the name of the association as the name of the field, and that we can choose which models get serialized. If pack_association is passed an array, it will return an array of packed models, but if it is passed a single model, it will return just that packed model.

Examples:

Use a different field name than the name of the association
set_association_packer :ugly_internal_names, InternalPacker
field :nice_external_names do |model|
  pack_association(:ugly_internal_names, model.ugly_internal_names)
end
Pack a single instance of a one_to_many association
class PostPacker < Sequel::Packer
  set_association_packer :comments, CommentPacker
  field :top_comment do |model|
    pack_association(:comments, model.comments.max_by(&:num_likes))
  end
end

self.precompute(&block)

Occasionally packing a model may require a computation that doesn't fit in with the rest of the Packer paradigm. This may be a Sequel query that is particularly difficult to express as an association, or even a call to an external service. If such a computation can be performed in bulk, then the precompute method can be used as an entry point for that operation.

The precompute method will execute a given block and pass it all of the models that will be packed using that packer. This block will be executed a single time, even when called by a deeply nested packer.

The precompute block is instance_execed in the context of the packer instance, the result of any computation can be saved in a simple instance variable (@precomputed_result) and later referenced inside the blocks that are passed to field methods.

As an example, suppose a video uploading platform performs additional video processing on every uploaded video and exposes the status of that processing as a separate service over the network, rather than directly with the upload metadata in the database. precompute could be used as follows:

class VideoUploadPacker < Sequel::Packer
  model VideoUpload

  precompute do |video_uploads|
    @processing_statuses = ResolutionService
      .get_status_bulk(ids: video_uploads.map(&:id))
  end

  field :id
  field :filename
  field :processing_status do |video_upload|
    @processing_statuses[video_upload.id]
  end
end

Instance method versions

In addition to the class method versions of field, eager, set_association_packer, and precompute, there are also regular instance method versions which take the exact same arguments. When writing a trait block, the block is evaulated in the context of a new Packer instance and actually calls the instance method versions instead.

Context

In addition to the data to be packed, and a set of traits, the pack method also accepts arbitrary keyword arguments. This is referred to as context is handled opaquely by the Packer. The data passed in here is saved as the @context instance variable, which is then accessible from within the blocks passed to field, trait, and precompute, for whatever purpose. It is also automatically passed to any nested subpackers.

The most common usage for context would be to pass in the current user making a request. It could then be used to pack permission levels about records, for example.

class PostPacker < Sequel::Packer
  model Post

  eager :permissions
  field :access_level do |post|
    user_permission = post.permissions.find do |perm|
      perm.user_id == @context[:user].id
    end

    user_permission.access_level
  end
end

You might notice something inefficient about the above code. Even though we only want to look at the user's permission record, we fetch ALL of the permission records for each Post. Ideally we would filter the permissions association dataset when we call eager, but we don't have access to @context at that point. This leads to the final DSL method available when writing a Packer:

self.with_context(&block)

You can pass a block to with_context that will be executed as soon as a Packer instance is constructed. The block can access @context and can also call the standard Packer DSL methods, field, eager, etc.

The above example could then be made more efficient as follows:

class PostPacker < Sequel::Packer
  model Post

- eager :permissions
+ with_context do
+   eager permissions: (proc {|ds| ds.where(user_id: @context[:user].id)})
+ end
end

A very tricky usage of with_context (and not recommended...) would be to control the traits used on subpackers:

class UserPacker < Sequel::Packer
  model User

  with_context do
    field :comments, CommentPacker, *@context[:comment_traits]
  end
end

UserPacker.pack(User.dataset, comment_traits: [])
=> [{comments: [{id: 7}, ...]}]
UserPacker.pack(User.dataset, comment_traits: [:author])
=> [{comments: [{id: 7, author: {id: 1, ...}}, ...]}]
UserPacker.pack(User.dataset, comment_traits: [:num_likes])
=> [{comments: [{id: 7, likes: 53}, ...]}]

Potential Future Functionality

The 1.0.0 version of the Packer library is flexible to support many use cases. That said, of course there are ways to improve it! There are three main improvements I can imagine adding:

Automatically Generated Type Declarations

It would be fairly easy to add generate type definitions by adding arguments to field. Packers could produce TypeScript interface declarations, and adding a simple build step to a CI pipeline could enforce type safety across the frontend and backend. Or they could produce OpenAPI specifications, which could then be used to automatically generate clients using something like Swagger.

Lifecycle Hooks

It should be fairly easy to extend the Packer library using standard Ruby features like subclassing, mixins, via include, or even monkey-patching. It may be beneficial to have explicit hooks for common operations however, like before_fetch, or around_pack. It's more likely that these hooks are needed for logging and tracing capabilities, than for actual functionality, so I'd like to see some real-world usage before committing to a specific style of integration.

Less Data Fetching

Sequel by default fetches every column in a table, but a Packer knows (roughly) what data is going to be used so it could only select the columns neede for actual serialization, and limit how much data is actually fetched from the database. I haven't done any benchmarking on this, so I'm not sure how much of a benefit could be gained by this, but it would be interesting!

This could work roughly as follows:

  • Start by fetching all columns that appear in simple field(:column_name) declarations
  • Add any columns need to fetch nested associations, or to re-asscociate fetched records with their "parent" models, using the left_key and right_key fields of the AssociationReflections
  • Add a column(*columns) DSL method to explicitly fetch additional columns

Other Enhancements

Here are some other potential enhancements, though these are less fleshed out.

  • Support not including a key in a hash if the associated value is nil, to reduce size of outputted data.
  • Support different casing of the outputted hashes, i.e., snake_case vs. camelCase.
  • Explicitly support different output formats, rather than just plain Ruby hashes, such as Protocol Buffers or Cap'n Proto.
  • When using nested precompute blocks, the Packer has to flatten the associations of a model, which may be expensive, but has not been benchmarked. These flattened arrays already exist internally in Sequel when the eager loading occurs, but those aren't exposed. The code in Sequel could be re-implemented as part of the library to avoid re-constructing those arrays.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/PaulJuliusMartinez/sequel-packer.

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake test to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

Releases

To release a new version, update the version number in lib/sequel/packer/version.rb, update the CHANGELOG.md with new changes, then run rake release, which which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Attribution

Karthik Viswanathan designed the original API of the Packer library while at Affinity. This library is a ground up rewrite which defines a very similar API, but shares no code with the original implementation.

License

The gem is available as open source under the terms of the MIT License.