Module: Sequel::Plugins::ColumnEncryption
- Defined in:
- lib/sequel/plugins/column_encryption.rb
Overview
The column_encryption plugin adds support for encrypting the content of individual columns in a table.
Column values are encrypted with AES-256-GCM using a per-value cipher key derived from a key provided in the configuration using HMAC-SHA256.
Usage
If you would like to support encryption of columns in more than one model, you should probably load the plugin into the parent class of your models and specify the keys:
Sequel::Model.plugin :column_encryption do |enc|
enc.key 0, ENV["SEQUEL_COLUMN_ENCRYPTION_KEY"]
end
This specifies a single master encryption key. Unless you are actively rotating keys, it is best to use a single master key. Rotation of encryption keys will be discussed in a later section.
In the above call, 0
is the id of the key, and the ENV["SEQUEL_COLUMN_ENCRYPTION_KEY"]
is the content of the key, which must be a string with exactly 32 bytes. As indicated, this key should not be hardcoded or otherwise committed to the source control repository.
For models that need encrypted columns, you load the plugin again, but specify the columns to encrypt:
ConfidentialModel.plugin :column_encryption do |enc|
enc.column :encrypted_column_name
enc.column :searchable_column_name, searchable: true
enc.column :ci_searchable_column_name, searchable: :case_insensitive
end
With this, all three specified columns (encrypted_column_name
, searchable_column_name
, and ci_searchable_column_name
) will be marked as encrypted columns. When you run the following code:
ConfidentialModel.create(
encrypted_column_name: 'These',
searchable_column_name: 'will be',
ci_searchable_column_name: 'Encrypted'
)
It will save encrypted versions to the database. encrypted_column_name
will not be searchable, searchable_column_name
will be searchable with an exact match, and ci_searchable_column_name
will be searchable with a case insensitive match. See section below for details on searching.
It is possible to have model-specific keys by specifying both the key
and column
methods in the model:
ConfidentialModel.plugin :column_encryption do |enc|
enc.key 0, ENV["SEQUEL_MODEL_SPECIFIC_ENCRYPTION_KEY"]
enc.column :encrypted_column_name
enc.column :searchable_column_name, searchable: true
enc.column :ci_searchable_column_name, searchable: :case_insensitive
end
When the key
method is called inside the plugin block, previous keys are ignored, and only the new keys specified will be used. This approach would allow the ConfidentialModel
to use the model specific encryption keys, and other models to use the default keys specified in the parent class.
The key
and column
methods inside the plugin block support additional options. The key
method supports the following options:
- :auth_data
-
The authentication data to use for the AES-256-GCM cipher. Defaults to the empty string.
- :padding
-
The number of padding bytes to use. For security, data is padded so that a database administrator cannot determine the exact size of the unencrypted data. By default, this value is 8, which means that unencrypted data will be padded to a multiple of 8 bytes. Up to twice as much padding as specified will be used, as the number of padding bytes is partially randomized.
The column
method supports the following options:
- :searchable
-
Whether the column is searchable. This should not be used unless searchability is needed, as it can allow the database administrator to determine whether two distinct rows have the same unencrypted data (but not what that data is). This can be set to
true
to allow searching with an exact match, or:case_insensitive
for a case insensitive match. - :search_both
-
This should only be used if you have previously switched the
:searchable
option fromtrue
to:case_insensitive
or vice-versa, and would like the search to return values that have not yet been reencrypted. Note that switching fromtrue
to:case_insensitive
isn’t a problem, but switching from:case_insensitive
totrue
and using this option can cause the search to return values that are not an exact match. You should manually filter those objects after decrypting if you want to ensure an exact match. - :format
-
The format of the column, if you want to perform serialization before encryption and deserialization after decryption. Can be either a symbol registered with the serialization plugin or an array of two callables, the first for serialization and the second for deserialization.
The column
method also supports a block for column-specific keys:
ConfidentialModel.plugin :column_encryption do |enc|
enc.column :encrypted_column_name do |cenc|
cenc.key 0, ENV["SEQUEL_COLUMN_SPECIFIC_ENCRYPTION_KEY"]
end
enc.column :searchable_column_name, searchable: true
enc.column :ci_searchable_column_name, searchable: :case_insensitive
end
In this case, the ENV["SEQUEL_COLUMN_SPECIFIC_ENCRYPTION_KEY"]
key will only be used for the :encrypted_column_name
column, and not the other columns.
Note that there isn’t a security reason to prefer either model-specific or column-specific keys, as the actual cipher key used is unique per column value.
Note that changing the key_id, key string, or auth_data for an existing key will break decryption of values encrypted with that key. If you would like to change any aspect of the key, add a new key, rotate to the new encryption key, and then remove the previous key, as described in the section below on key rotation.
Searching Encrypted Values
To search searchable encrypted columns, use with_encrypted_value
. This example code will return the model instance created in the code example in the previous section:
ConfidentialModel.
with_encrypted_value(:searchable_column_name, "will be")
with_encrypted_value(:ci_searchable_column_name, "encrypted").
first
Encryption Key Rotation
To rotate encryption keys, add a new key above the existing key, with a new key ID:
Sequel::Model.plugin :column_encryption do |enc|
enc.key 1, ENV["SEQUEL_COLUMN_ENCRYPTION_KEY"]
enc.key 0, ENV["SEQUEL_OLD_COLUMN_ENCRYPTION_KEY"]
end
Newly encrypted data will then use the new key. Records encrypted with the older key will still be decrypted correctly.
To force reencryption for existing records that are using the older key, you can use the needing_reencryption
dataset method and the reencrypt
instance method. For a small number of records, you can probably do:
ConfidentialModel.needing_reencryption.all(&:reencrypt)
With more than a small number of records, you’ll want to do this in batches. It’s possible you could use an approach such as:
ds = ConfidentialModel.needing_reencryption.limit(100)
true until ds.all(&:reencrypt).empty?
After all values have been reencrypted for all models, and no models use the older encryption key, you can remove it from the configuration:
Sequel::Model.plugin :column_encryption do |enc|
enc.key 1, ENV["SEQUEL_COLUMN_ENCRYPTION_KEY"]
end
Once an encryption key has been removed, after no data uses it, it is safe to reuse the same key id for a new key. This approach allows for up to 256 concurrent keys in the same configuration.
Encrypting Additional Formats
By default, the column_encryption plugin assumes that the decrypted data should be returned as a string, and a string will be passed to encrypt. However, using the :format
option, you can specify an alternate format. For example, if you want to encrypt a JSON representation of the object, so that you can deal with an array/hash and automatically have it serialized with JSON and then encrypted when saving, and then deserialized with JSON after decryption when it is retrieved:
require 'json'
ConfidentialModel.plugin :column_encryption do |enc|
enc.key 0, ENV["SEQUEL_MODEL_SPECIFIC_ENCRYPTION_KEY"]
enc.column :encrypted_column_name
enc.column :searchable_column_name, searchable: true
enc.column :ci_searchable_column_name, searchable: :case_insensitive
enc.column :encrypted_json_column_name, format: :json
end
The values of the :format
are the same values you can pass as the first argument to serialize_attributes
(in the serialization plugin). You can pass an array with the serializer and deserializer for custom support.
You can use both :searchable
and :format
together for searchable encrypted serialized columns. However, note that this allows only exact searches of the serialized version of the data. So for JSON, a search for {'a'=>1, 'b'=>2}
would not match {'b'=>2, 'a'=>1}
even though the objects are considered equal. If this is an issue, make sure you use a serialization format where all equal objects are serialized to the same string.
Enforcing Uniqueness
You cannot enforce uniqueness of unencrypted data at the database level if you also want to support key rotation. However, absent key rotation, a unique index on the first 48 characters of the encrypted column can enforce uniqueness, as long as the column is searchable. If the encrypted column is case-insensitive searchable, the uniqueness is case insensitive as well.
Column Value Cryptography/Format
Column values used by this plugin use the following format (key
is specified in the plugin configuration and must be exactly 32 bytes):
- column_value
-
urlsafe_base64(flags + NUL + key_id + NUL + search_data + key_data + cipher_iv + cipher_auth_tag + encrypted_data)
- flags
-
1 byte, the type of record (0: not searchable, 1: searchable, 2: lowercase searchable)
- NUL
-
1 byte, ASCII NUL
- key_id
-
1 byte, the key id, supporting 256 concurrently active keys (0 - 255)
- search_data
-
0 bytes if flags is 0, 32 bytes if flags is 1 or 2. Format is HMAC-SHA256(key, unencrypted_data). Ignored on decryption, only used for searching.
- key_data
-
32 bytes random data used to construct cipher key
- cipher_iv
-
12 bytes, AES-256-GCM cipher random initialization vector
- cipher_auth_tag
-
16 bytes, AES-256-GCM cipher authentication tag
- encrypted_data
-
AES-256-GCM(HMAC-SHA256(key, key_data), padding_size + padding + unencrypted_data)
- padding_size
-
1 byte, with the amount of padding (0-255 bytes of padding allowed)
- padding
-
number of bytes specified by padding size, ignored on decryption
- unencrypted_data
-
actual column value
The reason for flags + NUL + key_id + NUL
(4 bytes) as the header is to allow for an easy way to search for values needing reencryption using a database index. It takes the first three bytes and converts them to base64, and looks for values less than that value or greater than that value with ‘B’ appended. The NUL byte in the fourth byte of the header ensures that after base64 encoding, the fifth byte in the column will be ‘A’.
The reason for search_data
(32 bytes) directly after is that for searchable values, after base64 encoding of the header and search data, it is 48 bytes and can be used directly as a prefix search on the column, which can be supported by the same database index. This is more efficient than a full column value search for large values, and allows for case-insensitive searching without a separate column, by having the search_data be based on the lowercase value while the unencrypted data is original case.
The reason for the padding is so that a database administrator cannot be sure exactly how many bytes are in the column. It is stored encrypted because otherwise the database administrator could calculate it by decoding the base64 data.
Unsupported Features
The following features are delibrately not supported:
Compression
Allowing compression with encryption is inviting security issues later. While padding can reduce the risk of compression with encryption, it does not eliminate it entirely. Users that must have compression with encryption can use the :format
option with a serializer that compresses and a deserializer that decompresses.
Mixing Encrypted/Unencrypted Data
Mixing encrypted and unencrypted data increases the complexity and security risk, since there is a chance unencrypted data could look like encrypted data in the pathologic case. If you have existing unencrypted data that would like to encrypt, create a new column for the encrypted data, and then migrate the data from the unencrypted column to the encrypted column. After all unencrypted values have been migrated, drop the unencrypted column.
Arbitrary Encryption Schemes
Supporting arbitrary encryption schemes increases the complexity risk. If in the future AES-256-GCM is not considered a secure enough cipher, it is possible to extend the current format using the reserved values in the first two bytes of the header.
Caveats
As column_encryption is a model plugin, it only works with using model instance methods. If you directly modify the database using a dataset or an external program that modifies the contents of the encrypted columns, you will probably corrupt the data. To make data corruption less likely, it is best to have a CHECK constraints on the encrypted column with a basic format and length check:
DB.alter_table(:table_name) do
c = Sequel[:encrypted_column_name]
add_constraint(:encrypted_column_name_format,
c.like('AA__A%') | c.like('Ag__A%') | c.like('AQ__A%'))
add_constraint(:encrypted_column_name_length, Sequel.char_length(c) >= 88)
end
If possible, it’s also best to check that the column is valid urlsafe base64 data of sufficient length. This can be done on PostgreSQL using a combination of octet_length, decode, and regexp_replace:
DB.alter_table(:ce_test) do
c = Sequel[:encrypted_column_name]
add_constraint(:enc_base64) do
octet_length(decode(regexp_replace(regexp_replace(c, '_', '/', 'g'), '-', '+', 'g'), 'base64')) >= 65
end
end
Such constraints will probably be sufficient to protect against most unintentional corruption of encrypted columns.
If the database supports transparent data encryption and you trust the database administrator, using the database support is probably a better approach.
The column_encryption plugin is only supported on Ruby 2.3+ and when the Ruby openssl standard library supports the AES-256-GCM cipher.
Defined Under Namespace
Modules: ClassMethods, DatasetMethods, InstanceMethods Classes: ColumnDSL, ColumnEncryptionMetadata, Cryptor, DSL
Class Method Summary collapse
Class Method Details
.apply(model, opts = OPTS, &_) ⇒ Object
589 590 591 |
# File 'lib/sequel/plugins/column_encryption.rb', line 589 def self.apply(model, opts=OPTS, &_) model.plugin :serialization end |
.configure(model) {|dsl| ... } ⇒ Object
593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 |
# File 'lib/sequel/plugins/column_encryption.rb', line 593 def self.configure(model) dsl = DSL.new yield dsl model.instance_exec do unless dsl.keys.empty? @column_encryption_keys = dsl.keys.freeze @column_encryption_cryptor = nil end @column_encryption_metadata = Hash[@column_encryption_metadata || {}] dsl.columns.each do |column, opts, block| _encrypt_column(column, opts, &block) end @column_encryption_metadata.freeze end end |