GDPR Admin
Rails engine for processing GDPR processes. GDPR Admin offers a simple interface for defining strategies for automating the process of data access and data removal as required by data privacy regulations like GDPR.
GDPR Admin uses simple Ruby classes, so you're free to code your Data Policies as you see fit. A swiss knife of helper methods are available to make the processes even simpler. The focus of the gem is to easily implement:
- Right of access: export a subject's data on request;
- Right to erasure: remove a subject's personal data on request;
- Storage limitation: hold data for as long as required (data retention policies and offboarding tenants);
Installation
Add this line to your application's Gemfile:
gem "gdpr_admin"
And then execute:
$ bundle
Or install it yourself as:
$ gem install gdpr_admin
Then install the migrations:
$ rails gdpr_admin:install:migrations
Usage
Create your data policies file within app/gdpr
(configurable) and inherit from GdprAdmin::ApplicationDataPolicy
.
Implement the methods #scope
, #export
and #erase
for the new data policy. Within the data policy, you will be
able to access the GdprAdmin::Request
object in any method by calling the method request
- you can, therefore, have
different scopes and different removal behaviors depending on the request.
Optinally, you may declare a #subject_scope
method with logic for scoping subject data. If this method is not present,
it will fallback to the #scope
method.
class UserDataPolicy < GdprAdmin::ApplicationDataPolicy
def scope
User.with_deleted.where(updated_at: ...request.data_older_than)
end
# Optional
def subject_scope
scope.where(email: request.subject)
end
def erase(user)
user.update_columns(
first_name: 'Anonymous',
last_name: "User #{user.id}",
email: "anonymous.user#{user.id}@company.co",
anonymized_at: Time.zone.now,
)
end
end
Once you have all your data policies defined, create a GdprAdmin::Request
to process a new request:
GdprAdmin::Request.create!(
tenant: current_tenant,
requester: current_admin_user,
request_type: 'erase_data',
data_older_than: 1.month.ago, # Optional: by default, it will be todays date
)
Creating a request will automatically enqueue the request to be processed in 4 hours - this gives time to cancel an accidental request. You can configure this grace period as desired.
Scope Helpers
Helpers to be used within the #scope
method.
scope_by_date
scope_by_date(scope, field = :updated_at)
Automatically scopes the data using the updated_at
column to match the GDPR Request. You can use a different column by providing
a second argument.
class ContactDataPolicy < GdprAdmin::ApplicationDataPolicy
def scope
scope_by_date(Contact)
end
end
Anonymization Helpers
A set of helper methods are available to make the anonymization even simpler. These are not mandatory, but can help you keep your code cleaner and, better, write less code.
erase_fields
erase_fields(record, fields, base_changes = {})
The method erase_fields
is available in the Data Policy class. It expects an array of field anonymization options.
It will automatically process those fields and update the record in the database using update_columns
(so validations
are skipped). The last optional argument (base_changes
) is a hash of attributes that should be updated when the record
is updated. See the example below:
class ContactDataPolicy < GdprAdmin::ApplicationDataPolicy
def fields
[
{ field: :first_name, method: :anonymize_first_name },
{ field: :last_name, method: :anonymize_last_name },
{ field: :gender, method: :skip },
{
field: :email,
method: lambda { |contact|
domain = contact.email[/@.*/]
"anonymous.contact#{contact.id}#{domain}"
},
},
{ field: :street_address, method: :nilify },
{ field: :city, method: :anonymize_city, seed: :id },
]
end
def erase(contact)
erase_fields(contact, fields, { anonymized_at: Time.zone.now })
end
end
The anonymizers used above (e.g. anonymize_first_name
), by default, will use the value of the field being updated as
the seed. That means that, when anonymization with the same anonymizer function, equal values will always yield the
same anonymized value. (note: different values may also yield the same value)
To use the built-in anonymizer functions, you need to install the gem faker
.
Data Policy Hooks
For advanced use cases, you may install hooks that are run at different stages of the data policy process.
before_process
Will be run before the policy is executed. Calling skip_data_policy!
will raise GdprAdmin::SkipDataPolicyError
and
stop the execution of the data policy.
class ContactDataPolicy < GdprAdmin::ApplicationDataPolicy
before_process :skip_internal_contacts!
private
def skip_internal_contacts!
skip_data_policy! if contact.email =~ /.*@company\.com/
end
end
before_process_record
Called before processing a record (either erasing or exporting). Calling skip_record!
will raise GdprAdmin::SkipRecordError
and
skip processing that particular record.
class UserDataPolicy < GdprAdmin::ApplicationDataPolicy
before_process :skip_super_admins!
private
def skip_super_admins!(user)
skip_record! if user.role == 'super_admin'
end
end
GDPR Request
A GDPR Request (GdprAdmin::Request
) represents a request to remove a subject's data, tenant's data, or export subject data.
Model GdprAdmin::Request
GdprAdmin::Request.create!(
tenant: current_tenant,
requester: current_admin_user,
request_type: 'erase_data',
data_older_than: 30.days.ago,
)
#tenant
Defines which tenant is this request attributed to. All requests must be assigned to a Tenant.
#request_type
Represents what type of request is being performed. Must be one of:
erase_data
: erase a tenant's data;erase_subject
: erase a single subject's data in a tenant;export_subject
: export data about a subject within a tenant;
#data_older_than
The request should only affect any data created before than date defined here (e.g. erase anything older than 30 days ago).
#process!
This method executes the request. This is automatically called by the GdprAdmin::RequestProcessorJob
.
Data Retention Policies
Some tenants may have different data retention policies (e.g. some hold Personally identifiable information for 3 months, others for 1 month).
Setup Job
Run the job GdprAdmin::DataRetentionPoliciesRunnerJob
periodically to execute your data retention policy (it is recommended
to run it, at least, once a day).
Sidekiq
You can define a repeating job with sidekiq-cron
:
Sidekiq::Cron::Job.create(
class: 'GdprAdmin::DataRetentionPoliciesRunnerJob',
cron: '0 2 * * *',
name: 'Run Data Retention Policies',
queue: 'cron',
active_job: true,
)
DelayedJobs
GdprAdmin::DataRetentionPoliciesRunnerJob.set(cron: '0 2 * * *', queue: :cron).perform_later
Model GdprAdmin::DataRetentionPolicy
This model allows you to create a custom data policy for each tenant and automatically run the policy periodically.
GdprAdmin::DataRetentionPolicy.create!(
tenant: acme_inc,
period_in_days: 30,
)
If you want to have a custom logic for running the data retention policy (e.g. do not run if the organization is deleted),
then you can add the logic to the method #should_process?
:
module GdprAdmin
class DataRetentionPolicy < GdprAdmin::ApplicationRecord
include GdprAdmin::DataRetentionPolicyConcern
def should_process?
tenant.deleted_at.nil?
end
end
end
PaperTrail
GDPR Admin provides a set of tools to keep your PaperTrail GDPR Compliant.
PaperTrail Data Privacy
By default, PaperTrail versions will not be anonymized. You may extend the default PaperTrail::VersionDataPrivacy
with your own scope. If you track custom fields with your versions (e.g. ip
), then you can also define an anonymizer
for those here:
# app/gdpr/paper_trail/version_data_privacy.rb
module PaperTrail
class VersionDataPolicy < GdprAdmin::PaperTrail::VersionDataPolicy
def fields
[
{ field: 'ip', method: :anonymize_ip },
]
end
def scope
return PaperTrail::Version.where(updated_at: ...request.data_older_than) if request.erase_data?
PaperTrail::Version.none
end
end
end
PaperTrail::VersionDataPolicy#erase
NOTE: this method only support JSON format for object
and, optionally, object_changes
. If you need a different
format, you will need to re-implement this method as desired.
erase(version, item_fields = nil)
The erase
method will, by default, anonymize the data within object
and object_changes
(and whichever fields are
defined in the #fields
method). It will choose which fields to anonymize the object
and object_changes
and which
anonymization methods by finding the item
's data policy and loading the fields from its fields
method. Unless
item_fields
is defined, in which case it will be used instead.
For example, if you have a item_type
set to User
, it will try to find the UserDataPrivacy
. If you want to use a
different class for a item_type
, you must define a data_policy_class
in the model.
class User < ApplicationRecord
def data_policy_class
PersonDataPolicy
end
end
If you'd like to just namespace all policies, then you can define data_policy_prefix
in the ApplicationRecord
:
class ApplicationRecord < ActiveRecord::Base
def data_policy_prefix
'Gdpr::'
end
end
# Now, user should be defined in `Gdpr::UserDataPolicy`
PaperTrail Helpers
When using the method erase_fields
, no PaperTrail versions will be created in the database. GDPR Admin offer other
helper methods to deal with PaperTrail. (If you are not using paper_trail
, this section may not be relevant)
without_paper_trail
Given a block, this method will execute it in a context where PaperTrail is disabled, so no versions are created:
def erase(contact)
without_paper_trail do
contact.update!(first_name: 'John', last_name: 'Doe')
end
end
As mentioned above, this is not required when using erase_fields
as it is the default behavior.
Configuration
Configure GDPR Admin in a initializer file config/initializers/gdpr_admin.rb
. The configuration should be done within
the block of GdprAdmin.configure(&block)
:
# config/initializers/gdpr_admin/rb
GdprAdmin.configure do |config|
# GDPR Admin configuration here...
end
Multi-Tenancy
GDPR Admin is built maily for B2B SaaS services and it assumes that the service may have multiple tenants. The
GDPR::Request
object always expects a tenant
to be provided. When processing the request, data will be automatically
segregated using the tenant adapter.
Tenant Class
By default, GDPR Admin will assume that the tenant class is Organization
. You can change that by setting the
tenant_class
in the config.
GdprAdmin.configure do |config|
config.tenant_class = 'Tenant'
end
ActsAsTenant Adapter
You can segregated the process to a tenant using ActsAsTenant
gem:
GdprAdmin.configure do |config|
config.tenant_adapter = :acts_as_tenant
end
Rollback on Failure
By default, GDPR Admin will attempt to rollback all changes made during the processing a Request in case it fails during the processing. This can be turned of with the flag rollback_on_failure
. In this situation, all changes will be committed as soon as they are done.
GdprAdmin.configure do |config|
config.rollback_on_failure = false
end
Jobs
Requests are processed asynchronously using ActiveJob.
Custom Queue
You can set which queue should be used to schedule the job for processing the request using the config default_job_queue
:
GdprAdmin.configure do |config|
config.default_job_queue = :gdpr_tasks
end
Grace Periods
To allow for cancelling any accidental erasure request, the jobs are scheduled with a configurable grace period. By default, erasure requests will wait 4 hours before being executed, while export requests will be executed immediately.
GdprAdmin.configure do |config|
config.erasure_grace_period = 1.day
config.export_grace_period = 2.minutes
end
Other Configurations
Data Policies Directory
By default, GDPR Admin will assume that your data policies are defined in app/gdpr
. If you wish to have them
in a different place, you can change the option data_policies_path
:
GdprAdmin.configure do |config|
# Change data policies path to be within the models directory (app/models/gdpr)
config.data_policies_path = Rails.root.join('app', 'models', 'gdpr')
end
License
The gem is available as open source under the terms of the MIT License.