Faulty
Fault-tolerance tools for ruby based on circuit-breakers.
users = Faulty.circuit(:api).try_run do
api.users
end.or_default([])
Installation
Add it to your Gemfile
:
gem 'faulty'
Or install it manually:
gem install faulty
During your app startup, call Faulty.init
. For Rails, you would do this in
config/initializers/faulty.rb
. See Setup for details.
API Docs
API docs can be read on rubydoc.info, inline in the source code, or
you can generate them yourself with Ruby yard
:
bin/yardoc
Then open doc/index.html
in your browser.
Setup
Use the default configuration options:
Faulty.init
Or specify your own configuration:
Faulty.init do |config|
config.storage = Faulty::Storage::Redis.new
config.listeners << Faulty::Events::CallbackListener.new do |events|
events.circuit_open do |payload|
puts 'Circuit was opened'
end
end
end
For a full list of configuration options, see the Global Configuration section.
What is this for?
Circuit breakers are a fault-tolerance tool for creating separation between your application and external dependencies. For example, your application may call an external API to send a text message:
TextApi.send()
In normal operation, this API call is very fast. However what if the texting service started hanging? Your application would quickly use up a lot of resources waiting for requests to return from the service. You could consider adding a timeout to your request:
TextApi.send(, timeout: 5)
Now your application will terminate requests after 5 seconds, but that could still add up to a lot of resources if you call this thousands of times. Circuit breakers solve this problem.
Faulty.circuit(:text_api).run do
TextApi.send(, timeout: 5)
end
Now, when the text API hangs, the first few will run and start timing out. This will trip the circuit. After the circuit trips (see How it Works), calls to the text API will be paused for the configured cool down period. Your application resources are not overwhelmed.
You are free to implement a fallback or error handling however you wish, for example, in this case, you might add the text message to a failure queue:
Faulty.circuit(:text_api).run do
TextApi.send(, timeout: 5)
rescue Faulty::CircuitError => e
FailureQueue.enqueue()
end
Basic Usage
To create a circuit, call Faulty.circuit
. This can be done as you use the
circuit, or you can set it up beforehand. Any options passed to the circuit
method are synchronized across threads and saved as long as the process is alive.
circuit1 = Faulty.circuit(:api, cache_refreshes_after: 1800)
# The options from above are also used when called here
circuit2 = Faulty.circuit(:api)
circuit2..cache_refreshes_after == 1800 # => true
# The same circuit is returned on each consecutive call
circuit1.equal?(circuit2) # => true
To run a circuit, call the run
method:
Faulty.circuit(:api).run do
api.users
end
See How it Works for more details about how Faulty handles circuit failures.
If the run
block above fails, a Faulty::CircuitError
will be raised. It is
up to your application to handle that error however necessary or crash. Often
though, you don't want to crash your application when a circuit fails, but
instead apply a fallback or default behavior. For this, Faulty provides the
try_run
method:
result = Faulty.circuit(:api).try_run do
api.users
end
users = if result.ok?
result.get
else
[]
end
The try_run
method returns a result type instead of raising errors. See the
API docs for Result
for more information. Here we use it to check whether the
result is ok?
(not an error). If it is we set the users variable, otherwise we
set a default of an empty array. This pattern is so common, that Result
also
implements a helper method or_default
to do the same thing:
users = Faulty.circuit(:api).try_run do
api.users
end.or_default([])
How it Works
Faulty implements a version of circuit breakers inspired by "Release It!: Design and Deploy Production-Ready Software" by Michael T. Nygard and Martin Fowler's post on the subject. A few notable features of Faulty's implementation are:
- Rate-based failure thresholds
- Integrated caching inspired by Netflix's Hystrix with automatic cache jitter and error fallback.
- Event-based monitoring
- Flexible fault-tolerant storage with optional fallbacks
Following the principals of the circuit-breaker pattern, the block given to
run
or try_run
will always be executed as long as it never raises an error.
If the block does raise an error, then the circuit keeps track of the number
of runs and the failure rate.
Once both thresholds are breached, the circuit is opened. Once open, the
circuit starts the cool-down period. Any executions within that cool-down are
skipped, and a Faulty::OpenCircuitError
will be raised.
After the cool-down has elapsed, the circuit enters the half-open state. In this state, Faulty allows a single execution of the block as a test run. If the test run succeeds, the circuit is fully opened and the circuit state is reset. If the test run fails, the circuit is closed and the cool-down is reset.
Each time the circuit changes state or executes the block, events are raised that are sent to the Faulty event notifier. The notifier should be used to track circuit failure rates, open circuits, etc.
In addition to the classic circuit breaker design, Faulty implements caching that is integrated with the circuit state. See Caching for more detail.
Configuration
Faulty can be configured with the following configuration options. This example illustrates the default values. In the first example, we configure Faulty globally. The second example shows the same configuration using an instance of Faulty instead of global configuration.
Faulty.init do |config|
# The cache backend to use. By default, Faulty looks for a Rails cache. If
# that's not available, it uses an ActiveSupport::Cache::Memory instance.
# Otherwise, it uses a Faulty::Cache::Null and caching is disabled.
# Whatever backend is given here is automatically wrapped in
# Faulty::Cache::AutoWire. This adds fault-tolerance features, see the
# AutoWire API docs for more details.
config.cache = Faulty::Cache::Default.new
# The storage backend. By default, Faulty uses an in-memory store. For most
# production applications, you'll want a more robust backend. Faulty also
# provides Faulty::Storage::Redis for this.
# Whatever backend is given here is automatically wrapped in
# Faulty::Storage::AutoWire. This adds fault-tolerance features, see the
# AutoWire APi docs for more details. If an array of storage backends is
# given, each one will be tried in order until one succeeds.
config.storage = Faulty::Storage::Memory.new
# An array of event listeners. Each object in the array should implement
# Faulty::Events::ListenerInterface. For ad-hoc custom listeners, Faulty
# provides Faulty::Events::CallbackListener.
config.listeners = [Faulty::Events::LogListener.new]
# The event notifier. For most use-cases, you don't need to change this,
# However, Faulty allows substituting your own notifier if necessary.
# If overridden, config.listeners will be ignored.
config.notifier = Faulty::Events::Notifier.new(config.listeners)
end
Here is the same configuration using an instance of Faulty
. This is a more
object-oriented approach.
faulty = Faulty.new do |config|
config.cache = Faulty::Cache::Default.new
config.storage = Faulty::Storage::Memory.new
config.listeners = [Faulty::Events::LogListener.new]
config.notifier = Faulty::Events::Notifier.new(config.listeners)
end
Most of the examples in this README use the global Faulty class methods, but
they work the same way when using an instance. Just substitute your instance
instead of Faulty
. There is no preferred way to use Faulty. Choose whichever
configuration mechanism works best for your application. Also see
Multiple Configurations if your application needs
to set different options in different scenarios.
For all Faulty APIs that have configuration, you can also pass in an options
hash. For example, Faulty.init
could be called like this:
Faulty.init(cache: Faulty::Cache::Null.new)
Circuit Options
A circuit can be created with the following configuration options. Those options
are only set once, synchronized across threads, and will persist in-memory until
the process exits. If you're using multiple configurations,
the options are retained within the context of each instance. All options given
after the first call to Faulty.circuit
(or Faulty#circuit
) are ignored.
This is because the circuit objects themselves are internally memoized, and are read-only once created.
The following example represents the defaults for a new circuit:
Faulty.circuit(:api) do |config|
# The cache backend for this circuit. Inherits the global cache by default.
config.cache = Faulty..cache
# The number of seconds before a cache entry is expired. After this time, the
# cache entry may be fully deleted. If set to nil, the cache will not expire.
config.cache_expires_in = 86400
# The number of seconds before a cache entry should be refreshed. See the
# Caching section for more detail. A value of nil disables cache refreshing.
config.cache_refreshes_after = 900
# The number of seconds to add or subtract from cache_refreshes_after
# when determining whether a cache entry should be refreshed. Helps mitigate
# the "thundering herd" effect
config.cache_refresh_jitter = 0.2 * config.cache_refreshes_after
# After a circuit is opened, the number of seconds to wait before moving the
# circuit to half-open.
config.cool_down = 300
# The errors that will be captured by Faulty and used to trigger circuit
# state changes.
config.errors = [StandardError]
# Errors that should be ignored by Faulty and not captured.
config.exclude = []
# The event notifier. Inherits the global notifier by default
config.notifier = Faulty..notifier
# The minimum failure rate required to trip a circuit
config.rate_threshold = 0.5
# The minimum number of runs required before a circuit can trip
config.sample_threshold = 3
# The storage backend for this circuit. Inherits the global storage by default
config.storage = Faulty..storage
end
Following the same convention as Faulty.init
, circuits can also be created
with an options hash:
Faulty.circuit(:api, cache_expires_in: 1800)
Caching
Faulty integrates caching into it's circuits in a way that is particularly
suited to fault-tolerance. To make use of caching, you must specify the cache
configuration option when initializing Faulty or creating a new Faulty instance.
If you're using Rails, this is automatically set to the Rails cache.
Once your cache is configured, you can use the cache
parameter when running
a circuit to specify a cache key:
feed = Faulty.circuit(:rss_feeds)
.try_run(cache: "rss_feeds/#{feed}") do
fetch_feed(feed)
end.or_default([])
By default a circuit has the following options:
cache_expires_in
: 86400 (1 day). This is sent to the cache backend and defines how long the cache entry should be stored. After this time elapses, queries will result in a cache miss.cache_refreshes_after
: 900 (15 minutes). This is used internally by Faulty to indicate when a cache should be refreshed. It does not affect how long the cache entry is stored.cache_refresh_jitter
: 180 (3 minutes = 20% ofcache_refreshes_after
). The maximum number of seconds to randomly add or subtract fromcache_refreshes_after
when determining whether to refresh a cache entry. This mitigates the "thundering herd" effect caused by many processes simultaneously refreshing the cache.
This code will attempt to fetch an RSS feed protected by a circuit. If the feed is within the cache refresh period, then the result will be returned from the cache and the block will not be executed regardless of the circuit state.
If the cache is hit, but outside its refresh period, then Faulty will check the circuit state. If the circuit is closed or half-open, then it will run the block. If the block is successful, then it will update the circuit, write to the cache and return the new value.
However, if the cache is hit and the block fails, then that failure is noted in the circuit and Faulty returns the cached value.
If the circuit is open and the cache is hit, then Faulty will always return the cached value.
If the cache query results in a miss, then faulty operates as normal. In the
code above, if the circuit is closed, the block will be executed. If the block
succeeds, the cache is refreshed. If the block fails, the default of []
will
be returned.
Fault Tolerance
Faulty backends are fault-tolerant by default. Any StandardError
s raised by
the storage or cache backends are captured and suppressed. Failure events for
these errors are sent to the notifier.
In case of a flaky storage or cache backend, Faulty also uses independent
in-memory circuits to track failures so that we don't keep calling a backend
that is failing. See the API docs for Cache::AutoWire
, and Storage::AutoWire
for more details.
If the storage backend fails, circuits will default to closed. If the cache backend fails, all cache queries will miss.
Event Handling
Faulty uses an event-dispatching model to deliver notifications of internal
events. The full list of events is available from Faulty::Events::EVENTS
.
cache_failure
- A cache backend raised an error. Payload:key
,action
,error
circuit_cache_hit
- A circuit hit the cache. Payload:circuit
,key
circuit_cache_miss
- A circuit hit the cache. Payload:circuit
,key
circuit_cache_write
- A circuit wrote to the cache. Payload:circuit
,key
circuit_closed
- A circuit closed. Payload:circuit
circuit_failure
- A circuit execution raised an error. Payload:circuit
,status
,error
circuit_opened
- A circuit execution caused the circuit to open. Payloadcircuit
,error
circuit_reopened
- A circuit execution cause the circuit to reopen from half-open. Payload:circuit
,error
.circuit_skipped
- A circuit execution was skipped because the circuit is closed. Payload:circuit
circuit_success
- A circuit execution was successful. Payload:circuit
,status
storage_failure
- A storage backend raised an error. Payloadcircuit
(can be nil),action
,error
By default events are logged using Faulty::Events::LogListener
, but that can
be replaced, or additional listeners can be added.
CallbackListener
The callback listener is useful for ad-hoc handling of events. You can specify an event handler by calling a method on the callback handler by the same name.
Faulty.init do |config|
# Replace the default listener with a custom callback listener
listener = Faulty::Events::CallbackListener.new do |events|
events.circuit_opened do |payload|
MyNotifier.alert("Circuit #{payload[:circuit].name} opened: #{payload[:error].}")
end
end
config.listeners = [listener]
end
Other Built-in Listeners
In addition to the log and callback listeners, Faulty intends to implement built-in service-specific handlers to make it easy to integrate with monitoring and reporting software.
Faulty::Events::HoneybadgerListener
: Reports circuit and backend errors to the Honeybadger error reporting service.
Custom Listeners
You can implement your own listener by following the documentation in
Faulty::Events::ListenerInterface
. For example:
class MyFaultyListener
def handle(event, payload)
MyNotifier.alert(event, payload)
end
end
Faulty.init do |config|
config.listeners = [MyFaultyListener.new]
end
Configuring the Storage Backend
A storage backend is required to use Faulty. By default, it uses in-memory storage, but Redis is also available, along with a number of wrappers used to improve resiliency and fault-tolerance.
Memory
The Faulty::Storage::Memory
backend is the default storage backend. You may
prefer this implementation if you want to avoid the complexity and potential
failure-mode of cross-network circuit storage. The trade-off is that circuit
state is only contained within a single process and will not be saved across
application restarts. Locks will also be cleared on restart.
The default configuration:
Faulty.init do |config|
config.storage = Faulty::Storage::Memory.new do |storage|
# The maximum number of circuit runs that will be stored
storage.max_sample_size = 100
end
end
Redis
The Faulty::Storage::Redis
backend provides distributed circuit storage using
Redis. Although Faulty takes steps to reduce risk
(See Fault Tolerance), using cross-network storage does
introduce some additional failure modes. To reduce this risk, be sure to set
conservative timeouts for your Redis connection. Setting high timeouts will
print warnings to stderr.
The default configuration:
Faulty.init do |config|
config.storage = Faulty::Storage::Redis.new do |storage|
# The Redis client. Accepts either a Redis instance, or a ConnectionPool
# of Redis instances. A low timeout is highly recommended to prevent
# cascading failures when evaluating circuits.
storage.client = ::Redis.new(timeout: 1)
# The prefix to prepend to all redis keys used by Faulty circuits
storage.key_prefix = 'faulty'
# A string to separate the parts of the redis key
storage.key_separator = ':'
# The maximum number of circuit runs that will be stored
storage.max_sample_size = 100
# The maximum number of seconds that a circuit run will be stored
storage.sample_ttl = 1800
# The maximum number of seconds to store a circuit. Does not apply to
# locks, which are indefinite.
storage.circuit_ttl = 604_800 # 1 Week
# The number of seconds between circuit expirations. Changing this setting
# is not recommended. See API docs for more implementation details.
storage.list_granularity = 3600
# If true, disables warnings about recommended client settings like timeouts
storage.disable_warnings = false
end
end
FallbackChain
The Faulty::Storage::FallbackChain
backend is a wrapper for multiple
prioritized storage backends. If the first backend in the chain fails,
consecutive backends are tried until one succeeds. The recommended use-case for
this is to fall back on reliable storage if a networked storage backend fails.
For example, you may configure Redis as your primary storage backend, with an in-memory storage backend as a fallback:
Faulty.init do |config|
config.storage = Faulty::Storage::FallbackChain.new([
Faulty::Storage::Redis.new,
Faulty::Storage::Memory.new
])
end
Faulty instances will automatically use a fallback chain if an array is given to
the storage
option, so this example is equivalent to the above:
Faulty.init do |config|
config.storage = [
Faulty::Storage::Redis.new,
Faulty::Storage::Memory.new
]
end
If the fallback chain fails-over to backup storage, circuit states will not
carry over, so failover could be temporarily disruptive to your application.
However, any calls to #lock
or #unlock
will always be persisted to all
backends so that locks are maintained during failover.
Storage::FaultTolerantProxy
This wrapper is applied to all non-fault-tolerant storage backends by default
(see the API docs for Faulty::Storage::AutoWire
).
Faulty::Storage::FaultTolerantProxy
is a wrapper that suppresses storage
errors and returns sensible defaults during failures. If a storage backend is
failing, all circuits will be treated as closed regardless of locks or previous
history.
If you wish your application to use a secondary storage backend instead of
failing closed, use FallbackChain
.
Storage::CircuitProxy
This wrapper is applied to all non-fault-tolerant storage backends by default
(see the API docs for Faulty::Storage::AutoWire
).
Faulty::Storage::CircuitProxy
is a wrapper that uses an independent in-memory
circuit to track failures to storage backends. If a storage backend fails
continuously, it will be temporarily disabled and raise Faulty::CircuitError
s.
Typically this is used inside a FaultTolerantProxy
or FallbackChain
so that
these storage failures are handled gracefully.
Configuring the Cache Backend
Null
The Faulty::Cache::Null
cache disables caching. It is the default if Rails and
ActiveSupport are not present.
Rails
Faulty::Cache::Rails
is the default cache if Rails or ActiveSupport are
present. If Rails is present, it uses Rails.cache
as the backend. If
ActiveSupport is present, but Rails is not, it creates a new
ActiveSupport::Cache::MemoryStore
by default. This backend can be used with
any ActiveSupport::Cache
.
Faulty.init do |config|
config.cache = Faulty::Cache::Rails.new(
ActiveSupport::Cache::RedisCacheStore.new
)
end
Cache::FaultTolerantProxy
This wrapper is applied to all non-fault-tolerant cache backends by default
(see the API docs for Faulty::Cache::AutoWire
).
Faulty::Cache::FaultTolerantProxy
is a wrapper that suppresses cache errors
and acts like a null cache during failures. Reads always return nil
, and
writes are no-ops.
Cache::CircuitProxy
This wrapper is applied to all non-fault-tolerant circuit backends by default
(see the API docs for Faulty::Circuit::AutoWire
).
Faulty::Circuit::CircuitProxy
is a wrapper that uses an independent in-memory
circuit to track failures to circuit backends. If a circuit backend fails
continuously, it will be temporarily disabled and raise Faulty::CircuitError
s.
Typically this is used inside a FaultTolerantProxy
so that these cache
failures are handled gracefully.
Listing Circuits
For monitoring or debugging, you may need to retrieve a list of all circuit
names. This is possible with Faulty.list_circuits
(or Faulty#list_circuits
if you're using an instance).
You can get a list of all circuit statuses by mapping those names to their status objects. Be careful though, since this could cause performance issues for very large numbers of circuits.
statuses = Faulty.list_circuits.map do |name|
Faulty.circuit(name).status
end
Locking Circuits
It is possible to lock a circuit open or closed. A circuit that is locked open
will never execute its block, and always raise an Faulty::OpenCircuitError
.
This is useful in cases where you need to manually disable a dependency
entirely. If a cached value is available, that will be returned from the circuit
until it expires, even outside its refresh period.
Faulty.circuit(:broken_api).lock_open!
A circuit that is locked closed will never trip. This is useful in cases where a circuit is continuously tripping incorrectly. If a cached value is available, it will have the same behavior as an unlocked circuit.
Faulty.circuit(:false_positive).lock_closed!
To remove a lock of either type:
Faulty.circuit(:fixed).unlock!
Locking or unlocking a circuit has no concurrency guarantees, so it's not recommended to lock or unlock circuits from production code. Instead, locks are intended as an emergency tool for troubleshooting and debugging.
Multiple Configurations
It is possible to have multiple configurations of Faulty running within the same
process. The most common setup is to simply use Faulty.init
to
configure Faulty globally, however it is possible to have additional
configurations.
The default instance
When you call Faulty.init
, you are actually creating the default instance of
Faulty
. You can access this instance directly by calling Faulty.default
.
# We create the default instance
Faulty.init
# Access the default instance
faulty = Faulty.default
# Alternatively, access the instance by name
faulty = Faulty[:default]
You can rename the default instance if desired:
Faulty.init(:custom_default)
instance = Faulty.default
instance = Faulty[:custom_default]
Multiple Instances
If you want multiple instance, but want global, thread-safe access to
them, you can use Faulty.register
:
api_faulty = Faulty.new do |config|
# This accepts the same options as Faulty.init
end
Faulty.register(:api, api_faulty)
# Now access the instance globally
Faulty[:api]
When you call Faulty.circuit
, that's the same as calling
Faulty.default.circuit
, so you can apply the same principal to any other
registered Faulty instance:
Faulty[:api].circuit(:api_circuit).run { 'ok' }
Standalone Instances
If you choose, you can use Faulty instances without registering them globally. This is more object-oriented and is necessary if you use dependency injection.
faulty = Faulty.new
faulty.circuit(:standalone_circuit)
Calling #circuit
on the instance still has the same memoization behavior that
Faulty.circuit
has, so subsequent calls to the same circuit will return a
memoized circuit object.
Implementing a Cache Backend
You can implement your own cache backend by following the documentation in
Faulty::Cache::Interface
. It is a fairly simple API, with only get/set
methods. For example:
class MyFaultyCache
def initialize(my_cache)
@cache = my_cache
end
def read(key)
@cache.read(key)
end
def write(key, value, expires_in: nil)
@cache.write(key, value, expires_in)
end
# Set this to false unless your cache never raises errors
def fault_tolerant?
false
end
end
Feel free to open a pull request if your cache backend would be useful for other users.
Implementing a Storage Backend
You can implement your own storage backend by following the documentation in
Faulty::Storage::Interface
. Since the storage has some tricky requirements
regarding concurrency, the Faulty::Storage::Memory
can be used as a reference
implementation. Feel free to open a pull request if your storage backend
would be useful for other users.
Alternatives
Faulty has its own opinions about how to implement a circuit breaker in Ruby, but there are and have been many other options:
Currently Active
- semian: A resiliency toolkit that includes circuit breakers. It uses adapters to auto-wire circuits, and it has only host-local storage by design.
- circuitbox: Similar in design to Faulty, but with a different API. It uses Moneta to abstract circuit storage to allow any key-value store.
Previous Work
- circuit_breaker-ruby (no recent activity)
- stoplight (unmaintained)
- circuit_breaker (archived)
- simple_circuit_breaker (unmaintained)
- breaker (unmaintained)
- circuit_b (unmaintained)
Faulty's Unique Features
- Simple API but configurable for advanced users
- Pluggable storage backends (circuitbox also has this)
- Protected storage access with fallback to safe storage
- Global, or object-oriented configuration with multiple instances
- Integrated caching support tailored for fault-tolerance
- Manually lock circuits open or closed