PerfectSched
Highly available distributed cron works with PerfectQueue.
It provides exactly-once semantics unless backend database fails. Registered schedules are surely pushed to a queue provided by PerfectQueue every time in order.
You can register, modify and delete schedules using the command line utility or library API.
Backend database is pluggable. PerfectSched supports RDBMS and Amazon SimpleDB for now.
Architecture
PerfectSched uses following database schema:
(
id:string -- unique identifier of the schedule
data:blob -- additional attributes to be pushed to PerfectQueue
next_time:int -- unix time of the next schedule
cron:string -- description of the schedule
delay:int -- delay time before running a schedule
timeout:int
)
-
list: lists tasks whose timeout column is old enough.
-
lock: updates timeout column of the first task
-
push: push a message to the PerfectQueue
-
update: if it succeeded, updates the next_time and timeout columns
-
or leave: if it failed, leave the row and expect to be retried.
Cooperation with PerfectQueue
PerfectSched pushes a task to PerfectQueue every time on schedule. The ID of the task becomes “<id of the scuedule>.<unix time of the schedule>”. For example, the identifier of the schedule is “my-sched”, and a schedule runs at “2011-08-30 00:00:00 UTC” (1314662400 in UNIX TIME), the ID of the task is “my-sched.1314662400”. The data of the task is same as the schedule.
Library usage
Adding a schedule
require 'perfectsched'
# RDBMS
require 'perfectsched/backend/rdb'
sched = PerfectSched::Backend::RDBBackend.new(
'mysql://user:password@localhost/mydb', table='perfectsched')
# SimpleDB
require 'perfectsched/backend/simpledb'
sched = PerfectSched::Backend::SimpleDBBackend.new(
'AWS_KEY_ID', 'AWS_SECRET_KEY', 'your-simpledb-domain-name')
id = 'unique-key-id'
cron = "* * * * *"
delay = 0
data = '{"any":"data"}'
start = Time.now.to_i
sched.add(id, cron, delay, data, start)
Deleting a schedule
sched.delete(id)
Modifying a schedule
cron = "* * * * 0"
delay = 10
sched.modify_sched(id, cron, delay)
data = '{"user":1}'
sched.modify_data(id, data)
sched.modify(id, cron, delay, data)
Command line usage
Usage: perfectsched [options]
--setup PATH.yaml Write example configuration file
-f, --file PATH.yaml Set path to the configuration file
--list Show registered schedule
--delete ID Delete a registered schedule
--add <ID> <CRON> <DATA> Register a schedule
-d, --delay SEC Delay time before running a schedule (default: 0)
-s, --start UNIXTIME Start time to run a schedule (default: now)
-S, --modify-sched <ID> <CRON> Modify schedule of a registered schedule
-D, --modify-delay <ID> <DELAY> Modify delay of a registered schedule
-J, --modify-data <ID> <DATA> Modify data of a registered schedule
-b, --daemon PIDFILE Daemonize (default: foreground)
-o, --log PATH log file path
-v, --verbose verbose mode
Configuration
First of all, create a configuration file:
$ perfectsched --setup config.yaml
$ edit config.yaml
Adding a schedule
$ perfectsched -f config.yaml --add unique-key-id "* * * * *" '{"any":"data"}'
Deleting a schedule
$ perfectsched -f config.yaml --delete unique-key-id
Modifying a schedule
$ perfectsched -f config.yaml --modify-sched unique-key-id "* * * * 0"
$ perfectsched -f config.yaml --modify-delay unique-key-id 10
$ perfectsched -f config.yaml --modify-data unique-key-id '{"user":1}'
Listing registered schedules
$ perfectsched -f config.yaml --list
id schedule delay next time next run data
test1 * * * * * 0 2011-08-30 01:29:42 +0900 2011-08-30 01:29:42 +0900 {"attr1":"val1","attr":"val2"}
1 entries.
Running a scheduler
$ perfectsched -f config.yaml
It’s recommended to run the scheduler on several servers for availability.