boatman

Boatman is a simple Ruby DSL for polling directories and ferrying around / manipulating new files that appear in those folders. It was created as an attempt at something more elegant than having numerous scripts that all do very similar file transfer and manipulation tasks.

Install

Install the boatman gem (assuming you have Ruby and RubyGems):

gem install boatman

Example

Get a quick feel for what boatman does with this example.

Create a YAML file to define the task scripts and directories they’ll operate on. Let’s call it demo.yml:

tasks:
- demo.rb
directories:
  fresh_text_folder: txt_output
  text_storage_folder: storage

Now create two folders under the folder where you have demo.yml called “txt_output” and “storage”.

Make a task file demo.rb:

fresh_text_folder.check_every 5.seconds do
  age :greater_than => 1.second

  files_ending_with "txt" do |file|
    move file, :to => text_storage_folder
  end
end

Now run your task:

boatman demo.yml

Now, while boatman is running, open another terminal or a file browser and create a file in the txt_output folder you created called “demo.txt”. Wait a bit and it should get moved to the “storage” folder you made.

Hit Ctrl-C to exit out of boatman. You can take a look at the boatman.log file it creates to see a log of what operations have been performed.

Creating tasks

Polling folders

The top level of a boatman task will usually be a directory polling loop. This is accomplished by running the check_every method on a directory specified in your YAML configuration file. The check_every method takes a time interval as its only argument other than a block. Using the example above, running

fresh_text_folder.check_every 5.seconds do
  ...
end

will run the provided do..end block every 5 seconds in the context of the fresh_text_folder directory.

Specifying file/folder criteria

It is often desirable to consider just a subset of the files in the folder being polled. Files/folders can be selected by age and file/folder name. Age can be specified inside the polling folders do..end block:

# age of the file must be greater than 1 minute
age :greater_than => 1.minute

# age of the file must be less than 24 hours
age :less_than => 24.hours

There are a few ways to select based on file name. Each of these methods accepts a block to run on each selected file/folder:

# select files by a string or regular expression
files_matching /\d+\.tif/ do |file|
  ...
end

# select files by a string or regular expression ending
files_ending_with "txt" do |file|
  ...
end

# select folders by a string or regular expression
folders_matching /\d+/ do |folder|
  ...
end

# select folders by a string or regular expression ending
folders_ending_with "log" do |folder|
  ...
end

Copying files/folders

Files/folders can be copied or moved inside the block provided to the file/folder matching methods:

# move selected files to destination_folder, which needs to be defined in the configuration YAML file.
files_ending_with "txt" do |file|
  move file, :to => destination_folder
end

# same thing but copy the file instead of moving it
files_ending_with "txt" do |file|
  copy file, :to => destination_folder
end

boatman will perform a checksum verification by default in order to catch errors in file transfers. This can be disabled with the disable_checksum_verification directive inside the file/folder matching block:

files_ending_with "txt" do |file|
  disable_checksum_verification

  move file, :to => destination_folder
end

Files can optionally be renamed using the :rename parameter for move or copy:

files_ending_with "txt" do |file|
  # use the path method on the file
  new_name = "renamed_" + File.basename(file.path)

  # file will renamed, e.g. test.txt becomes renamed_test.txt
  move file, :to => destination_folder, :rename => new_name
end

The file being copied/moved can also be modified by passing a block to the copy or move methods. The parameters for the block are the path to read the original file and a the path to write the modified file:

files_ending_with "txt" do |file|
  move file, :to => destination_folder do |old_file_name, new_file_name|
    old_file = File.new(old_file_name, "r")
    new_file = File.new(new_file_name, "w")

    old_file.readlines.each do |line|
      new_file << "# #{line}"
    end
  end
end

Configuration Files

YAML configuration files for boatman consist of two parts, tasks and directories.

Under tasks you can specify any number of task files to be loaded and run together. Note that boatman will take care of running each task on the interval it specifies, however the tasks are run serially so a long-running task will prevent subsequent tasks from running until it completes:

# config.yml
tasks:
- text_file_reformatter.rb
- raw_data_transfer.rb
directories:
...

Directories allow you to name directories you’d like to have available to your tasks. It is possible to specify both Windows- and POSIX-style paths:

# Windows-style
my_shared_folder: //mycomputer/myshare

# POSIX-style
my_shared_folder: /home/bmarzolf/share

Note on Patches/Pull Requests

  • Fork the project.

  • Make your feature addition or bug fix.

  • Add tests for it. This is important so I don’t break it in a future version unintentionally.

  • Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but

    bump version in a commit by itself I can ignore when I pull)
    
  • Send me a pull request. Bonus points for topic branches.

Copyright © 2009 Bruz Marzolf. See LICENSE for details.