🖼 GsImgFetcher
gs_img_fetcher
is a tool to download images from remote hosts and save them on your local storage.
Installation
Add this line to your application's Gemfile:
gem 'gs_img_fetcher'
And then execute:
$ bundle install
Or install it yourself as:
$ gem install gs_img_fetcher
Features
Concurrency
gs_img_fetcher
is designed with concurrency in mind. It can be configured to fetch images either asynchronously or synchronously.
By default, it runs asynchronously and the maximum number of threads depends on what your machine allows.
For a relatively small input, it would be better to specify --no-async
option.
Check out the options --async
and --max_threads
.
File size limit
You can set a limit on the maximum size of each downloaded image to avoid downloading unexpectedly large files and filling up your storage.
By default, it runs without a limit. Use --max_size
option to set one.
Usage
CLI
Let's say you have in your current directory a text file named urls.txt
containing list of image URLs, each line containing one URL.
$ cat urls.txt
http://example.com/image1.jpg
http://example.com/image1.png
http://example.com/image1.svg
$ gs_img_fetcher run urls.txt output --max_size=5
I, [2020-05-17T13:09:01.420214 #87392] INFO -- : Processing 3 URLs (3 valid, 0 invalid)
...
I, [2020-05-17T13:09:02.709097 #87392] INFO -- : Fetch complete (3 successful, 0 failed)
$ ls output
1e8256aa-5cb7-4545-9109-65aaa550deac.jpg 49d4f436-110f-4206-a2d6-07cc6156fc56.png a5b4ce07-1fc3-49e3-b558-44f8c4afaaab.svg
Running gs_img_fetcher run urls.txt output
would take URLs from urls.txt
, downloads the images and saves them in the directory output
.
Run gs_img_fetcher --help
to show usage guide.
Set the environment variable NOLOG
to a truthy value to suppress logs.
Hooking GsImgFetcher into your own application
Fetching a single image
fetcher = GsImgFetcher::Fetcher.new(
GsImgFetcher::Entry.new('http://example.com/image.png'),
'output'
)
fetcher.fetch
fetcher.save
fetcher.successful?
Fetching multiple images
entry_set = GsImgFetcher::EntrySet.from_file('urls.txt')
# or
urls = ['http://example.com/image.png', 'http://example.com/image2.png']
entries = urls.map { |url| GsImgFetcher::Entry.new(url) }
entry_set = GsImgFetcher::EntrySet.new(entries)
manager = GsImgFetcher::Manager.new(entry_set, output_dir: 'output', async: false)
manager.setup.fetch
Manager
is what controls the entire process of handling the input and fetching and saving the images.EntrySet
is responsible for finding the input file and parsing, sanitizing and validating the list of URLs.Fetcher
is responsible for downloading and saving images.
Development
After checking out the repo, run bin/setup
to install dependencies. Then, run bundle exec rspec
to run the tests. You can also run bin/console
for an interactive prompt that will allow you to experiment.
To install this gem onto your local machine, run bundle exec rake install
. To release a new version, update the version number in version.rb
, and then run bundle exec rake release
, which will create a git tag for the version, push git commits and tags, and push the .gem
file to rubygems.org.
License
The gem is available as open source under the terms of the MIT License.