url_expander

The idea is simple, One expander to expand them all. I’ve been working on an analytical tool for twitter.com and i wanted a way to expand all the shortened urls. Url_expander is a good start, it covers (13,231 bit.ly domains + 28 different services). I tried to make it as simple as possible to extend & add new services.

Install

gem install url_expander

How to use

Download the config used to authenticate with some of the shortening services

curl -s https://raw.github.com/gist/1086114/c5c401e811effc3e99ed37ae9503f267877d8e68/url_expander_credentials.yml -o ~/url_expander_credentials.yml

Require and Use

require 'rubygems' # if ruby < 1.9
require 'url_expander'
UrlExpander::Client.expand('http://tinyurl.com/66sekq5')

Options

  • :nested_shortening – Nested shortening is turned on by default (:nested_shortening => true)

    UrlExpander::Client.expand("http://t.co/ZGEGdas")  => "http://www.blink182.com/upallnight/"
    UrlExpander::Client.expand("http://t.co/ZGEGdas", :nested_shortening => false)  => "http://bit.ly/oi029o"
    
  • :limit – If the nested_shortening is turned on, then :limit controls the max number of redirection allowed. Default value is 10

    UrlExpander::Client.expand('http://tinyurl.com/66sekq5', :limit => 4)
    
  • :config_file – Some services Require credentials to access the api (such as bitly). You need to provide YAML file with the access credentials. Default is “~/url_expander_credentials.yml”.

    UrlExpander::Client.expand('http://bit.ly/qpshuI', :config_file => '/Users/moski/url_expander_credentials.yml')
    

Supported Services:

  • bit.ly and bitly domains(13,231 domains as of July 19-2011)

    UrlExpander::Client.expand("http://bit.ly/qpshuI")
    UrlExpander::Client.expand("http://4sq.com/pQkuZk")
    UrlExpander::Client.expand("http://fxn.ws/pBewvL")
    UrlExpander::Client.expand("http://tcrn.ch/oe50JN")
    UrlExpander::Client.expand("http://nyti.ms/dzy2b7")
    ....
    

Adding a new service

url_expander has 3 main classes

  • UrlExpander::Expanders::Basic: For services that provides 301 redirect.

  • UrlExpander::Expanders::API: For services that provide expand api.

  • UrlExpander::Expanders::Scrape: No api, no 301 redirect, manual scraping.

Each service must subclass one of these classes.

Subclass UrlExpander::Expanders::Basic

class Tinyurl < UrlExpander::Expanders::Basic
  # A pattern is used by the basic class to match the url and extract the 
  # the shortning key.
  PATTERN = %r'(http://tinyurl\.com(/[\w/]+))'

  # Reference to the class used by the base class to access Request class within
  attr_reader :parent_klass

  # initialize function to do custom initialization
  # make sure to set the @parent_kass and call super
  # Calling super with will extract the key and calls fetch_url
  def initialize(short_url="", options={})
    @parent_klass = self
    super(short_url, options)
  end

  # Request class will include httpartty and set the base_uri for that
  # service
  class Request
    include HTTParty
    base_uri 'http://tinyurl.com'
  end
end

All you have to do now is register you class. Simple enough :)

module UrlExpander
  module Expanders
    autoload :Tinyurl, 'basic/tinyurl' 
  end
end

Subclass UrlExpander::Expanders::API

If the service provides an api why not use it. UrlExpander::Expanders::API gives you a little more flexibly that the basic one but requires more code.

class Googl < UrlExpander::Expanders::API
  # A pattern is used by the basic class to match the url and extract the the shortening key.
  # NOTICE: We ignored the / before the key
  # http://goo.gl/DRppM => 'DRppM' without /
  PATTERN = %r'(http://goo\.gl/([\w/]+))'

  # Reference to the class used by the base class to access Request class within  
  attr_reader :parent_klass

  # initialize function to do custom initialization
  # make sure to set the @parent_kass and call super
  # calling super will only exract the shortening for the url using the pattern defined above.
  # You need to manually call fetch_url.
  def initialize(short_url, options={})
    @parent_klass = self
    super(short_url,options)
    fetch_url
  end

  # Request class will include httpartty and set the base_uri for that service.
  class Request
    include HTTParty
    base_uri 'https://www.googleapis.com'
    headers 'Content-Type' => 'application/json'
    headers "Content-length" => "0"
  end

  private

  # A custom fetcher function expand the url using the api.  
  def fetch_url
    response = Request.get("/urlshortener/v1/url?shortUrl=http://goo.gl/#{@shortner_key}")
    if response.code == 200
      @long_url  = response['longUrl']
    else
      error = (JSON.parse response.body)['error']
      raise UrlExpander::Error.new(error['message'],response.code)
    end
  end
end

Thats about it, don’t forget to register your class.

Contributing to url_expander

  • Check out the latest master to make sure the feature hasn’t been implemented or the bug hasn’t been fixed yet

  • Check out the issue tracker to make sure someone already hasn’t requested it and/or contributed it

  • Fork the project

  • Start a feature/bugfix branch

  • Commit and push until you are happy with your contribution

  • Make sure to add tests for it. This is important so I don’t break it in a future version unintentionally.

  • Please try not to mess with the Rakefile, version, or history. If you want to have your own version, or is otherwise necessary, that is fine, but please isolate to its own commit so I can cherry-pick around it.

Copyright © 2011 Moski. See LICENSE.txt for further details.