Class: Cartographer

Inherits:
Object
  • Object
show all
Defined in:
lib/cartographer.rb

Overview

Cartographer is a sitemap manipulation library.

sm = Cartographer.new("http://example.org")
sm.add(Cartographer::URL.new(:location => URI.parse("http://example.org/ethereal")))
sm.add_tree("/some/doc/root") #=> adds all the files in the docroot
puts sm.to_xml #=> outputs your sitemap.

# filtering and remapping
sm = Cartographer.new("http://example.org")
# pass a block to get the paths found and rewrite them.
# return nil if you want that path (and sub-paths) to be omitted.
sm.add_tree("/some/doc/root") { |x| x.sub(/.html$/, '/') }
File.open('sitemap.xml', 'w') { |f| f << sm.to_xml }

See Cartographer and Cartographer::URL RDoc for more information.

Defined Under Namespace

Classes: URL

Constant Summary collapse

VERSION =
'1.2.0'
XML_TEMPLATE =
File.expand_path("xml/sitemap.haml", File.dirname(File.expand_path(__FILE__)))

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(root_uri, default_changefreq = nil) ⇒ Cartographer

Constructor.

Takes a root_uri which can be anything resolvable as a string, or URI::HTTP.

Also takes a default change frequency which is just a string corresponding to the different types.



50
51
52
53
54
# File 'lib/cartographer.rb', line 50

def initialize(root_uri, default_changefreq=nil)
  @urls = []
  @root_uri = root_uri.kind_of?(URI::HTTP) ? root_uri : URI.parse(root_uri.to_s)
  @default_changefreq = default_changefreq
end

Instance Attribute Details

#default_changefreqObject

The default change frequency for all URLs.



39
40
41
# File 'lib/cartographer.rb', line 39

def default_changefreq
  @default_changefreq
end

#root_uriObject (readonly)

Your site’s root path.



35
36
37
# File 'lib/cartographer.rb', line 35

def root_uri
  @root_uri
end

#urlsObject (readonly)

List of Cartographer::URL objects



31
32
33
# File 'lib/cartographer.rb', line 31

def urls
  @urls
end

Instance Method Details

#add(url) ⇒ Object

Adds a Cartographer::URL to the list of URLs for processing.

Raises:

  • (ArgumentError)


60
61
62
63
# File 'lib/cartographer.rb', line 60

def add(url)
  raise ArgumentError, "Please pass Cartographer::URL objects to this method" unless url.kind_of?(URL)
  @urls.push(url)
end

#add_tree(base_path) ⇒ Object

Given a path, walks the path and transforms all the files and directories in it into Cartographer::URL objects.

If given a block, it will yield each relative path to this block as the first argument. The return value of this block is used as the path, making them transformable. If you return nil, this path (and any sub-paths) are pruned from the find.

See the Constructor for examples or README.txt.



77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
# File 'lib/cartographer.rb', line 77

def add_tree(base_path)
  Find.find(base_path) do |path|
    next if File.directory?(path)

    new_path = path.sub %r!^#{Regexp.escape base_path}!, ''
    uri = root_uri.dup

    stat = File.stat(path)

    if block_given?
      new_path = yield new_path
      unless new_path
        Find.prune
      end
    end

    uri.path = Cartographer.escape_path(new_path)
    add URL.new(:location => uri, :lastmod => stat.mtime, :changefreq => @default_changefreq)
  end
end

#from_xml(xml) ⇒ Object

Given a string containing sitemap.xml content, parses it and adds each item to the URL list.



114
115
116
117
118
119
120
121
122
123
124
# File 'lib/cartographer.rb', line 114

def from_xml(xml)
  xml = Nokogiri::XML.parse(xml)
  xml.root.xpath('xmlns:url').each do |url|
    add Cartographer::URL.new(
      :location   => URI.parse(url.xpath('xmlns:loc').first.content),
      :lastmod    => (Time.parse(url.xpath('xmlns:lastmod').first.content) rescue Time.now),
      :changefreq => (url.xpath('xmlns:changefreq').first.content rescue nil),
      :priority   => (url.xpath('xmlns:priority').first.content rescue nil)
    )
  end
end

#to_xmlObject

Transforms the given Cartographer::URL list to the sitemap XML format.

Be aware this does not alter the root URL structure of these elements, so you may have mixed sitemap content if you do not rewrite the URLs.



105
106
107
# File 'lib/cartographer.rb', line 105

def to_xml
  Haml::Engine.new(File.read(XML_TEMPLATE)).render(self)
end

#uniq!Object

Uniq’s the URL list. This operation is destructive.



130
131
132
# File 'lib/cartographer.rb', line 130

def uniq!
  urls.uniq!
end