Class: Gort::RobotsTxt

Inherits:
Object
  • Object
show all
Defined in:
lib/gort/robots_txt.rb

Overview

Represents a robots.txt file.

Instance Attribute Summary collapse

Formatting Methods collapse

Instance Method Summary collapse

Constructor Details

#initialize(rules) ⇒ RobotsTxt

Returns a new instance of RobotsTxt.



9
10
11
# File 'lib/gort/robots_txt.rb', line 9

def initialize(rules)
  @rules = rules
end

Instance Attribute Details

#rulesArray<Rule, Group, InvalidLine> (readonly)

Returns:



14
15
16
# File 'lib/gort/robots_txt.rb', line 14

def rules
  @rules
end

Instance Method Details

#allow?(user_agent, path_and_query) ⇒ Boolean

Is this path allowed for the given user agent?

Parameters:

  • user_agent (String)
  • path_and_query (String)

Returns:

  • (Boolean)

See Also:



23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# File 'lib/gort/robots_txt.rb', line 23

def allow?(user_agent, path_and_query)
  return true if path_and_query == ROBOTS_TXT_PATH

  top_match =
    matches(user_agent, path_and_query)
    .compact
    # This is an arcane bit.
    # The rules are reverse sorted by match length (i.e. longest first),
    # and then by class name using the fact that allow goes before disallow.
    # This is the rule precedence order defined in the RFC.
    .min_by { |(match_length, rule)| [-match_length, rule.class.name] }

  # Allow if there is no match or the top match is an allow rule.
  top_match.nil? || top_match.last.is_a?(AllowRule)
end

#disallow?(user_agent, path_and_query) ⇒ Boolean

Is this path disallowed for the given user agent?

Parameters:

  • user_agent (String)
  • path_and_query (String)

Returns:

  • (Boolean)

See Also:



46
47
48
# File 'lib/gort/robots_txt.rb', line 46

def disallow?(user_agent, path_and_query)
  !allow?(user_agent, path_and_query)
end

#inspectString

A human readable representation of the robots.txt.

Returns:

  • (String)


57
58
59
# File 'lib/gort/robots_txt.rb', line 57

def inspect
  "#<#{self.class.name}:#{object_id} #{rules.inspect}>"
end

#pretty_print(pp) ⇒ void

This method returns an undefined value.

Produces a pretty human readable representation of the robots.txt.

Parameters:

  • pp (PrettyPrint)

    pretty printer



68
69
70
71
72
73
74
75
76
77
# File 'lib/gort/robots_txt.rb', line 68

def pretty_print(pp)
  pp.text("#{self.class.name}/#{object_id}")
  pp.group(1, "[", "]") do
    pp.breakable("")
    pp.seplist(rules) do |rule|
      pp.pp(rule)
    end
    pp.breakable("")
  end
end