Class: CMSScanner::Target

Inherits:

WebSite

Object
WebSite
CMSScanner::Target

show all

Includes:: Server::Generic

Defined in:: lib/cms_scanner/target.rb,
lib/cms_scanner/target/scope.rb,
lib/cms_scanner/target/hashes.rb,
lib/cms_scanner/target/server/iis.rb,
lib/cms_scanner/target/platform/php.rb,
lib/cms_scanner/target/server/nginx.rb,
lib/cms_scanner/target/server/apache.rb,
lib/cms_scanner/target/server/generic.rb

Overview

Scope system logic

Defined Under Namespace

Modules: Platform, Server Classes: Scope

Instance Attribute Summary

Attributes inherited from WebSite

#homepage_res, #opts, #uri

Class Method Summary collapse

.page_hash(page) ⇒ String

The md5sum of the page.

Instance Method Summary collapse

#comments_from_page(pattern, page = nil) {|MatchData, Nokogiri::XML::Comment| ... } ⇒ Array<Array<MatchData, Nokogiri::XML::Comment>>
#error_404_hash ⇒ String

The hash of a 404.
#homepage_hash ⇒ String

The hash of the homepage.
#homepage_or_404?(page) ⇒ Boolean

Wether or not the page is a the homepage or a 404 based on its md5sum.
#in_scope?(url_or_uri) ⇒ Boolean

True if the url given is in scope.
#in_scope_uris(res, xpath = '//@href|//@src|//@data-src') {|Addressable::URI, Nokogiri::XML::Element| ... } ⇒ Array<Addressable::URI>

The in scope absolute URIs detected in the response’s body.
#initialize(url, opts = {}) ⇒ Target constructor

A new instance of Target.
#interesting_findings(opts = {}) ⇒ Findings
#javascripts_from_page(pattern, page = nil) {|MatchData, Nokogiri::XML::Element| ... } ⇒ Array<Array<MatchData, Nokogiri::XML::Element>>
#scope ⇒ Array<PublicSuffix::Domain, String>
#scope_url_pattern ⇒ Regexp

Similar to Target#url_pattern but considering the in scope domains as well.
#uris_from_page(page = nil, xpath = '//@href|//@src|//@data-src') {|Addressable::URI, Nokogiri::XML::Element| ... } ⇒ Array<Addressable::URI>

The absolute URIs detected in the response’s body from the HTML tags.
#url_pattern ⇒ Regexp

The pattern related to the target url, also matches escaped /, such as in JSON JS data: http://t.com/.
#vulnerable? ⇒ Boolean

Weteher or not vulnerabilities have been found.
#xpath_pattern_from_page(xpath, pattern, page = nil) {|MatchData, Nokogiri::XML::Element| ... } ⇒ Array<Array<MatchData, Nokogiri::XML::Element>>

Methods included from Server::Generic

#directory_listing?, #directory_listing_entries, #headers, #server

Methods inherited from WebSite

#access_forbidden?, #error_404_res, #error_404_url, #head_and_get, #head_or_get_params, #homepage_url, #http_auth?, #ip, #online?, #proxy_auth?, #redirection, #url, #url=

Constructor Details

#initialize(url, opts = {}) ⇒ `Target`

Returns a new instance of Target.

Parameters:

url (String)
opts (Hash) (defaults to: {})

Options Hash (opts):

:scope (Array<PublicSuffix::Domain, String>)

# File 'lib/cms_scanner/target.rb', line 17

def initialize(url, opts = {})
  super(url, opts)

  scope << uri.host
  Array(opts[:scope]).each { |s| scope << s }
end

Class Method Details

.page_hash(page) ⇒ `String`

Note:

Comments are deleted to avoid cache generation details

Returns The md5sum of the page.

Parameters:

page (Typhoeus::Response, String)

Returns:

(String) —

The md5sum of the page

# File 'lib/cms_scanner/target/hashes.rb', line 11

def self.page_hash(page)
  page = NS::Browser.get(page, followlocation: true) unless page.is_a?(Typhoeus::Response)

  # Removes comments and script tags before computing the hash
  # to remove any potential cached stuff
  html = Nokogiri::HTML(page.body)
  html.xpath('//script|//comment()').each(&:remove)

  Digest::MD5.hexdigest(html)
end

Instance Method Details

#comments_from_page(pattern, page = nil) {|MatchData, Nokogiri::XML::Comment| ... } ⇒ `Array<Array<MatchData, Nokogiri::XML::Comment>>`

Parameters:

pattern (Regexp)
page (Typhoeus::Response, String) (defaults to: nil)

Yields:

(MatchData, Nokogiri::XML::Comment)

Returns:

(Array<Array<MatchData, Nokogiri::XML::Comment>>)

# File 'lib/cms_scanner/target.rb', line 72

def comments_from_page(pattern, page = nil)
  xpath_pattern_from_page('//comment()', pattern, page) do |match, node|
    yield match, node if block_given?
  end
end

#error_404_hash ⇒ `String`

Note:

This is used to detect potential custom 404 responding with a 200

Returns The hash of a 404.

Returns:

(String) —

The hash of a 404



29
30
31

# File 'lib/cms_scanner/target/hashes.rb', line 29

def error_404_hash
  @error_404_hash ||= self.class.page_hash(error_404_res)
end

#homepage_hash ⇒ `String`

Returns The hash of the homepage.

Returns:

(String) —

The hash of the homepage



23
24
25

# File 'lib/cms_scanner/target/hashes.rb', line 23

def homepage_hash
  @homepage_hash ||= self.class.page_hash(url)
end

#homepage_or_404?(page) ⇒ `Boolean`

Returns Wether or not the page is a the homepage or a 404 based on its md5sum.

Parameters:

page (Typhoeus::Response, String)

Returns:

(Boolean) —

Wether or not the page is a the homepage or a 404 based on its md5sum



35
36
37

# File 'lib/cms_scanner/target/hashes.rb', line 35

def homepage_or_404?(page)
  homepage_and_404_hashes.include?(self.class.page_hash(page))
end

#in_scope?(url_or_uri) ⇒ `Boolean`

Returns true if the url given is in scope.

Parameters:

url (String, Addressable::URI) —

An absolute URL or URI

Returns:

(Boolean) —

true if the url given is in scope

# File 'lib/cms_scanner/target/scope.rb', line 14

def in_scope?(url_or_uri)
  url_or_uri = Addressable::URI.parse(url_or_uri.strip) unless url_or_uri.is_a?(Addressable::URI)

  scope.include?(url_or_uri.host)
rescue StandardError
  false
end

#in_scope_uris(res, xpath = '//@href|//@src|//@data-src') {|Addressable::URI, Nokogiri::XML::Element| ... } ⇒ `Array<Addressable::URI>`

Note:

It is highly recommended to use the xpath parameter to focus on the uris needed, as this method can be quite time consuming when there are a lof of uris to check

Returns The in scope absolute URIs detected in the response’s body.

Parameters:

res (Typhoeus::Response)
xpath (String) (defaults to: '//@href|//@src|//@data-src')

Yields:

(Addressable::URI, Nokogiri::XML::Element) —

The in scope url and its associated tag

Returns:

(Array<Addressable::URI>) —

The in scope absolute URIs detected in the response’s body

# File 'lib/cms_scanner/target/scope.rb', line 31

def in_scope_uris(res, xpath = '//@href|//@src|//@data-src')
  found = []

  uris_from_page(res, xpath) do |uri, tag|
    next unless in_scope?(uri)

    yield uri, tag if block_given?

    found << uri
  end

  found
end

#interesting_findings(opts = {}) ⇒ `Findings`

Parameters:

opts (Hash) (defaults to: {})

Returns:

(Findings)



27
28
29

# File 'lib/cms_scanner/target.rb', line 27

def interesting_findings(opts = {})
  @interesting_findings ||= NS::Finders::InterestingFindings::Base.find(self, opts)
end

#javascripts_from_page(pattern, page = nil) {|MatchData, Nokogiri::XML::Element| ... } ⇒ `Array<Array<MatchData, Nokogiri::XML::Element>>`

Parameters:

pattern (Regexp)
page (Typhoeus::Response, String) (defaults to: nil)

Yields:

(MatchData, Nokogiri::XML::Element)

Returns:

(Array<Array<MatchData, Nokogiri::XML::Element>>)

# File 'lib/cms_scanner/target.rb', line 83

def javascripts_from_page(pattern, page = nil)
  xpath_pattern_from_page('//script', pattern, page) do |match, node|
    yield match, node if block_given?
  end
end

#scope ⇒ `Array<PublicSuffix::Domain, String>`

Returns:

(Array<PublicSuffix::Domain, String>)



7
8
9

# File 'lib/cms_scanner/target/scope.rb', line 7

def scope
  @scope ||= Scope.new
end

#scope_url_pattern ⇒ `Regexp`

Similar to Target#url_pattern but considering the in scope domains as well

rubocop:disable Metrics/AbcSize

Returns:

(Regexp) —

The pattern related to the target url and in scope domains, it also matches escaped /, such as in JSON JS data: http://t.com/

# File 'lib/cms_scanner/target/scope.rb', line 50

def scope_url_pattern
  return @scope_url_pattern if @scope_url_pattern

  domains = [uri.host + uri.path]

  domains += if scope.domains.empty?
               Array(scope.invalid_domains[1..-1])
             else
               Array(scope.domains[1..-1]).map(&:to_s) + scope.invalid_domains
             end

  domains.map! { |d| Regexp.escape(d.delete_suffix('/')).gsub('\*', '.*').gsub('/', '\\\\\?/') }

  domains[0].gsub!(Regexp.escape(uri.host), "#{Regexp.escape(uri.host)}(?::\\d+)?") if uri.port

  @scope_url_pattern = %r{https?:\\?/\\?/(?:#{domains.join('|')})\\?/?}i
end

#uris_from_page(page = nil, xpath = '//@href|//@src|//@data-src') {|Addressable::URI, Nokogiri::XML::Element| ... } ⇒ `Array<Addressable::URI>`

Note:

It is highly recommended to use the xpath parameter to focus on the uris needed, as this method can be quite time consuming when there are a lof of uris to check

Returns The absolute URIs detected in the response’s body from the HTML tags.

Parameters:

page (Typhoeus::Response, String) (defaults to: nil)
xpath (String) (defaults to: '//@href|//@src|//@data-src')

Yields:

(Addressable::URI, Nokogiri::XML::Element) —

The url and its associated tag

Returns:

(Array<Addressable::URI>) —

The absolute URIs detected in the response’s body from the HTML tags

# File 'lib/cms_scanner/target.rb', line 98

def uris_from_page(page = nil, xpath = '//@href|//@src|//@data-src')
  page    = NS::Browser.get(url(page)) unless page.is_a?(Typhoeus::Response)
  found   = []

  page.html.xpath(xpath).each do |node|
    attr_value = node.text.to_s

    next unless attr_value && !attr_value.empty?

    node_uri = begin
      uri.join(attr_value.strip)
    rescue StandardError
      # Skip potential malformed URLs etc.
      next
    end

    next unless node_uri.host

    yield node_uri, node.parent if block_given? && !found.include?(node_uri)

    found << node_uri
  end

  found.uniq
end

#url_pattern ⇒ `Regexp`

Returns The pattern related to the target url, also matches escaped /, such as in JSON JS data: http://t.com/.

Returns:

(Regexp) —

The pattern related to the target url, also matches escaped /, such as in JSON JS data: http://t.com/



42
43
44

# File 'lib/cms_scanner/target.rb', line 42

def url_pattern
  @url_pattern ||= Regexp.new(Regexp.escape(url).gsub(/https?/i, 'https?').gsub('/', '\\\\\?/'), Regexp::IGNORECASE)
end

#vulnerable? ⇒ `Boolean`

Weteher or not vulnerabilities have been found. Used to set the exit code of the scanner and it should be overriden in the implementation

Returns:

(Boolean)

Raises:

(NotImplementedError)



36
37
38

# File 'lib/cms_scanner/target.rb', line 36

def vulnerable?
  raise NotImplementedError
end

#xpath_pattern_from_page(xpath, pattern, page = nil) {|MatchData, Nokogiri::XML::Element| ... } ⇒ `Array<Array<MatchData, Nokogiri::XML::Element>>`

Parameters:

xpath (String)
pattern (Regexp)
page (Typhoeus::Response, String) (defaults to: nil)

Yields:

(MatchData, Nokogiri::XML::Element)

Returns:

(Array<Array<MatchData, Nokogiri::XML::Element>>)

# File 'lib/cms_scanner/target.rb', line 52

def xpath_pattern_from_page(xpath, pattern, page = nil)
  page    = NS::Browser.get(url(page)) unless page.is_a?(Typhoeus::Response)
  matches = []

  page.html.xpath(xpath).each do |node|
    next unless node.text.strip =~ pattern

    yield Regexp.last_match, node if block_given?

    matches << [Regexp.last_match, node]
  end

  matches
end

Class: CMSScanner::Target

Overview

Defined Under Namespace

Instance Attribute Summary

Attributes inherited from WebSite

Class Method Summary collapse

Instance Method Summary collapse

Methods included from Server::Generic

Methods inherited from WebSite

Constructor Details

#initialize(url, opts = {}) ⇒ Target

Class Method Details

.page_hash(page) ⇒ String

Instance Method Details

#comments_from_page(pattern, page = nil) {|MatchData, Nokogiri::XML::Comment| ... } ⇒ Array<Array<MatchData, Nokogiri::XML::Comment>>

#error_404_hash ⇒ String

#homepage_hash ⇒ String

#homepage_or_404?(page) ⇒ Boolean

#in_scope?(url_or_uri) ⇒ Boolean

#in_scope_uris(res, xpath = '//@href|//@src|//@data-src') {|Addressable::URI, Nokogiri::XML::Element| ... } ⇒ Array<Addressable::URI>

#interesting_findings(opts = {}) ⇒ Findings

#javascripts_from_page(pattern, page = nil) {|MatchData, Nokogiri::XML::Element| ... } ⇒ Array<Array<MatchData, Nokogiri::XML::Element>>

#scope ⇒ Array<PublicSuffix::Domain, String>

#scope_url_pattern ⇒ Regexp

#uris_from_page(page = nil, xpath = '//@href|//@src|//@data-src') {|Addressable::URI, Nokogiri::XML::Element| ... } ⇒ Array<Addressable::URI>

#url_pattern ⇒ Regexp

#vulnerable? ⇒ Boolean

#xpath_pattern_from_page(xpath, pattern, page = nil) {|MatchData, Nokogiri::XML::Element| ... } ⇒ Array<Array<MatchData, Nokogiri::XML::Element>>

#initialize(url, opts = {}) ⇒ `Target`

.page_hash(page) ⇒ `String`

#comments_from_page(pattern, page = nil) {|MatchData, Nokogiri::XML::Comment| ... } ⇒ `Array<Array<MatchData, Nokogiri::XML::Comment>>`

#error_404_hash ⇒ `String`

#homepage_hash ⇒ `String`

#homepage_or_404?(page) ⇒ `Boolean`

#in_scope?(url_or_uri) ⇒ `Boolean`

#in_scope_uris(res, xpath = '//@href|//@src|//@data-src') {|Addressable::URI, Nokogiri::XML::Element| ... } ⇒ `Array<Addressable::URI>`

#interesting_findings(opts = {}) ⇒ `Findings`

#javascripts_from_page(pattern, page = nil) {|MatchData, Nokogiri::XML::Element| ... } ⇒ `Array<Array<MatchData, Nokogiri::XML::Element>>`

#scope ⇒ `Array<PublicSuffix::Domain, String>`

#scope_url_pattern ⇒ `Regexp`

#uris_from_page(page = nil, xpath = '//@href|//@src|//@data-src') {|Addressable::URI, Nokogiri::XML::Element| ... } ⇒ `Array<Addressable::URI>`

#url_pattern ⇒ `Regexp`

#vulnerable? ⇒ `Boolean`

#xpath_pattern_from_page(xpath, pattern, page = nil) {|MatchData, Nokogiri::XML::Element| ... } ⇒ `Array<Array<MatchData, Nokogiri::XML::Element>>`