Class: HTTParty::Parser

Inherits:
Object
  • Object
show all
Defined in:
lib/ap/parser.rb

Overview

This is a monkey-patch to HTTParty because the AP API doesn’t return the HTML in the <content> tag in CDATA tags, so we need to gsub the response to add them before being parsed

Instance Method Summary collapse

Instance Method Details

#ap_xmlObject

Fixes and parses the XML returned by the AP Why is it broken? The HTML content doesn’t include CDATA tags



13
14
15
16
17
18
# File 'lib/ap/parser.rb', line 13

def ap_xml
  # other gsub could be negaitve /<content?([A-Za-z "=]+)>(?!<\!\[CDATA\[)/
  # but CS theory says that isn't a good idea, and so does running time tests
  Crack::XML.parse(body.gsub(/<content?([A-Za-z "=]+)><\!\[CDATA\[/, '<content>').gsub(/\]\]><\/content>/, "</content>").gsub(/<content?([A-Za-z "=]+)>/, "<content><![CDATA[").gsub(/<\/content>/, "]]></content>"))
  # Crack::XML.parse(body.gsub(/<content?([A-Za-z "=]+)>(?!<\!\[CDATA\[)/, "<content><![CDATA[").gsub(/<\/content>/, "]]></content>"))
end