Class: MARC::UnsafeXMLWriter
- Defined in:
- lib/marc/unsafe_xmlwriter.rb
Overview
UnsafeXMLWriter bypasses real xml handlers like REXML or Nokogiri and just concatenates strings to produce the XML document. This has no guarantees of validity if the MARC record you’re encoding isn’t valid and won’t do things like entity expansion, but it does escape using ruby’s String#encode(xml: :text) and it’s much, much faster – 4-5 times faster than using Nokogiri, and 15-20 times faster than the REXML version.
Constant Summary collapse
- XML_HEADER =
'<?xml version="1.0" encoding="UTF-8"?>'
- NS_ATTRS =
%(xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.loc.gov/MARC21/slim" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd")
- NS_COLLECTION =
"<collection #{NS_ATTRS}>".freeze
- COLLECTION =
"<collection>".freeze
- NS_RECORD =
"<record #{NS_ATTRS}>".freeze
- RECORD =
"<record>".freeze
Constants inherited from XMLWriter
Class Method Summary collapse
-
.encode(record, include_namespace: true) ⇒ String
Take a record and turn it into a valid MARC-XML string.
-
.open_collection(include_namespace: true) ⇒ Object
Open ‘collection` tag, w or w/o namespace.
- .open_controlfield(tag) ⇒ Object
- .open_datafield(tag, ind1, ind2) ⇒ Object
- .open_record(include_namespace: true) ⇒ Object
- .open_subfield(code) ⇒ Object
-
.single_record_document(record, include_namespace: true) ⇒ Object
Produce an XML string with a single document in a collection.
Instance Method Summary collapse
-
#write(record) ⇒ Object
Write the record to the target.
Methods inherited from XMLWriter
#close, fix_leader, #initialize, #stylesheet_tag
Constructor Details
This class inherits a constructor from MARC::XMLWriter
Class Method Details
.encode(record, include_namespace: true) ⇒ String
Take a record and turn it into a valid MARC-XML string. Note that this is an XML snippet, without an XML header or <collection> enclosure.
58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
# File 'lib/marc/unsafe_xmlwriter.rb', line 58 def encode(record, include_namespace: true) xml = open_record(include_namespace: include_namespace).dup # MARCXML only allows alphanumerics or spaces in the leader lead = fix_leader(record.leader) xml << "<leader>" << lead.encode(xml: :text) << "</leader>" record.each do |f| if f.instance_of?(MARC::DataField) xml << open_datafield(f.tag, f.indicator1, f.indicator2) f.each do |sf| xml << open_subfield(sf.code) << sf.value.encode(xml: :text) << "</subfield>" end xml << "</datafield>" elsif f.instance_of?(MARC::ControlField) xml << open_controlfield(f.tag) << f.value.encode(xml: :text) << "</controlfield>" end end xml << "</record>" xml.force_encoding("utf-8") end |
.open_collection(include_namespace: true) ⇒ Object
Open ‘collection` tag, w or w/o namespace
26 27 28 29 30 31 32 |
# File 'lib/marc/unsafe_xmlwriter.rb', line 26 def open_collection(include_namespace: true) if include_namespace NS_COLLECTION else COLLECTION end end |
.open_controlfield(tag) ⇒ Object
88 89 90 |
# File 'lib/marc/unsafe_xmlwriter.rb', line 88 def open_controlfield(tag) "<controlfield tag=\"#{tag}\">" end |
.open_datafield(tag, ind1, ind2) ⇒ Object
80 81 82 |
# File 'lib/marc/unsafe_xmlwriter.rb', line 80 def open_datafield(tag, ind1, ind2) "<datafield tag=\"#{tag}\" ind1=\"#{ind1}\" ind2=\"#{ind2}\">" end |
.open_record(include_namespace: true) ⇒ Object
34 35 36 37 38 39 40 |
# File 'lib/marc/unsafe_xmlwriter.rb', line 34 def open_record(include_namespace: true) if include_namespace NS_RECORD else RECORD end end |
.open_subfield(code) ⇒ Object
84 85 86 |
# File 'lib/marc/unsafe_xmlwriter.rb', line 84 def open_subfield(code) "<subfield code=\"#{code}\">" end |
.single_record_document(record, include_namespace: true) ⇒ Object
Produce an XML string with a single document in a collection
45 46 47 48 49 50 51 |
# File 'lib/marc/unsafe_xmlwriter.rb', line 45 def single_record_document(record, include_namespace: true) xml = XML_HEADER.dup xml << open_collection(include_namespace: include_namespace) xml << encode(record, include_namespace: false) xml << "</collection>".freeze xml end |
Instance Method Details
#write(record) ⇒ Object
Write the record to the target
20 21 22 |
# File 'lib/marc/unsafe_xmlwriter.rb', line 20 def write(record) @fh.write(self.class.encode(record)) end |