Class: Cacofonix::Reader

Inherits:
Object
  • Object
show all
Includes:
Enumerable
Defined in:
lib/cacofonix/core/reader.rb

Overview

This is the primary class for reading data from an ONIX file.

Each ONIX file should contain a single header, and 1 or more products:

reader = Cacofonix::Reader.new("somefile.xml")

puts reader.header.inspect

reader.each do |product|
  puts product.inspect
end

The header will be returned as an Cacofonix::Header object, and the product will be an Cacofonix::Product.

The Cacofonix::Product class can be a bit of a hassle to work with, as data can be nested in it fairly deeply. To wrap all the products returned by the reader in a shim that provides simple accessor access to common attributes, pass the shim class as a second argument.

reader = Cacofonix::Reader.new("somefile.xml", Cacofonix::APAProduct)

puts reader.header.inspect

reader.each do |product|
  puts product.inspect
end

APAProduct stands for Australian Publishers Association and provides simple access to the ONIX attributes that are commonly used in the Australian market.

As well as accessing the file header, there are handful of other read only attributes that might be useful

reader = Cacofonix::Reader.new("somefile.xml", Cacofonix::APAProduct)

puts reader.xml_lang
puts reader.xml_version

Note that ONIX has 1500 valid named entities (such as –) that can cause the parser to throw an exception because it doesn’t recognise them. To work around this, the parser loads the ONIX DTD. It generally has to do this over the internet, which slows parsing considerably. To skip DTD loading (running the risk of parser exceptions), you can pass an option to the constructor:

reader = Cacofonix::Reader.new("somefile.xml", Cacofonix::Product, :dtd => false)

For more information, see is.gd/p7fHQq

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(input, product_klass = ::Cacofonix::Product, options = {}) ⇒ Reader

Options:

:dtd - if false, then DTD is not loaded before parsing
:interpret - a module (or an array of modules) that should extend
             each Product


71
72
73
74
75
76
77
78
79
80
81
82
83
# File 'lib/cacofonix/core/reader.rb', line 71

def initialize(input, product_klass = ::Cacofonix::Product, options = {})
  @input = input
  @product_klass = product_klass
  @options = options || {}

  create_parser

  @release = find_release
  @header = find_header

  @xml_lang    ||= @reader.lang
  @xml_version ||= @reader.xml_version.to_f
end

Instance Attribute Details

#headerObject (readonly)

Returns the value of attribute header.



62
63
64
# File 'lib/cacofonix/core/reader.rb', line 62

def header
  @header
end

#releaseObject (readonly)

Returns the value of attribute release.



62
63
64
# File 'lib/cacofonix/core/reader.rb', line 62

def release
  @release
end

#xml_langObject (readonly)

Returns the value of attribute xml_lang.



63
64
65
# File 'lib/cacofonix/core/reader.rb', line 63

def xml_lang
  @xml_lang
end

#xml_versionObject (readonly)

Returns the value of attribute xml_version.



63
64
65
# File 'lib/cacofonix/core/reader.rb', line 63

def xml_version
  @xml_version
end

Instance Method Details

#closeObject



111
112
113
# File 'lib/cacofonix/core/reader.rb', line 111

def close
  puts "Reader#close is deprecated."
end

#each(&block) ⇒ Object

Iterate over all the products in an ONIX file



87
88
89
90
91
92
93
94
95
96
# File 'lib/cacofonix/core/reader.rb', line 87

def each(&block)
  @reader.each do |node|
    if @reader.node_type == 1 && @reader.name == "Product"
      str = @reader.outer_xml
      product = str.nil? ? @product_klass.new : @product_klass.from_xml(str)
      product.interpret @options[:interpret]
      yield product
    end
  end
end

#productsObject

Assemble all the products in the ONIX file into an array. Obviously this will chew through memory with very large ONIX files, so use with care.



101
102
103
# File 'lib/cacofonix/core/reader.rb', line 101

def products
  @products ||= [].tap { |prods| each { |prod| prods << prod } }
end

#rewindObject

Rewind the reader so that you can call ‘each’ again.



107
108
109
# File 'lib/cacofonix/core/reader.rb', line 107

def rewind
  create_parser
end