Class: TaliaCore::DataTypes::XmlData
- Inherits:
-
FileRecord
- Object
- ActiveRecord::Base
- DataRecord
- FileRecord
- TaliaCore::DataTypes::XmlData
- Defined in:
- lib/talia_core/data_types/xml_data.rb
Overview
FileRecord class to store XML (or XHTML) files.
Instance Attribute Summary
Attributes inherited from DataRecord
Instance Method Summary collapse
-
#create_from_data(location, data, options = {:tidy => true}) ⇒ Object
See the FileStore module for details on how creation of data file objects works.
-
#extract_mime_type(location) ⇒ Object
MIME type should be one of ‘text/html’ or ‘text/xml’ (‘text/hnml’ is supported for legacy reasons).
-
#get_content(options = {}) ⇒ Object
The content of this document.
-
#get_content_string(options = nil) ⇒ Object
Same as #get_content, but returns a string instead of the REXML documents.
-
#get_escaped_content_string(options = nil) ⇒ Object
Same as #get_content_string, but with the XML escape for inclusion in HTML documents.
-
#mime_subtype ⇒ Object
The mime subtype for this specified class.
Methods inherited from FileRecord
#all_bytes, #get_byte, #position, #reset, #seek, #size
Methods included from PathHelpers::ClassMethods
Methods included from DataLoader::ClassMethods
Methods included from IipLoader
#convert_original?, #create_from_files, #create_from_stream, #create_iip, #open_original_image, #open_original_image_file, #open_original_image_stream, #orig_location, #prepare_image_from_existing!
Methods included from TaliaUtil::IoHelper
#base_for, #file_url, #open_from_url, #open_generic
Methods included from PathHelpers
#data_directory, #data_path, #extract_filename, #file_path, #full_filename, #static_path, #tempfile_path
Methods included from FileStore
#all_text, #assign_type, #create_from_file, #is_file_open?, #write_file_after_save
Methods inherited from DataRecord
#all_bytes, #content_string, find_by_type_and_location!, find_data_records, #get_byte, #mime_type, #position, #reset, #seek, #size
Instance Method Details
#create_from_data(location, data, options = {:tidy => true}) ⇒ Object
See the FileStore module for details on how creation of data file objects works. This version differs from the superclass version in that it will (optionally) clean the HTML using the “tidy” tool. Also see tidy.rubyforge.org/
Tidy will be used under the following circumstances:
-
The “tidy” option is given and
-
The library itself is available and
-
The file appears to be a (X)HTML file
Options::
- tidy
-
Use the “tidy” tool to clean up (X)HTML. Defaults to true if no options are given.
120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 |
# File 'lib/talia_core/data_types/xml_data.rb', line 120 def create_from_data(location, data, = {:tidy => true}) # check tidy option if ((([:tidy] == true) and (Tidy_enable == true)) and ((File.extname(location) == '.htm') or (File.extname(location) == '.html') or (File.extname(location) == '.xhtml'))) # apply tidy on data data_to_write = Tidy.open(:show_warnings => false) do |tidy| tidy..output_xhtml = true tidy..tidy_mark = false xhtml = tidy.clean(data) xhtml end else data_to_write = data end # write data super(location, data_to_write, ) end |
#extract_mime_type(location) ⇒ Object
MIME type should be one of ‘text/html’ or ‘text/xml’ (‘text/hnml’ is supported for legacy reasons)
37 38 39 40 41 42 43 44 45 46 47 48 |
# File 'lib/talia_core/data_types/xml_data.rb', line 37 def extract_mime_type(location) # TODO: Could probably use the Mime classes to get the # type, or move to the superclass case File.extname(location).downcase when '.htm', '.html','.xhtml' 'text/html' when '.hnml' 'text/hnml' when '.xml' 'text/xml' end end |
#get_content(options = {}) ⇒ Object
The content of this document. This returns REXML elements for the document content. For plain XML files, this will return the children of the doucment root. For XHTML documents, this will return the children of the “body” tag.
Options:
- xsl_file
-
If given, the document will be transformed using this XSL file before the document is extracted
64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 |
# File 'lib/talia_core/data_types/xml_data.rb', line 64 def get_content( = {}) # TODO: Maybe port this to hpricot/nokogiri too text_to_parse = all_text # if xsl_file option is specified, execute transformation if ([:xsl_file]) text_to_parse = xslt_transform(file_path, [:xsl_file]) end # create document object document = REXML::Document.new text_to_parse # get content if ((mime_subtype == "html") or ((mime_subtype == "xml") and (!.nil?) and (![:xsl_file].nil?))) content = document.elements['//body'].elements elsif ((mime_subtype == "xml") or (mime_subtype == "hnml")) content = document.root.elements end # adjust/replace items path content.each { |i| wrapItem i } # return content return content end |
#get_content_string(options = nil) ⇒ Object
Same as #get_content, but returns a string instead of the REXML documents
92 93 94 95 96 97 98 |
# File 'lib/talia_core/data_types/xml_data.rb', line 92 def get_content_string( = nil) xml_str = '' get_content().each do |element| xml_str << element.to_s end xml_str end |
#get_escaped_content_string(options = nil) ⇒ Object
Same as #get_content_string, but with the XML escape for inclusion in HTML documents
102 103 104 |
# File 'lib/talia_core/data_types/xml_data.rb', line 102 def get_escaped_content_string( = nil) get_content_string().gsub(/</, "<").gsub(/>/, ">") end |
#mime_subtype ⇒ Object
The mime subtype for this specified class
51 52 53 |
# File 'lib/talia_core/data_types/xml_data.rb', line 51 def mime_subtype mime_type.split(/\//)[1] end |