Class: XML::Digester

Inherits:
Object
  • Object
show all
Defined in:
lib/xml/digestr.rb

Overview

Processes XML input according to a series of rules applied prior to the parse beginning. A Digester instance wraps an instance of XML::SaxParser and uses it’s own event callbacks to trigger the configured rules as appropriate.

This API is based on the Jakarta Commons Digester library (jakarta.apache.org/commons/digester), and is intended to provide similar semantics to that package in a pleasingly Rubyish manner.

Notes

  • It’s not yet as fast as I’d like.

  • There is currently no namespace support.

Defined Under Namespace

Classes: BlockRule, CallMethodRule, CallParamRule, Error, LinkRule, ObjectCreateRule, RulesBase, SetNextRule, SetPropertiesRule, SetPropertyRule, SetTopRule

Constant Summary collapse

VERSION =
"0.0.1"

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(pedantic = false, parser = XML::SaxParser.new) ⇒ Digester

Create a new Digester. If a parser is supplied, be aware that some of it’s callbacks will be replaced:

  • on_start_document

  • on_start_element

  • on_characters

  • on_cdata_block

  • on_end_element

  • on_end_document



536
537
538
539
540
541
542
543
544
545
546
# File 'lib/xml/digestr.rb', line 536

def initialize(pedantic = false, parser = XML::SaxParser.new)
  @parser = parser
  @parser.on_start_document(&method(:cb_start_document))
  @parser.on_end_document(&method(:cb_end_document))
  @parser.on_start_element(&method(:cb_start_element))
  @parser.on_characters(&method(:cb_characters))
  @parser.on_cdata_block(&method(:cb_characters))
  @parser.on_end_element(&method(:cb_end_element))
  @pedantic = !!pedantic
  @userstack = []
end

Instance Attribute Details

#current_pathObject (readonly)

Obtain the path to the current element in the form: /a/b/c



512
513
514
# File 'lib/xml/digestr.rb', line 512

def current_path
  @current_path
end

#parserObject (readonly)

Obtain the XML::SaxParser used by this Digester



505
506
507
# File 'lib/xml/digestr.rb', line 505

def parser
  @parser
end

#pedanticObject Also known as: pedantic?

Determines whether rules are pedantic (e.g. raise Errors when some attributes cannot be matched, or when a method call fails).



516
517
518
# File 'lib/xml/digestr.rb', line 516

def pedantic
  @pedantic
end

#rulestackObject (readonly)

Obtain the rulestack - an auxiliary stack provided for rule-specific state storage.



509
510
511
# File 'lib/xml/digestr.rb', line 509

def rulestack
  @rulestack
end

#userstackObject (readonly) Also known as: stack

Obtain the userstack. This is the main stack used by the digester, and should be used read-only - you must use the mutators provided by this class (#push, #pop, etc) to ensure the digester state remains consistent.



523
524
525
# File 'lib/xml/digestr.rb', line 523

def userstack
  @userstack
end

Instance Method Details

#add_block(pattern, &blk) ⇒ Object

call-seq:

add_block(pattern) { |*args| ... }

Add a new BlockRule with the supplied block. See BlockRule for details of the block’s arguments.



618
619
620
# File 'lib/xml/digestr.rb', line 618

def add_block(pattern, &blk)
  add_rule(BlockRule.new(pattern, &blk))
end

#add_call_method(pattern, msg = nil, target_ofs = 0, *args, &blk) ⇒ Object

call-seq:

add_call_method(pattern, msg, target_ofs = 0, *args)
add_call_method(pattern) { |target| ... }
add_call_method(pattern, nil, target_ofs, *args) { |target, *args| ... }

Add a new CallMethodRule that will call the given method on the object at the given offset from the top of the stack (positive only, increasing distance from the stack top).



698
699
700
# File 'lib/xml/digestr.rb', line 698

def add_call_method(pattern, msg = nil, target_ofs = 0,*args, &blk)
  add_rule(CallMethodRule.new(pattern,msg,target_ofs,*args,&blk))
end

#add_call_param(pattern, param_idx = 0, source = nil, type = String) ⇒ Object

call-seq:

add_call_param(pattern, param_idx, attr_name, type = String)
add_call_param(pattern, param_idx, stack_index, type = String)
add_call_param(pattern, param_idx, nil, type = String)
add_call_param(pattern, param_idx = 0)

Add a new CallParamRule that will take it’s parameter value from the specified source, or the current element body if source is nil.



711
712
713
# File 'lib/xml/digestr.rb', line 711

def add_call_param(pattern, param_idx = 0, source = nil, type = String)
  add_rule(CallParamRule.new(pattern,param_idx,source,type))
end

#add_call_param_attribute(pattern, param_idx, attr_name, type = String) ⇒ Object

call-seq:

add_call_param_attribute(pattern, param_idx, attr_name, type = String)
add_call_param_attribute(pattern, param_idx, attr_name)

Add a new CallParamRule that will take it’s parameter value from the named attribute on the current element. This just calls through to add_call_param and is provided for xmldigester compatibility.



733
734
735
# File 'lib/xml/digestr.rb', line 733

def add_call_param_attribute(pattern, param_idx, attr_name, type = String)
  add_call_param(pattern, param_idx, attr_name, type)
end

#add_call_param_body(pattern, param_idx = 0, type = String) ⇒ Object

call-seq:

add_call_param_body(pattern, param_idx, type = String)
add_call_param_body(pattern, param_idx = 0)

Add a new CallParamRule that will take it’s parameter value from the current element body. This just calls through to add_call_param and is provided for xmldigester compatibility.



722
723
724
# File 'lib/xml/digestr.rb', line 722

def add_call_param_body(pattern, param_idx = 0, type = String)
  add_call_param(pattern,param_idx,nil,type)
end

#add_call_param_stack(pattern, param_idx, stack_ofs = 0, type = nil) ⇒ Object

call-seq:

add_call_param_stack(pattern, param_idx, stack_ofs, type = nil)
add_call_param_stack(pattern, param_idx = 0, stack_ofs = 0)

Add a new CallParamRule that will take it’s parameter value from the specified stack element. The stack offset should be a positive integer indicating the depth of the target object - zero (the default) indicates the top of the stack.

This just calls through to add_call_param and is provided for xmldigester compatibility.



748
749
750
# File 'lib/xml/digestr.rb', line 748

def add_call_param_stack(pattern, param_idx, stack_ofs = 0, type = nil)
  add_call_param(pattern, param_idx, stack_ofs, type)
end

call-seq:

add_link(pattern) { |parent, child| ... }

Add a new LinkRule that, when matched, will pass the top two elements (in order - see below) to the supplied block.



664
665
666
# File 'lib/xml/digestr.rb', line 664

def add_link(pattern, &blk)
  add_rule(LinkRule.new(pattern,&blk))
end

#add_object_create(pattern, klass = Object, msg = :new, *args, &blk) ⇒ Object Also known as: add_create_object

call-seq:

add_object_create(pattern, klass)
add_object_create(pattern, obj, message, *args)
add_object_create(pattern) { ... }

Add a new ObjectCreateRule with the specified class, method call or block.



629
630
631
# File 'lib/xml/digestr.rb', line 629

def add_object_create(pattern, klass = Object, msg = :new, *args, &blk)
  add_rule(ObjectCreateRule.new(pattern,klass,msg,*args,&blk))
end

#add_rule(rule) ⇒ Object

Add the specified rule to this digester.



602
603
604
605
606
607
608
609
610
611
# File 'lib/xml/digestr.rb', line 602

def add_rule(rule)
  rule.digester = self

  if @first_rule
    @last_rule.next, rule.prev = rule, @last_rule
    @last_rule = rule
  else
    @first_rule = @last_rule = rule
  end
end

#add_set_next(pattern, msg, *args) ⇒ Object

call-seq:

add_set_next(pattern, msg, *additional_args)

Add a new SetNextRule that will send the specified message to the next-to-top stack object, passing in the top object as the initial parameter, followed by any additional arguments supplied to this method.



675
676
677
# File 'lib/xml/digestr.rb', line 675

def add_set_next(pattern, msg, *args)
  add_rule(SetNextRule.new(pattern,msg,*args))
end

#add_set_properties(pattern, mapping = Hash.new {|h,k| h[k] = k}, &blk) ⇒ Object

call-seq:

add_set_properties(pattern, [mapping])
add_set_properties(pattern, [mapping]) { |target, attr, value| ... }

Add a new SetPropertiesRule. See SetPropertiesRule for details of the optional mapping format.



642
643
644
645
646
# File 'lib/xml/digestr.rb', line 642

def add_set_properties(pattern, 
                       mapping = Hash.new {|h,k| h[k] = k}, 
                       &blk)
  add_rule(SetPropertiesRule.new(pattern,mapping,&blk))
end

#add_set_property(pattern, name_attr, value_attr, type = String) ⇒ Object

call-seq:

add_set_property(pattern, name_attr = 'name', value_attr = 'value', type = String)
add_set_property(pattern, name_attr = 'name', value_attr = 'value', type = String) { |target, attr, value| ... }

Add a new SetPropertyRule that will set the ruby attribute named by the ‘name_attr’ XML attribute to the value specified by the ‘value_attr’ XML attribute.



655
656
657
# File 'lib/xml/digestr.rb', line 655

def add_set_property(pattern, name_attr, value_attr, type = String)
  add_rule(SetPropertyRule.new(pattern,name_attr,value_attr,type))
end

#add_set_top(pattern, msg, *args) ⇒ Object

call-seq:

add_set_top(pattern, msg, *additional_args)

Add a new SetTopRule that will send the specified message to the top stack object, passing in the next-to-top object as the initial parameter, followed by any additional arguments supplied to this method.



686
687
688
# File 'lib/xml/digestr.rb', line 686

def add_set_top(pattern, msg, *args)
  add_rule(SetTopRule.new(pattern,msg,*args))
end

#clearObject

Clear the user stack and reset the digester state.



571
572
573
574
# File 'lib/xml/digestr.rb', line 571

def clear
  @userstack.clear
  @rulestack = []
end

#current_elementObject

Obtain the name of the element currently being processed,



594
595
596
597
# File 'lib/xml/digestr.rb', line 594

def current_element
  stk = @current_path
  stk.slice(stk.rindex('/')..-1)
end

#parse_file(filename) ⇒ Object

call-seq:

parse_file(filename) -> first_object

Parse the specified XML file and trigger appropriate rules in this Digester. Returns the first element that was pushed onto the stack.



565
566
567
568
# File 'lib/xml/digestr.rb', line 565

def parse_file(filename)
  @parser.filename = filename
  do_parse
end

#parse_string(xml) ⇒ Object

call-seq:

parse_string(xml) -> first_object

Parse the specified XML string and trigger appropriate rules in this Digester. Returns the first element that was pushed onto the stack.



554
555
556
557
# File 'lib/xml/digestr.rb', line 554

def parse_string(xml)
  @parser.string = xml
  do_parse
end

#peekObject

Retrieve a reference to the top user stack element, without actually popping it from the stack.



578
579
580
# File 'lib/xml/digestr.rb', line 578

def peek
  @userstack.last
end

#popObject

Remove and return the top user stack element.



583
584
585
# File 'lib/xml/digestr.rb', line 583

def pop
  @userstack.pop
end

#push(o) ⇒ Object

Push the supplied object o onto the user stack.



588
589
590
591
# File 'lib/xml/digestr.rb', line 588

def push(o)
  @first ||= o
  @userstack.push(o)
end