Class: Infoboxer::Tree::Template
- Includes:
- Linkable
- Defined in:
- lib/infoboxer/tree/template.rb
Overview
Represents MediaWiki template.
Template is basically a thing with name, some variables and their values. When pages are displayed in browser, templates are rendered in something different by wiki engine; yet, when extracting information with Infoboxer, you are working with original templates.
It requires some mastering and understanding, yet allows to do very poweful things. There are many kinds of them, from pure formatting-related (which are typically not more than small bells and whistles for page outlook, and should be rendered as a text) to very information-heavy ones, like infoboxes, from which Infoboxer borrows its name!
Basically, for information extraction from template you'll list its #variables, and then use #fetch method (and its variants: #fetch_hash/##fetch_date) to extract their values.
On variables naming
MediaWiki templates can contain named and unnamed variables. Example:
{{birth date and age|1953|2|19|df=y}}
This is template with name "birth date and age", three unnamed variables with values "1953", "2" and "19", and one named variable with name "df" and value "y".
For consistency, Infoboxer treats unnamed variables exactly the same way MediaWiki does: they considered to have numeric names, which are started from 1 and stored as a strings. So, for template shown above, the following is correct:
template.fetch('1').text == '1953'
template.fetch('2').text == '2'
template.fetch('3').text == '19'
template.fetch('df').text == 'y'
Note also, that named variables with simple text values are duplicated as a template node Node#params, so, the following is correct also:
template.params['df'] == 'y'
template.params.has_key?('1') == false
For more advanced topics, like subclassing templates by names and converting them to inline text, please read Infoboxer::Templates module's documentation.
Direct Known Subclasses
Instance Attribute Summary collapse
-
#name ⇒ String
readonly
Template name, designating its contents structure.
Attributes inherited from Compound
Attributes inherited from Node
Instance Method Summary collapse
-
#fetch(*patterns) ⇒ Nodes<Var>
Fetches template variable(s) by name(s) or patterns.
-
#fetch_date(*patterns) ⇒ Date
Fetches date by list of variable names containing date components.
-
#fetch_hash(*patterns) ⇒ Hash<String => Var>
Fetches hash
{name => variable}
, by same patterns as #fetch. -
#follow ⇒ MediaWiki::Page
Extracts template source and returns it parsed (or nil, if template not found).
-
#initialize(name, variables = Nodes[]) ⇒ Template
constructor
A new instance of Template.
-
#link ⇒ Object
Wikilink name of this template's source.
- #named_variables ⇒ Object
- #text ⇒ Object
-
#to_h ⇒ Hash{String => String}
Represents entire template as hash of
String => String
, where keys are variable names and values are text representation of variables contents. -
#to_tree(level = 0) ⇒ Object
See Node#to_tree.
-
#unnamed_variables ⇒ Nodes<Var>
Returns list of template variables with numeric names (which are treated as "unnamed" variables by MediaWiki templates, see class docs for explanation).
- #unwrap ⇒ Object
Methods included from Linkable
Methods inherited from Compound
Methods inherited from Node
#==, #children, coder, def_readers, #first?, #index, #inspect, #next_siblings, #prev_siblings, #siblings, #text_, #to_s
Methods included from Navigation::Wikipath
Methods included from Navigation::Sections::Node
Methods included from Navigation::Shortcuts::Node
#bold?, #categories, #external_links, #heading?, #headings, #images, #infobox, #infoboxes, #italic?, #lists, #paragraphs, #tables, #templates, #wikilinks
Methods included from Navigation::Lookup::Node
#_lookup, #_lookup_children, #_lookup_next_siblings, #_lookup_parents, #_lookup_prev_sibling, #_lookup_prev_siblings, #_lookup_siblings, #_matches?, #lookup, #lookup_children, #lookup_next_siblings, #lookup_parents, #lookup_prev_sibling, #lookup_prev_siblings, #lookup_siblings, #matches?, #parent?
Constructor Details
Instance Attribute Details
#name ⇒ String (readonly)
Template name, designating its contents structure.
See also #url, which you can navigate to read template's definition (and, in Wikipedia and many other projects, its documentation).
106 107 108 |
# File 'lib/infoboxer/tree/template.rb', line 106 def name @name end |
Instance Method Details
#fetch(*patterns) ⇒ Nodes<Var>
Fetches template variable(s) by name(s) or patterns.
Usage:
argentina.infobox.fetch('leader_title_1') # => one Var node
argentina.infobox.fetch('leader_title_1',
'leader_name_1') # => two Var nodes
argentina.infobox.fetch(/leader_title_\d+/) # => several Var nodes
170 171 172 |
# File 'lib/infoboxer/tree/template.rb', line 170 def fetch(*patterns) Nodes[*patterns.map { |p| variables.find(name: p) }.flatten] end |
#fetch_date(*patterns) ⇒ Date
Fetches date by list of variable names containing date components.
(Experimental, subject to change or enchance.)
Explanation: if you have template like
{{birth date and age|1953|2|19|df=y}}
...there is a short way to obtain date from it:
template.fetch_date('1', '2', '3') # => Date.new(1953,2,19)
195 196 197 198 199 200 201 202 203 204 |
# File 'lib/infoboxer/tree/template.rb', line 195 def fetch_date(*patterns) components = fetch(*patterns) components.pop while components.last.nil? && !components.empty? if components.empty? nil else Date.new(*components.map { |v| v.to_s.to_i }) end end |
#fetch_hash(*patterns) ⇒ Hash<String => Var>
Fetches hash {name => variable}
, by same patterns as #fetch.
177 178 179 |
# File 'lib/infoboxer/tree/template.rb', line 177 def fetch_hash(*patterns) fetch(*patterns).map { |v| [v.name, v] }.to_h end |
#follow ⇒ MediaWiki::Page
Extracts template source and returns it parsed (or nil, if template not found).
NB: Infoboxer does NO variable substitution or other template evaluation actions. Moreover, it will almost certainly NOT parse template definitions correctly. You should use this method ONLY for "transclusion" templates (parts of content, which are included into other pages "as is").
Look for example at this page's source: each subtable about some region is just a transclusion of template. This can be processed like:
Infoboxer.wp.get('Tropical and subtropical coniferous forests').
templates(name: /forests^/).
follow.tables #.and_so_on
See also Linkable#follow for general notes on the following links.
|
# File 'lib/infoboxer/tree/template.rb', line 208
|
#link ⇒ Object
Wikilink name of this template's source.
234 235 236 237 |
# File 'lib/infoboxer/tree/template.rb', line 234 def link # FIXME: super-naive for now, doesn't thinks about subpages and stuff. "Template:#{name}" end |
#named_variables ⇒ Object
154 155 156 |
# File 'lib/infoboxer/tree/template.rb', line 154 def named_variables variables.select(&:named?) end |
#text ⇒ Object
121 122 123 124 |
# File 'lib/infoboxer/tree/template.rb', line 121 def text res = unnamed_variables.map(&:text).join('|') res.empty? ? '' : "{#{name}:#{res}}" end |
#to_h ⇒ Hash{String => String}
Represents entire template as hash of String => String
,
where keys are variable names and values are text representation
of variables contents.
141 142 143 |
# File 'lib/infoboxer/tree/template.rb', line 141 def to_h variables.map { |var| [var.name, var.text] }.to_h end |
#to_tree(level = 0) ⇒ Object
See Node#to_tree
131 132 133 134 |
# File 'lib/infoboxer/tree/template.rb', line 131 def to_tree(level = 0) ' ' * level + "<#{descr}>\n" + variables.map { |var| var.to_tree(level + 1) }.join end |
#unnamed_variables ⇒ Nodes<Var>
Returns list of template variables with numeric names (which are treated as "unnamed" variables by MediaWiki templates, see class docs for explanation).
150 151 152 |
# File 'lib/infoboxer/tree/template.rb', line 150 def unnamed_variables variables.reject(&:named?) end |
#unwrap ⇒ Object
126 127 128 |
# File 'lib/infoboxer/tree/template.rb', line 126 def unwrap unnamed_variables.flat_map(&:children).unwrap end |