Class: Myasorubka::MSD
- Inherits:
-
Object
- Object
- Myasorubka::MSD
- Defined in:
- lib/myasorubka/msd.rb
Overview
MSD is a morphosyntactic descriptor model.
This representation, with the concrete applications which display and exemplify the attributes and values and provide their internal constraints and relationships, makes the proposal self-explanatory. Other groups can easily test the specifications on their language, simply by following the method of the applications. The possibility of incorporating idiosyncratic classes and distinctions after the common core features makes the proposal relatively adaptable and flexible, without compromising compatibility.
MSD implementation and documentation are based on MULTEXT-East Morphosyntactic Specifications, Version 4: nl.ijs.si/ME/V4/msd/html/msd.html
You may use Myasorubka::MSD either as parser and generator.
“‘ruby msd = Myasorubka::MSD.new(Myasorubka::MSD::Russian) msd = :noun msd = :common msd = :plural msd = :locative msd.to_s # => “Nc-pl” “`
“‘ruby msd = Myasorubka::MSD.new(Myasorubka::MSD::Russian, ’Vmps-snpfel’) msd # => :verb msd # => :past msd # => nil msd.grammemes # => :vform=>:participle, … “‘
Defined Under Namespace
Modules: English, Russian Classes: InvalidDescriptor
Constant Summary collapse
- EMPTY_DESCRIPTOR =
Empty descriptor character.
'-'
Instance Attribute Summary collapse
-
#grammemes ⇒ Object
readonly
Returns the value of attribute grammemes.
-
#language ⇒ Object
readonly
Returns the value of attribute language.
-
#pos ⇒ Object
Returns the value of attribute pos.
Instance Method Summary collapse
- #<=>(other) ⇒ Object
- #==(other) ⇒ Object
-
#[](key) ⇒ Symbol
Retrieves the morphosyntactic descriptor corresponding to the ‘key` object.
-
#[]=(key, value) ⇒ Symbol
Assignes the morphosyntactic descriptor given by ‘value` with the key given by `key` object.
-
#initialize(language, msd = '') ⇒ MSD
constructor
Creates a new morphosyntactic descriptor model instance.
- #inspect ⇒ Object
-
#merge!(hash) ⇒ MSD
Merges grammemes that are stored in ‘hash` into the MSD grammemes.
-
#to_regexp ⇒ Regexp
Generates Regexp from the MSD that is useful to perform database queries.
- #to_s ⇒ Object
-
#valid? ⇒ true, false
Validates the MSD instance.
Constructor Details
#initialize(language, msd = '') ⇒ MSD
Creates a new morphosyntactic descriptor model instance. Please specify a ‘language` module with defined `CATEGORIES`.
Optionally, you can parse MSD string that is passed as ‘msd` argument.
63 64 65 66 67 68 69 70 71 72 |
# File 'lib/myasorubka/msd.rb', line 63 def initialize(language, msd = '') @language, @pos, @grammemes = language, nil, {} unless language.const_defined? 'CATEGORIES' raise ArgumentError, 'given language has no morphosyntactic descriptions' end parse! msd if msd && !msd.empty? end |
Instance Attribute Details
#grammemes ⇒ Object (readonly)
Returns the value of attribute grammemes.
50 51 52 |
# File 'lib/myasorubka/msd.rb', line 50 def grammemes @grammemes end |
#language ⇒ Object (readonly)
Returns the value of attribute language.
50 51 52 |
# File 'lib/myasorubka/msd.rb', line 50 def language @language end |
#pos ⇒ Object
Returns the value of attribute pos.
51 52 53 |
# File 'lib/myasorubka/msd.rb', line 51 def pos @pos end |
Instance Method Details
#<=>(other) ⇒ Object
104 105 106 |
# File 'lib/myasorubka/msd.rb', line 104 def <=> other to_s <=> other.to_s end |
#==(other) ⇒ Object
109 110 111 |
# File 'lib/myasorubka/msd.rb', line 109 def == other to_s == other.to_s end |
#[](key) ⇒ Symbol
Retrieves the morphosyntactic descriptor corresponding to the ‘key` object. If not, returns `nil`.
80 81 82 83 |
# File 'lib/myasorubka/msd.rb', line 80 def [] key return pos if :pos == key grammemes[key] end |
#[]=(key, value) ⇒ Symbol
Assignes the morphosyntactic descriptor given by ‘value` with the key given by `key` object.
92 93 94 95 96 |
# File 'lib/myasorubka/msd.rb', line 92 def []= key, value return @pos = value if :pos == key raise InvalidDescriptor, 'category is not set yet' unless pos grammemes[key] = value end |
#inspect ⇒ Object
99 100 101 |
# File 'lib/myasorubka/msd.rb', line 99 def inspect '#<%s msd=%s>' % [language.name, to_s.inspect] end |
#merge!(hash) ⇒ MSD
Merges grammemes that are stored in ‘hash` into the MSD grammemes.
140 141 142 143 144 145 146 |
# File 'lib/myasorubka/msd.rb', line 140 def merge! hash hash.each do |key, value| self[key.to_sym] = value.to_sym end self end |
#to_regexp ⇒ Regexp
Generates Regexp from the MSD that is useful to perform database queries.
“‘ruby msd = Myasorubka::MSD.new(Myasorubka::MSD::Russian, ’Vm’) r = msd.to_regexp # => /^Vm.*$/ ‘Vmp’ =~ r # 0 ‘Nc-pl’ =~ r # nil “‘
125 126 127 128 129 130 131 132 |
# File 'lib/myasorubka/msd.rb', line 125 def to_regexp Regexp.new([ '^', self.to_s.gsub(EMPTY_DESCRIPTOR, '.'), '.*', '$' ].join) end |
#to_s ⇒ Object
149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 |
# File 'lib/myasorubka/msd.rb', line 149 def to_s return '' unless pos unless category = language::CATEGORIES[pos] raise InvalidDescriptor, "category is nil" end msd = [category[:code]] attrs = category[:attrs] grammemes.each do |attr_name, value| next unless value attr_index = attrs.index { |name, *values| name == attr_name } unless attr_index raise InvalidDescriptor, 'no such attribute "%s" of category "%s"' % [attr_name, pos] end attr_name, values = attrs[attr_index] unless attr_value = values[value] raise InvalidDescriptor, 'no such attribute "%s" ' \ 'for attribute "%s" of category "%s"' % [value, attr_name, pos] end msd[attr_index + 1] = attr_value end msd.map { |e| e || EMPTY_DESCRIPTOR }.join end |
#valid? ⇒ true, false
Validates the MSD instance.
185 186 187 188 189 |
# File 'lib/myasorubka/msd.rb', line 185 def valid? !!to_s rescue InvalidDescriptor false end |