Class: TextRank::TokenFilter::PartOfSpeech
- Inherits:
-
Object
- Object
- TextRank::TokenFilter::PartOfSpeech
- Defined in:
- lib/text_rank/token_filter/part_of_speech.rb
Overview
Token filter to keep only a selected set of parts of speech
= Example
PartOfSpeech.new(parts_to_keep: %w[nn nns]).filter!(%w[ all men are by nature free ]) => ["men", "nature"]
Instance Method Summary collapse
-
#filter!(tokens) ⇒ Array<String>
Perform the filter.
-
#initialize(parts_to_keep: %w[nn nnp nnps nns jj jjr jjs vb vbd vbg vbn vbp vbz],, **_) ⇒ PartOfSpeech
constructor
A new instance of PartOfSpeech.
Constructor Details
#initialize(parts_to_keep: %w[nn nnp nnps nns jj jjr jjs vb vbd vbg vbn vbp vbz],, **_) ⇒ PartOfSpeech
Returns a new instance of PartOfSpeech.
19 20 21 22 23 |
# File 'lib/text_rank/token_filter/part_of_speech.rb', line 19 def initialize(parts_to_keep: %w[nn nnp nnps nns jj jjr jjs vb vbd vbg vbn vbp vbz], **_) @parts_to_keep = Set.new(parts_to_keep) @eng_tagger = EngTagger.new @last_pos_tag = 'pp' end |
Instance Method Details
#filter!(tokens) ⇒ Array<String>
Perform the filter
28 29 30 31 32 |
# File 'lib/text_rank/token_filter/part_of_speech.rb', line 28 def filter!(tokens) tokens.keep_if do |token| @parts_to_keep.include?(pos_tag(token)) end end |