Class: Spacy::Matcher

Inherits:
Object
  • Object
show all
Defined in:
lib/ruby-spacy.rb

Overview

See also spaCy Python API document for [‘Matcher`](spacy.io/api/matcher).

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(nlp) ⇒ Matcher

Creates a Spacy::Matcher instance

Parameters:



469
470
471
# File 'lib/ruby-spacy.rb', line 469

def initialize(nlp)
  @py_matcher = PyMatcher.call(nlp.vocab)
end

Instance Attribute Details

#py_matcherObject (readonly)

Returns a Python ‘Matcher` instance accessible via `PyCall`.

Returns:

  • (Object)

    a Python ‘Matcher` instance accessible via `PyCall`



465
466
467
# File 'lib/ruby-spacy.rb', line 465

def py_matcher
  @py_matcher
end

Instance Method Details

#add(text, pattern) ⇒ Object

Adds a label string and a text pattern.

Parameters:

  • text (String)

    a label string given to the pattern

  • pattern (Array<Array<Hash>>)

    sequences of text patterns that are alternative to each other



476
477
478
# File 'lib/ruby-spacy.rb', line 476

def add(text, pattern)
  @py_matcher.add(text, pattern)
end

#match(doc) ⇒ Array<Hash{:match_id => Integer, :start_index => Integer, :end_index => Integer}>

Execute the match.

Parameters:

  • doc (Doc)

    an Doc instance

Returns:

  • (Array<Hash{:match_id => Integer, :start_index => Integer, :end_index => Integer}>)

    the id of the matched pattern, the starting position, and the end position



483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
# File 'lib/ruby-spacy.rb', line 483

def match(doc)
  str_results = @py_matcher.call(doc.py_doc).to_s
  s = StringScanner.new(str_results[1..-2])
  results = []
  while s.scan_until(/(\d+), (\d+), (\d+)/)
    next unless s.matched

    triple = s.matched.split(", ")
    match_id = triple[0].to_i
    start_index = triple[1].to_i
    end_index = triple[2].to_i - 1
    results << { match_id: match_id, start_index: start_index, end_index: end_index }
  end
  results
end