Class: Transformers::Distilbert::DistilBertTokenizer::BasicTokenizer
- Inherits:
-
Object
- Object
- Transformers::Distilbert::DistilBertTokenizer::BasicTokenizer
- Defined in:
- lib/transformers/models/distilbert/tokenization_distilbert.rb
Instance Attribute Summary collapse
-
#do_lower_case ⇒ Object
readonly
Returns the value of attribute do_lower_case.
-
#strip_accents ⇒ Object
readonly
Returns the value of attribute strip_accents.
-
#tokenize_chinese_chars ⇒ Object
readonly
Returns the value of attribute tokenize_chinese_chars.
Instance Method Summary collapse
-
#initialize(do_lower_case: true, never_split: nil, tokenize_chinese_chars: true, strip_accents: nil, do_split_on_punc: true) ⇒ BasicTokenizer
constructor
A new instance of BasicTokenizer.
Constructor Details
#initialize(do_lower_case: true, never_split: nil, tokenize_chinese_chars: true, strip_accents: nil, do_split_on_punc: true) ⇒ BasicTokenizer
Returns a new instance of BasicTokenizer.
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
# File 'lib/transformers/models/distilbert/tokenization_distilbert.rb', line 26 def initialize( do_lower_case: true, never_split: nil, tokenize_chinese_chars: true, strip_accents: nil, do_split_on_punc: true ) if never_split.nil? never_split = [] end @do_lower_case = do_lower_case @never_split = Set.new(never_split) @tokenize_chinese_chars = tokenize_chinese_chars @strip_accents = strip_accents @do_split_on_punc = do_split_on_punc end |
Instance Attribute Details
#do_lower_case ⇒ Object (readonly)
Returns the value of attribute do_lower_case.
24 25 26 |
# File 'lib/transformers/models/distilbert/tokenization_distilbert.rb', line 24 def do_lower_case @do_lower_case end |
#strip_accents ⇒ Object (readonly)
Returns the value of attribute strip_accents.
24 25 26 |
# File 'lib/transformers/models/distilbert/tokenization_distilbert.rb', line 24 def strip_accents @strip_accents end |
#tokenize_chinese_chars ⇒ Object (readonly)
Returns the value of attribute tokenize_chinese_chars.
24 25 26 |
# File 'lib/transformers/models/distilbert/tokenization_distilbert.rb', line 24 def tokenize_chinese_chars @tokenize_chinese_chars end |