Class: Scanf::FormatSpecifier
- Inherits:
-
Object
- Object
- Scanf::FormatSpecifier
- Defined in:
- lib/scanf.rb
Overview
Technical notes
Rationale behind scanf for Ruby
The impetus for a scanf implementation in Ruby comes chiefly from the fact that existing pattern matching operations, such as Regexp#match and String#scan, return all results as strings, which have to be converted to integers or floats explicitly in cases where what’s ultimately wanted are integer or float values.
Design of scanf for Ruby
scanf for Ruby is essentially a <format string>-to-<regular expression> converter.
When scanf is called, a FormatString object is generated from the format string (“%d%s…”) argument. The FormatString object breaks the format string down into atoms (“%d”, “%5f”, “blah”, etc.), and from each atom it creates a FormatSpecifier object, which it saves.
Each FormatSpecifier has a regular expression fragment and a “handler” associated with it. For example, the regular expression fragment associated with the format “%d” is “([-+]?d+)”, and the handler associated with it is a wrapper around String#to_i. scanf itself calls FormatString#match, passing in the input string. FormatString#match iterates through its FormatSpecifiers; for each one, it matches the corresponding regular expression fragment against the string. If there’s a match, it sends the matched string to the handler associated with the FormatSpecifier.
Thus, to follow up the “%d” example: if “123” occurs in the input string when a FormatSpecifier consisting of “%d” is reached, the “123” will be matched against “([-+]?d+)”, and the matched string will be rendered into an integer by a call to to_i.
The rendered match is then saved to an accumulator array, and the input string is reduced to the post-match substring. Thus the string is “eaten” from the left as the FormatSpecifiers are applied in sequence. (This is done to a duplicate string; the original string is not altered.)
As soon as a regular expression fragment fails to match the string, or when the FormatString object runs out of FormatSpecifiers, scanning stops and results accumulated so far are returned in an array.
Instance Attribute Summary collapse
-
#conversion ⇒ Object
readonly
Returns the value of attribute conversion.
-
#matched ⇒ Object
readonly
Returns the value of attribute matched.
-
#matched_string ⇒ Object
readonly
Returns the value of attribute matched_string.
-
#re_string ⇒ Object
readonly
Returns the value of attribute re_string.
Instance Method Summary collapse
- #count_space? ⇒ Boolean
-
#initialize(str) ⇒ FormatSpecifier
constructor
A new instance of FormatSpecifier.
- #letter ⇒ Object
- #match(str) ⇒ Object
- #mid_match? ⇒ Boolean
- #to_re ⇒ Object
- #to_s ⇒ Object
- #width ⇒ Object
Constructor Details
#initialize(str) ⇒ FormatSpecifier
Returns a new instance of FormatSpecifier.
331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 |
# File 'lib/scanf.rb', line 331 def initialize(str) @spec_string = str h = '[A-Fa-f0-9]' @re_string, @handler = case @spec_string # %[[:...:]] when /%\*?(\[\[:[a-z]+:\]\])/ [ "(#{$1}+)", :extract_plain ] # %5[[:...:]] when /%\*?(\d+)(\[\[:[a-z]+:\]\])/ [ "(#{$2}{1,#{$1}})", :extract_plain ] # %[...] when /%\*?\[([^\]]*)\]/ yes = $1 if /^\^/.match(yes) then no = yes[1..-1] else no = '^' + yes end [ "([#{yes}]+)(?=[#{no}]|\\z)", :extract_plain ] # %5[...] when /%\*?(\d+)\[([^\]]*)\]/ yes = $2 w = $1 [ "([#{yes}]{1,#{w}})", :extract_plain ] # %i when /%\*?i/ [ "([-+]?(?:(?:0[0-7]+)|(?:0[Xx]#{h}+)|(?:[1-9]\\d*)))", :extract_integer ] # %5i when /%\*?(\d+)i/ n = $1.to_i s = "(" if n > 1 then s += "[1-9]\\d{1,#{n-1}}|" end if n > 1 then s += "0[0-7]{1,#{n-1}}|" end if n > 2 then s += "[-+]0[0-7]{1,#{n-2}}|" end if n > 2 then s += "[-+][1-9]\\d{1,#{n-2}}|" end if n > 2 then s += "0[Xx]#{h}{1,#{n-2}}|" end if n > 3 then s += "[-+]0[Xx]#{h}{1,#{n-3}}|" end s += "\\d" s += ")" [ s, :extract_integer ] # %d, %u when /%\*?[du]/ [ '([-+]?\d+)', :extract_decimal ] # %5d, %5u when /%\*?(\d+)[du]/ n = $1.to_i s = "(" if n > 1 then s += "[-+]\\d{1,#{n-1}}|" end s += "\\d{1,#{$1}})" [ s, :extract_decimal ] # %x when /%\*?[Xx]/ [ "([-+]?(?:0[Xx])?#{h}+)", :extract_hex ] # %5x when /%\*?(\d+)[Xx]/ n = $1.to_i s = "(" if n > 3 then s += "[-+]0[Xx]#{h}{1,#{n-3}}|" end if n > 2 then s += "0[Xx]#{h}{1,#{n-2}}|" end if n > 1 then s += "[-+]#{h}{1,#{n-1}}|" end s += "#{h}{1,#{n}}" s += ")" [ s, :extract_hex ] # %o when /%\*?o/ [ '([-+]?[0-7]+)', :extract_octal ] # %5o when /%\*?(\d+)o/ [ "([-+][0-7]{1,#{$1.to_i-1}}|[0-7]{1,#{$1}})", :extract_octal ] # %f when /%\*?[aefgAEFG]/ [ '([-+]?(?:0[xX](?:\.\h+|\h+(?:\.\h*)?)[pP][-+]\d+|\d+(?![\d.])|\d*\.\d*(?:[eE][-+]?\d+)?))', :extract_float ] # %5f when /%\*?(\d+)[aefgAEFG]/ [ '(?=[-+]?(?:0[xX](?:\.\h+|\h+(?:\.\h*)?)[pP][-+]\d+|\d+(?![\d.])|\d*\.\d*(?:[eE][-+]?\d+)?))' + "(\\S{1,#{$1}})", :extract_float ] # %5s when /%\*?(\d+)s/ [ "(\\S{1,#{$1}})", :extract_plain ] # %s when /%\*?s/ [ '(\S+)', :extract_plain ] # %c when /\s%\*?c/ [ "\\s*(.)", :extract_plain ] # %c when /%\*?c/ [ "(.)", :extract_plain ] # %5c (whitespace issues are handled by the count_*_space? methods) when /%\*?(\d+)c/ [ "(.{1,#{$1}})", :extract_plain ] # %% when /%%/ [ '(\s*%)', :nil_proc ] # literal characters else [ "(#{Regexp.escape(@spec_string)})", :nil_proc ] end @re_string = '\A' + @re_string end |
Instance Attribute Details
#conversion ⇒ Object (readonly)
Returns the value of attribute conversion
289 290 291 |
# File 'lib/scanf.rb', line 289 def conversion @conversion end |
#matched ⇒ Object (readonly)
Returns the value of attribute matched
289 290 291 |
# File 'lib/scanf.rb', line 289 def matched @matched end |
#matched_string ⇒ Object (readonly)
Returns the value of attribute matched_string
289 290 291 |
# File 'lib/scanf.rb', line 289 def matched_string @matched_string end |
#re_string ⇒ Object (readonly)
Returns the value of attribute re_string
289 290 291 |
# File 'lib/scanf.rb', line 289 def re_string @re_string end |
Instance Method Details
#count_space? ⇒ Boolean
327 328 329 |
# File 'lib/scanf.rb', line 327 def count_space? /(?:\A|\S)%\*?\d*c|%\d*\[/.match(@spec_string) end |
#letter ⇒ Object
469 470 471 |
# File 'lib/scanf.rb', line 469 def letter @spec_string[/%\*?\d*([a-z\[])/, 1] end |
#match(str) ⇒ Object
456 457 458 459 460 461 462 463 464 465 466 467 |
# File 'lib/scanf.rb', line 456 def match(str) @matched = false s = str.dup s.sub!(/\A\s+/,'') unless count_space? res = to_re.match(s) if res @conversion = send(@handler, res[1]) @matched_string = @conversion.to_s @matched = true end res end |
#mid_match? ⇒ Boolean
478 479 480 481 482 483 484 485 |
# File 'lib/scanf.rb', line 478 def mid_match? return false unless @matched cc_no_width = letter == '[' &&! width c_or_cc_width = (letter == 'c' || letter == '[') && width width_left = c_or_cc_width && (matched_string.size < width) return width_left || cc_no_width end |
#to_re ⇒ Object
452 453 454 |
# File 'lib/scanf.rb', line 452 def to_re Regexp.new(@re_string,Regexp::MULTILINE) end |
#to_s ⇒ Object
323 324 325 |
# File 'lib/scanf.rb', line 323 def to_s @spec_string end |
#width ⇒ Object
473 474 475 476 |
# File 'lib/scanf.rb', line 473 def width w = @spec_string[/%\*?(\d+)/, 1] w && w.to_i end |