Method: Enumerable#chunk
- Defined in:
- enum.c
#chunk {|array| ... } ⇒ Object
Each element in the returned enumerator is a 2-element array consisting of:
-
A value returned by the block.
-
An array (“chunk”) containing the element for which that value was returned, and all following elements for which the block returned the same value:
So that:
-
Each block return value that is different from its predecessor begins a new chunk.
-
Each block return value that is the same as its predecessor continues the same chunk.
Example:
e = (0..10).chunk {|i| (i / 3).floor } # => #<Enumerator: ...>
# The enumerator elements.
e.next # => [0, [0, 1, 2]]
e.next # => [1, [3, 4, 5]]
e.next # => [2, [6, 7, 8]]
e.next # => [3, [9, 10]]
Method chunk is especially useful for an enumerable that is already sorted. This example counts words for each initial letter in a large array of words:
# Get sorted words from a web page.
url = 'https://raw.githubusercontent.com/eneko/data-repository/master/data/words.txt'
words = URI::open(url).readlines
# Make chunks, one for each letter.
e = words.chunk {|word| word.upcase[0] } # => #<Enumerator: ...>
# Display 'A' through 'F'.
e.each {|c, words| p [c, words.length]; break if c == 'F' }
Output:
["A", 17096]
["B", 11070]
["C", 19901]
["D", 10896]
["E", 8736]
["F", 6860]
You can use the special symbol :_alone to force an element into its own separate chuck:
a = [0, 0, 1, 1]
e = a.chunk{|i| i.even? ? :_alone : true }
e.to_a # => [[:_alone, [0]], [:_alone, [0]], [true, [1, 1]]]
For example, you can put each line that contains a URL into its own chunk:
pattern = /http/
open(filename) { |f|
f.chunk { |line| line =~ pattern ? :_alone : true }.each { |key, lines|
pp lines
}
}
You can use the special symbol :_separator or nil to force an element to be ignored (not included in any chunk):
a = [0, 0, -1, 1, 1]
e = a.chunk{|i| i < 0 ? :_separator : true }
e.to_a # => [[true, [0, 0]], [true, [1, 1]]]
Note that the separator does end the chunk:
a = [0, 0, -1, 1, -1, 1]
e = a.chunk{|i| i < 0 ? :_separator : true }
e.to_a # => [[true, [0, 0]], [true, [1]], [true, [1]]]
For example, the sequence of hyphens in svn log can be eliminated as follows:
sep = "-"*72 + "\n"
IO.popen("svn log README") { |f|
f.chunk { |line|
line != sep || nil
}.each { |_, lines|
pp lines
}
}
#=> ["r20018 | knu | 2008-10-29 13:20:42 +0900 (Wed, 29 Oct 2008) | 2 lines\n",
# "\n",
# "* README, README.ja: Update the portability section.\n",
# "\n"]
# ["r16725 | knu | 2008-05-31 23:34:23 +0900 (Sat, 31 May 2008) | 2 lines\n",
# "\n",
# "* README, README.ja: Add a note about default C flags.\n",
# "\n"]
# ...
Paragraphs separated by empty lines can be parsed as follows:
File.foreach("README").chunk { |line|
/\A\s*\z/ !~ line || nil
}.each { |_, lines|
pp lines
}
3993 3994 3995 3996 3997 3998 3999 4000 4001 4002 4003 4004 4005 |
# File 'enum.c', line 3993
static VALUE
enum_chunk(VALUE enumerable)
{
VALUE enumerator;
RETURN_SIZED_ENUMERATOR(enumerable, 0, 0, enum_size);
enumerator = rb_obj_alloc(rb_cEnumerator);
rb_ivar_set(enumerator, id_chunk_enumerable, enumerable);
rb_ivar_set(enumerator, id_chunk_categorize, rb_block_proc());
rb_block_call(enumerator, idInitialize, 0, 0, chunk_i, enumerator);
return enumerator;
}
|