Method: Enumerable#chunk
- Defined in:
- enum.c
#chunk {|elt| ... } ⇒ Object #chunk(initial_state) {|elt, state| ... } ⇒ Object
Enumerates over the items, chunking them together based on the return value of the block.
Consecutive elements which return the same block value are chunked together.
For example, consecutive even numbers and odd numbers can be chunked as follows.
[3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5].chunk { |n|
n.even?
}.each { |even, ary|
p [even, ary]
}
#=> [false, [3, 1]]
# [true, [4]]
# [false, [1, 5, 9]]
# [true, [2, 6]]
# [false, [5, 3, 5]]
This method is especially useful for sorted series of elements. The following example counts words for each initial letter.
open("/usr/share/dict/words", "r:iso-8859-1") { |f|
f.chunk { |line| line.ord }.each { |ch, lines| p [ch.chr, lines.length] }
}
#=> ["\n", 1]
# ["A", 1327]
# ["B", 1372]
# ["C", 1507]
# ["D", 791]
# ...
The following key values have special meaning:
-
niland:_separatorspecifies that the elements should be dropped. -
:_alonespecifies that the element should be chunked by itself.
Any other symbols that begin with an underscore will raise an error:
items.chunk { |item| :_underscore }
#=> RuntimeError: symbols beginning with an underscore are reserved
nil and :_separator can be used to ignore some elements.
For example, the sequence of hyphens in svn log can be eliminated as follows:
sep = "-"*72 + "\n"
IO.popen("svn log README") { |f|
f.chunk { |line|
line != sep || nil
}.each { |_, lines|
pp lines
}
}
#=> ["r20018 | knu | 2008-10-29 13:20:42 +0900 (Wed, 29 Oct 2008) | 2 lines\n",
# "\n",
# "* README, README.ja: Update the portability section.\n",
# "\n"]
# ["r16725 | knu | 2008-05-31 23:34:23 +0900 (Sat, 31 May 2008) | 2 lines\n",
# "\n",
# "* README, README.ja: Add a note about default C flags.\n",
# "\n"]
# ...
Paragraphs separated by empty lines can be parsed as follows:
File.foreach("README").chunk { |line|
/\A\s*\z/ !~ line || nil
}.each { |_, lines|
pp lines
}
:_alone can be used to force items into their own chunk. For example, you can put lines that contain a URL by themselves, and chunk the rest of the lines together, like this:
pattern = /http/
open(filename) { |f|
f.chunk { |line| line =~ pattern ? :_alone : true }.each { |key, lines|
pp lines
}
}
If the block needs to maintain state over multiple elements, an initial_state argument can be used. If a non-nil value is given, a reference to it is passed as the 2nd argument of the block for the chunk method, so state-changes to it persist across block calls.
2541 2542 2543 2544 2545 2546 2547 2548 2549 2550 2551 2552 2553 2554 2555 2556 2557 |
# File 'enum.c', line 2541
static VALUE
enum_chunk(int argc, VALUE *argv, VALUE enumerable)
{
VALUE initial_state;
VALUE enumerator;
if (!rb_block_given_p())
rb_raise(rb_eArgError, "no block given");
rb_scan_args(argc, argv, "01", &initial_state);
enumerator = rb_obj_alloc(rb_cEnumerator);
rb_ivar_set(enumerator, rb_intern("chunk_enumerable"), enumerable);
rb_ivar_set(enumerator, rb_intern("chunk_categorize"), rb_block_proc());
rb_ivar_set(enumerator, rb_intern("chunk_initial_state"), initial_state);
rb_block_call(enumerator, idInitialize, 0, 0, chunk_i, enumerator);
return enumerator;
}
|