Class: Statsample::Multiset
Overview
Multiset joins multiple dataset with the same fields and vectors but with different number of cases. This is the base class for stratified and cluster sampling estimation
Instance Attribute Summary collapse
-
#datasets ⇒ Object
readonly
Array with Daru::DataFrame.
-
#fields ⇒ Object
readonly
Name of fields.
Class Method Summary collapse
Instance Method Summary collapse
- #[](i) ⇒ Object
- #add_dataset(key, ds) ⇒ Object
- #collect_vector(field) ⇒ Object
- #datasets_names ⇒ Object
- #each(&block) ⇒ Object
- #each_vector(field) ⇒ Object
-
#initialize(fields) ⇒ Multiset
constructor
To create a multiset * Multiset.new(%wf2 f3) # define only fields.
- #n_datasets ⇒ Object
- #sum_field(field) ⇒ Object
-
#union(&block) ⇒ Object
Generate a new dataset as a union of partial dataset If block given, this is applied to each dataset before union.
Constructor Details
#initialize(fields) ⇒ Multiset
To create a multiset
-
Multiset.new(%wf2 f3) # define only fields
12 13 14 15 |
# File 'lib/statsample/multiset.rb', line 12 def initialize(fields) @fields=fields @datasets={} end |
Instance Attribute Details
#datasets ⇒ Object (readonly)
Array with Daru::DataFrame
9 10 11 |
# File 'lib/statsample/multiset.rb', line 9 def datasets @datasets end |
#fields ⇒ Object (readonly)
Name of fields
7 8 9 |
# File 'lib/statsample/multiset.rb', line 7 def fields @fields end |
Class Method Details
Instance Method Details
#[](i) ⇒ Object
85 86 87 |
# File 'lib/statsample/multiset.rb', line 85 def [](i) @datasets[i] end |
#add_dataset(key, ds) ⇒ Object
62 63 64 65 66 67 68 |
# File 'lib/statsample/multiset.rb', line 62 def add_dataset(key,ds) if ds.vectors.to_a != @fields raise ArgumentError, "Dataset(#{ds.vectors.to_a.to_s})must have the same fields of the Multiset(#{@fields})" else @datasets[key] = ds end end |
#collect_vector(field) ⇒ Object
77 78 79 |
# File 'lib/statsample/multiset.rb', line 77 def collect_vector(field) @datasets.collect { |k,v| yield k, v[field] } end |
#datasets_names ⇒ Object
54 55 56 |
# File 'lib/statsample/multiset.rb', line 54 def datasets_names @datasets.keys.sort end |
#each(&block) ⇒ Object
89 90 91 92 93 94 |
# File 'lib/statsample/multiset.rb', line 89 def each(&block) @datasets.each {|k,ds| next if ds.nrows == 0 block.call(k,ds) } end |
#each_vector(field) ⇒ Object
81 82 83 |
# File 'lib/statsample/multiset.rb', line 81 def each_vector(field) @datasets.each { |k,v| yield k, v[field] } end |
#n_datasets ⇒ Object
58 59 60 |
# File 'lib/statsample/multiset.rb', line 58 def n_datasets @datasets.size end |
#sum_field(field) ⇒ Object
69 70 71 72 73 74 75 76 |
# File 'lib/statsample/multiset.rb', line 69 def sum_field(field) @datasets.inject(0) {|a,da| stratum_name = da[0] vector = da[1][field] val = yield stratum_name,vector a + val } end |
#union(&block) ⇒ Object
Generate a new dataset as a union of partial dataset If block given, this is applied to each dataset before union
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
# File 'lib/statsample/multiset.rb', line 26 def union(&block) union_field={} types={} names={} labels={} each do |k,ds| if block ds = ds.dup yield k,ds end @fields.each do |f| union_field[f] ||= Array.new union_field[f].concat(ds[f].to_a) types[f] ||= ds[f].type names[f] ||= ds[f].name labels[f] ||= ds[f].index.to_a end end @fields.each do |f| union_field[f] = Daru::Vector.new(union_field[f]) union_field[f].rename names[f] end ds_union = Daru::DataFrame.new(union_field, order: @fields) ds_union end |