Class: Core::Type::Instances
- Inherits:
-
Object
- Object
- Core::Type::Instances
- Defined in:
- lib/ruby-band/core/type/instances.rb
Overview
-
Description :
This is the main class from the Weka package for data handling. It is essentially a matrix: each row is an instance of the ‘Instance’ class, while each column is an instance of the ‘Attribute’ class The class ‘Instances’ is here extended to add custom functionalities
Direct Known Subclasses
Defined Under Namespace
Classes: Base
Instance Method Summary collapse
-
#add_instance(instance) ⇒ Object
An Instance instance object (one row) is inserted into the current Instances object * Args : -
instance
-> an array of values of the correct data type (:nominal,:numeric,etc…). -
#add_nominal_attribute(attribute, list_values) ⇒ Object
An Attribute instance object is inserted into the current Instances object * Args : -
attribute_name
-> A name for the new attribute -values
-> RubyArray with nominal values * WARNING : This method only creates an empty attribute field. -
#add_numeric_attribute(attribute_name) ⇒ Object
An Attribute instance object is inserted into the current Instances object * Args : -
attribute_name
-> A name for the new attribute * WARNING : This method only creates an empty attribute field. -
#att(attr_type, name, *values) ⇒ Object
This method is used for attributes definition in uninitialized Instances-derived classes.
-
#check_array(data) ⇒ Object
(check function): should check that the array is bidimensional and that the lengths are equal.
-
#check_numeric_instance ⇒ Object
Check if this instance’s attributes are all Numeric.
-
#date(name) ⇒ Object
This method is used for Date attributes definition in uninitialized Instances-derived classes * Args : -
name
-> Attribute name, a String. -
#dim ⇒ Object
Return the dimensions of the dataset (for the current Instances class object).
- #each_column ⇒ Object
- #each_column_with_index ⇒ Object
- #each_row ⇒ Object
- #each_row_with_index ⇒ Object
-
#mean(attribute_name) ⇒ Object
Return the mean value of a single attribute (a column from the Instances object) * Args : -
attribute_name
-> a String, the name of the attribute. -
#merge_with(instances) ⇒ Object
Merges two sets of Instances together.
-
#n_col ⇒ Object
Return the number of columns (Attribute objects) in the dataset.
-
#n_rows ⇒ Object
Return the number of rows (Instance objects) in the dataset.
-
#nominal(name, values) ⇒ Object
This method is used for Nominal attributes definition in uninitialized Instances-derived classes * Args : -
name
-> Attribute name, a String -values
-> An array of values for the nominal attribute. -
#numeric(name) ⇒ Object
This method is used for Numeric attributes definition in uninitialized Instances-derived classes * Args : -
name
-> Attribute name, a String. -
#populate_by_row(data) ⇒ Object
An entire dataset is inserted ‘by row’ into the current Instances object i.e.
-
#return_attr_data(att) ⇒ Object
Return data for a single attribute (a column from the Instances object) * Args : -
att
-> a String, the name of the attribute. -
#string(name) ⇒ Object
This method is used for String attributes definition in uninitialized Instances-derived classes * Args : -
name
-> Attribute name, a String. -
#summary ⇒ Object
Print to STDOUT the list of the Instances’s attributes (with the corresponding types).
-
#to_a2d ⇒ Object
Convert an Instances object to a bidimensional Ruby array where each row corresponds to an Instance object.
-
#to_Apache_matrix ⇒ Object
Convert the present Instances object to an Apache matrix if every Instances attribute is Numeric.
-
#to_Apache_matrix_block ⇒ Object
Convert the present Instances object to an Apache matrix (block) if every Instances attribute is Numeric.
-
#to_ARFF(out_file) ⇒ Object
Write the content of the current Instances object to a .arff file * Args : -
out_file
-> a String, the name of the output file. -
#to_CSV(out_file) ⇒ Object
Write the content of the current Instances object to a .csv file * Args : -
out_file
-> a String, the name of the output file. -
#to_json_format ⇒ Object
Return a json String for the current Instances object The output is modeled on the ‘datatable’ Google charts APIs More details at: ‘developers.google.com/chart/interactive/docs/reference#DataTable’.
-
#variance(attribute_name) ⇒ Object
Return the variance of a single attribute (a column from the Instances object) * Args : -
attribute_name
-> a String, the name of the attribute.
Instance Method Details
#add_instance(instance) ⇒ Object
An Instance instance object (one row) is inserted into the current Instances object
-
Args :
-
instance
-> an array of values of the correct data type (:nominal,:numeric,etc…)
-
199 200 201 202 203 204 205 206 207 |
# File 'lib/ruby-band/core/type/instances.rb', line 199 def add_instance(instance) data_ref=Array.new instance.each_with_index do |attribute,idx| data_ref << insert_attribute(attribute,idx) end double_array = data_ref.to_java :double single_row = Instance.new(1.0, double_array) self.add(single_row) end |
#add_nominal_attribute(attribute, list_values) ⇒ Object
An Attribute instance object is inserted into the current Instances object
-
Args :
-
attribute_name
-> A name for the new attribute -
values
-> RubyArray with nominal values
-
-
WARNING :
This method only creates an empty attribute field
224 225 226 227 228 229 230 |
# File 'lib/ruby-band/core/type/instances.rb', line 224 def add_nominal_attribute(attribute,list_values) values = FastVector.new list_values.each do |val| values.addElement(val) end insertAttributeAt(Attribute.new(attribute, values), self.numAttributes) end |
#add_numeric_attribute(attribute_name) ⇒ Object
An Attribute instance object is inserted into the current Instances object
-
Args :
-
attribute_name
-> A name for the new attribute
-
-
WARNING :
This method only creates an empty attribute field
214 215 216 |
# File 'lib/ruby-band/core/type/instances.rb', line 214 def add_numeric_attribute(attribute_name) insertAttributeAt(Attribute.new(attribute_name), self.numAttributes) end |
#att(attr_type, name, *values) ⇒ Object
This method is used for attributes definition in uninitialized Instances-derived classes
283 284 285 286 287 288 289 |
# File 'lib/ruby-band/core/type/instances.rb', line 283 def att(attr_type,name,*values) att = Core::Type.create_numeric_attr(name.to_java(:string)) if attr_type == :numeric att = Core::Type.create_nominal_attr(name.to_java(:string),values[0]) if attr_type == :nominal att = Core::Type.create_date_attr(name.to_java(:string),values[0]) if attr_type == :date att = att = Core::Type.create_string_attr(name.to_java(:string)) if attr_type == :string @positions << att end |
#check_array(data) ⇒ Object
(check function): should check that the array is bidimensional and that the lengths are equal
180 181 182 |
# File 'lib/ruby-band/core/type/instances.rb', line 180 def check_array(data) return true # still to be done end |
#check_numeric_instance ⇒ Object
Check if this instance’s attributes are all Numeric
77 78 79 80 81 82 83 |
# File 'lib/ruby-band/core/type/instances.rb', line 77 def check_numeric_instance enumerateAttributes.each do |att| unless att.isNumeric raise ArgumentError, "Sorry, attribute '#{att.name}' is not numeric!" end end end |
#date(name) ⇒ Object
This method is used for Date attributes definition in uninitialized Instances-derived classes
-
Args :
-
name
-> Attribute name, a String
-
309 310 311 |
# File 'lib/ruby-band/core/type/instances.rb', line 309 def date(name) att :date, name end |
#dim ⇒ Object
Return the dimensions of the dataset (for the current Instances class object)
56 57 58 |
# File 'lib/ruby-band/core/type/instances.rb', line 56 def dim puts "Rows number:\t#{numInstances}\nColumns number:\t #{numAttributes}" end |
#each_column ⇒ Object
68 69 70 |
# File 'lib/ruby-band/core/type/instances.rb', line 68 def each_column enumerate_attributes.each {|attribute| yield(attribute)} end |
#each_column_with_index ⇒ Object
72 73 74 |
# File 'lib/ruby-band/core/type/instances.rb', line 72 def each_column_with_index enumerate_attributes.each_with_index {|attribute,id| yield(attribute,id)} end |
#each_row ⇒ Object
60 61 62 |
# File 'lib/ruby-band/core/type/instances.rb', line 60 def each_row enumerate_instances.each {|inst| yield(inst)} end |
#each_row_with_index ⇒ Object
64 65 66 |
# File 'lib/ruby-band/core/type/instances.rb', line 64 def each_row_with_index enumerate_instances.each_with_index {|inst,id| yield(inst,id)} end |
#mean(attribute_name) ⇒ Object
Return the mean value of a single attribute (a column from the Instances object)
-
Args :
-
attribute_name
-> a String, the name of the attribute
-
124 125 126 127 128 129 |
# File 'lib/ruby-band/core/type/instances.rb', line 124 def mean(attribute_name) sum = enumerateInstances.inject(0) do |s,x| s+=x.value(attribute(attribute_name)) end return sum/(numInstances*1.0) end |
#merge_with(instances) ⇒ Object
Merges two sets of Instances together. The resulting set will have all the attributes of the first set plus all the attributes of the second set. The number of instances in both sets must be the same.
-
Args :
-
instances
-> An Instances class object
-
271 272 273 |
# File 'lib/ruby-band/core/type/instances.rb', line 271 def merge_with(instances) return Instances.mergeInstances(self,instances) end |
#n_col ⇒ Object
Return the number of columns (Attribute objects) in the dataset
51 52 53 |
# File 'lib/ruby-band/core/type/instances.rb', line 51 def n_col return numAttributes end |
#n_rows ⇒ Object
Return the number of rows (Instance objects) in the dataset
46 47 48 |
# File 'lib/ruby-band/core/type/instances.rb', line 46 def n_rows return numInstances end |
#nominal(name, values) ⇒ Object
This method is used for Nominal attributes definition in uninitialized Instances-derived classes
-
Args :
-
name
-> Attribute name, a String -
values
-> An array of values for the nominal attribute
-
295 296 297 |
# File 'lib/ruby-band/core/type/instances.rb', line 295 def nominal(name,values) att :nominal, name, values end |
#numeric(name) ⇒ Object
This method is used for Numeric attributes definition in uninitialized Instances-derived classes
-
Args :
-
name
-> Attribute name, a String
-
302 303 304 |
# File 'lib/ruby-band/core/type/instances.rb', line 302 def numeric(name) att :numeric, name end |
#populate_by_row(data) ⇒ Object
An entire dataset is inserted ‘by row’ into the current Instances object i.e. one Instance object is inserted at the time
-
Args :
-
data
-> a bidimensional array
-
188 189 190 191 192 193 194 |
# File 'lib/ruby-band/core/type/instances.rb', line 188 def populate_by_row(data) unless check_array(data) == false data.each do |row| add_instance(row) end end end |
#return_attr_data(att) ⇒ Object
Return data for a single attribute (a column from the Instances object)
-
Args :
-
att
-> a String, the name of the attribute
-
106 107 108 109 110 111 112 113 114 115 116 117 118 119 |
# File 'lib/ruby-band/core/type/instances.rb', line 106 def return_attr_data(att) attr_values = Array.new if attribute(att).isNumeric enumerateInstances.each do |i| attr_values << i.value(attribute(att)) end else attr_index = attribute(att).index enumerateInstances.each do |inst| attr_values << inst.string_value(attr_index) end end return attr_values end |
#string(name) ⇒ Object
This method is used for String attributes definition in uninitialized Instances-derived classes
-
Args :
-
name
-> Attribute name, a String
-
316 317 318 |
# File 'lib/ruby-band/core/type/instances.rb', line 316 def string(name) att :string, name end |
#summary ⇒ Object
Print to STDOUT the list of the Instances’s attributes (with the corresponding types)
233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 |
# File 'lib/ruby-band/core/type/instances.rb', line 233 def summary summary = Ruport::Data::Table::new summary.add_column 'Attributes' enumerateAttributes.each_with_index do |att,idx| summary.add_column idx end att_names = ['Names'] enumerateAttributes.each do |att| att_names << "'#{att.name}'" end summary << att_names att_types = ['Types'] enumerateAttributes.each do |att| att_types << "Numeric" if att.isNumeric att_types << "Nominal" if att.isNominal att_types << "Date" if att.isDate att_types << "String" if att.isString end summary << att_types display = [] display << summary unless enumerate_instances.nil? count=0 enumerateInstances.each {|inst| count=count+1} display << "\nNumber of rows: #{count}" end display end |
#to_a2d ⇒ Object
Convert an Instances object to a bidimensional Ruby array where each row corresponds to an Instance object
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
# File 'lib/ruby-band/core/type/instances.rb', line 26 def to_a2d matrix = Array.new att = Array.new self.enumerateAttributes.each_with_index do |a,idx| if a.isNumeric enumerate_instances.each {|s| att << s.value(s.attribute(idx))} matrix << att att = Array.new else enumerateInstances.each do |inst| att << inst.string_value(idx) end matrix << att att = Array.new end end return matrix.transpose end |
#to_Apache_matrix ⇒ Object
Convert the present Instances object to an Apache matrix if every Instances attribute is Numeric
87 88 89 90 91 92 |
# File 'lib/ruby-band/core/type/instances.rb', line 87 def to_Apache_matrix check_numeric_instance ruby_array = to_a java_double_array = Core::Utils::bidimensional_to_double(ruby_array) return Core::Type::Apache_matrix.new(java_double_array) end |
#to_Apache_matrix_block ⇒ Object
Convert the present Instances object to an Apache matrix (block) if every Instances attribute is Numeric
96 97 98 99 100 101 |
# File 'lib/ruby-band/core/type/instances.rb', line 96 def to_Apache_matrix_block check_numeric_instance ruby_array = to_a java_double_array = Core::Utils::bidimensional_to_double(ruby_array) return Core::Type::Apache_matrix_block.new(java_double_array) end |
#to_ARFF(out_file) ⇒ Object
Write the content of the current Instances object to a .arff file
-
Args :
-
out_file
-> a String, the name of the output file
-
154 155 156 157 158 159 160 |
# File 'lib/ruby-band/core/type/instances.rb', line 154 def to_ARFF(out_file) saver = ArffSaver.new saver.setInstances(self) out_file = File.new out_file saver.setFile(out_file); saver.writeBatch(); end |
#to_CSV(out_file) ⇒ Object
Write the content of the current Instances object to a .csv file
-
Args :
-
out_file
-> a String, the name of the output file
-
143 144 145 146 147 148 149 |
# File 'lib/ruby-band/core/type/instances.rb', line 143 def to_CSV(out_file) saver = CSVSaver.new saver.setInstances(self) out_file = File.new out_file saver.setFile(out_file); saver.writeBatch(); end |
#to_json_format ⇒ Object
Return a json String for the current Instances object The output is modeled on the ‘datatable’ Google charts APIs More details at: ‘developers.google.com/chart/interactive/docs/reference#DataTable’
334 335 336 337 338 339 |
# File 'lib/ruby-band/core/type/instances.rb', line 334 def to_json_format dataset_hash = Hash.new dataset_hash[:cols] = enumerateAttributes.collect {|attribute| attribute.name} dataset_hash[:rows] = enumerateInstances.collect {|instance| instance.toString} return JSON.pretty_generate(dataset_hash) end |
#variance(attribute_name) ⇒ Object
Return the variance of a single attribute (a column from the Instances object)
-
Args :
-
attribute_name
-> a String, the name of the attribute
-
134 135 136 137 138 |
# File 'lib/ruby-band/core/type/instances.rb', line 134 def variance(attribute_name) enumerateAttributes.each_with_idx do |att,idx| return variance(idx) if att.name==attribute_name end end |