Module: RedAmber::DataFrameReshaping
- Included in:
- DataFrame
- Defined in:
- lib/red_amber/data_frame_reshaping.rb
Overview
Mix-in for the class DataFrame
Instance Method Summary collapse
-
#to_long(*keep_keys, name: :NAME, value: :VALUE) ⇒ DataFrame
Create a ‘long’ (may be tidy) DataFrame from a ‘wide’ DataFrame.
-
#to_wide(name: :NAME, value: :VALUE) ⇒ DataFrame
Create a ‘wide’ (may be messy) DataFrame from a ‘long’ DataFrame.
-
#transpose(key: keys.first, name: :NAME) ⇒ DataFrame
Create a transposed DataFrame for the wide (may be messy) DataFrame.
Instance Method Details
#to_long(*keep_keys, name: :NAME, value: :VALUE) ⇒ DataFrame
Create a ‘long’ (may be tidy) DataFrame from a ‘wide’ DataFrame.
172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 |
# File 'lib/red_amber/data_frame_reshaping.rb', line 172 def to_long(*keep_keys, name: :NAME, value: :VALUE) warn('[Info] No key to keep is specified.') if keep_keys.empty? not_included = keep_keys - keys unless not_included.empty? raise DataFrameArgumentError, "Not have keys #{not_included}" end name = name.to_sym if keep_keys.include?(name) raise DataFrameArgumentError, "Can't specify the key: #{name} for the column from keys." end value = value.to_sym if keep_keys.include?(value) raise DataFrameArgumentError, "Can't specify the key: #{value} for the column from values." end hash = Hash.new { |h, k| h[k] = [] } l = keys.size - keep_keys.size each_row do |row| row.each do |k, v| if keep_keys.include?(k) hash[k].concat([v] * l) else hash[name] << k hash[value] << v end end end hash[name] = hash[name].map { |x| x&.to_s } DataFrame.new(hash) end |
#to_wide(name: :NAME, value: :VALUE) ⇒ DataFrame
Create a ‘wide’ (may be messy) DataFrame from a ‘long’ DataFrame.
253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 |
# File 'lib/red_amber/data_frame_reshaping.rb', line 253 def to_wide(name: :NAME, value: :VALUE) name = name.to_sym unless keys.include?(name) raise DataFrameArgumentError, "You are going to keep the key: #{name}. " \ 'You may need to specify the column name ' \ 'that gives the new keys by `:name` option.' end value = value.to_sym unless keys.include?(value) raise DataFrameArgumentError, "You are going to keep the key: #{value}. " \ 'You may need to specify the column name ' \ 'that gives the new values by `:value` option.' end hash = Hash.new { |h, k| h[k] = {} } keep_keys = keys - [name, value] each_row do |row| keeps, converts = row.partition { |k, _| keep_keys.include?(k) } h = converts.to_h hash[keeps.to_h][h[name].to_s.to_sym] = h[value] end ks = hash.first[0].keys + hash.first[1].keys vs = hash.map { |k, v| k.values + v.values }.transpose DataFrame.new(ks.zip(vs)) end |
#transpose(key: keys.first, name: :NAME) ⇒ DataFrame
Create a transposed DataFrame for the wide (may be messy) DataFrame.
92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 |
# File 'lib/red_amber/data_frame_reshaping.rb', line 92 def transpose(key: keys.first, name: :NAME) unless keys.include?(key) raise DataFrameArgumentError, "Self does not include: #{key}" end # Find unused name new_keys = self[key].to_a.map { |e| e.to_s.to_sym } name = (:NAME1..).find { |k| !new_keys.include?(k) } if new_keys.include?(name) names = (keys - [key]).map { |x| x&.to_s } hash = { name => names } i = keys.index(key) each_row do |h| k = h.values[i] hash[k] = h.values - [k] end DataFrame.new(hash) end |