Class: CsvFilter

Inherits:
Object
  • Object
show all
Defined in:
lib/csv_filter.rb

Overview

Author:

  • Kris Luminar

Instance Method Summary collapse

Constructor Details

#initialize(file_path, separator = "\t") ⇒ CsvFilter

Returns a new instance of CsvFilter.

Parameters:

  • file_path

    the full path to a file



4
5
6
7
8
9
10
11
# File 'lib/csv_filter.rb', line 4

def initialize file_path, separator = "\t"
  STDOUT.flush
  @file = File.open(file_path, 'r')
  @separator = separator
  @num_columns = count_columns
  @header = {}
  grab_header
end

Instance Method Details

#count_columnsObject

count num of columns already deprecated, always a bad sign



77
78
79
# File 'lib/csv_filter.rb', line 77

def count_columns
  fields.size
end

#fieldsObject

list the fields from the header rown in the input file



83
84
85
86
87
# File 'lib/csv_filter.rb', line 83

def fields
  return @fields if @fields
  @file.rewind
  @fields = @file.gets.split(@separator).map &:strip
end

#filter(*columns) ⇒ Object

selects just the desired columns from a CSV file

Parameters:

  • columns

    list of columns to filter on

Raises:

  • (ArgumentError)


40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
# File 'lib/csv_filter.rb', line 40

def filter(*columns)
  # columns = [*columns].flatten #columns should accept either an array of strings or a variable number of strings
  raise ArgumentError unless (columns.respond_to?(:size) and columns.size < @num_columns)
  output = []
  @file.each_with_index do |line, i|
    #TODO: Decide wther to allow user to specify if header row exists. If so, this step will be conditional. Else, add proviso to the README that csv file must include a header line.
    next if i == 0 # skip header row
    row = {}
    line.chomp.split(@separator).each_with_index do |value, j|
      if filtered_column_positions(columns).include? j
        row[@header[j]] = value
      end
    end
    output << row
  end
  output
end

#filtered_column_positions(columns) ⇒ Object

the positions of all the columns that the user wants filtered on



26
27
28
29
# File 'lib/csv_filter.rb', line 26

def filtered_column_positions columns
  columns = columns.flatten
  @filtered_column_positions ||= register.select {|field, pos| columns.include? field }.values
end

#grab_headerObject

grab the first row of the inputed csv file



15
16
17
18
19
20
21
22
# File 'lib/csv_filter.rb', line 15

def grab_header
  return @header if (@header and !@header.empty?)
  @file.rewind
  fields.each_with_index do |field_name, i|
    @header[i]= field_name.strip
  end
  @header
end

CLI interface for filter

(see #filter)



61
62
63
64
65
66
67
68
69
70
71
72
# File 'lib/csv_filter.rb', line 61

def print_filter(columns)
  lines = filter(columns)
  output = []
  lines.each_with_index do |line, i|
    row = "#{i}.".ljust(6)
    line.each do |k,v|
      row << "#{v}\t"
    end
    output << row
  end
  puts output.join("\n")
end

#registerObject

map column names to their relative positions (numbers)



33
34
35
# File 'lib/csv_filter.rb', line 33

def register
  grab_header.invert
end