Class: Fech::Filing
- Inherits:
-
Object
- Object
- Fech::Filing
- Defined in:
- lib/fech/filing.rb
Overview
Fech::Filing downloads an Electronic Filing given its ID, and will search rows by row type. Using a child Translator object, the data in each row is automatically mapped at runtime into a labeled Hash. Additional Translations may be added to change the way that data is mapped and cleaned.
Direct Known Subclasses
Constant Summary collapse
- FIRST_V3_FILING =
first filing number using the version >=3.00 format note that there are plenty of <v3 filings after this, so readable? still needs to be checked
11850
Instance Attribute Summary collapse
-
#download_dir ⇒ Object
Returns the value of attribute download_dir.
-
#filing_id ⇒ Object
Returns the value of attribute filing_id.
Class Method Summary collapse
-
.map_for(row_type, opts = {}) ⇒ Object
Returns the column names for given row type and version in the order they appear in row data.
Instance Method Summary collapse
-
#amendment? ⇒ Boolean
Whether this filing amends a previous filing or not.
-
#amends ⇒ Object
Returns the filing ID of the past filing this one amends, nil if this is a first-draft filing.
-
#custom_file_path ⇒ Object
The file path where custom versions of a filing are to be saved.
-
#delimiter ⇒ String
The delimiter used in the filing’s version.
-
#download ⇒ Object
Saves the filing data from the FEC website into the default download directory.
-
#each_row(opts = {}) {|Array| ... } ⇒ Object
Iterates over and yields the Filing’s lines.
-
#each_row_with_index(&block) ⇒ Object
Wrapper around .each_row to include indexes.
-
#file_contents ⇒ Object
The raw contents of the Filing.
- #file_name ⇒ Object
-
#file_path ⇒ Object
The location of the Filing on the file system.
- #filing_url ⇒ Object
-
#filing_version ⇒ Object
The version of the FEC software used to generate this Filing.
-
#fix_f99_contents ⇒ Object
Handle the contents of F99s by removing the [BEGINTEXT] and [ENDTEXT] delimiters and putting the text content onto the same line as the summary.
-
#form_type ⇒ Object
Determine the form type of the filing before it’s been parsed.
-
#hash_zip(keys, values) ⇒ Fech::Mapped, Hash
Combines an array of keys and values into an Fech::Mapped object, a type of Hash.
-
#header(opts = {}) ⇒ Hash
Access the header (first) line of the filing, containing information about the filing’s version and metadata about the software used to file it.
-
#initialize(filing_id, opts = {}) ⇒ Filing
constructor
Create a new Filing object, assign the download directory to system’s temp folder by default.
-
#map(row, opts = {}) ⇒ Object
Maps a raw row to a labeled hash following any rules given in the filing’s Translator based on its version and row type.
-
#map_for(row_type) ⇒ Object
Returns the column names for given row type and the filing’s version in the order they appear in row data.
-
#mappings ⇒ Object
Gets or creats the Mappings instance for this filing_version.
-
#parse_filing_version ⇒ Object
Pulls out the version number from the header line.
-
#parse_row?(row, opts = {}) ⇒ Boolean
Decides what to do with a given row.
-
#readable? ⇒ Boolean
Only FEC format 3.00 + is supported.
-
#resave_f99_contents ⇒ Object
Resave the “fixed” version of an F99.
-
#rows_like(row_type, opts = {}) {|Hash| ... } ⇒ Array
Access all lines of the filing that match a given row type.
-
#summary ⇒ Hash
Access the summary (second) line of the filing, containing aggregate and top-level information about the filing.
- #translate {|t| ... } ⇒ Object
-
#translator ⇒ Object
Accessor for @translator.
Constructor Details
#initialize(filing_id, opts = {}) ⇒ Filing
Create a new Filing object, assign the download directory to system’s temp folder by default.
23 24 25 26 27 28 29 30 31 32 |
# File 'lib/fech/filing.rb', line 23 def initialize(filing_id, opts={}) @filing_id = filing_id @download_dir = opts[:download_dir] || Dir.tmpdir @translator = opts[:translate] ? Fech::Translator.new(:include => opts[:translate]) : nil @quote_char = opts[:quote_char] || '"' @csv_parser = opts[:csv_parser] || Fech::Csv @resaved = false @customized = false @encoding = opts[:encoding] || 'iso-8859-1:utf-8' end |
Instance Attribute Details
#download_dir ⇒ Object
Returns the value of attribute download_dir.
16 17 18 |
# File 'lib/fech/filing.rb', line 16 def download_dir @download_dir end |
#filing_id ⇒ Object
Returns the value of attribute filing_id.
16 17 18 |
# File 'lib/fech/filing.rb', line 16 def filing_id @filing_id end |
Class Method Details
Instance Method Details
#amendment? ⇒ Boolean
Whether this filing amends a previous filing or not.
195 196 197 |
# File 'lib/fech/filing.rb', line 195 def amendment? !amends.nil? end |
#amends ⇒ Object
Returns the filing ID of the past filing this one amends, nil if this is a first-draft filing. :report_id in the HDR line references the amended filing
202 203 204 |
# File 'lib/fech/filing.rb', line 202 def amends header[:report_id] end |
#custom_file_path ⇒ Object
The file path where custom versions of a filing are to be saved.
271 272 273 |
# File 'lib/fech/filing.rb', line 271 def custom_file_path File.join(download_dir, "fech_#{file_name}") end |
#delimiter ⇒ String
Returns the delimiter used in the filing’s version.
346 347 348 |
# File 'lib/fech/filing.rb', line 346 def delimiter filing_version.to_f < 6 ? "," : "\034" end |
#download ⇒ Object
Saves the filing data from the FEC website into the default download directory.
36 37 38 39 40 41 42 43 44 45 46 |
# File 'lib/fech/filing.rb', line 36 def download File.open(file_path, 'w') do |file| begin file << open(filing_url).read rescue file << open(filing_url).read.ensure_encoding('UTF-8', :external_encoding => Encoding::UTF_8, :invalid_characters => :drop) end end self end |
#each_row(opts = {}) {|Array| ... } ⇒ Object
Iterates over and yields the Filing’s lines
321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 |
# File 'lib/fech/filing.rb', line 321 def each_row(opts={}, &block) unless File.exists?(file_path) raise "File #{file_path} does not exist. Try invoking the .download method on this Filing object." end # If this is an F99, we need to parse it differently. resave_f99_contents if ['F99', '"F99"'].include? form_type c = 0 @csv_parser.parse_row(@customized ? custom_file_path : file_path, opts.merge(:col_sep => delimiter, :quote_char => @quote_char, :skip_blanks => true, :encoding => @encoding)) do |row| if opts[:with_index] yield [row, c] c += 1 else yield row end end end |
#each_row_with_index(&block) ⇒ Object
Wrapper around .each_row to include indexes
341 342 343 |
# File 'lib/fech/filing.rb', line 341 def each_row_with_index(&block) each_row(:with_index => true, &block) end |
#file_contents ⇒ Object
The raw contents of the Filing
248 249 250 |
# File 'lib/fech/filing.rb', line 248 def file_contents File.open(file_path, "r:#{@encoding}") end |
#file_name ⇒ Object
309 310 311 |
# File 'lib/fech/filing.rb', line 309 def file_name "#{filing_id}.fec" end |
#file_path ⇒ Object
The location of the Filing on the file system
243 244 245 |
# File 'lib/fech/filing.rb', line 243 def file_path File.join(download_dir, file_name) end |
#filing_url ⇒ Object
313 314 315 |
# File 'lib/fech/filing.rb', line 313 def filing_url "https://docquery.fec.gov/dcdev/posted/#{filing_id}.fec" end |
#filing_version ⇒ Object
The version of the FEC software used to generate this Filing
216 217 218 |
# File 'lib/fech/filing.rb', line 216 def filing_version @filing_version ||= parse_filing_version end |
#fix_f99_contents ⇒ Object
Handle the contents of F99s by removing the
- BEGINTEXT
-
and [ENDTEXT] delimiters and
putting the text content onto the same line as the summary.
279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 |
# File 'lib/fech/filing.rb', line 279 def fix_f99_contents @customized = true content = file_contents.read if RUBY_VERSION > "1.9.2" content.encode!('UTF-16', 'UTF-8', :invalid => :replace, :undef => :replace, :replace => '?') content.encode!('UTF-8', 'UTF-16') else require 'iconv' ic = Iconv.new('UTF-8//IGNORE', 'UTF-8') content = ic.iconv(content + ' ')[0..-2] # add valid byte before converting, then remove it end regex = /\n\[BEGINTEXT\]\n(.*?)\[ENDTEXT\]\n/mi # some use eg [EndText] match = content.match(regex) if match repl = match[1].gsub(/"/, '""') content.gsub(regex, "#{delimiter}\"#{repl}\"") else content end end |
#form_type ⇒ Object
Determine the form type of the filing before it’s been parsed. This is needed for the F99 special case.
255 256 257 258 259 260 261 262 263 264 265 266 267 |
# File 'lib/fech/filing.rb', line 255 def form_type if RUBY_VERSION >= "2.0" lines = file_contents.each_line else lines = file_contents.lines end lines.each_with_index do |row, index| next if index == 0 return row.split(delimiter).first end end |
#hash_zip(keys, values) ⇒ Fech::Mapped, Hash
Combines an array of keys and values into an Fech::Mapped object, a type of Hash.
211 212 213 |
# File 'lib/fech/filing.rb', line 211 def hash_zip(keys, values) Fech::Mapped.new(self, values.first).merge(Hash[*keys.zip(values).flatten]) end |
#header(opts = {}) ⇒ Hash
Access the header (first) line of the filing, containing information about the filing’s version and metadata about the software used to file it.
51 52 53 54 55 |
# File 'lib/fech/filing.rb', line 51 def header(opts={}) each_row do |row| return parse_row?(row) end end |
#map(row, opts = {}) ⇒ Object
Maps a raw row to a labeled hash following any rules given in the filing’s Translator based on its version and row type. Finds the correct map for a given row, performs any matching Translations on the individual values, and returns either the entire dataset, or just those fields requested.
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 |
# File 'lib/fech/filing.rb', line 119 def map(row, opts={}) data = Fech::Mapped.new(self, row.first) full_row_map = map_for(row.first) # If specific fields were asked for, return only those if opts[:include] row_map = full_row_map.select { |k| opts[:include].include?(k) } else row_map = full_row_map end # Inserts the row into data, performing any specified preprocessing # on individual cells along the way row_map.each_with_index do |field, index| value = row[full_row_map.index(field)] if translator translator.get_translations(:row => row.first, :version => filing_version, :action => :convert, :field => field).each do |translation| # User's Procs should be given each field's value as context value = translation[:proc].call(value) end end data[field] = value end # Performs any specified group preprocessing / combinations if translator combinations = translator.get_translations(:row => row.first, :version => filing_version, :action => :combine) row_hash = hash_zip(row_map, row) if combinations combinations.each do |translation| # User's Procs should be given the entire row as context value = translation[:proc].call(row_hash) field = translation[:field].source.gsub(/[\^\$]*/, "").to_sym data[field] = value end end data end |
#map_for(row_type) ⇒ Object
Returns the column names for given row type and the filing’s version in the order they appear in row data.
163 164 165 |
# File 'lib/fech/filing.rb', line 163 def map_for(row_type) mappings.for_row(row_type) end |
#mappings ⇒ Object
Gets or creats the Mappings instance for this filing_version
238 239 240 |
# File 'lib/fech/filing.rb', line 238 def mappings @mapping ||= Fech::Mappings.new(filing_version) end |
#parse_filing_version ⇒ Object
Pulls out the version number from the header line. Must parse this line manually, since we don’t know the version yet, and thus the delimiter type is still a mystery.
223 224 225 226 227 228 229 230 |
# File 'lib/fech/filing.rb', line 223 def parse_filing_version first = File.open(file_path).first if first.index("\034").nil? @csv_parser.parse(first).flatten[2] else @csv_parser.parse(first, :col_sep => "\034").flatten[2] end end |
#parse_row?(row, opts = {}) ⇒ Boolean
Decides what to do with a given row. If the row’s type matches the desired type, or if no type was specified, it will run the row through #map. If :raw was passed true, a flat, unmapped data array will be returned.
99 100 101 102 103 104 105 106 107 108 109 |
# File 'lib/fech/filing.rb', line 99 def parse_row?(row, opts={}) return false if row.nil? || row.empty? # Always parse, unless :parse_if is given and does not match row if opts[:parse_if].nil? || \ Fech.regexify(opts[:parse_if]).match(row.first.downcase) opts[:raw] ? row : map(row, opts) else false end end |
#readable? ⇒ Boolean
Only FEC format 3.00 + is supported
233 234 235 |
# File 'lib/fech/filing.rb', line 233 def readable? filing_version.to_i >= 3 end |
#resave_f99_contents ⇒ Object
Resave the “fixed” version of an F99
303 304 305 306 307 |
# File 'lib/fech/filing.rb', line 303 def resave_f99_contents return true if @resaved File.open(custom_file_path, 'w') { |f| f.write(fix_f99_contents) } @resaved = true end |
#rows_like(row_type, opts = {}) {|Hash| ... } ⇒ Array
Access all lines of the filing that match a given row type. Will return an Array of all available lines if called directly, or will yield the mapped rows one by one if a block is passed.
78 79 80 81 82 83 84 85 86 87 88 89 90 |
# File 'lib/fech/filing.rb', line 78 def rows_like(row_type, opts={}, &block) data = [] each_row(:row_type => row_type) do |row| value = parse_row?(row, opts.merge(:parse_if => row_type)) next if value == false if block_given? yield value else data << value if value end end block_given? ? nil : data end |
#summary ⇒ Hash
Access the summary (second) line of the filing, containing aggregate and top-level information about the filing.
60 61 62 63 64 65 |
# File 'lib/fech/filing.rb', line 60 def summary each_row_with_index do |row, index| next if index == 0 return parse_row?(row) end end |
#translate {|t| ... } ⇒ Object
186 187 188 189 190 191 192 |
# File 'lib/fech/filing.rb', line 186 def translate(&block) if block_given? yield translator else translator end end |
#translator ⇒ Object
Accessor for @translator. Will return the Translator initialized in Filing’s initializer if built-in translations were passed to Filing’s initializer (=> [:foo, :bar]). Otherwise, will create and memoize a new Translator without any default translations.
180 181 182 |
# File 'lib/fech/filing.rb', line 180 def translator @translator ||= Fech::Translator.new end |