Class: Reading::Parsing::Rows::Column

Inherits:

Object

Object
Reading::Parsing::Rows::Column

show all

Defined in:: lib/reading/parsing/rows/column.rb

Overview

The base class for all the columns in parsing/rows/compact_planned_columns and parsing/rows/regular_columns.

Direct Known Subclasses

Reading::Parsing::Rows::CompactPlanned::Head, Regular::EndDates, Regular::Genres, Regular::Head, Regular::History, Regular::Length, Regular::Notes, Regular::Rating, Regular::Sources, Regular::StartDates

Constant Summary collapse

SHARED_REGEXES = Regular expressions that are shared across more than one column, placed here just to be DRY.

{
  progress: %r{
    (DNF\s+)?(?<progress_percent>\d\d?)%
    |
    (DNF\s+)?p?(?<progress_pages>\d+)p?
    |
    (DNF\s+)?(?<progress_time>\d+:\d\d)
    |
    # just DNF
    (?<progress_dnf>DNF)
  }x,
  series_and_extra_info: [
    # just series
    %r{\A
      in\s(?<series_names>.+)
      # empty volume so that names and volumes have equal sizes when turned into arrays
      (?<series_volumes>)
    \z}x,
    # series and volume
    %r{\A
      (?<series_names>.+?)
      ,?\s*
      \#(?<series_volumes>\d+)
    \z}x,
    # extra info
    %r{\A
      (?<extra_info>.+)
    \z}x,
  ],
}.freeze

Class Method Summary collapse

.column_name ⇒ String

The class name changed into a string, e.g.
.flatten_into_arrays ⇒ Array<Symbol>

Keys in the parsed output hash that should be converted to an array, even if only one value was in the input, as in { … extra_info: [“ed. Jane Doe”] }.
.regexes(segment_index) ⇒ Array<Regexp>

The regular expressions used to parse the column (except the part of the column before the first format emoji, which is in ::regexes_before_formats below).
.regexes_before_formats ⇒ Array<Regexp>

The regular expressions used to parse the part of the column before the first format emoji.
.segment_group_separator ⇒ Regexp^?

The regular expression used to split segment groups (e.g. /s*—-s*/), or nil if the column should not be split by segment group.
.segment_separator ⇒ Regexp^?

The regular expression used to split segments (e.g. /s*–s*/), or nil if the column should not be split by segment.
.split_by_format? ⇒ Boolean

Whether the column can contain “chunks” each set off by a format emoji.
.split_by_segment? ⇒ Boolean

Whether the column can contain multiple segments, e.g.
.split_by_segment_group? ⇒ Boolean

Whether the column can contain multiple segment groups, e.g.
.to_sym ⇒ Symbol

The class name changed into a symbol, e.g.
.tweaks ⇒ Hash{Symbol => Proc}

Adjustments that are made to captured values at the end of parsing the column.

Class Method Details

.column_name ⇒ `String`

The class name changed into a string, e.g. StartDates => “Start Dates”

Returns:

(String)

# File 'lib/reading/parsing/rows/column.rb', line 9

def self.column_name
  class_name = name.split("::").last
  class_name.gsub(/(.)([A-Z])/,'\1 \2')
end

.flatten_into_arrays ⇒ `Array<Symbol>`

Keys in the parsed output hash that should be converted to an array, even if only one value was in the input, as in { … extra_info: [“ed. Jane Doe”] }

Returns:

(Array<Symbol>)



73
74
75

# File 'lib/reading/parsing/rows/column.rb', line 73

def self.flatten_into_arrays
  []
end

.regexes(segment_index) ⇒ `Array<Regexp>`

The regular expressions used to parse the column (except the part of the column before the first format emoji, which is in ::regexes_before_formats below). An array because sometimes it’s simpler to try several smaller regular expressions in series, and because a regular expression might be applicable only for segments in a certain position. See parsing/rows/regular_columns/head.rb for an example.

Parameters:

segment_index (Integer) —

the position of the current segment.

Returns:

(Array<Regexp>)



85
86
87

# File 'lib/reading/parsing/rows/column.rb', line 85

def self.regexes(segment_index)
  []
end

.regexes_before_formats ⇒ `Array<Regexp>`

The regular expressions used to parse the part of the column before the first format emoji.

Returns:

(Array<Regexp>)



92
93
94

# File 'lib/reading/parsing/rows/column.rb', line 92

def self.regexes_before_formats
  []
end

.segment_group_separator ⇒ `Regexp`^?

The regular expression used to split segment groups (e.g. /s*—-s*/), or nil if the column should not be split by segment group.

Returns:

(Regexp, nil)



57
58
59

# File 'lib/reading/parsing/rows/column.rb', line 57

def self.segment_group_separator
  nil
end

.segment_separator ⇒ `Regexp`^?

The regular expression used to split segments (e.g. /s*–s*/), or nil if the column should not be split by segment.

Returns:

(Regexp, nil)



43
44
45

# File 'lib/reading/parsing/rows/column.rb', line 43

def self.segment_separator
  nil
end

.split_by_format? ⇒ `Boolean`

Whether the column can contain “chunks” each set off by a format emoji. For example, the Head column of a compact planned row typically contains a list of multiple items. (The two others are the Sources column, for multiple variants of an item; and the regular Head column, for multiple items.)

Returns:

(Boolean)



30
31
32

# File 'lib/reading/parsing/rows/column.rb', line 30

def self.split_by_format?
  false
end

.split_by_segment? ⇒ `Boolean`

Whether the column can contain multiple segments, e.g. “Cosmos – 2013 paperback”

Returns:

(Boolean)



36
37
38

# File 'lib/reading/parsing/rows/column.rb', line 36

def self.split_by_segment?
  !!segment_separator
end

.split_by_segment_group? ⇒ `Boolean`

Whether the column can contain multiple segment groups, e.g. “2021/1/28..2/1 x4 – ..2/3 x5 —- 11/1 – 11/2”

Returns:

(Boolean)



50
51
52

# File 'lib/reading/parsing/rows/column.rb', line 50

def self.split_by_segment_group?
  !!segment_group_separator
end

.to_sym ⇒ `Symbol`

The class name changed into a symbol, e.g. StartDates => :start_dates

Returns:

(Symbol)

# File 'lib/reading/parsing/rows/column.rb', line 16

def self.to_sym
  class_name = name.split("::").last
  class_name
    .gsub(/(.)([A-Z])/,'\1_\2')
    .downcase
    .to_sym
end

.tweaks ⇒ `Hash{Symbol => Proc}`

Adjustments that are made to captured values at the end of parsing the column. For example, if ::regexes includes a capture group named “sources” and it needs to be split by commas: { sources: -> { _1.split(/s*,s*/) } }

Returns:

(Hash{Symbol => Proc})



66
67
68

# File 'lib/reading/parsing/rows/column.rb', line 66

def self.tweaks
  {}
end

Class: Reading::Parsing::Rows::Column

Overview

Direct Known Subclasses

Constant Summary collapse

Class Method Summary collapse

Class Method Details

.column_name ⇒ String

.flatten_into_arrays ⇒ Array<Symbol>

.regexes(segment_index) ⇒ Array<Regexp>

.regexes_before_formats ⇒ Array<Regexp>

.segment_group_separator ⇒ Regexp?

.segment_separator ⇒ Regexp?

.split_by_format? ⇒ Boolean

.split_by_segment? ⇒ Boolean

.split_by_segment_group? ⇒ Boolean

.to_sym ⇒ Symbol

.tweaks ⇒ Hash{Symbol => Proc}

.column_name ⇒ `String`

.flatten_into_arrays ⇒ `Array<Symbol>`

.regexes(segment_index) ⇒ `Array<Regexp>`

.regexes_before_formats ⇒ `Array<Regexp>`

.segment_group_separator ⇒ `Regexp`^?

.segment_separator ⇒ `Regexp`^?

.split_by_format? ⇒ `Boolean`

.split_by_segment? ⇒ `Boolean`

.split_by_segment_group? ⇒ `Boolean`

.to_sym ⇒ `Symbol`

.tweaks ⇒ `Hash{Symbol => Proc}`