Class: SecApi::Extractor

Inherits:

Object

Object
SecApi::Extractor

show all

Defined in:: lib/sec_api/extractor.rb

Overview

Extractor proxy for document extraction endpoints

All extractor methods return immutable ExtractedData objects (not raw hashes). This ensures thread safety and a consistent API surface.

Examples:

Extract text from filing

extracted = client.extractor.extract(filing_url)
extracted.text              # => "Full extracted text..."
extracted.sections          # => { risk_factors: "...", financials: "..." }
extracted.metadata          # => { source_url: "...", form_type: "10-K" }

Extract specific sections

extracted = client.extractor.extract(filing_url, sections: [:risk_factors, :mda])
extracted.risk_factors      # => "Risk factor content..."
extracted.mda               # => "MD&A content..."

Constant Summary collapse

SECTION_MAP =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

Maps Ruby symbols to SEC item identifiers for 10-K filings

API:

private

{
  risk_factors: "1A",
  business: "1",
  mda: "7",
  financials: "8",
  legal_proceedings: "3",
  properties: "2",
  market_risk: "7A"
}.freeze

Instance Method Summary collapse

#extract(filing, sections: nil, **options) ⇒ ExtractedData

Extract text and sections from SEC filing.
#initialize(client) ⇒ SecApi::Extractor constructor private

Creates a new Extractor proxy instance.

Constructor Details

#initialize(client) ⇒ `SecApi::Extractor`

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Creates a new Extractor proxy instance.

Extractor instances are obtained via Client#extractor and cached for reuse. Direct instantiation is not recommended.

Parameters:

The parent client for API access

API:

private



38
39
40

# File 'lib/sec_api/extractor.rb', line 38

def initialize(client)
  @_client = client
end

Instance Method Details

#extract(filing, sections: nil, **options) ⇒ `ExtractedData`

Note:

When extracting multiple sections, one API call is made per section. This may impact latency and API usage costs for large section lists.

Extract text and sections from SEC filing

Examples:

Extract full filing

extracted = client.extractor.extract(filing_url)
extracted.text  # => "Full filing text..."

Extract specific section (dynamic accessor)

extracted = client.extractor.extract(filing_url, sections: [:risk_factors])
extracted.risk_factors  # => "Risk factors content..."

Extract multiple sections (dynamic accessors)

extracted = client.extractor.extract(filing_url, sections: [:risk_factors, :mda])
extracted.risk_factors  # => "Risk factors..."
extracted.mda           # => "MD&A analysis..."

Parameters:

The filing URL string or Filing object
(defaults to: nil)

Specific sections to extract (e.g., [:risk_factors, :mda]) When nil or omitted, extracts the full filing text. Supported sections: :risk_factors, :business, :mda, :financials, :legal_proceedings, :properties, :market_risk
Additional extraction options passed to the API

Returns:

Immutable extracted data object

Raises:

when API key is invalid
when filing URL is not found
when connection fails

# File 'lib/sec_api/extractor.rb', line 68

def extract(filing, sections: nil, **options)
  url = filing.is_a?(String) ? filing : filing.url

  if sections.nil? || sections.empty?
    # Default behavior - extract full filing
    response = @_client.connection.post("/extractor", {url: url}.merge(options))
    ExtractedData.from_api(response.body)
  else
    # Extract specified sections
    section_contents = extract_sections(url, Array(sections), options)
    ExtractedData.from_api({sections: section_contents})
  end
end

Class: SecApi::Extractor

Overview

Examples:

Extract text from filing

Extract specific sections

Constant Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(client) ⇒ SecApi::Extractor

Instance Method Details

#extract(filing, sections: nil, **options) ⇒ ExtractedData

Examples:

Extract full filing

Extract specific section (dynamic accessor)

Extract multiple sections (dynamic accessors)

#initialize(client) ⇒ `SecApi::Extractor`

#extract(filing, sections: nil, **options) ⇒ `ExtractedData`