Class: SecApi::Extractor
- Inherits:
-
Object
- Object
- SecApi::Extractor
- Defined in:
- lib/sec_api/extractor.rb
Overview
Extractor proxy for document extraction endpoints
All extractor methods return immutable ExtractedData objects (not raw hashes). This ensures thread safety and a consistent API surface.
Constant Summary collapse
- SECTION_MAP =
This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.
Maps Ruby symbols to SEC item identifiers for 10-K filings
{ risk_factors: "1A", business: "1", mda: "7", financials: "8", legal_proceedings: "3", properties: "2", market_risk: "7A" }.freeze
Instance Method Summary collapse
-
#extract(filing, sections: nil, **options) ⇒ ExtractedData
Extract text and sections from SEC filing.
-
#initialize(client) ⇒ SecApi::Extractor
constructor
private
Creates a new Extractor proxy instance.
Constructor Details
#initialize(client) ⇒ SecApi::Extractor
This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.
Creates a new Extractor proxy instance.
Extractor instances are obtained via Client#extractor and cached for reuse. Direct instantiation is not recommended.
38 39 40 |
# File 'lib/sec_api/extractor.rb', line 38 def initialize(client) @_client = client end |
Instance Method Details
#extract(filing, sections: nil, **options) ⇒ ExtractedData
When extracting multiple sections, one API call is made per section. This may impact latency and API usage costs for large section lists.
Extract text and sections from SEC filing
68 69 70 71 72 73 74 75 76 77 78 79 80 |
# File 'lib/sec_api/extractor.rb', line 68 def extract(filing, sections: nil, **) url = filing.is_a?(String) ? filing : filing.url if sections.nil? || sections.empty? # Default behavior - extract full filing response = @_client.connection.post("/extractor", {url: url}.merge()) ExtractedData.from_api(response.body) else # Extract specified sections section_contents = extract_sections(url, Array(sections), ) ExtractedData.from_api({sections: section_contents}) end end |