Class: Google::Cloud::DocumentAI::V1beta3::OcrConfig

Inherits:
Object
  • Object
show all
Extended by:
Protobuf::MessageExts::ClassMethods
Includes:
Protobuf::MessageExts
Defined in:
proto_docs/google/cloud/documentai/v1beta3/document_io.rb

Overview

Config for Document OCR.

Defined Under Namespace

Classes: Hints

Instance Attribute Summary collapse

Instance Attribute Details

#advanced_ocr_options::Array<::String>

Returns A list of advanced OCR options to further fine-tune OCR behavior. Current valid values are:

  • legacy_layout: a heuristics layout detection algorithm, which serves as an alternative to the current ML-based layout detection algorithm. Customers can choose the best suitable layout algorithm based on their situation.

Returns:

  • (::Array<::String>)

    A list of advanced OCR options to further fine-tune OCR behavior. Current valid values are:

    • legacy_layout: a heuristics layout detection algorithm, which serves as an alternative to the current ML-based layout detection algorithm. Customers can choose the best suitable layout algorithm based on their situation.


146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
# File 'proto_docs/google/cloud/documentai/v1beta3/document_io.rb', line 146

class OcrConfig
  include ::Google::Protobuf::MessageExts
  extend ::Google::Protobuf::MessageExts::ClassMethods

  # Hints for OCR Engine
  # @!attribute [rw] language_hints
  #   @return [::Array<::String>]
  #     List of BCP-47 language codes to use for OCR. In most cases, not
  #     specifying it yields the best results since it enables automatic language
  #     detection. For languages based on the Latin alphabet, setting hints is
  #     not needed. In rare cases, when the language of the text in the
  #     image is known, setting a hint will help get better results (although it
  #     will be a significant hindrance if the hint is wrong).
  class Hints
    include ::Google::Protobuf::MessageExts
    extend ::Google::Protobuf::MessageExts::ClassMethods
  end
end

#compute_style_info::Boolean

Returns Turn on font id model and returns font style information.

Returns:

  • (::Boolean)

    Turn on font id model and returns font style information.



146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
# File 'proto_docs/google/cloud/documentai/v1beta3/document_io.rb', line 146

class OcrConfig
  include ::Google::Protobuf::MessageExts
  extend ::Google::Protobuf::MessageExts::ClassMethods

  # Hints for OCR Engine
  # @!attribute [rw] language_hints
  #   @return [::Array<::String>]
  #     List of BCP-47 language codes to use for OCR. In most cases, not
  #     specifying it yields the best results since it enables automatic language
  #     detection. For languages based on the Latin alphabet, setting hints is
  #     not needed. In rare cases, when the language of the text in the
  #     image is known, setting a hint will help get better results (although it
  #     will be a significant hindrance if the hint is wrong).
  class Hints
    include ::Google::Protobuf::MessageExts
    extend ::Google::Protobuf::MessageExts::ClassMethods
  end
end

#enable_image_quality_scores::Boolean

Returns Enables intelligent document quality scores after OCR. Can help with diagnosing why OCR responses are of poor quality for a given input. Adds additional latency comparable to regular OCR to the process call.

Returns:

  • (::Boolean)

    Enables intelligent document quality scores after OCR. Can help with diagnosing why OCR responses are of poor quality for a given input. Adds additional latency comparable to regular OCR to the process call.



146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
# File 'proto_docs/google/cloud/documentai/v1beta3/document_io.rb', line 146

class OcrConfig
  include ::Google::Protobuf::MessageExts
  extend ::Google::Protobuf::MessageExts::ClassMethods

  # Hints for OCR Engine
  # @!attribute [rw] language_hints
  #   @return [::Array<::String>]
  #     List of BCP-47 language codes to use for OCR. In most cases, not
  #     specifying it yields the best results since it enables automatic language
  #     detection. For languages based on the Latin alphabet, setting hints is
  #     not needed. In rare cases, when the language of the text in the
  #     image is known, setting a hint will help get better results (although it
  #     will be a significant hindrance if the hint is wrong).
  class Hints
    include ::Google::Protobuf::MessageExts
    extend ::Google::Protobuf::MessageExts::ClassMethods
  end
end

#enable_native_pdf_parsing::Boolean

Returns Enables special handling for PDFs with existing text information. Results in better text extraction quality in such PDF inputs.

Returns:

  • (::Boolean)

    Enables special handling for PDFs with existing text information. Results in better text extraction quality in such PDF inputs.



146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
# File 'proto_docs/google/cloud/documentai/v1beta3/document_io.rb', line 146

class OcrConfig
  include ::Google::Protobuf::MessageExts
  extend ::Google::Protobuf::MessageExts::ClassMethods

  # Hints for OCR Engine
  # @!attribute [rw] language_hints
  #   @return [::Array<::String>]
  #     List of BCP-47 language codes to use for OCR. In most cases, not
  #     specifying it yields the best results since it enables automatic language
  #     detection. For languages based on the Latin alphabet, setting hints is
  #     not needed. In rare cases, when the language of the text in the
  #     image is known, setting a hint will help get better results (although it
  #     will be a significant hindrance if the hint is wrong).
  class Hints
    include ::Google::Protobuf::MessageExts
    extend ::Google::Protobuf::MessageExts::ClassMethods
  end
end

#enable_symbol::Boolean

Returns Includes symbol level OCR information if set to true.

Returns:

  • (::Boolean)

    Includes symbol level OCR information if set to true.



146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
# File 'proto_docs/google/cloud/documentai/v1beta3/document_io.rb', line 146

class OcrConfig
  include ::Google::Protobuf::MessageExts
  extend ::Google::Protobuf::MessageExts::ClassMethods

  # Hints for OCR Engine
  # @!attribute [rw] language_hints
  #   @return [::Array<::String>]
  #     List of BCP-47 language codes to use for OCR. In most cases, not
  #     specifying it yields the best results since it enables automatic language
  #     detection. For languages based on the Latin alphabet, setting hints is
  #     not needed. In rare cases, when the language of the text in the
  #     image is known, setting a hint will help get better results (although it
  #     will be a significant hindrance if the hint is wrong).
  class Hints
    include ::Google::Protobuf::MessageExts
    extend ::Google::Protobuf::MessageExts::ClassMethods
  end
end

#hints::Google::Cloud::DocumentAI::V1beta3::OcrConfig::Hints

Returns Hints for the OCR model.

Returns:



146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
# File 'proto_docs/google/cloud/documentai/v1beta3/document_io.rb', line 146

class OcrConfig
  include ::Google::Protobuf::MessageExts
  extend ::Google::Protobuf::MessageExts::ClassMethods

  # Hints for OCR Engine
  # @!attribute [rw] language_hints
  #   @return [::Array<::String>]
  #     List of BCP-47 language codes to use for OCR. In most cases, not
  #     specifying it yields the best results since it enables automatic language
  #     detection. For languages based on the Latin alphabet, setting hints is
  #     not needed. In rare cases, when the language of the text in the
  #     image is known, setting a hint will help get better results (although it
  #     will be a significant hindrance if the hint is wrong).
  class Hints
    include ::Google::Protobuf::MessageExts
    extend ::Google::Protobuf::MessageExts::ClassMethods
  end
end