Class: Google::Cloud::Language::Document
- Inherits:
-
Object
- Object
- Google::Cloud::Language::Document
- Defined in:
- lib/google/cloud/language/document.rb
Overview
# Document
Represents a document for the Language service.
Cloud Natural Language API supports UTF-8, UTF-16, and UTF-32 encodings. (Ruby uses UTF-8 natively, which is the default sent to the API, so unless you’re working with text processed in different platform, you should not need to set the encoding type.)
Be aware that only English, Spanish, and Japanese language content are supported, and sentiment analysis only supports English text.
See Project#document.
Instance Attribute Summary collapse
Class Method Summary collapse
- .from_grpc(grpc, service) ⇒ Object
- .from_source(source, service, format: nil, language: nil) ⇒ Object
Instance Method Summary collapse
-
#annotate(sentiment: false, entities: false, syntax: false, encoding: nil) ⇒ Annotation
(also: #mark, #detect)
Analyzes the document and returns sentiment, entity, and syntactic feature results, depending on the option flags.
- #content? ⇒ Boolean
-
#entities(encoding: nil) ⇒ Annotation::Entities
Entity analysis inspects the given text for known entities (proper nouns such as public figures, landmarks, etc.) and returns information about those entities.
-
#format ⇒ Symbol
The document’s format.
-
#format=(new_format) ⇒ Object
Sets the document’s format.
-
#html! ⇒ Object
Sets the document to the ‘HTML` format.
-
#html? ⇒ Boolean
Whether the document is the ‘HTML` format.
-
#initialize ⇒ Document
constructor
A new instance of Document.
- #inspect ⇒ Object
-
#language ⇒ String
The document’s language.
-
#language=(new_language) ⇒ Object
Sets the document’s language.
-
#sentiment ⇒ Annotation::Sentiment
Sentiment analysis inspects the given text and identifies the prevailing emotional opinion within the text, especially to determine a writer’s attitude as positive, negative, or neutral.
- #source ⇒ Object
-
#syntax(encoding: nil) ⇒ Annotation
Syntactic analysis extracts linguistic information, breaking up the given text into a series of sentences and tokens (generally, word boundaries), providing further analysis on those tokens.
-
#text! ⇒ Object
Sets the document to the ‘TEXT` format.
-
#text? ⇒ Boolean
Whether the document is the ‘TEXT` format.
- #to_grpc ⇒ Object
- #url? ⇒ Boolean
Constructor Details
#initialize ⇒ Document
Returns a new instance of Document.
59 60 61 62 |
# File 'lib/google/cloud/language/document.rb', line 59 def initialize @grpc = nil @service = nil end |
Instance Attribute Details
#service ⇒ Object
55 56 57 |
# File 'lib/google/cloud/language/document.rb', line 55 def service @service end |
Class Method Details
.from_grpc(grpc, service) ⇒ Object
331 332 333 334 335 336 |
# File 'lib/google/cloud/language/document.rb', line 331 def self.from_grpc grpc, service new.tap do |i| i.instance_variable_set :@grpc, grpc i.instance_variable_set :@service, service end end |
.from_source(source, service, format: nil, language: nil) ⇒ Object
340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 |
# File 'lib/google/cloud/language/document.rb', line 340 def self.from_source source, service, format: nil, language: nil source = String source grpc = Google::Cloud::Language::V1beta1::Document.new if source.start_with? "gs://" grpc.gcs_content_uri = source format ||= :html if source.end_with? ".html" else grpc.content = source end if format.to_s == "html" grpc.type = :HTML else grpc.type = :PLAIN_TEXT end grpc.language = language.to_s unless language.nil? from_grpc grpc, service end |
Instance Method Details
#annotate(sentiment: false, entities: false, syntax: false, encoding: nil) ⇒ Annotation Also known as: mark, detect
Analyzes the document and returns sentiment, entity, and syntactic feature results, depending on the option flags. Calling ‘annotate` with no arguments will perform all analysis features. Each feature is priced separately. See [Pricing](cloud.google.com/natural-language/pricing) for details.
219 220 221 222 223 224 225 226 227 |
# File 'lib/google/cloud/language/document.rb', line 219 def annotate sentiment: false, entities: false, syntax: false, encoding: nil ensure_service! grpc = service.annotate to_grpc, sentiment: sentiment, entities: entities, syntax: syntax, encoding: encoding Annotation.from_grpc grpc end |
#content? ⇒ Boolean
67 68 69 |
# File 'lib/google/cloud/language/document.rb', line 67 def content? @grpc.source == :content end |
#entities(encoding: nil) ⇒ Annotation::Entities
Entity analysis inspects the given text for known entities (proper nouns such as public figures, landmarks, etc.) and returns information about those entities.
content = “Darth Vader is the best villain in Star Wars.” document = language.document content entities = document.entities # API call
entities.count #=> 2 entities.first.name #=> “Darth Vader” entities.first.type #=> :PERSON entities.first.name #=> “Star Wars” entities.first.type #=> :WORK_OF_ART
282 283 284 285 286 |
# File 'lib/google/cloud/language/document.rb', line 282 def entities encoding: nil ensure_service! grpc = service.entities to_grpc, encoding: encoding Annotation::Entities.from_grpc grpc end |
#format ⇒ Symbol
The document’s format.
91 92 93 94 |
# File 'lib/google/cloud/language/document.rb', line 91 def format return :text if text? return :html if html? end |
#format=(new_format) ⇒ Object
Sets the document’s format.
106 107 108 109 110 |
# File 'lib/google/cloud/language/document.rb', line 106 def format= new_format @grpc.type = :PLAIN_TEXT if new_format.to_s == "text" @grpc.type = :HTML if new_format.to_s == "html" @grpc.type end |
#html! ⇒ Object
Sets the document to the ‘HTML` format.
140 141 142 |
# File 'lib/google/cloud/language/document.rb', line 140 def html! @grpc.type = :HTML end |
#html? ⇒ Boolean
Whether the document is the ‘HTML` format.
133 134 135 |
# File 'lib/google/cloud/language/document.rb', line 133 def html? @grpc.type == :HTML end |
#inspect ⇒ Object
317 318 319 320 321 |
# File 'lib/google/cloud/language/document.rb', line 317 def inspect "#<#{self.class.name} (" \ "#{(content? ? "\"#{source[0, 16]}...\"" : source)}, " \ "format: #{format.inspect}, language: #{language.inspect})>" end |
#language ⇒ String
The document’s language. ISO and BCP-47 language codes are supported.
149 150 151 |
# File 'lib/google/cloud/language/document.rb', line 149 def language @grpc.language end |
#language=(new_language) ⇒ Object
Sets the document’s language.
163 164 165 |
# File 'lib/google/cloud/language/document.rb', line 163 def language= new_language @grpc.language = new_language.to_s end |
#sentiment ⇒ Annotation::Sentiment
Sentiment analysis inspects the given text and identifies the prevailing emotional opinion within the text, especially to determine a writer’s attitude as positive, negative, or neutral. Currently, only English is supported for sentiment analysis.
content = “Darth Vader is the best villain in Star Wars.” document = language.document content sentiment = document.sentiment # API call
sentiment.polarity #=> 1.0 sentiment.magnitude #=> 0.8999999761581421
310 311 312 313 314 |
# File 'lib/google/cloud/language/document.rb', line 310 def sentiment ensure_service! grpc = service.sentiment to_grpc Annotation::Sentiment.from_grpc grpc end |
#source ⇒ Object
81 82 83 84 |
# File 'lib/google/cloud/language/document.rb', line 81 def source return @grpc.content if content? @grpc.gcs_content_uri end |
#syntax(encoding: nil) ⇒ Annotation
Syntactic analysis extracts linguistic information, breaking up the given text into a series of sentences and tokens (generally, word boundaries), providing further analysis on those tokens.
252 253 254 |
# File 'lib/google/cloud/language/document.rb', line 252 def syntax encoding: nil annotate syntax: true, encoding: encoding end |
#text! ⇒ Object
Sets the document to the ‘TEXT` format.
124 125 126 |
# File 'lib/google/cloud/language/document.rb', line 124 def text! @grpc.type = :PLAIN_TEXT end |
#text? ⇒ Boolean
Whether the document is the ‘TEXT` format.
117 118 119 |
# File 'lib/google/cloud/language/document.rb', line 117 def text? @grpc.type == :PLAIN_TEXT end |
#to_grpc ⇒ Object
325 326 327 |
# File 'lib/google/cloud/language/document.rb', line 325 def to_grpc @grpc end |
#url? ⇒ Boolean
74 75 76 |
# File 'lib/google/cloud/language/document.rb', line 74 def url? @grpc.source == :gcs_content_uri end |