Class: Google::Cloud::DiscoveryEngine::V1::TrainCustomModelRequest::GcsTrainingInput
- Inherits:
-
Object
- Object
- Google::Cloud::DiscoveryEngine::V1::TrainCustomModelRequest::GcsTrainingInput
- Extended by:
- Protobuf::MessageExts::ClassMethods
- Includes:
- Protobuf::MessageExts
- Defined in:
- proto_docs/google/cloud/discoveryengine/v1/search_tuning_service.rb
Overview
Cloud Storage training data input.
Instance Attribute Summary collapse
-
#corpus_data_path ⇒ ::String
The Cloud Storage corpus data which could be associated in train data.
-
#query_data_path ⇒ ::String
The gcs query data which could be associated in train data.
-
#test_data_path ⇒ ::String
Cloud Storage test data.
-
#train_data_path ⇒ ::String
Cloud Storage training data path whose format should be
gs://<bucket_to_data>/<tsv_file_name>
.
Instance Attribute Details
#corpus_data_path ⇒ ::String
Returns The Cloud Storage corpus data which could be associated in train data.
The data path format is gs://<bucket_to_data>/<jsonl_file_name>
.
A newline delimited jsonl/ndjson file.
For search-tuning model, each line should have the _id, title
and text. Example:
{"_id": "doc1", title: "relevant doc", "text": "relevant text"}
.
110 111 112 113 |
# File 'proto_docs/google/cloud/discoveryengine/v1/search_tuning_service.rb', line 110 class GcsTrainingInput include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end |
#query_data_path ⇒ ::String
Returns The gcs query data which could be associated in train data.
The data path format is gs://<bucket_to_data>/<jsonl_file_name>
.
A newline delimited jsonl/ndjson file.
For search-tuning model, each line should have the _id and text. Example: {"_id": "query1", "text": "example query"}.
110 111 112 113 |
# File 'proto_docs/google/cloud/discoveryengine/v1/search_tuning_service.rb', line 110 class GcsTrainingInput include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end |
#test_data_path ⇒ ::String
Returns Cloud Storage test data. Same format as train_data_path. If not provided, a random 80/20 train/test split will be performed on train_data_path.
110 111 112 113 |
# File 'proto_docs/google/cloud/discoveryengine/v1/search_tuning_service.rb', line 110 class GcsTrainingInput include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end |
#train_data_path ⇒ ::String
Returns Cloud Storage training data path whose format should be
gs://<bucket_to_data>/<tsv_file_name>
. The file should be in tsv
format. Each line should have the doc_id and query_id and score (number).
For search-tuning model, it should have the query-id corpus-id
score as tsv file header. The score should be a number in [0, inf+)
.
The larger the number is, the more relevant the pair is. Example:
query-id\tcorpus-id\tscore
query1\tdoc1\t1
.
110 111 112 113 |
# File 'proto_docs/google/cloud/discoveryengine/v1/search_tuning_service.rb', line 110 class GcsTrainingInput include ::Google::Protobuf::MessageExts extend ::Google::Protobuf::MessageExts::ClassMethods end |