Class: Braintrust::Resources::Evals

Inherits:
Object
  • Object
show all
Defined in:
lib/braintrust/resources/evals.rb

Instance Method Summary collapse

Constructor Details

#initialize(client:) ⇒ Evals

Returns a new instance of Evals.



6
7
8
# File 'lib/braintrust/resources/evals.rb', line 6

def initialize(client:)
  @client = client
end

Instance Method Details

#create(params = {}, opts = {}) ⇒ Braintrust::Models::SummarizeExperimentResponse

Launch an evaluation. This is the API-equivalent of the Eval function that is built into the Braintrust SDK. In the Eval API, you provide pointers to a dataset, task function, and scoring functions. The API will then run the evaluation, create an experiment, and return the results along with a link to the experiment. To learn more about evals, see the Evals guide.

Parameters:

  • params (Hash) (defaults to: {})

    Attributes to send in this request.

  • opts (Hash|RequestOptions) (defaults to: {})

    Options to specify HTTP behaviour for this request.

Options Hash (params):

  • :data (Data::UnnamedTypeWithunionParent38|Data::UnnamedTypeWithunionParent39)

    The dataset to use

  • :project_id (String)

    Unique identifier for the project to run the eval in

  • :scores (Array<Score::UnnamedTypeWithunionParent40|Score::UnnamedTypeWithunionParent41|Score::UnnamedTypeWithunionParent42|Score::UnnamedTypeWithunionParent43|Score::UnnamedTypeWithunionParent44|Score::UnnamedTypeWithunionParent45>)

    The functions to score the eval on

  • :task (Task::UnnamedTypeWithunionParent46|Task::UnnamedTypeWithunionParent47|Task::UnnamedTypeWithunionParent48|Task::UnnamedTypeWithunionParent49|Task::UnnamedTypeWithunionParent50|Task::UnnamedTypeWithunionParent51)

    The function to evaluate

  • :experiment_name (String)

    An optional name for the experiment created by this eval. If it conflicts with an existing experiment, it will be suffixed with a unique identifier.

  • :metadata (Hash)

    Optional experiment-level metadata to store about the evaluation. You can later use this to slice & dice across experiments.

  • :stream (Boolean)

    Whether to stream the results of the eval. If true, the request will return two events: one to indicate the experiment has started, and another upon completion. If false, the request will return the evaluation's summary upon completion.

Returns:



33
34
35
36
37
38
39
40
# File 'lib/braintrust/resources/evals.rb', line 33

def create(params = {}, opts = {})
  req = {}
  req[:method] = :post
  req[:path] = "/v1/eval"
  req[:body] = params
  req[:model] = Braintrust::Models::SummarizeExperimentResponse
  @client.request(req, opts)
end