Langfuse Ruby SDK

Gem Version CI Ruby License

Ruby SDK for Langfuse - the open-source LLM engineering platform. This SDK provides comprehensive tracing, prompt management, and evaluation capabilities for LLM applications.

Features

  • 🔍 Tracing: Complete observability for LLM applications with traces, spans, and generations
  • 📝 Prompt Management: Version control and deployment of prompts with caching
  • 📊 Evaluation: Built-in evaluators and custom scoring capabilities
  • 🎯 Events: Generic event tracking for custom application events and logging
  • 🚀 Async Processing: Background event processing with automatic batching
  • 🔒 Type Safety: Comprehensive error handling and validation
  • 🎯 Framework Integration: Easy integration with popular Ruby frameworks

Installation

Add this line to your application's Gemfile:

gem 'langfuse-ruby'

And then execute:

$ bundle install

Or install it yourself as:

$ gem install langfuse-ruby

Quick Start

1. Initialize the Client

require 'langfuse'

# Initialize with API keys
client = Langfuse.new(
  public_key: "pk-lf-...",
  secret_key: "sk-lf-...",
  host: "https://cloud.langfuse.com"  # Optional, defaults to cloud.langfuse.com
)

# Or configure globally
Langfuse.configure do |config|
  config.public_key = "pk-lf-..."
  config.secret_key = "sk-lf-..."
  config.host = "https://cloud.langfuse.com"
end

client = Langfuse.new

2. Basic Tracing

# Create a trace
trace = client.trace(
  name: "chat-completion",
  user_id: "user123",
  session_id: "session456",
  metadata: { environment: "production" }
)

# Add a generation (LLM call)
generation = trace.generation(
  name: "openai-completion",
  model: "gpt-3.5-turbo",
  input: [{ role: "user", content: "Hello, world!" }],
  model_parameters: { temperature: 0.7, max_tokens: 100 }
)

generation.end(output: 'Hello! How can I help you today?', usage: { prompt_tokens: 10, completion_tokens: 15, total_tokens: 25 })

trace.update(output: 'Hello! How can I help you today?')

# Flush events (optional - happens automatically)
client.flush

3. Nested Spans

trace = client.trace(name: "document-qa")

# Create a span for document retrieval
retrieval_span = trace.span(
  name: "document-retrieval",
  input: { query: "What is machine learning?" }
)

# Add a generation for embedding
embedding_gen = retrieval_span.generation(
  name: "embedding-generation",
  model: "text-embedding-ada-002",
  input: "What is machine learning?",
  output: [0.1, 0.2, 0.3], # embedding vector
  usage: { prompt_tokens: 5, total_tokens: 5 }
)

# End the retrieval span
retrieval_span.end(
  output: { documents: ["ML is...", "Machine learning involves..."] }
)

# Create a span for answer generation
answer_span = trace.span(
  name: "answer-generation",
  input: { 
    query: "What is machine learning?",
    context: ["ML is...", "Machine learning involves..."]
  }
)

# Add LLM generation
llm_gen = answer_span.generation(
  name: "openai-completion",
  model: "gpt-3.5-turbo",
  input: [
    { role: "system", content: "Answer based on context" },
    { role: "user", content: "What is machine learning?" }
  ]
)

answer_span.end(output: { answer: "Machine learning is a subset of AI..." }, usage: { prompt_tokens: 50, completion_tokens: 30, total_tokens: 80 })

Events

Create generic events for custom application events and logging:

# Create events from trace
event = trace.event(
  name: "user_action",
  input: { action: "login", user_id: "123" },
  output: { success: true },
  metadata: { ip: "192.168.1.1" }
)

# Create events from spans or generations
validation_event = span.event(
  name: "validation_check",
  input: { rules: ["required", "format"] },
  output: { valid: true, warnings: [] }
)

# Direct event creation
event = client.event(
  trace_id: trace.id,
  name: "audit_log",
  input: { operation: "data_export" },
  output: { status: "completed" },
  level: "INFO"
)

Prompt Management

Get and Use Prompts

# Get a prompt
prompt = client.get_prompt("chat-prompt", version: 1)

# Compile prompt with variables
compiled = prompt.compile(
  user_name: "Alice",
  topic: "machine learning"
)

puts compiled
# Output: "Hello Alice! Let's discuss machine learning today."

Create Prompts

# Create a text prompt
text_prompt = client.create_prompt(
  name: "greeting-prompt",
  prompt: "Hello {{user_name}}! How can I help you with {{topic}} today?",
  labels: ["greeting", "customer-service"],
  config: { temperature: 0.7 }
)

# Create a chat prompt
chat_prompt = client.create_prompt(
  name: "chat-prompt",
  prompt: [
    { role: "system", content: "You are a helpful assistant specialized in {{domain}}." },
    { role: "user", content: "{{user_message}}" }
  ],
  labels: ["chat", "assistant"]
)

Prompt Templates

# Create prompt templates for reuse
template = Langfuse::PromptTemplate.from_template(
  "Translate the following text to {{language}}: {{text}}"
)

translated = template.format(
  language: "Spanish",
  text: "Hello, world!"
)

# Chat prompt templates
chat_template = Langfuse::ChatPromptTemplate.from_messages([
  { role: "system", content: "You are a {{role}} assistant." },
  { role: "user", content: "{{user_input}}" }
])

messages = chat_template.format(
  role: "helpful",
  user_input: "What is Ruby?"
)

Evaluation and Scoring

Built-in Evaluators

# Exact match evaluator
exact_match = Langfuse::Evaluators::ExactMatchEvaluator.new

result = exact_match.evaluate(
  input: "What is 2+2?",
  output: "4",
  expected: "4"
)
# => { name: "exact_match", value: 1, comment: "Exact match" }

# Similarity evaluator
similarity = Langfuse::Evaluators::SimilarityEvaluator.new

result = similarity.evaluate(
  input: "What is AI?",
  output: "Artificial Intelligence is...",
  expected: "AI is artificial intelligence..."
)
# => { name: "similarity", value: 0.85, comment: "Similarity: 85%" }

# Length evaluator
length = Langfuse::Evaluators::LengthEvaluator.new(min_length: 10, max_length: 100)

result = length.evaluate(
  input: "Explain AI",
  output: "AI is a field of computer science that focuses on creating intelligent machines."
)
# => { name: "length", value: 1, comment: "Length 80 within range" }

Custom Scoring

# Add scores to traces or observations
trace = client.trace(name: "qa-session")

# Score the entire trace
trace.score(
  name: "user-satisfaction",
  value: 0.9,
  comment: "User was very satisfied"
)

# Score specific generations
generation = trace.generation(
  name: "answer-generation",
  model: "gpt-3.5-turbo",
  output: { content: "The answer is 42." }
)

generation.score(
  name: "accuracy",
  value: 0.8,
  comment: "Mostly accurate answer"
)

generation.score(
  name: "helpfulness",
  value: 0.95,
  comment: "Very helpful response"
)

Advanced Usage

Error Handling

begin
  client = Langfuse.new(
    public_key: "invalid-key",
    secret_key: "invalid-secret"
  )

  trace = client.trace(name: "test")
  client.flush
rescue Langfuse::AuthenticationError => e
  puts "Authentication failed: #{e.message}"
rescue Langfuse::RateLimitError => e
  puts "Rate limit exceeded: #{e.message}"
rescue Langfuse::NetworkError => e
  puts "Network error: #{e.message}"
rescue Langfuse::APIError => e
  puts "API error: #{e.message}"
end

Configuration Options

client = Langfuse.new(
  public_key: "pk-lf-...",
  secret_key: "sk-lf-...",
  host: "https://your-instance.langfuse.com",
  debug: true,        # Enable debug logging
  timeout: 30,        # Request timeout in seconds
  retries: 3,         # Number of retry attempts
  flush_interval: 30, # Event flush interval in seconds (default: 5)
  auto_flush: true    # Enable automatic flushing (default: true)
)

Environment Variables

You can also configure the client using environment variables:

export LANGFUSE_PUBLIC_KEY="pk-lf-..."
export LANGFUSE_SECRET_KEY="sk-lf-..."
export LANGFUSE_HOST="https://cloud.langfuse.com"
export LANGFUSE_FLUSH_INTERVAL=5
export LANGFUSE_AUTO_FLUSH=true

Automatic Flush Control

By default, the Langfuse client automatically flushes events to the server at regular intervals using a background thread. You can control this behavior:

Enable/Disable Auto Flush

# Enable automatic flushing (default)
client = Langfuse.new(
  public_key: "pk-lf-...",
  secret_key: "sk-lf-...",
  auto_flush: true,
  flush_interval: 5  # Flush every 5 seconds
)

# Disable automatic flushing for manual control
client = Langfuse.new(
  public_key: "pk-lf-...",
  secret_key: "sk-lf-...",
  auto_flush: false
)

# Manual flush when auto_flush is disabled
client.flush

Global Configuration

Langfuse.configure do |config|
  config.auto_flush = false  # Disable auto flush globally
  config.flush_interval = 10
end

Environment Variable

export LANGFUSE_AUTO_FLUSH=false

Use Cases

Auto Flush Enabled (Default)

  • Best for most applications
  • Events are sent automatically
  • No manual management required

Auto Flush Disabled

  • Better performance for batch operations
  • More control over when events are sent
  • Requires manual flush calls
  • Useful for high-frequency operations
# Example: Batch processing with manual flush
client = Langfuse.new(auto_flush: false)

# Process many items
1000.times do |i|
  trace = client.trace(name: "batch-item-#{i}")
  # ... process item
end

# Flush all events at once
client.flush

Shutdown

# Ensure all events are flushed before shutdown
client.shutdown

Framework Integration

Rails Integration

# config/initializers/langfuse.rb
Langfuse.configure do |config|
  config.public_key = Rails.application.credentials.langfuse_public_key
  config.secret_key = Rails.application.credentials.langfuse_secret_key
  config.debug = Rails.env.development?
end

# In your controller or service
class ChatController < ApplicationController
  def create
    @client = Langfuse.new

    trace = @client.trace(
      name: "chat-request",
      user_id: current_user.id,
      session_id: session.id,
      input: params[:message],
      metadata: { 
        controller: self.class.name,
        action: action_name,
        ip: request.remote_ip
      }
    )

    # Your LLM logic here
    response = generate_response(params[:message])

    render json: { response: response }
  end
end

Sidekiq Integration

class LLMProcessingJob < ApplicationJob
  def perform(user_id, message)
    client = Langfuse.new

    trace = client.trace(
      name: "background-llm-processing",
      user_id: user_id,
      input: { message: message },
      metadata: { job_class: self.class.name }
    )

    # Process with LLM
    result = process_with_llm(message)

    # Ensure events are flushed
    client.flush
  end
end

Examples

Check out the examples/ directory for more comprehensive examples:

Development

After checking out the repo, run:

bin/setup

To install dependencies. Then, run:

rake spec

To run the tests. You can also run:

bin/console

For an interactive prompt that will allow you to experiment.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/ai-firstly/langfuse-ruby.

License

The gem is available as open source under the terms of the MIT License.