π BetterTranslate
AI-powered YAML locale file translator for Rails and Ruby projects
BetterTranslate automatically translates your YAML locale files using cutting-edge AI providers (ChatGPT, Google Gemini, and Anthropic Claude). It's designed for Rails applications but works with any Ruby project that uses YAML-based internationalization.
π― Why BetterTranslate?
- β Production-Ready: Tested with real APIs via VCR cassettes (18 cassettes, 260KB)
- β
Interactive Demo: Try it in 2 minutes with
ruby spec/dummy/demo_translation.rb - β
Variable Preservation:
%{name}placeholders maintained in translations - β Nested YAML Support: Complex structures preserved perfectly
- β Multiple Providers: Choose ChatGPT, Gemini, or Claude
| Provider | Model | Speed | Quality | Cost |
|---|---|---|---|---|
| ChatGPT | GPT-5-nano | β‘β‘β‘ Fast | βββββ Excellent | π°π° Medium |
| Gemini | gemini-2.0-flash-exp | β‘β‘β‘β‘ Very Fast | ββββ Very Good | π° Low |
| Claude | Claude 3.5 | β‘β‘ Medium | βββββ Excellent | π°π°π° High |
β¨ Features
Core Translation Features
- π€ Multiple AI Providers: Support for ChatGPT (GPT-5-nano), Google Gemini (gemini-2.0-flash-exp), and Anthropic Claude
- β‘ Intelligent Caching: LRU cache with optional TTL reduces API costs and speeds up repeated translations
- π Translation Modes: Choose between override (replace entire files) or incremental (merge with existing translations)
- π― Smart Strategies: Automatic selection between deep translation (< 50 strings) and batch translation (β₯ 50 strings)
- π« Flexible Exclusions: Global exclusions for all languages + language-specific exclusions for fine-grained control
- π¨ Translation Context: Provide domain-specific context for medical, legal, financial, or technical terminology
- π Similarity Analysis: Built-in Levenshtein distance analyzer to identify similar translations
- π Orphan Key Analyzer: Find unused translation keys in your codebase with comprehensive reports (text, JSON, CSV)
New in v1.1.0 π
- ποΈ Provider-Specific Options: Fine-tune AI behavior with
model,temperature, andmax_tokens - πΎ Automatic Backups: Configurable backup rotation before overwriting files (
.bak,.bak.1,.bak.2) - π¦ JSON Support: Full support for JSON locale files (React, Vue, modern JS frameworks)
- β‘ Parallel Translation: Translate multiple languages concurrently with thread-based execution
- π Multiple Files: Translate multiple files with arrays or glob patterns (
**/*.en.yml)
Development & Quality
- π§ͺ Comprehensive Testing: Unit tests + integration tests with VCR cassettes (18 cassettes, 260KB)
- π¬ Rails Dummy App: Interactive demo with real translations (
ruby spec/dummy/demo_translation.rb) - π VCR Integration: Record real API responses, test without API keys, CI/CD friendly
- π‘οΈ Type-Safe Configuration: Comprehensive validation with detailed error messages
- π YARD Documentation: Complete API documentation with examples
- π Retry Logic: Exponential backoff for failed API calls (3 attempts, configurable)
- π¦ Rate Limiting: Thread-safe rate limiter prevents API overload
π Quick Start
Try It Now (Interactive Demo)
Clone the repo and run the demo to see BetterTranslate in action:
git clone https://github.com/alessiobussolari/better_translate.git
cd better_translate
bundle install
# Set your OpenAI API key
export OPENAI_API_KEY=your_key_here
# Run the demo!
ruby spec/dummy/demo_translation.rb
What happens:
- β
Reads
en.ymlwith 16 translation keys - β Translates to Italian and French using ChatGPT
- β
Generates
it.ymlandfr.ymlfiles - β Shows progress, results, and sample translations
- β Takes ~2 minutes (real API calls)
Sample Output:
# en.yml (input)
en:
hello: "Hello"
users:
greeting: "Hello %{name}"
# it.yml (generated) β
it:
hello: "Ciao"
users:
greeting: "Ciao %{name}" # Variable preserved!
# fr.yml (generated) β
fr:
hello: "Bonjour"
users:
greeting: "Bonjour %{name}" # Variable preserved!
See spec/dummy/USAGE_GUIDE.md for more examples.
Rails Integration
# config/initializers/better_translate.rb
BetterTranslate.configure do |config|
config.provider = :chatgpt
config.openai_key = ENV["OPENAI_API_KEY"]
config.source_language = "en"
config.target_languages = [
{ short_name: "it", name: "Italian" },
{ short_name: "fr", name: "French" },
{ short_name: "es", name: "Spanish" }
]
config.input_file = "config/locales/en.yml"
config.output_folder = "config/locales"
# Optional: Provide context for better translations
config.translation_context = "E-commerce application with product catalog"
end
# Translate all files
BetterTranslate.translate_all
π¦ Installation
Add this line to your application's Gemfile:
gem "better_translate"
And then execute:
bundle install
Or install it yourself as:
gem install better_translate
Rails Integration
For Rails applications, generate the initializer:
rails generate better_translate:install
This creates config/initializers/better_translate.rb with example configuration for all supported providers.
βοΈ Configuration
Provider Setup
ChatGPT (OpenAI)
BetterTranslate.configure do |config|
config.provider = :chatgpt
config.openai_key = ENV["OPENAI_API_KEY"]
# Optional: customize model settings (defaults shown)
config.request_timeout = 30 # seconds
config.max_retries = 3
config.retry_delay = 2.0 # seconds
# π v1.1.0: Provider-specific options
config.model = "gpt-5-nano" # Specify model (optional)
config.temperature = 0.3 # Creativity (0.0-2.0, default: 0.3)
config.max_tokens = 2000 # Response length limit
end
Get your API key from OpenAI Platform.
Google Gemini
BetterTranslate.configure do |config|
config.provider = :gemini
config.google_gemini_key = ENV["GOOGLE_GEMINI_API_KEY"]
# Same optional settings as ChatGPT
config.request_timeout = 30
config.max_retries = 3
end
Get your API key from Google AI Studio.
Anthropic Claude
BetterTranslate.configure do |config|
config.provider = :anthropic
config.anthropic_key = ENV["ANTHROPIC_API_KEY"]
# Same optional settings
config.request_timeout = 30
config.max_retries = 3
end
Get your API key from Anthropic Console.
New Features (v1.1.0)
Automatic Backups
Protect your translation files with automatic backup creation:
config.create_backup = true # Enable backups (default: true)
config.max_backups = 5 # Keep up to 5 backup versions
Backup files are created with rotation:
- First backup:
it.yml.bak - Second backup:
it.yml.bak.1 - Third backup:
it.yml.bak.2 - Older backups are automatically deleted
JSON File Support
Translate JSON locale files for modern JavaScript frameworks:
# Automatically detects JSON format from file extension
config.input_file = "config/locales/en.json"
config.output_folder = "config/locales"
# All features work with JSON: backups, incremental mode, exclusions, etc.
Example JSON file:
{
"en": {
"common": {
"greeting": "Hello %{name}"
}
}
}
Parallel Translation
Translate multiple languages concurrently for faster processing:
config.target_languages = [
{ short_name: "it", name: "Italian" },
{ short_name: "fr", name: "French" },
{ short_name: "es", name: "Spanish" },
{ short_name: "de", name: "German" }
]
config.max_concurrent_requests = 4 # Translate 4 languages at once
Performance improvement: With 4 languages and max_concurrent_requests = 4, translation time is reduced by ~75% compared to sequential processing.
Multiple Files Support
Translate multiple files in a single run:
# Array of specific files
config.input_files = [
"config/locales/common.en.yml",
"config/locales/errors.en.yml",
"config/locales/admin.en.yml"
]
# Or use glob patterns (recommended)
config.input_files = "config/locales/**/*.en.yml"
# Or combine both approaches
config.input_files = [
"config/locales/**/*.en.yml",
"app/javascript/translations/*.en.json"
]
Output files preserve the original structure:
common.en.ymlβcommon.it.ymlerrors.en.ymlβerrors.it.ymladmin/settings.en.ymlβadmin/settings.it.yml
Language Configuration
config.source_language = "en" # ISO 639-1 code (2 letters)
config.target_languages = [
{ short_name: "it", name: "Italian" },
{ short_name: "fr", name: "French" },
{ short_name: "de", name: "German" },
{ short_name: "es", name: "Spanish" },
{ short_name: "pt", name: "Portuguese" },
{ short_name: "ja", name: "Japanese" },
{ short_name: "zh", name: "Chinese" }
]
File Paths
config.input_file = "config/locales/en.yml" # Source file
config.output_folder = "config/locales" # Output directory
π¨ Features in Detail
Translation Modes
Override Mode (Default)
Replaces the entire target file with fresh translations:
config.translation_mode = :override # default
Use when: Starting fresh or regenerating all translations.
Incremental Mode
Merges with existing translations, only translating missing keys:
config.translation_mode = :incremental
Use when: Preserving manual corrections or adding new keys to existing translations.
Caching System
The LRU (Least Recently Used) cache stores translations to reduce API costs:
config.cache_enabled = true # default: true
config.cache_size = 1000 # default: 1000 items
config.cache_ttl = 3600 # optional: 1 hour in seconds (nil = no expiration)
Cache key format: "#{text}:#{target_lang_code}"
Benefits:
- Reduces API costs for repeated translations
- Speeds up re-runs during development
- Thread-safe with Mutex protection
Rate Limiting
Prevent API overload with built-in rate limiting:
config.max_concurrent_requests = 3 # default: 3
The rate limiter enforces a 0.5-second delay between requests by default. This is handled automatically by the BaseHttpProvider.
Exclusion System
Global Exclusions
Keys excluded from translation in all target languages (useful for brand names, product codes, etc.):
config.global_exclusions = [
"app.name", # "MyApp" should never be translated
"app.company", # "ACME Inc." stays the same
"product.sku" # "SKU-12345" is language-agnostic
]
Language-Specific Exclusions
Keys excluded only for specific languages (useful for manually translated legal text, locale-specific content, etc.):
config.exclusions_per_language = {
"it" => ["legal.terms", "legal.privacy"], # Italian legal text manually reviewed
"de" => ["legal.terms", "legal.privacy"], # German legal text manually reviewed
"fr" => ["marketing.slogan"] # French slogan crafted by marketing team
}
Example:
legal.termsis translated for Spanish, Portuguese, etc.- But excluded for Italian and German (already manually translated)
Translation Context
Provide domain-specific context to improve translation accuracy:
config.translation_context = "Medical terminology for healthcare applications"
This context is included in the AI system prompt, helping with specialized terminology in fields like:
- π₯ Medical/Healthcare: "patient", "diagnosis", "treatment"
- βοΈ Legal: "plaintiff", "defendant", "liability"
- π° Financial: "dividend", "amortization", "escrow"
- π E-commerce: "checkout", "cart", "inventory"
- π§ Technical: "API", "endpoint", "authentication"
Translation Strategies
BetterTranslate automatically selects the optimal strategy based on content size:
Deep Translation (< 50 strings)
- Translates each string individually
- Detailed progress tracking
- Best for small to medium files
Batch Translation (β₯ 50 strings)
- Processes in batches of 10 strings
- Faster for large files
- Reduced API overhead
You don't need to configure this - it's automatic! π―
π§ Rails Integration
BetterTranslate provides three Rails generators:
1. Install Generator
Generate the initializer with example configuration:
rails generate better_translate:install
Creates: config/initializers/better_translate.rb
2. Translate Generator
Run the translation process:
rails generate better_translate:translate
This triggers the translation based on your configuration and displays progress messages.
3. Analyze Generator
Analyze translation similarities using Levenshtein distance:
rails generate better_translate:analyze
Output:
- Console summary with similar translation pairs
- Detailed JSON report:
tmp/translation_similarity_report.json - Human-readable summary:
tmp/translation_similarity_summary.txt
Use cases:
- Identify potential translation inconsistencies
- Find duplicate or near-duplicate translations
- Quality assurance for translation output
π Advanced Usage
Programmatic Translation
Translate Multiple Texts to Multiple Languages
texts = ["Hello", "Goodbye", "Thank you"]
target_langs = [
{ short_name: "it", name: "Italian" },
{ short_name: "fr", name: "French" }
]
results = BetterTranslate::TranslationHelper.translate_texts_to_languages(texts, target_langs)
# Results structure:
# {
# "it" => ["Ciao", "Arrivederci", "Grazie"],
# "fr" => ["Bonjour", "Au revoir", "Merci"]
# }
Translate Single Text to Multiple Languages
text = "Welcome to our application"
target_langs = [
{ short_name: "it", name: "Italian" },
{ short_name: "es", name: "Spanish" }
]
results = BetterTranslate::TranslationHelper.translate_text_to_languages(text, target_langs)
# Results:
# {
# "it" => "Benvenuto nella nostra applicazione",
# "es" => "Bienvenido a nuestra aplicaciΓ³n"
# }
Custom Configuration for Specific Tasks
# Separate configuration for different domains
medical_config = BetterTranslate::Configuration.new
medical_config.provider = :chatgpt
medical_config.openai_key = ENV["OPENAI_API_KEY"]
medical_config.translation_context = "Medical terminology for patient records"
medical_config.validate!
# Use the custom config...
Dry Run Mode
Test your configuration without writing files:
config.dry_run = true
This validates everything and simulates the translation process without creating output files.
Verbose Logging
Enable detailed logging for debugging:
config.verbose = true
π Orphan Key Analyzer
The Orphan Key Analyzer helps you find unused translation keys in your codebase. It scans your YAML locale files and compares them against your actual code usage, generating comprehensive reports.
CLI Usage
Find orphan keys from the command line:
# Basic text report (default)
better_translate analyze \
--source config/locales/en.yml \
--scan-path app/
# JSON format (great for CI/CD)
better_translate analyze \
--source config/locales/en.yml \
--scan-path app/ \
--format json
# CSV format (easy to share with team)
better_translate analyze \
--source config/locales/en.yml \
--scan-path app/ \
--format csv
# Save to file
better_translate analyze \
--source config/locales/en.yml \
--scan-path app/ \
--output orphan_report.txt
Sample Output
Text format:
============================================================
Orphan Keys Analysis Report
============================================================
Statistics:
Total keys: 50
Used keys: 45
Orphan keys: 5
Usage: 90.0%
Orphan Keys (5):
------------------------------------------------------------
Key: users.
Value: This feature was removed
Key: products.deprecated_label
Value: Old Label
...
============================================================
JSON format:
{
"orphans": ["users.old_message", "products.deprecated_label"],
"orphan_details": {
"users.old_message": "This feature was removed",
"products.deprecated_label": "Old Label"
},
"orphan_count": 5,
"total_keys": 50,
"used_keys": 45,
"usage_percentage": 90.0
}
Programmatic Usage
Use the analyzer in your Ruby code:
# Scan YAML file
key_scanner = BetterTranslate::Analyzer::KeyScanner.new("config/locales/en.yml")
all_keys = key_scanner.scan # Returns Hash of all keys
# Scan code for used keys
code_scanner = BetterTranslate::Analyzer::CodeScanner.new("app/")
used_keys = code_scanner.scan # Returns Set of used keys
# Detect orphans
detector = BetterTranslate::Analyzer::OrphanDetector.new(all_keys, used_keys)
orphans = detector.detect
# Get statistics
puts "Orphan count: #{detector.orphan_count}"
puts "Usage: #{detector.usage_percentage}%"
# Generate report
reporter = BetterTranslate::Analyzer::Reporter.new(
orphans: orphans,
orphan_details: detector.orphan_details,
total_keys: all_keys.size,
used_keys: used_keys.size,
usage_percentage: detector.usage_percentage,
format: :text
)
puts reporter.generate
reporter.save_to_file("orphan_report.txt")
Supported Translation Patterns
The analyzer recognizes these i18n patterns:
t('key')- Rails short formt("key")- Rails short form with double quotesI18n.t(:key)- Symbol syntaxI18n.t('key')- String syntaxI18n.translate('key')- Full method name<%= t('key') %>- ERB templatesI18n.t('key', param: value)- With parameters
Nested keys:
en:
users:
profile:
title: "Profile" # Detected as: users.profile.title
Use cases:
- Clean up unused translations before deployment
- Identify dead code after refactoring
- Reduce locale file size
- Improve translation maintenance
- Generate reports for translation teams
π§ͺ Development & Testing
BetterTranslate includes comprehensive testing infrastructure with unit tests, integration tests, and a Rails dummy app for realistic testing.
Test Structure
spec/
Running Tests
# Run all tests (unit + integration)
bundle exec rake spec
# or
bundle exec rspec
# Run only unit tests (fast, no API calls)
bundle exec rspec spec/better_translate/
# Run only integration tests (uses VCR cassettes)
bundle exec rspec spec/integration/
# Run specific test file
bundle exec rspec spec/better_translate/configuration_spec.rb
# Run tests with coverage
bundle exec rspec --format documentation
VCR Cassettes & API Testing
BetterTranslate uses VCR (Video Cassette Recorder) to record real API interactions for integration tests. This allows:
β Realistic testing with actual provider responses β No API keys needed after initial recording β Fast test execution (no real API calls) β CI/CD friendly (cassettes committed to repo) β API keys anonymized (safe to commit)
Setup API Keys for Recording
# Copy environment template
cp .env.example .env
# Edit .env and add your API keys
OPENAI_API_KEY=sk-...
GEMINI_API_KEY=...
ANTHROPIC_API_KEY=sk-ant-...
Re-record Cassettes
# Delete and re-record all cassettes
rm -rf spec/vcr_cassettes/
bundle exec rspec spec/integration/
# Re-record specific provider
rm -rf spec/vcr_cassettes/chatgpt/
bundle exec rspec spec/integration/chatgpt_integration_spec.rb
Note: The .env file is gitignored. API keys in cassettes are automatically replaced with <OPENAI_API_KEY>, <GEMINI_API_KEY>, etc.
Rails Dummy App Demo
Test BetterTranslate with a realistic Rails app:
# Run interactive demo
ruby spec/dummy/demo_translation.rb
Output:
Generated files:
spec/dummy/config/locales/it.yml- Italian translationspec/dummy/config/locales/fr.yml- French translation
See spec/dummy/USAGE_GUIDE.md for more examples.
Code Quality
# Run RuboCop linter
bundle exec rubocop
# Auto-fix violations
bundle exec rubocop -a
# Run both tests and linter
bundle exec rake
Documentation
# Generate YARD documentation
bundle exec yard doc
# Start documentation server (http://localhost:8808)
bundle exec yard server
# Check documentation coverage
bundle exec yard stats
Interactive Console
# Load the gem in an interactive console
bin/console
Security Audit
# Check for security vulnerabilities
bundle exec bundler-audit check --update
ποΈ Architecture
Provider Architecture
All providers inherit from BaseHttpProvider:
BaseHttpProvider (abstract)
BaseHttpProvider responsibilities:
- HTTP communication via Faraday
- Retry logic with exponential backoff
- Rate limiting
- Timeout handling
- Error wrapping
Core Components
- Configuration: Type-safe config with validation
- Cache: LRU cache with optional TTL
- RateLimiter: Thread-safe request throttling
- Validator: Input validation (language codes, text, paths, keys)
- HashFlattener: Converts nested YAML β flat structure
Error Hierarchy
All errors inherit from BetterTranslate::Error:
BetterTranslate::Error
π Documentation
- USAGE_GUIDE.md - Complete guide to dummy app and demos
- VCR Testing Guide - How to test with VCR cassettes
- CLAUDE.md - Developer guide for AI assistants (Claude Code)
- YARD Docs - Complete API documentation
Key Documentation Files
better_translate/
π€ Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/alessiobussolari/better_translate.
Development Guidelines
- TDD (Test-Driven Development): Always write tests before implementing features
- YARD Documentation: Document all public methods with
@param,@return,@raise, and@example - RuboCop Compliance: Ensure code passes
bundle exec rubocopbefore committing - Frozen String Literals: Include
# frozen_string_literal: trueat the top of all files - HTTP Client: Use Faraday for all HTTP requests (never Net::HTTP or HTTParty)
- VCR Cassettes: Record integration tests with real API responses for CI/CD
Development Workflow
# 1. Clone and setup
git clone https://github.com/alessiobussolari/better_translate.git
cd better_translate
bundle install
# 2. Create a feature branch
git checkout -b my-feature
# 3. Write tests first (TDD)
# Edit spec/better_translate/my_feature_spec.rb
# 4. Implement the feature
# Edit lib/better_translate/my_feature.rb
# 5. Ensure tests pass and code is clean
bundle exec rspec
bundle exec rubocop
# 6. Commit and push
git add .
git commit -m "Add my feature"
git push origin my-feature
# 7. Create a Pull Request
Release Workflow
Releases are automated via GitHub Actions:
# 1. Update version
vim lib/better_translate/version.rb # VERSION = "1.0.1"
# 2. Update CHANGELOG
vim CHANGELOG.md
# 3. Commit and tag
git add -A
git commit -m "chore: Release v1.0.1"
git tag v1.0.1
git push origin main
git push origin v1.0.1
# 4. GitHub Actions automatically:
# β
Runs tests
# β
Builds gem
# β
Publishes to RubyGems.org
# β
Creates GitHub Release
Setup: See .github/RUBYGEMS_SETUP.md for configuring RubyGems trusted publishing (no API keys needed!).
π License
The gem is available as open source under the terms of the MIT License.
π Code of Conduct
Everyone interacting in the BetterTranslate project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.