Zenrows
Ruby client for ZenRows web scraping proxy. Multi-backend HTTP client with http.rb as primary adapter.
Installation
Add to your Gemfile:
gem 'zenrows'
Then run:
bundle install
Configuration
Zenrows.configure do |config|
config.api_key = 'YOUR_API_KEY'
config.host = 'superproxy.zenrows.com' # default
config.port = 1337 # default
config.connect_timeout = 5 # seconds
config.read_timeout = 180 # seconds
end
Usage
Basic Request
client = Zenrows::Client.new
http = client.http(js_render: true, premium_proxy: true)
response = http.get('https://example.com')
puts response.body
puts response.status
Note: SSL verification is disabled automatically for proxy connections (required by ZenRows).
With Options
http = client.http(
js_render: true, # Enable headless browser
premium_proxy: true, # Use residential IPs
proxy_country: 'us', # Geolocation
wait: 5000, # Wait 5 seconds after load
wait_for: '.content', # Wait for CSS selector
session_id: true # Sticky session
)
JavaScript Instructions
Automate browser interactions:
instructions = Zenrows::JsInstructions.build do
wait_for '.login-form'
fill '#email', '[email protected]'
fill '#password', 'secret123'
click '#submit'
wait 2000
scroll_to :bottom
wait_for '.results'
end
http = client.http(js_render: true, js_instructions: instructions)
response = http.get(url)
Available instructions:
click(selector)- Click elementwait(ms)- Wait durationwait_for(selector)- Wait for elementwait_event(event)- networkidle, load, domcontentloadedfill(selector, value)- Fill inputcheck(selector)/uncheck(selector)- Checkboxesselect_option(selector, value)- Dropdownsscroll_y(pixels)/scroll_x(pixels)- Scrollscroll_to(:bottom)/scroll_to(:top)- Scroll to positionevaluate(js_code)- Execute JavaScriptframe_*variants for iframe interactions
Screenshots
http = client.http(
js_render: true,
screenshot: true, # Take screenshot
screenshot_fullpage: true, # Full page
json_response: true # Get JSON with screenshot data
)
Block Resources
Speed up requests by blocking unnecessary resources:
http = client.http(
js_render: true,
block_resources: 'image,media,font'
)
Device & Antibot
http = client.http(
js_render: true,
device: 'mobile', # mobile/desktop emulation
antibot: true # enhanced antibot bypass
)
API Client (v0.2.0+)
For advanced extraction features, use the REST API client:
Autoparse
Extract structured data from known sites (Amazon, etc.):
api = Zenrows::ApiClient.new
response = api.get('https://amazon.com/dp/B01LD5GO7I', autoparse: true)
response.parsed # => { "title" => "...", "price" => "$29.99", ... }
CSS Extraction
Extract data using CSS selectors:
# Hash syntax
response = api.get(url, css_extractor: {
title: 'h1',
links: 'a @href',
prices: '.price'
})
response.extracted # => { "title" => "...", "links" => [...], "prices" => [...] }
# DSL syntax
extractor = Zenrows::CssExtractor.build do
extract :title, 'h1'
links :urls, 'a.product'
images :photos, 'img.gallery'
end
response = api.get(url, css_extractor: extractor)
Markdown Output
response = api.get(url, response_type: 'markdown')
response.markdown # => "# Page Title\n\nContent..."
Response Metadata
response = api.get(url)
response.status # => 200
response.success? # => true
response.final_url # => "https://example.com/redirected"
response.request_cost # => 0.001
response.concurrency_remaining # => 199
Options Reference
| Option | Type | Description |
|---|---|---|
js_render |
Boolean | Enable JavaScript rendering |
premium_proxy |
Boolean | Use residential proxies |
proxy_country |
String | Country code (us, gb, de, etc.) |
device |
String | Device emulation (mobile/desktop) |
antibot |
Boolean | Enhanced antibot bypass |
wait |
Integer/Boolean | Wait time in ms (true = 15000) |
wait_for |
String | CSS selector to wait for |
session_id |
Boolean/String | Session persistence |
session_ttl |
String | Session duration (1m, 10m, 30m) |
window_height |
Integer | Browser window height |
window_width |
Integer | Browser window width |
js_instructions |
Array/String | Browser automation |
json_response |
Boolean | Return JSON instead of HTML |
screenshot |
Boolean | Take screenshot |
screenshot_fullpage |
Boolean | Full page screenshot |
screenshot_selector |
String | Screenshot specific element |
block_resources |
String | Block resources (image,media,font) |
headers |
Hash | Custom HTTP headers |
API Client Options
| Option | Type | Description |
|---|---|---|
autoparse |
Boolean | Auto-extract structured data |
css_extractor |
Hash/Object | CSS selectors for extraction |
response_type |
String | Output format ('markdown') |
outputs |
String | Extract specific data (headings,links) |
Error Handling
begin
response = http.get(url, ssl_context: client.ssl_context)
rescue Zenrows::ConfigurationError => e
# Missing or invalid configuration
rescue Zenrows::RateLimitError => e
sleep(e.retry_after || 60)
retry
rescue Zenrows::BotDetectedError => e
# Try with premium proxy
http = client.http(premium_proxy: true, proxy_country: 'us')
retry
rescue Zenrows::WaitTimeError => e
# Wait time exceeded 3 minutes
rescue Zenrows::TimeoutError => e
# Request timed out
end
Rails Integration
The gem automatically integrates with Rails when detected:
- Uses Rails.logger by default
- Supports ActiveSupport::Duration for wait times
# In Rails, you can use duration objects
http = client.http(wait: 5.seconds)
Development
bundle install
bundle exec rake test # Run tests
bundle exec standardrb # Lint code
bundle exec yard doc # Generate docs
License
MIT License. See LICENSE.txt.