Smart Fetch (Auto)
You hand FourA a URL and a validate rule for what the real page should contain. FourA does the rest: it walks a cost-aware ladder, stops at the first rung that returns a response your rules accept, and remembers what worked per host so the next call on the same site is cheap.
This guide explains what auto does under the hood, when to use it, and how to read its response. For the parameter reference, see API Endpoints.
The Idea
Most scraping setups make you pick the engine up front. Single is fastest, Proxy adds rotation, Browser handles JavaScript. You guess wrong, you waste credits or you get blocked.
Auto flips it. You declare success (validate), not method. FourA climbs a ladder until one rung succeeds:
- Cheap probe (single, direct egress)
- Rotated proxy single
- Browser, with JavaScript and a solver if the site challenges
- Browser through proxy for the hardest targets
Auto stops as soon as a rung returns a response your validate rule accepts. Most calls finish on rung 1 or 2.
What You Send
The minimum is a URL plus a validate substring. Without validate.data.accept, auto can't tell a real page from a challenge interstitial returned with HTTP 200, and it may return the challenge as success.
curl -X POST https://eu.api.foura.ai/api/auto/ \
-H "X-API-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com/product/42",
"validate": {"data": {"accept": ["Add to cart"]}}
}'
Optional knobs (see the endpoint reference for full details):
returnSession(defaulttrue): return the winning{ proxy, cookies, userAgent }so you can replay it.forceProxy(defaulttrue): skip direct-egress rungs. Setfalseonly if you know the site is friendlier to a clean IP than to free rotating proxies.timeout_ms(default120000): total budget for the whole call. The ladder portions it across rungs.ignoreProxies: proxy IDs to avoid on every sub-attempt.followRedirects(default5): max redirects on the cheap rungs.
What You Get Back
{
"status": 200,
"data": "<!doctype html>...",
"headers": [{"content-type": "text/html"}],
"meta": {
"rung": "cache",
"solved": false,
"attempts": 1,
"credits": 2
},
"session": {
"proxy": "A1B2C3",
"cookies": [{"name": "session", "value": "abc", "domain": "example.com"}],
"userAgent": "Mozilla/5.0..."
}
}
Three things to read:
statusanddata: same shape as the underlying engine returned. For single and proxy rungs,headersis a per-hop array. For browser rungs,headersis a flat object.meta: whichrungdelivered the response, how manyattemptsit took, whether a bot challenge wassolved, and the totalcreditsspent. Always present.session: the{ proxy, cookies, userAgent }triple that cracked the target. Use it to replay against the same host via/api/single/or/api/browser/.
Replaying with the Session
After auto returns a session, you can drop straight into Single or Browser for follow-up pages on the same host. No new ladder climb, no new probe.
import requests
API = "https://eu.api.foura.ai"
KEY = "YOUR_API_KEY"
H = {"X-API-Key": KEY, "Content-Type": "application/json"}
# 1) First call: let auto figure it out.
r = requests.post(f"{API}/api/auto/", headers=H, json={
"url": "https://example.com/product/42",
"validate": {"data": {"accept": ["Add to cart"]}},
}).json()
session = r["session"]
proxy = session["proxy"]
user_agent = session["userAgent"]
# 2) Follow-up pages: replay through single with the same proxy + UA.
for sku in ("43", "44", "45"):
r = requests.post(f"{API}/api/single/", headers=H, json={
"method": "GET",
"url": f"https://example.com/product/{sku}",
"proxy": proxy,
"headers": [["User-Agent", user_agent]],
}).json()
print(sku, r["status"])
The session is only as durable as the target makes it. Some sites bind clearance to the cookie jar for hours; others rotate every few minutes. If a replay starts returning challenges again, call /api/auto/ once more to refresh.
When to Use Auto
| Use auto | Use single, proxy, or browser by hand |
|---|---|
| You're targeting a new site and don't know what it needs | You already know the engine that works |
| You want one call that handles direct, proxy, and browser fallback for you | You want full control over per-call retries and timeouts |
| You're fine paying a few seconds of probing on the first call | First-call latency matters more than discovery |
| You want a learned session you can replay cheaply | You're optimizing a tight loop on a known-good target |
Auto isn't always the cheapest pick. If you know a target works with single + unblocker, calling Single directly is 2 credits with predictable latency. Auto on the same target costs whatever its ladder spends, which can be more if the site requires escalation.
Validate Tells Auto What "Success" Means
The single most important parameter is validate. Without it, auto can't distinguish a real 200 page from a 200 challenge interstitial dressed up as content.
Use validate.data.accept with a substring only the real page contains:
{
"validate": {
"data": {
"accept": ["sku-42-add-to-cart", "Customer reviews"]
}
}
}
For JSON APIs, accept a field name you expect:
{
"validate": {
"data": { "accept": ["\"products\":["] },
"status": { "accept": [200] }
}
}
For sites that legitimately return non-200 (geo-blocks you want to ignore, intentional 403 on logged-out endpoints), allow them via validate.status.accept:
{
"validate": {
"status": { "accept": [200, 451] }
}
}
Without validate, auto falls back to "HTTP 200 = success" and won't catch a Cloudflare challenge interstitial that the WAF returns with a 200.
Reading meta to Understand What Happened
meta.rung is the most useful debug signal. Values:
probe- solved on a cheap direct request. The cheapest path.proxy- needed proxy rotation to get through.browser- needed a full browser render, possibly with a challenge solve.cache- replayed a warm session from a prior auto call. Cheapest path on repeat calls.fail- no rung produced a response your rules accepted.
meta.solved: true means the browser detected and solved a bot challenge (Cloudflare clearance, similar gates). meta.attempts is the count of sub-call tries before success.
If a site keeps ending on browser when you expected probe, consider whether a stricter validate rule (or a less strict one) would let a cheaper rung pass.
Errors and Edge Cases
When auto fails, the response carries status (usually the last failed rung's status) and an error string:
{ "status": 0, "error": "all attempts failed", "meta": { "rung": "fail", "attempts": 7, "credits": 47 } }
status: 0 means no rung produced a response at all (every attempt timed out or got rejected). A non-zero status plus error means the last attempt got a response, but auto rejected it (validate or otherwise).
Check meta to understand where the budget went. If attempts is high and rung is fail after the browser rung, the target may need a longer timeout_ms, a stricter validate rule, or simply isn't reachable via free rotating proxies right now.
What Auto Does Not Do
- It does not bypass legal restrictions. If a site is geo-blocked and rejects every exit FourA can reach, auto returns the block.
- It does not cache content. Every call still hits the target. The "warm session" is the proxy and cookies, not the response.
- It does not write to the Activity Log as a separate row from the sub-calls. The Single / Proxy / Browser sub-calls auto made on your behalf appear in Activity; the outer
/api/auto/call is a coordinator.
Related
- API Endpoints: Full parameter reference
- Choosing the Right Endpoint: When to pick auto vs single, proxy, or browser
- Request Outcomes: Which outcomes are billable
- Anti-Bot Protection: What FourA does about Cloudflare, DataDome, and friends
- MCP Recipes: The same patterns as MCP tool calls