#What Bot-Detect solves
This API verifies whether an IP truly belongs to an official crawler (e.g., Googlebot, Bingbot) instead of someone claiming to be a bot via User-Agent. It’s designed for abuse prevention, SEO safety (don’t block real bots), and traffic classification (bill bots differently, slow them down, or allow them).
#Endpoints and when to use them
#POST /v1/bot/detect — Auto-detect
- Best for: General checks when you don’t know the vendor in advance.
- How it works: If a
uais provided, we try to infer a vendor from it; otherwise we test your IP against all supported vendors’ ranges. - Typical use: Single integration point for all traffic (edge/middleware).
#POST /v1/bot/detect/{vendor} — Vendor-specific
- Best for: When routing already knows a suspected vendor (e.g., URLs dedicated to Google).
- Vendors:
google,bing,duck,qwant,meta,yandex,seznam,openai. - Note: UA is informative; the final decision is IP-based (with optional reverse-DNS/ASN checks).
#Quick start
curl -X POST "https://api.yeb.to/v1/bot/detect" \
-H "Authorization: Bearer <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{ "ip": "66.249.66.1", "ua": "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" }'
# Vendor-specific (Google)
curl -X POST "https://api.yeb.to/v1/bot/detect/google" \
-H "Authorization: Bearer <YOUR_API_KEY>" \
-H "Content-Type: application/json" \
-d '{ "ip": "66.249.66.1" }'
// JS Fetch example
fetch('https://api.yeb.to/v1/bot/detect/bing', {
method: 'POST',
headers: {
'Authorization': 'Bearer <YOUR_API_KEY>',
'Content-Type': 'application/json'
},
body: JSON.stringify({ ip: '40.77.167.129' })
})
.then(r => r.json())
.then(console.log)
.catch(console.error);
#Parameters that actually matter
| Param | Required | What to pass in practice | Why it matters |
|---|---|---|---|
ip |
Yes | Client IPv4/IPv6 you’re checking (edge-extracted, not X-Forwarded-For if you’re not sure). | Final decision is IP-based. UA can be spoofed; IP ranges aren’t. |
ua |
No | Send it if you have it. We'll infer vendor and annotate ua_match, but IP stays decisive. |
Helps explain “why” (was the UA consistent with the result?). |
verify_rdns |
No | true if you want reverse-DNS → forward match (Google/Bing). |
Extra safety when you need to be 100% sure even if IP is inside a large range. |
strict_rdns |
No | true to fail if rDNS check doesn’t pass (only effective when verify_rdns is true). |
Force “OK = false” unless DNS proves it. |
verify_asn |
No | true to cross-check by ASN (if enabled server-side). |
Useful for vendors without official JSON ranges (e.g., Meta). |
strict_asn |
No | true to fail if ASN check doesn’t confirm. |
Pairs with verify_asn for stricter policies. |
asn |
No | Override vendor ASN when you know it (e.g., 32934 for Meta). |
Handy for custom policies or fast vendor updates. |
#Reading and acting on responses
You mostly care about result.ok, result.vendor, and result.reason.
{
"result": {
"vendor": "google",
"ok": true,
"reason": "ip_match", // or "ip_and_ua_match", etc.
"ua_present": true, // UA was sent explicitly as param
"ua_source": "param", // "param" | "header" | null
"ua_match": true, // UA pattern fit the vendor
"ip_match": true, // IP fell inside vendor ranges
"dns_verified": false, // set true if verify_rdns passed
"rdns_checked": false,
"asn_verified": false,
"asn_checked": false,
"cidr_empty": false, // true if no ranges were available
"ip_kind": "search_bot", // Google only: search_bot | special_crawler | user_triggered_google | user_triggered_user | cdn_proxy | unknown
"ip_kind_source": "json", // "json" | "dns_ptr" | null
"ptr": "crawl-66-249-66-1.googlebot.com"
}
}
#Typical reason values
ip_and_ua_match— Best case: UA matched the vendor pattern and IP belongs to the vendor.ip_match— Good: IP belongs to vendor ranges; UA was missing or didn’t match (still OK to treat as vendor).ip_match_but_ua_not_matched— Likely a legitimate bot with a non-standard UA; don’t block.ua_not_matched— UA said “I’m X”, but we couldn’t confirm it; checkip_match.ip_not_in_vendor_ranges— Treat as non-vendor; consider bot throttling or block rules.
#Recommended actions
- ok = true → Allow, skip WAF challenges, crawl-budget friendly rate-limit.
- ok = false and UA claims vendor → 403 or serve simplified page; log for audit.
- Need absolute certainty → Call with
verify_rdns=true(+strict_rdns=trueif you want hard fail).
#Troubleshooting & field notes
- “UA says Googlebot but ok=false” → You saw a spoofed UA. Key is
ip_match. Consider returning 403 or a lightweight page. - Rapid vendor changes → Prefer vendor JSON ranges (handled automatically). If you need hard proof, enable
verify_rdns. - IPv6 traffic → Fully supported. If your edge strips IPv6, fix that first.
- False negatives for Meta → Use
verify_asn=true(andasn=32934if relevant) for stronger confirmation. - Rate-limits → Respect 429 with exponential backoff. Keep request IDs in your logs.
#API Changelog
ip_kind, ip_kind_source, ptr) and improved vendor JSON ingestion (Google “special crawlers” and user-triggered fetchers).
verify_rdns/strict_rdns and verify_asn/strict_asn flags for stricter verification flows.
/google, /bing, /meta, /yandex, etc.). Response schema unified across vendors.