Scrape a Dynamic Website

Dynamic websites load content using JavaScript after the initial page load. This guide shows how to collect data from these sites using FourA's browser endpoint.

The Problem

When you send a standard HTTP request to a JavaScript-heavy website, you get the HTML shell but not the actual content. The data you need (product listings, prices, search results) is loaded by JavaScript after the page renders in a browser.

This is increasingly common with modern frameworks like React, Vue, Angular, and Next.js.

The Solution: Browser Requests

FourA's browser endpoint (POST /api/browser/) opens your URL in a real Chrome browser that:

  1. Loads the page
  2. Executes all JavaScript
  3. Waits for the content to render
  4. Returns the fully rendered HTML

Step 1: Identify What You Need

Before making the request, visit the target page in your browser and use DevTools (F12) to find a piece of text or element that confirms the content has loaded. For example:

  • A product name that appears after JS renders
  • A CSS class like product-grid in the rendered HTML
  • A text string like "results" that only appears when data loads

Step 2: Send a Browser Request

curl -X POST https://eu.api.foura.ai/api/browser/ \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/products",
    "timeout_ms": 15000,
    "checkText": "product-grid"
  }'

The checkText option tells FourA to verify that the string "product-grid" appears in the rendered page. If it doesn't appear before the timeout, the request fails, letting you know the content didn't load.

Step 3: Parse the HTML

The response contains the fully rendered HTML in the body field. Parse it with your preferred library:

Python (BeautifulSoup)

import requests
from bs4 import BeautifulSoup

resp = requests.post("https://eu.api.foura.ai/api/browser/", headers={
    "X-API-Key": "YOUR_API_KEY",
    "Content-Type": "application/json"
}, json={
    "url": "https://example.com/products",
    "timeout_ms": 15000,
    "checkText": "product-grid"
})

html = resp.json()["body"]
soup = BeautifulSoup(html, "html.parser")

for product in soup.select(".product-card"):
    name = product.select_one(".product-name").text.strip()
    price = product.select_one(".product-price").text.strip()
    print(f"{name}: {price}")

Node.js (cheerio)

import * as cheerio from 'cheerio';

const resp = await fetch('https://eu.api.foura.ai/api/browser/', {
  method: 'POST',
  headers: { 'X-API-Key': 'YOUR_API_KEY', 'Content-Type': 'application/json' },
  body: JSON.stringify({
    url: 'https://example.com/products',
    timeout_ms: 15000,
    checkText: 'product-grid'
  })
});

const { body: html } = await resp.json();
const $ = cheerio.load(html);

$('.product-card').each((i, el) => {
  console.log($(el).find('.product-name').text(), $(el).find('.product-price').text());
});

Troubleshooting

Still getting empty content?

  • Verify the page actually uses JavaScript rendering (check with "View Source" vs. DevTools)
  • Increase timeout_ms: some pages load slowly
  • Check if the page requires authentication or cookies (use the cookies parameter)

Getting a captcha page?

  • For single/HTTP requests, switch to the proxy endpoint (POST /api/proxy/) for automatic IP rotation.
  • To add proxy rotation to browser requests, use the browser endpoint's proxy parameter instead of wrapping in the proxy endpoint. The proxy endpoint only wraps single/HTTP requests, not browser requests.
curl -X POST "https://eu.api.foura.ai/api/browser/" \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com", "proxy": "http://proxy-url:port"}'

Next Steps

Last updated: May 13, 2026