How to Scrape a Dynamic Website
Dynamic websites load content using JavaScript after the initial page load. This guide shows how to collect data from these sites using FourA browser tasks.
The Problem
When you send a standard HTTP request to a JavaScript-heavy website, you get the HTML shell but not the actual content. The data you need (product listings, prices, search results) is loaded by JavaScript after the page renders in a browser.
This is increasingly common with modern frameworks like React, Vue, Angular, and Next.js.
The Solution: Browser Tasks
FourA browser tasks run a full headless Chrome instance that:
- Loads the page
- Executes all JavaScript
- Waits for the content to render
- Returns the fully rendered HTML
Step 1: Identify What You Need
Before making the request, visit the target page in your browser and use DevTools (F12) to find the CSS selector of the element that contains your data. For example:
.product-gridfor a product listing[data-testid="results"]for search results#price-displayfor a price element
Step 2: Send a Browser Task
curl -X POST https://eu.api.foura.ai/api/v1/tasks \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com/products",
"type": "browser",
"options": {
"waitFor": ".product-grid",
"timeout": 15000
}
}'
The waitFor option tells FourA to wait until the .product-grid element appears in the DOM before capturing the page. This ensures all async data has loaded.
Step 3: Parse the HTML
The response contains the fully rendered HTML. Parse it with your preferred library:
Python (BeautifulSoup)
import requests
from bs4 import BeautifulSoup
resp = requests.post("https://eu.api.foura.ai/api/v1/tasks", headers={
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
}, json={
"url": "https://example.com/products",
"type": "browser",
"options": {"waitFor": ".product-grid"}
})
html = resp.json()["content"]
soup = BeautifulSoup(html, "html.parser")
for product in soup.select(".product-card"):
name = product.select_one(".product-name").text.strip()
price = product.select_one(".product-price").text.strip()
print(f"{name}: {price}")
Node.js (cheerio)
import * as cheerio from 'cheerio';
const resp = await fetch('https://eu.api.foura.ai/api/v1/tasks', {
method: 'POST',
headers: { 'Authorization': 'Bearer YOUR_API_KEY', 'Content-Type': 'application/json' },
body: JSON.stringify({ url: 'https://example.com/products', type: 'browser', options: { waitFor: '.product-grid' } })
});
const { content } = await resp.json();
const $ = cheerio.load(content);
$('.product-card').each((i, el) => {
console.log($(el).find('.product-name').text(), $(el).find('.product-price').text());
});
Troubleshooting
Still getting empty content?
- Make sure the
waitForselector actually exists on the target page - Increase the
timeout: some pages load slowly - Check if the page requires authentication or cookies
Getting a captcha page?
- Switch to
proxytype for automatic anti-bot handling - Try combining browser + proxy by using proxy with geo-targeting
Next Steps
- Choosing the Right Task Type: When to use browser vs. single
- Monitor Competitor Prices: Full price tracking tutorial
- Anti-Bot Protection: Handle protected sites