Monitor Competitor Prices
Set up automated price monitoring across competitor websites using the FourA API.
What You'll Build
A Python script that:
- Fetches product pages from a list of competitor URLs
- Extracts price data from the HTML
- Logs results to a CSV file
- Runs on a schedule
Prerequisites
- A FourA API key (get one here)
- Python 3.8+
requestsandbeautifulsoup4packages
pip install requests beautifulsoup4
Step 1: Define Your Targets
Create a list of product URLs to monitor:
targets = [
{"name": "Competitor A - Widget", "url": "https://competitor-a.com/widget", "selector": ".price"},
{"name": "Competitor B - Widget", "url": "https://competitor-b.com/products/widget", "selector": "[data-price]"},
{"name": "Competitor C - Widget", "url": "https://competitor-c.com/item/123", "selector": ".product-price span"},
]
Step 2: Fetch Pages via FourA
import requests
import time
API_URL = "https://eu.api.foura.ai/api/v1/tasks"
API_KEY = "YOUR_API_KEY"
def fetch_page(url, task_type="single"):
resp = requests.post(API_URL, headers={
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
}, json={"url": url, "type": task_type})
if resp.status_code == 429:
time.sleep(5)
return fetch_page(url, task_type)
data = resp.json()
return data.get("content", "")
Step 3: Extract Prices
from bs4 import BeautifulSoup
import re
def extract_price(html, selector):
soup = BeautifulSoup(html, "html.parser")
element = soup.select_one(selector)
if not element:
return None
# Extract numeric price from text like "$49.99" or "49,99 EUR"
text = element.get_text(strip=True)
match = re.search(r'[\d,.]+', text)
return float(match.group().replace(',', '.')) if match else None
Step 4: Run and Log Results
import csv
from datetime import datetime
def monitor_prices():
timestamp = datetime.now().isoformat()
results = []
for target in targets:
html = fetch_page(target["url"])
price = extract_price(html, target["selector"])
results.append({
"timestamp": timestamp,
"name": target["name"],
"url": target["url"],
"price": price
})
print(f"{target['name']}: {price}")
time.sleep(1) # Be polite
# Append to CSV
with open("prices.csv", "a", newline="") as f:
writer = csv.DictWriter(f, fieldnames=["timestamp", "name", "url", "price"])
if f.tell() == 0:
writer.writeheader()
writer.writerows(results)
if __name__ == "__main__":
monitor_prices()
Step 5: Schedule It
Run the script hourly with cron:
crontab -e
# Add this line:
0 * * * * cd /path/to/project && python3 monitor.py >> monitor.log 2>&1
Tips
- Start with
single, switch tobrowserif pages use JavaScript rendering - Add error handling: sites change layouts. Log failures separately.
- Keep selectors updated: when a competitor redesigns, update the CSS selector
- Respect the sites: space requests, avoid peak hours, follow robots.txt
Next Steps
- Choosing the Right Task Type: Pick the best approach
- Error Handling: Handle failures gracefully
- Scrape a Dynamic Website: Handle JavaScript-rendered pages