競合価格の監視

FourA APIを使用して、競合他社のウェブサイト全体で自動価格監視をセットアップします。

作成するもの

以下の機能を持つPythonスクリプト：

競合他社のURLリストから商品ページを取得する
HTMLから価格データを抽出する
結果をCSVファイルに記録する
スケジュールに従って実行する

前提条件

FourA APIキー（こちらから取得）
Python 3.8以上
requestsおよびbeautifulsoup4パッケージ

pip install requests beautifulsoup4

ステップ1：ターゲットの定義

監視する商品URLのリストを作成します：

targets = [
    {"name": "Competitor A - Widget", "url": "https://competitor-a.com/widget", "selector": ".price"},
    {"name": "Competitor B - Widget", "url": "https://competitor-b.com/products/widget", "selector": "[data-price]"},
    {"name": "Competitor C - Widget", "url": "https://competitor-c.com/item/123", "selector": ".product-price span"},
]

ステップ2：FourA経由でページを取得する

import requests
import time

SINGLE_URL = "https://eu.api.foura.ai/api/single/"
BROWSER_URL = "https://eu.api.foura.ai/api/browser/"
API_KEY = "YOUR_API_KEY"

HEADERS = {
    "X-API-Key": API_KEY,
    "Content-Type": "application/json"
}

def fetch_page(url, use_browser=False):
    if use_browser:
        resp = requests.post(BROWSER_URL, headers=HEADERS, json={
            "url": url,
            "timeout_ms": 15000
        })
        if resp.status_code == 429:
            time.sleep(5)
            return fetch_page(url, use_browser)
        return resp.json().get("body", "")
    else:
        resp = requests.post(SINGLE_URL, headers=HEADERS, json={
            "method": "GET",
            "url": url,
            "unblocker": True
        })
        if resp.status_code == 429:
            time.sleep(5)
            return fetch_page(url, use_browser)
        return resp.json().get("data", "")

ステップ3：価格の抽出

from bs4 import BeautifulSoup
import re

def extract_price(html, selector):
    soup = BeautifulSoup(html, "html.parser")
    element = soup.select_one(selector)
    if not element:
        return None
    # Extract numeric price from text like "$49.99" or "49,99 EUR"
    text = element.get_text(strip=True)
    match = re.search(r'[\d,.]+', text)
    return float(match.group().replace(',', '.')) if match else None

ステップ4：実行と結果の記録

import csv
from datetime import datetime

def monitor_prices():
    timestamp = datetime.now().isoformat()
    results = []

    for target in targets:
        html = fetch_page(target["url"])
        price = extract_price(html, target["selector"])
        results.append({
            "timestamp": timestamp,
            "name": target["name"],
            "url": target["url"],
            "price": price
        })
        print(f"{target['name']}: {price}")
        time.sleep(1)  # Be polite

    # Append to CSV
    with open("prices.csv", "a", newline="") as f:
        writer = csv.DictWriter(f, fieldnames=["timestamp", "name", "url", "price"])
        if f.tell() == 0:
            writer.writeheader()
        writer.writerows(results)

if __name__ == "__main__":
    monitor_prices()

ステップ5：スケジュールの設定

cronを使用してスクリプトを1時間ごとに実行します：

crontab -e
# Add this line:
0 * * * * cd /path/to/project && python3 monitor.py >> monitor.log 2>&1

ヒント

まずはsingle endpointから開始し、ページがJavaScriptレンダリングを使用している場合はbrowserに切り替えます
エラーハンドリングを追加する：サイトのレイアウトは変更されることがあります。失敗は個別にログに記録してください。
セレクターを最新に保つ：競合他社がデザインを変更した場合は、CSSセレクターを更新します
サイトを尊重する：リクエストの間隔を空け、ピーク時間を避け、robots.txtに従います

次のステップ

適切なエンドポイントの選択：最適なアプローチを選択する
エラーハンドリング：エラーを適切に処理する
動的ウェブサイトのスクレイピング：JavaScriptでレンダリングされたページを処理する

最終更新日: 2026年4月27日