경쟁사 가격 모니터링

FourA API를 사용하여 경쟁사 웹사이트의 자동 가격 모니터링을 설정합니다.

구현할 내용

다음 작업을 수행하는 Python 스크립트입니다.

경쟁사 URL 목록에서 상품 페이지를 가져옵니다
HTML에서 가격 데이터를 추출합니다
결과를 CSV 파일에 기록합니다
일정에 따라 실행됩니다

사전 준비 사항

FourA API 키 (여기에서 받기)
Python 3.8+
requests 및 beautifulsoup4 패키지

pip install requests beautifulsoup4

1단계: 대상 정의하기

모니터링할 상품 URL 목록을 생성합니다.

targets = [
    {"name": "Competitor A - Widget", "url": "https://competitor-a.com/widget", "selector": ".price"},
    {"name": "Competitor B - Widget", "url": "https://competitor-b.com/products/widget", "selector": "[data-price]"},
    {"name": "Competitor C - Widget", "url": "https://competitor-c.com/item/123", "selector": ".product-price span"},
]

2단계: FourA를 통해 페이지 가져오기

import requests
import time

SINGLE_URL = "https://eu.api.foura.ai/api/single/"
BROWSER_URL = "https://eu.api.foura.ai/api/browser/"
API_KEY = "YOUR_API_KEY"

HEADERS = {
    "X-API-Key": API_KEY,
    "Content-Type": "application/json"
}

def fetch_page(url, use_browser=False):
    if use_browser:
        resp = requests.post(BROWSER_URL, headers=HEADERS, json={
            "url": url,
            "timeout_ms": 15000
        })
        if resp.status_code == 429:
            time.sleep(5)
            return fetch_page(url, use_browser)
        return resp.json().get("body", "")
    else:
        resp = requests.post(SINGLE_URL, headers=HEADERS, json={
            "method": "GET",
            "url": url,
            "unblocker": True
        })
        if resp.status_code == 429:
            time.sleep(5)
            return fetch_page(url, use_browser)
        return resp.json().get("data", "")

3단계: 가격 추출하기

from bs4 import BeautifulSoup
import re

def extract_price(html, selector):
    soup = BeautifulSoup(html, "html.parser")
    element = soup.select_one(selector)
    if not element:
        return None
    # Extract numeric price from text like "$49.99" or "49,99 EUR"
    text = element.get_text(strip=True)
    match = re.search(r'[\d,.]+', text)
    return float(match.group().replace(',', '.')) if match else None

4단계: 실행 및 결과 기록하기

import csv
from datetime import datetime

def monitor_prices():
    timestamp = datetime.now().isoformat()
    results = []

    for target in targets:
        html = fetch_page(target["url"])
        price = extract_price(html, target["selector"])
        results.append({
            "timestamp": timestamp,
            "name": target["name"],
            "url": target["url"],
            "price": price
        })
        print(f"{target['name']}: {price}")
        time.sleep(1)  # Be polite

    # Append to CSV
    with open("prices.csv", "a", newline="") as f:
        writer = csv.DictWriter(f, fieldnames=["timestamp", "name", "url", "price"])
        if f.tell() == 0:
            writer.writeheader()
        writer.writerows(results)

if __name__ == "__main__":
    monitor_prices()

5단계: 일정 예약하기

cron을 사용하여 매시간 스크립트를 실행합니다.

crontab -e
# Add this line:
0 * * * * cd /path/to/project && python3 monitor.py >> monitor.log 2>&1

팁

single endpoint로 시작하고, 페이지가 JavaScript 렌더링을 사용하는 경우 browser로 전환하세요
오류 처리 추가: 사이트 레이아웃이 변경될 수 있습니다. 실패는 별도로 기록하세요.
선택기 최신 상태 유지: 경쟁사가 사이트를 리뉴얼하면 CSS 선택기를 업데이트하세요
사이트 존중: request 간격을 두고, 트래픽이 몰리는 시간대를 피하며, robots.txt를 준수하세요

다음 단계

올바른 Endpoint 선택하기: 가장 적합한 접근 방식 선택
오류 처리: 실패를 정상적으로 처리
동적 웹사이트 스크래핑: JavaScript로 렌더링된 페이지 처리

최근 업데이트: 2026년 4월 27일