fix: stop discovery from bursting dozens of yt-dlp calls inside one task

Each search/graph/trending task was calling _fetch_and_index_channel
inline for up to 10-15 newly discovered channels, each making up to 4
yt-dlp calls (1 channel metadata + 3 individual video fetches for
dateless entries). This bypassed the 30-90 s worker gap, producing
bursts of 40-60 calls in rapid succession and hammering YouTube.

Changes:
- _fetch_and_index_channel: removed the dateless-video individual
  fetch loop — one call per channel, videos without published_at are
  simply skipped at discovery time
- _search_and_store and _fetch_graph_for_channel: queue channel
  indexing as separate worker tasks (3 and 2 respectively) so the
  30-90 s gap applies between every yt-dlp call, including channel
  indexing
- update_trending_signal and update_graph_signal (old sync path):
  removed inline _fetch_and_index_channel loops (15 and 10 channels)
- _discovery_task in channels.py: replaced run_full_discovery (old
  synchronous path) with schedule_discovery so sync-all and
  follow-by-url go through the queue system

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-27 03:17:37 +02:00
parent 0c5b236b77
commit a3346c6e87
2 changed files with 23 additions and 50 deletions

View File

@@ -173,15 +173,11 @@ def _index_channel_task(channel_id: int, user_id: int, max_videos: int = 30):
def _discovery_task(user_id: int):
from ..database import SessionLocal
from ..services.discovery import run_full_discovery
db = SessionLocal()
from ..services.discovery import schedule_discovery
try:
run_full_discovery(db, user_id)
schedule_discovery(user_id)
except Exception:
pass
finally:
db.close()
def _enrich_missing_task(limit: int = 20):