Auto-schedule daily discovery + fix Find More UX + expand query diversity

Auto-discovery daemon:
- Runs every hour, triggers full discovery for any user whose last run
  was >23 hours ago. First check is 5 minutes after startup.
- Tracks run time in user_settings.last_discovery_run (new column).
- Manual Find More also stamps last_discovery_run.

Discovery status endpoint (GET /api/discovery/status):
- Returns pending_count (unseen queue size) and last_run timestamp.
- Shown in the Discover page header so users know queue state at a glance.

Find More UX fix:
- Was: kick background task, wait 8 seconds, refetch (task takes minutes).
- Now: button shows "Queued ✓" on success with an explanatory banner
  telling the user it takes a few minutes and also runs daily automatically.

Query diversity:
- Added "best [category] channels" serendipity queries to crawl_by_search.
- Limit raised from 25 to 30 queries per run.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-27 01:58:39 +02:00
parent 4d255647a1
commit 12f54ac5b0
5 changed files with 107 additions and 14 deletions

View File

@@ -264,8 +264,12 @@ def crawl_by_search(db: Session, user_id: int):
if followed_names:
sampled_names = random.sample(followed_names, min(15, len(followed_names)))
# Combine: tags (most signal) + channel names (broad reach) + categories (fallback)
queries = list(dict.fromkeys(top_tags + sampled_names + top_cats))[:25]
# Serendipity queries: "best [category] channels" — surfaces curated list videos
# which then get their channel indexed; broadens discovery beyond direct tag matches.
serendipity = [f"best {cat} channels" for cat in top_cats[:3]]
# Combine: tags (most signal) + channel names (broad reach) + serendipity + categories
queries = list(dict.fromkeys(top_tags + sampled_names + serendipity + top_cats))[:30]
if not queries:
return