youclonedl

Author	SHA1	Message	Date
Mattias Thall	c3290d33a7	Reduce parallel YouTube request workers to avoid cookie invalidation 8 simultaneous yt-dlp processes hitting video pages looks like a bot attack and causes YouTube to nuke the session cookies. Drop to: - Popular fetch view_count enrichment: 8→3 workers - Discovery search: 8→4 workers - Graph signal (featured channels): 8→3 workers Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-26 23:11:07 +02:00
Mattias Thall	be7319e96c	Sample videos randomly for view_count enrichment, not newest-first Previously ORDER BY published_at DESC meant only the newest 200 videos ever got view counts. Now ORDER BY RANDOM() spreads the 200 slots across the full channel history — videos without a count are still prioritised, but among those they're drawn randomly. Each run of Fetch Popular covers a different slice, converging toward full coverage over time. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-26 23:06:32 +02:00
Mattias Thall	6e455ed8ce	Fetch popular: flat-playlist crawl then parallel view_count enrichment Phase 1: crawl the full channel with flat-playlist to store any videos not yet in DB (fast, no individual requests). Phase 2: fetch real view_count for up to 200 channel videos in parallel (8 workers), prioritising those missing a count. Popular tab sorts all channel videos by view_count DESC NULLS LAST. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-26 23:05:21 +02:00
Mattias Thall	ff4d8e4ab4	Popular tab: rank by real view_count, drop broken ?sort=p URL yt-dlp's own test suite marks channel sort as 'Query for sorting no longer works' — YouTube blocked it. New approach: fetch view_count for up to 200 indexed videos in parallel (8 workers, prioritising those missing counts), then Popular tab sorts by view_count DESC WHERE view_count IS NOT NULL. Accurate for any channel once enrichment runs. Frontend refetch wait raised to 60s to cover ~200 parallel fetches. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-26 23:02:03 +02:00
Mattias Thall	3e699d61b6	Fix popular task failing silently when table doesn't exist The outer try had no except — any exception (e.g. table missing) killed the whole background task with no error visible to the user. Now: - CREATE TABLE IF NOT EXISTS inline so the task works even if the startup migration hasn't run (no server restart required) - Wrap DELETE in its own try/except - Catch and print outer exceptions so failures appear in server logs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-26 22:52:30 +02:00
Mattias Thall	77cba81ef4	Popular: write Phase 1 immediately, enrich view_count in background Previously the task waited for all 30 parallel metadata fetches before writing anything to the DB (~30s). Now Phase 1 (flat-playlist IDs + basic info) commits to channel_popular_videos immediately (~5s), so the tab populates fast. Phase 2 (view_count + dates) runs in a daemon thread while the user is already browsing. Also: catch table-not-found errors in the sort=popular query so a cold server returns [] instead of 500. Frontend refetch wait 35s→8s to match the faster Phase 1 commit time. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-26 22:47:42 +02:00
Mattias Thall	112f87e764	Popular tab now shows only flagged popular videos in rank order Add channel_popular_videos table (channel_id, video_id, rank). _fetch_popular_task clears and rewrites this table after each fetch. GET /channels/{id}/videos?sort=popular now JOINs this table and orders by rank instead of view_count, so the tab shows exactly the videos YouTube returned in popularity order — nothing more. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-26 22:38:53 +02:00
Mattias Thall	2f37072187	Fix popular fetch and improve date/view_count coverage Popular fetch now does a two-phase approach: fast flat-playlist to get IDs in popularity order, then parallel full metadata fetch (8 workers) to get real view_count and published_at for each video. Previously flat-playlist mode returned timestamp/view_count as null. Enrich task now also backfills published_at and view_count (not just description). Startup limit 3→50, enrichment sleep 2s→0.5s. Raise all thread pool sizes to match 8-core machine: - Discovery search: 5→8 workers - Graph signal: 4→8 workers - Popular fetch: 5→8 workers - Download semaphore default 3→6, cap 10→16 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-26 22:36:18 +02:00
Mattias Thall	5b0cf27f07	Add playlists support and fix explore older videos - New playlists router: fetch channel playlists from YouTube, index playlist videos, browse by playlist with pagination - Playlist model gets video_ids column to store ordered video list - Register playlists router in main.py with DB migration - Add Playlists tab to Channel page: grid of playlist cards, click to browse videos, index/re-index per playlist - Fix explore older videos skipping all entries without published_at; flat-playlist entries for older videos rarely include timestamp data Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-26 22:28:35 +02:00
Mattias Thall	d31fc1ef7f	Add Popular tab to channel page - YouTube sort=p fetch: indexes top 100 most-viewed videos from a channel, storing view_count in the DB - Popular tab on channel page shows videos sorted by view_count DESC - Videos/Popular tab switcher with context-appropriate fetch buttons - Expose view_count in VideoOut; add 'popular' sort to channel videos endpoint Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-26 22:22:10 +02:00
Mattias Thall	aa91156bbc	Add older content exploration: channel page + home feed Rediscover mode Channel page: - "Explore older videos" button fetches 100 videos at a time further back in the channel history using yt-dlp --playlist-start/--playlist-end - "Fetch entire history" still available for full crawl - Backend: /channels/{id}/explore?page=N endpoint + playlist offset support in fetch_channel_metadata(start_video=N) Home feed: - New "Rediscover" mode: older unwatched videos (90+ days old) from followed channels, randomly sampled then re-ranked by tag affinity Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-26 22:17:20 +02:00
Mattias Thall	0b482b5d49	Overhaul channel page: search, pagination, fetch all history - Search bar filters indexed videos server-side; "Search YouTube" button triggers a deep channel search and indexes matching results - Server-side sort (newest/oldest/A-Z/unwatched) + infinite scroll (60/page) - "Fetch recent" indexes last 30, "Fetch all" indexes full history - Auto-reindex on page visit if stale (>1h), refetches at 8s - Add /channels/{id}/index-full endpoint (max_videos=0) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-26 22:15:09 +02:00
Mattias Thall	50d61b5774	Fix crawled_at type error in get_channel SQLite returns datetime columns as strings via raw text() queries. Parse crawled_at safely before comparing against utcnow(). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-26 22:04:35 +02:00
Mattias Thall	d740fd5224	Auto-reindex channel on page visit if stale GET /channels/{id} now fires a background _index_channel_task if the channel hasn't been crawled in the last hour. The frontend refetches channel + videos 8s after page load to pick up the updated data. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-26 22:02:59 +02:00
Mattias Thall	ea99b74ba8	Add scheduled sync, disk space awareness, and subtitle downloads - auto-sync daemon: background thread checks every hour and syncs followed channels for users with sync_interval_hours set (6/12/24h options) - disk stats: /api/stats now returns total/used/free/download bytes; Stats page shows a disk usage bar - subtitles: subtitle_langs setting (e.g. "en,sv") passed through all download paths; yt-dlp writes .srt files alongside the video - Settings page: sync interval dropdown + subtitle languages input Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-26 20:36:50 +02:00
Mattias Tall	c00d5c7595	Optimise Following page: 4 aggregated queries, no correlated subqueries - Rewrite list_channels to run exactly 4 SQL queries regardless of channel count: channel rows, aggregated video stats (GROUP BY), new-video counts, and latest video (derived-table JOIN replaces per-row correlated subquery) - Remove dead _CHANNEL_STATS_SELECT (orphaned after the rewrite) - Fix upload_frequency_days: use pre-computed date_span_days from vstats instead of a broken per-channel db.execute() call - Restrict new_counts query to id_csv so it uses idx_videos_channel_indexed - markChannelsSeen: optimistic setQueryData instead of invalidateQueries, eliminating a full channel-list re-fetch on every Following page visit - DownloadIndicator idle poll: 10s → 30s (no need to hit DB when idle) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-26 16:18:33 +02:00
Mattias Tall	1405acfaed	Revert channel stats to correlated subqueries (CTE had a param binding bug) The CTE approach returned 0 rows — likely a SQLite/SQLAlchemy interaction with :user_id appearing in multiple CTEs. Reverted to the original correlated-subquery form which is proven correct. The 4 indexes added in the previous commit still apply and will make the per-channel subqueries faster once the DB is indexed on startup. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-26 16:10:24 +02:00
Mattias Tall	74e9a52096	Fix Following page: replace 9-subquery-per-channel stats with 2 CTEs + indexes The old _CHANNEL_STATS_SELECT ran 9 correlated subqueries for each channel row. With 1266 channels that was ~11000 sub-executions per GET /channels request, causing multi-second (or timeout) delays. New approach: 2 CTEs (vinfo for counts/sums, nc for new_count) each do a single aggregated pass over all followed-channel videos, joined back to channels. Only 2 correlated LIMIT-1 subqueries remain for latest_video_id/title (fast with the new index). Also adds 4 indexes on startup (IF NOT EXISTS — safe to deploy): - videos(channel_id, published_at DESC) — latest video lookups - videos(channel_id, indexed_at) — new_count filter - user_videos(video_id, user_id) — watch/download aggregation - user_channels(user_id, status) — followed channel filter Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-26 16:04:41 +02:00
Mattias Tall	1cd8645957	Fix YouTube hammering, sync rate limiting, and Following load time Sync throttling: - sync-all now skips channels crawled within the last 6 hours (prevents re-scraping 1266 channels on every button press) - Channels are queued into a single _index_channels_batch task that runs with 1.5s delay between each yt-dlp call instead of firing 1266 background tasks simultaneously - Startup enrich task reduced from 10 to 3 videos (3 yt-dlp calls on each container restart) - Enrich task adds 2s sleep between metadata fetches SQLite stability: - busy_timeout=5000 prevents SQLITE_BUSY errors under concurrent load - synchronous=NORMAL speeds up writes without data loss risk (safe with WAL) Following page: - staleTime: 60s on channels query so cached data is reused immediately on revisit; gcTime keeps it in memory for 5 min Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-26 16:00:37 +02:00
inputnoise	1827dd6c4e	Initial commit — YT Hub Self-hosted personal YouTube management app. FastAPI + SQLite backend, React + Vite + Tailwind frontend. Dockerfiles and compose included for Portainer deployment. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-25 20:09:04 +02:00

20 Commits