Previously the lock was released before _run(), so multiple threads could fire yt-dlp processes simultaneously — completely defeating the rate limiter. Now the lock is held through the subprocess call and released in finally. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>