Fix discovery to actually use negative affinity signals
Previously the engine was blind to dislikes/dismissals: - _build_user_tag_profile only used liked/watched (positive only) - dismiss_penalty was capped at 80% so hated content still surfaced - _search_and_store had zero affinity filtering, any YouTube result entered the queue - user_tag_affinity negative scores (written by dismiss/dislike) were never read Now: - _build_user_tag_profile reads directly from user_tag_affinity (positive + negative) - _tag_relevance_score returns negative values, so disliked-tag channels score below zero and get dropped - _search_and_store skips channels whose indexed videos match 3+ negatively-rated tags - list_discovery post-filters channels already in the queue using the same neg-affinity check - Removed the old _dismissed_channel_tags + dismiss_penalty (superseded) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -56,9 +56,32 @@ def list_discovery(
|
||||
ORDER BY dq.score DESC
|
||||
LIMIT :limit OFFSET :offset
|
||||
"""),
|
||||
{"user_id": current_user.id, "limit": limit, "offset": offset},
|
||||
{"user_id": current_user.id, "limit": limit * 3, "offset": offset},
|
||||
).mappings().all()
|
||||
|
||||
# Load negative affinity tags and use them to filter channels already in the queue
|
||||
neg_affinity = {
|
||||
r["tag"] for r in db.execute(
|
||||
text("SELECT tag FROM user_tag_affinity WHERE user_id = :user_id AND score < -2"),
|
||||
{"user_id": current_user.id},
|
||||
).mappings().all()
|
||||
}
|
||||
if neg_affinity and rows:
|
||||
channel_ids_csv = ",".join(str(r["channel_id"]) for r in rows)
|
||||
vtag_rows = db.execute(
|
||||
text(f"SELECT channel_id, tags FROM videos WHERE channel_id IN ({channel_ids_csv}) AND tags IS NOT NULL LIMIT 1000")
|
||||
).mappings().all()
|
||||
neg_hit: dict[int, int] = {}
|
||||
for vr in vtag_rows:
|
||||
try:
|
||||
for tag in json.loads(vr["tags"] or "[]"):
|
||||
if isinstance(tag, str) and tag.lower().strip() in neg_affinity:
|
||||
neg_hit[vr["channel_id"]] = neg_hit.get(vr["channel_id"], 0) + 1
|
||||
except (json.JSONDecodeError, TypeError):
|
||||
pass
|
||||
rows = [r for r in rows if neg_hit.get(r["channel_id"], 0) < 3]
|
||||
|
||||
rows = rows[:limit]
|
||||
items = []
|
||||
for row in rows:
|
||||
row = dict(row)
|
||||
|
||||
Reference in New Issue
Block a user