4 Commits (main)
 

Author SHA1 Message Date
nick2day 44641f589d Append-only picks store with user-initiated dismiss
Pipeline:
- write_picks() no longer truncates; deduplicates by artist+album
  key on append so the same pick is never added twice
- Picks accumulate indefinitely; only the user can remove them

New endpoint POST /api/fgs/picks/remove:
- Accepts {artist, album}, removes matching pick from store
- Also writes the removed pick to the dedup DB so it won't resurface

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
nick2day f6b84ee22f Fix pick quality: Bandcamp title parsing, sanitise pass, dashboard esc
Parsing:
- Handle "Album | Artist — Label - Bandcamp" title format (common
  Bandcamp search result pattern) — stops group at em-dash so label
  name doesn't bleed into artist field
- clean_name() strips label suffixes from parsed tokens
- artist_from_url() now title-cases Bandcamp slug
- looks_like_bad_pick() checks album for pipes, broader regex for
  'records'/'bandcamp' without word-boundary requirement

Sanitise pass (post-curator):
- Normalise obscurity to high/medium/low (dashboard badge values)
- Drop picks where artist field contains 'bandcamp'/'records'/pipe
- Detect when a review blog domain name was extracted as the artist;
  attempt recovery from original search result or drop the pick
- Review domain blocklist: metalinjection, cvltnation, angrymetalguy,
  nocleansinging, meatmeadmetal, decibelmag, and others

Dashboard fix:
- esc() now escapes single quotes (&#39;) to prevent broken onclick
  attributes when album/why fields contain apostrophes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
nick2day 39d6051a1f Fix pipeline yield: dedup, query expansion, parallel prefilter
- Dedup: mark only accepted picks as seen (not all prefiltered
  candidates) — unselected items stay eligible for re-evaluation,
  preventing pool exhaustion across runs
- Queries: expanded from 29 to 37+ with rotating 30-subgenre list,
  25 label targets, 14 review sites; Bandcamp/MA queries skip
  time_range for broader results; review sites use time_range:year
- Results per query: 15 → 25
- Prefilter: parallel batches of 35 (up to 3 concurrent), processes
  all fresh candidates instead of just top 80; be-inclusive prompt
- Curator: cap 20 → 30, score floor 60 → 50, URL prefix matching
  in provenance check instead of exact match

Result: 405 candidates/run vs 146 before; 88 passing prefilter vs 10;
pool stays at ~400 fresh on consecutive runs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago
nick2day 05bb4193ad Initial commit: FGS metal discovery standalone agent
FastAPI service replacing the 77-node n8n pipeline. Implements full
discovery pipeline: 29 rotating SearXNG queries, nomic-embed-text
scoring against Last.fm taste centroid, Mistral-nemo prefilter and
curator with provenance validation, SQLite dedup, writes to
metal-picks.json for the existing FGS dashboard.

Runs as systemd service on port 8766 (fgs-agent.home via Caddy).
n8n reduced to a 2-node schedule trigger → HTTP POST.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3 months ago