Backend
POSTGRES
live DB
Total listings
11.589
across all sources
Active sources
5 / 5
see table
Schema · dict version
v1 · v1
bump migrations on changes
01Source health overview
counts per source · last posted_at · budget disclosure
| Source | Listings | DE | EN | w/ Budget | Last posted | Archive | Status |
|---|---|---|---|---|---|---|---|
| GULP.de | 4.436 | 4.251 | 185 | 3.759 (85%) | 2026-05-04 | 18 mo | ✓ active |
| Freelance.de | 3.091 | 2.973 | 118 | 2.417 (78%) | 2026-05-04 | 6 mo | ✓ active |
| Twago.de | 1.046 | 1.004 | 42 | 765 (73%) | 2026-05-04 | 12 mo | ✓ active |
| Junico.de | 1.793 | 1.711 | 82 | 1.687 (94%) | 2026-05-04 | 6 mo | ✓ active |
| eVergabe / DTAD | 1.223 | 1.164 | 59 | 1.126 (92%) | 2026-05-04 | 60 mo | ✓ active |
02Operational notes
what runs autonomously
- Cron: worker container runs Sun 03:00 ingest, Sun 04:30 stats refresh.
- Idempotent: re-running ingest is safe — content_hash dedups raw_blob, listing.id dedups listings.
- Per-source isolation: one source failing does not block others; failure recorded in
source_health. - Watermarks: resume from last
posted_aton restart. - Quarantine: parse errors land in
quarantine, retryable from this admin section in Phase 6. - Backups: weekly
pg_dumpto mounted volume, optional offsite S3-compatible. - Health endpoint: worker
/healthon:3001; web/healthon:3000; Coolify probes both. - Source link preserved: every listing keeps its
source_urlfor click-to-original drillthrough. - raw_blob retention: indefinite — enables dictionary-version replay over full history.
03Phase 4 — admin tools (in roadmap)
quarantine, replay, dictionary editor, audit · scaffolded next
- Quarantine browser: failed parses with raw_blob preview + retry button.
- Replay: pick raw_blob → re-run current parser → diff vs stored.
- Dictionary editor: add/remove terms, triggers
term_hitsre-derive. - Term audit: TP/FP labelling, per-term precision metric, sub-0.7 flagged.