Case Study
Dredge: A Multi-Source Ops-Awareness Board
Scale
~7,300
Lines of Elixir (lib)
64
Modules
4
Source adapters
14
Ecto migrations
3
Anthropic client modes
~1,180
Lines of tests
The Problem
Small dev teams watch their world through six tabs at once: GitHub notifications, Dependabot alerts, a database they need to keep an eye on, RSS security advisories, and a wall of webhook noise from Sentry, Datadog, and PagerDuty. Nothing is connected, nothing is prioritized, and triage happens by context-switching until something slips through.
Dredge collapses all of it onto one real-time kanban board. Items flow in from any connected source, get routed into columns by rule, and surface in priority order - so a team triages in one place instead of across six tabs. It was also, openly, a vehicle to learn Elixir, OTP, and Phoenix LiveView properly rather than from a tutorial.
What Was Built
Sources (the adapter layer)
- ·RSS / Atom - poll-based, with feed-URL auto-discovery; tuned for CVE and vendor security feeds
- ·GitHub - OAuth; notifications, assigned issues, commits, and Dependabot alerts with CVSS mapped onto card priority
- ·Database - a generic, SELECT-only poller for Postgres and MySQL that flags suspicious rows and auto-detects title/severity columns
- ·Webhook - push-based, HMAC-verified POST /webhooks/:id; understands GitHub, Sentry, Datadog, and PagerDuty payloads
- ·Every source implements one behaviour (fetch/3 + normalize/1), so the pipeline never needs to know what kind of source it is talking to
Board and classification
- ·Rule-based routing: send items to columns by field / operator / value, with an optional set_priority (CVSS >= 9 forces priority 10)
- ·First-match-wins classifier with an Inbox fallback; full-board reclassification when rules change
Skiff (the AI layer)
- ·AI pipeline config - describe what you want to follow in plain language; Skiff returns a validated source + routing-rule config you preview and approve before anything is written
- ·AI board assistant - an agentic tool-use loop that answers questions over the whole board, can create cards, and can run a read-only SQL query against a connected database on demand
- ·Streamed column summaries via the Anthropic Messages API over SSE
Collaboration and real-time
- ·Invite members (viewer / editor), public read-only share links, live presence avatars, and per-card comment threads
- ·Phoenix PubSub + Presence: the board updates the instant an item arrives, with no polling on the client
Technical Architecture
What I Learned
This was a learning-first build, so these are the things the stack actually taught me - several of them the hard way, in the git history.
1. OTP reframes fault tolerance as a structural decision, not error handling
Coming from request/response backends, my instinct for "what if a feed is down" was try/catch and retry flags. OTP pushed the answer up a level: model each source as its own supervised process. Dredge runs one SourceWorker GenServer per source under a per-user DynamicSupervisor, registered by {user_id, source_id}. A flaky RSS feed can crash and restart on its own without touching a user's GitHub polling, and the worker is :transient so a clean stop stays stopped while a crash gets restarted. On boot, a root supervisor reads every active source from the database and starts a worker for each. Failure isolation stopped being something I coded around and became something the shape of the system gives me for free.
2. Behaviours make "add another integration" a closed problem
The first version threaded case source_type branches through the polling code. That gets worse with every source. Replacing it with an adapter behaviour - four callbacks, fetch/3 and normalize/1 being the core - meant the worker talks to an abstract source and never grows a new branch. Adding Reddit or Hacker News later is implementing one module, not editing the pipeline. It is the clearest payoff I have felt from designing to an interface rather than to the cases I happened to have on day one.
3. Managed infrastructure fails in ways no docs warn you about
The single most time-consuming stretch was getting the Database adapter to talk to Supabase, and none of it was Elixir's fault. Three real walls, all in the git history: Supabase's transaction pooler rejects named prepared statements (fixed by forcing prepare: :unnamed on Postgrex), the "direct" connection is IPv6-only and unreachable from most servers (fixed by steering users to the IPv4 pooler), and postgres:// URLs have to be torn apart into discrete host/port/user/pass options because the driver will not take a URL. The fix that mattered most was not code at all - it was writing the connection-string guidance into the setup UI so the next person does not hit the same wall. Half of integration work is turning your own debugging into someone else's instructions.
4. An AI feature needs a trust model and a cost model on day one
Two things I would have bolted on later if I had not been forced to think about them. Trust: the assistant can query a connected database, so the query path hard-enforces SELECT-only at the adapter, caps rows, and the connection string lives encrypted - the model gets read access, never write access. The same instinct shows up in the pipeline generator: AI-generated config is treated as an untrusted proposal, surfaced for review and only committed once the user approves, never written straight to the database. Cost: a shared API key is a shared bill, so usage runs through a quota resolver - admins are unlimited, a user who supplies their own Anthropic key is unlimited on their own dime, and everyone else gets a soft daily cap on the shared key, with the multi-call agentic loop counting as one request. Shipping an LLM into a product is as much about what it cannot do and what it cannot cost as what it can do.
5. LiveView gives you real-time without a front-end framework
The board updates live - new items, presence avatars, comments - with no React, no client-side store, no polling. Phoenix PubSub broadcasts a :new_item and the server re-renders the affected fragment over the websocket. The discipline it demands is LiveView streams for collections (so a long-lived board does not balloon server memory) and being deliberate about what state lives on the socket. Trading a SPA for server-driven rendering removed an entire category of client/server state-sync bugs I am used to fighting.
6. Naming the AI changed how the product reads
A small one with an outsized effect. The assistant started out as "Claude" in the UI and prompts. Renaming it to Skiff - giving it a product identity instead of a vendor name - made it read as a feature of Dredge rather than a chatbot bolted on, and quietly decoupled the product's voice from whichever model sits behind it. The branding move and the architecture move (keeping the Anthropic client behind one module) turned out to be the same instinct.
Security
- ·Credentials and database connection strings stored encrypted at rest via cloak_ecto
- ·Webhook ingestion is HMAC-verified per source before any payload is trusted
- ·The AI database tool and the Database adapter both hard-enforce SELECT-only; read-only DB users are recommended in the setup guidance
- ·AI usage is quota-limited per user to protect the shared API key