A multi-phase research pipeline that turns a topic into a long-form narrated audio episode — with chapters, notes, a cheat sheet, and a full source audit you can verify line-by-line. Built for the way you actually want to consume hard topics: while walking, driving, or doing the dishes.
educast.cc is the public face of an end-to-end research-to-audio pipeline. You hand it a subject and a few knobs — how dense, how long, how many modules, which language — and it goes off and does the work an analyst would do: maps the territory, gathers facts, cross-checks them against independent sources, plans a narrative, writes it, narrates it, and ships it as a self-contained shareable page.
Every share page carries the artifact and the audit trail. You can listen to the episode, read the notes, glance at the cheat sheet, and inspect every source the writer was allowed to draw from — including which facts were independently corroborated, by how many sources, and with what verdict.
Three numbers and a language hint shape everything downstream:
From those, the pipeline computes its own internal budgets — how deep to recurse, how many parallel searches to fan out, how many facts to verify, how many sections to write, how long each section should be. Higher density means deeper recursion, broader fan-out, more verification calls, and a tighter fact floor per subtopic. Longer length means more sections at roughly 4,000 characters apiece, with each section getting a proportional slice of the verified fact pool.
The flow below is the real shape of what happens between your prompt and the published episode. Every step is implemented; the bullet points under each are what that step actually does, in order.
• If you bullet-point any topics in the prompt, an LLM extracts them as mandatory topics — coverage is enforced later in the writing step.
• In course mode, an outline pass produces N module titles + descriptions + summaries from your title and module count.
• Density (1–10) is converted into concrete budgets: research depth, research breadth, max subtopics, a fact demand per subtopic, and a verification budget.
• An LLM generates 8–10 broad search queries from the prompt and any mandatory topics.
• All queries fan out in parallel against a web search API; failed queries are logged and the rest of the pipeline keeps going.
• The result set is folded into a structured knowledge graph: subtopics, entities, initial facts (each tagged with importance and source indices), and typed relationships between entities.
• Per subtopic: an LLM does a gap analysis — what facts are still missing? — and emits targeted queries.
• Pages are fetched (up to 25 per subtopic), parsed, and facts are extracted with source attribution.
• Recursion is controlled by your depth knob (1–5). Each level halves the breadth, so the tree narrows as it deepens.
• Subtopics are researched in parallel under a concurrency limiter so the pipeline doesn’t hammer either the search API or your wallet.
• Facts are ranked by priority (importance × provisional credibility) and the top N (your verification budget) are selected.
• For each, a fresh independent search is run — not the same query that surfaced it the first time.
• An LLM reads the new sources and issues a verdict: confirmed, partially_confirmed, contradicted, or insufficient_evidence, with an adjusted confidence score in [0, 1].
• The verdict, the adjusted score, and the corroborating source IDs are baked into the fact record. Contradicted facts are marked, not silently dropped.
• Facts are allocated to subtopics by a weighted score: roughly importance × 0.5 + sources × 0.3 + verifications × 0.2, with a floor so no discovered subtopic gets starved.
• A research-enhanced writing prompt is assembled — verified facts inline, citations attached, mandatory topics flagged.
• A section outline is generated with roughly one section per ~4,000 characters of target length.
• Each section is written against the brief, the outline, and a deny-list of fact claims used in earlier sections so the same anecdote doesn’t turn up twice.
• Mandatory topics are explicitly required in the writing prompt; sections target their allocated character count to within roughly ±10%.
• Style is deadpan and direct, with banned-pattern enforcement — an audited list of overused tics that the writer is told not to use.
• The script is saved to disk after each section, so a crash mid-way only loses the in-flight section, not the whole run.
• SSML breaks, prosody slowdowns, and emphasis tags are inserted for comic timing and weight.
• Inline tone cues like [deadpan], [slowly], and [pause] are added for the voice persona to interpret, then stripped before TTS.
• The script is split at chapter boundaries; each chapter is narrated as its own MP3.
• Provider is chosen per voice: Gemini, ElevenLabs, or Cartesia. The narrator gets a persona prompt for character.
• Chunks are quality-checked for length drift and tail-truncation before being concatenated. Sample rates are normalized via ffmpeg if providers disagree.
• A peaks waveform is decoded and saved alongside the MP3 for the player UI.
• Notes: structured markdown — one section per chapter, 5–10 bullets each, key terms bolded, with a closing “Key Takeaways” block.
• Cheat sheet: a compressed mental model — the hook, the entities, a 4–6 beat narrative spine, typed connections between entities, and the most quotable anchor facts.
• The MP3, peaks file, notes, cheat sheet, and source audit are uploaded to Cloudflare R2.
• A vanity slug is reserved on educast.cc and the share is reachable at a clean URL.
• The landing page is fully self-contained — player, waveform, chapter nav, offline cache, tabs — no accounts, no tracking, no app to install.
The verification phase is the part that earns the “deep research” in the name. A fact that surfaces once in one source is treated as provisional. A fact that survives an independent re-check — with the LLM looking at a different set of sources than the ones that originally produced it — gets a higher confidence score and goes into the script with corroboration count attached.
Every share page exposes this audit trail. For each fact you can see:
This isn’t a guarantee of truth — nothing on the open web is — but it’s a defensible epistemic floor. You can listen, then verify, then disagree, and the receipts are right there in the same page.
The writer detects language hints in the prompt itself. Say “in Hungarian”, “auf Deutsch”, “in Lithuanian”, or anything similar, and the entire script is written in that language. Supported languages are bounded by the underlying LLM (which speaks 100+) and by the TTS provider you pick — Gemini and ElevenLabs both ship multilingual voices. There is no hardcoded language list to fall off the edge of.
Narrated MP3 with a precise peaks waveform, variable speed, chapter markers, and offline caching for road trips and flights.
Structured markdown notes the audio was written from — readable as a standalone briefing.
A compressed mental model: the hook, the entities, the spine of the argument, and the anchor facts.
Every source the research touched, with a credibility audit trail you can verify yourself.
educast.cc is built by Ben Racz — engineer working on AI agent systems and tools that respect attention. Reach out, give feedback, or follow the work at benracz.com.