
Three weeks ago, I wrote about knowledge collapse - how our best technical insights are dying in private AI chats while Stack Overflow bleeds 78% of its traffic.
Hundreds of developers agreed we need a solution.
So I built one. And it's running in production right now.
Here's the live demo: https://chat-knowledge-api.fpl-test.workers.dev
Here's the source code: https://github.com/dannwaneri/chat-knowledge
Here's the complete build session you're reading from: 107 chunks, 174 messages, imported via HTML
Let me show you how it works.
Knowledge collapse is happening right now:
We need "Stack Overflow for AI conversations" - but decentralized, privacy-first, and developer-owned.
The workflow is dead simple:
Ctrl+S (save as HTML)node dist/cli/import-html.js chat.html
Tested on this article's build session:
No complex setup. No API keys. Just save and import.
Before any chat goes public, it runs through auto-detection:
🔴 CRITICAL (auto-block):
Real results from my build session scan:
Total issues detected: 599
Critical blocks: 3 (2 Bearer tokens, 1 API key)
High severity: Multiple localhost URLs
Safe to share: FALSE ✅ (exactly as designed)
This is the difference between "share everything" and "share safely." One leaked API key costs more than this entire system.
This Build Session:
That's faster than most companies decide what to build.
Not keyword matching - actual understanding.
Query: "how to handle vectorize embeddings"
Found: Content about "dimension reduction" and "optimization"
Relevance score: 0.78
The system understood WHAT I meant, not just what I typed.
Tech stack:
@cf/baai/bge-base-en-v1.5) - generates embeddingsThis isn't just personal knowledge management. It's designed to federate.
Live endpoints:
/api/federation/nodeinfo (200 OK ✅)/api/federation/.well-known/webfinger
What federation means:
Exactly like Mastodon, but for developer knowledge.
Frontend: HTML → Parser → Chunks
Backend: Cloudflare Workers (edge-native)
Database: D1 (SQLite at the edge)
Vector Store: Vectorize (768-dim embeddings)
AI: Workers AI (BGE-base-en-v1.5)
Protocol: ActivityPub (federation standard)
1. IMPORT
HTML file → Parse messages → Chunk content → Generate embeddings
2. SECURITY
Scan for secrets → Flag risks → Require review → Safe by default
3. STORAGE
Chunks → D1 (metadata)
Embeddings → Vectorize (semantic search)
4. SEARCH
Query → Generate embedding → Cosine similarity → Ranked results
5. FEDERATION (coming)
Public chats → ActivityPub → Federated timeline → Cross-instance search
1. HTML Parsing
Claude's HTML export format isn't documented. Had to reverse-engineer:
2. Security Scanner
Can't just regex for "API key" - need to understand context:
3. Federation Protocol
ActivityPub is designed for social posts, not Q&A:
4. Edge-Native Architecture
Cloudflare Workers have constraints:
Working within constraints = better architecture.
// Real security detection from the codebase
const patterns = {
bearerToken: /Bearer\s+[A-Za-z0-9\-._~+/]+=*/gi,
apiKey: /['\"]?api[_-]?key['\"]?\s*[:=]\s*['\"]?[A-Za-z0-9-_]{20,}['\"]?/gi,
localhost: /https?:\/\/(localhost|127\.0\.0\.1|::1)/gi,
internalDomain: /https?:\/\/[a-z0-9.-]+\.(local|internal|corp|dev)/gi
}
// Scan returns: { safe: boolean, issues: Issue[] }
// Auto-blocks if critical issues found
chats -- Core chat storage
chunks -- Content chunks for search
pre_share_scans -- Security scanner results
chunk_redactions -- Auto-redaction tracking
share_approvals -- Sharing workflow
federated_instances -- Federation network
federated_knowledge -- Cross-instance content
federation_activities -- ActivityPub events
knowledge_analytics -- Usage tracking
collections -- Knowledge curation
collection_items -- Collection membership
This is production-grade infrastructure, not a proof of concept.
Timing matters.
Stack Overflow traffic is down 78% and still falling. Every day, thousands of valuable debugging sessions happen in private AI chats and disappear forever.
We're not just losing knowledge - we're losing the HABIT of knowledge sharing.
Building this now means:
The best time to rebuild the knowledge commons was before Stack Overflow collapsed.
The second-best time is now.
Cloudflare Workers handles massive scale:
The same tech stack I use for production apps serving thousands of users.
Three weeks ago, Richard Pascoe asked in the comments: "Could Mastodon servers like Fosstodon help foster a knowledge sharing platform?"
I said yes and built it.
This isn't theoretical infrastructure. It's ActivityPub-compatible, meaning it federates with Mastodon, Fosstodon, and the entire Fediverse network.
Richard's question became the bridge between diagnosis and solution.
@richardpascoe - your instance is ready when you are. 🚀
Two weeks ago, we created @the-foundation to preserve developer knowledge publicly.
Richard Pascoe published our first collaborative post on fundamentals.
This is our second: working infrastructure.
The Foundation isn't just writing about the problem. We're shipping solutions.
This isn't a solo project. It's infrastructure.
The knowledge commons doesn't rebuild itself. But we can build it together.
# Clone the repo
git clone https://github.com/dannwaneri/chat-knowledge.git
cd chat-knowledge
# Install dependencies
npm install
# Login to Cloudflare (free tier works)
wrangler login
# Create infrastructure
wrangler d1 create chat-knowledge-db
wrangler vectorize create chat-knowledge-embeddings --dimensions=768 --metric=cosine
# Run migrations
wrangler d1 execute chat-knowledge-db --remote --file=migrations/migration-federation.sql
wrangler d1 execute chat-knowledge-db --remote --file=migrations/migration-sanitizer.sql
# Deploy
npm run deploy
That's it. You now have your own federated knowledge instance.
Import a chat:
# Save any Claude conversation as HTML (Ctrl+S)
npm run build
node dist/cli/import-html.js path/to/chat.html "My First Import"
Search it:
curl -X POST https://your-worker.workers.dev/search \
-H "Content-Type: application/json" \
-d '{"query": "debugging tips", "maxResults": 5}'
Scan for secrets:
node dist/cli/safe-share.js <chat-id>
# Shows what would leak before you share
I wrote about the problem three weeks ago.
Now the solution is running in production.
From observation to shipped product in 21 days.
That's the power of:
Your move, Stack Overflow. 👊
GitHub: https://github.com/dannwaneri/chat-knowledge
Live Demo: https://chat-knowledge-api.fpl-test.workers.dev
Twitter: @dannwaneri
If you believe in preserving developer knowledge, star the repo ⭐ and let's make this real.
The foundation is laid. Now we need builders.
Are you in?