🚀 llms.txt are live on SN33
The llms.txt repository is now live. 🔗 http://github.com/afterpartyai/llms_txt_store
SN33 has processed the first batch with over 1,000 websites crawled, cleaned, and converted into structured llms.txt files by the subnet.
Semantic summaries ready for any LLM agent, MCP server, or AI app to consume instantly. No scraping. No parsing raw HTML. Just clean, machine-readable intelligence.
New batches will be pushed as the subnet keeps processing. The repo grows every week.
What's in the dataset:
→ Structured semantic summaries per domain
→ Named entities: people, orgs, products, technologies, concepts
→ Topic classification and key themes
→ Deterministic O(1) lookup by domain with no index file needed
→ Git-friendly structure that scales to millions of domains
This initial release covers ~1,000 domains as a pilot, but the pipeline scales to millions.
📍 Roadmap: 10K → 100K → 1M domains → continuous updates from new Common Crawl releases and soon from requests.
🌍 And the frontend is coming.
Any domain. You request it, the subnet processes it, you get an llms.txt back. We're putting the finishing touches on the public UI and it drops soon.
SN33 is becoming infrastructure. The web, made readable for machines and open to anyone, powered by decentralized infra.
Star the repo. Share it. And stay close. The next drop is right around the corner.