When a Monorepo Refuses to Boot on the First Try

I closed Cursor IDE and decided to finally debug why Bot Social Publisher—my sprawling autonomous content pipeline with collectors, processors, enrichers, and multi-channel publishers—refused to start cleanly. The architecture looked beautiful on paper: six async collectors pulling from Git, Clipboard, Cursor, Claude, VSCode, and VS; a processing layer with filtering and deduplication; enrichment via Claude CLI (no paid API, just the subscription model); and publishers targeting websites, VK, and Telegram. Everything was modular, clean, structured. And completely broken.
The first shock came when I tried importing src/enrichment/. Python screamed about missing dependencies. I checked requirements.txt—it was incomplete. Somewhere in the codebase, someone had installed structlog for JSON logging and pydantic for data models, but never updated the requirements file. On Windows in Git Bash, I had to navigate to the venv carefully: venv/Scripts/pip install structlog pydantic. The path matters—backslashes don’t work in Bash. Once installed, I added them to requirements.txt so the next person wouldn’t hit the same wall.
Then came the Claude CLI integration check. The pipeline was supposed to make up to 6 LLM calls per note (content in Russian and English, titles in both languages, plus proofreading). With a daily limit of 100 queries and 3-concurrent throttling, this was unsustainable. I realized the system was trying to generate full content twice—once in Russian, once in English—when it could extract titles from the generated content instead. That alone would cut calls from 6 to 3 per note.
The real puzzle was ContentSelector, the module responsible for reducing 100+ line developer logs down to 40–60 informative lines. It was scoring based on positive signals (implemented, fixed, technology names, problems, solutions) and negative signals (empty markers, long hashes, bare imports). Elegant in theory. But when I tested it on actual Git commit logs, it was pulling in junk: IDE meta-tags like <ide_selection> and fallback titles like “Activity in…”. The filter was too permissive.
I spent an afternoon refactoring the scoring function, adding a junk-removal step before deduplication. Now the ContentSelector actually worked.
By the time I pushed everything to the main branch (after fixing Cyrillic encoding issues—never use curl -d with Russian text on Windows; use Python’s urllib.request instead), the monorepo finally booted cleanly. npm run dev on the web layer. Python async collectors spinning up. API endpoints responding. Enrichment pipeline humming.
As the old developers say: ASCII silly question, get a silly ANSI. 😄
Metadata
- Session ID:
- grouped_C--projects-bot-social-publisher_20260225_2133
- Branch:
- main
- Dev Joke
- Что SQLite сказал после обновления? «Я уже не тот, что раньше»