BorisovAI — Tools for the community. By the community.

I’ve been knee-deep in refactoring a voice-agent codebase—one of those projects that looks clean on the surface but hides architectural chaos underneath. The mission: consolidate 3,400+ lines of scattered handler code, untangle circular dependencies, and introduce proper dependency injection.

The story begins innocently. The handlers.py file had ballooned to 3,407 lines, with handlers reaching into a dozen global variables from legacy modules. Every handler touched _pending_restart, _user_sessions, _context_cache—you name it. The coupling was so tight that extracting even a single handler meant dragging half the codebase with it.

I started with the low-hanging fruit: moving UserSession and UserSessionManager into src/core/session.py, creating a real orchestrator layer that didn’t import from Telegram handlers, and fixing subprocess calls. The critical bug? A blocking subprocess.run() in the compaction logic was freezing the entire async event loop. Switching to asyncio.create_subprocess_exec() with a 60-second timeout was a no-brainer, but it revealed another issue: I had to ensure all imports were top-level, not inline, to avoid race conditions.

Then came the DI refactor—the real challenge. I designed a HandlerDeps dataclass to pass dependencies explicitly, added a DepsMiddleware to inject them, and started migrating handlers off globals. But here’s where reality hit: the voice and document handlers were so intertwined with legacy globals (especially _execute_restart) that extracting them would create more coupling, not less. Sometimes the best refactor is knowing when not to refactor.

The breakthrough came when I recognized the pattern: not all handlers need DI. The Telegram bot handlers, the CLI routing layer—those could be decoupled. The legacy handlers? I’d leave them as-is for now, but isolate them behind clear boundaries. By step 5, I had 566 passing tests and zero failing ones.

The memory leak in RateLimitMiddleware was devilishly simple—stale user entries weren’t being cleaned up. A periodic cleanup loop fixed it. The undefined candidates variable in error handling? That’s what happens when code generation outpaces testing. Add a test, catch the bug.

The lesson learned: refactoring legacy code isn’t about achieving perfect architecture in one go. It’s about strategic decoupling—fixing the leaks that matter, removing the globals that matter, and deferring the rest. Sometimes the best code is the code you don’t rewrite.

As a programmer, I learned long ago: we don’t worry about warnings—only errors 😄

Refactoring a Voice Agent: When Dependencies Fight Back

Metadata