Blog
Posts about the development process, solved problems and learned technologies
How I Caught the Best Seed in Neural Network Search
Got up from the couch, coffee in hand, and realized: I need to find the optimal seed for LLM Analysis. The project demanded a breakthrough — the current baseline was giving 72.86% accuracy, and that wasn't good enough for production. The task seemed straightforward at first glance: test 20 different seeds, each generating its own model initialization. But beneath that simplicity lay an uncomfortable truth — each seed required roughly 100 minutes of computation. About 30 hours of pure runtime for the search. I launched *seed_search.py* and sent it to the background via nohup — let it work on its own while I handled everything else. The first result surprised me: **seed 1 showed 76.5% at the 200th checkpoint**, meaning a 3.64 percentage point improvement. Not revolutionary, but movement in the right direction. The script ran stably, results accumulating in *results_seed_search.json* with resume support — if the process crashed, just restart it and it would continue from where it left off. While the seeds were computing, I got to parallel work. Wrote *augment_problems.py*, which transformed 6,604 original problems into 39,582 variations — the foundation for model self-distillation. Simultaneously prepared *majority_voting.py* for voting between Orchestra and baseline, and *dual_orchestra.py* for a two-stage architecture with intermediate layers. The plan crystallized in my head. After seed search finishes (another three days), I will: 1. Analyze the distribution of 20 results and pick the best seed 2. Run majority voting on the best checkpoint 3. Build Dual Orchestra Stage 1, using the best seed as the foundation 4. Train self-distillation on 39K augmented problems The technology behind all this is simple but stubborn. Claude as the primary LLM — fast, accurate enough for analysis. Python for process orchestration, JavaScript somewhere in the neighboring services. But the main thing — it's patience and systematicity. In a month, if everything works out, this model will perform better. For now, I'm waiting for results, sipping cold coffee. **Fun fact:** Kafka and my black cat have one thing in common — both do only what they want and actively ignore instructions. 😄
When Russian Abbreviations Break Your UI: A Cascade Debug Story
I was debugging the **Cascade** trend analysis frontend when a Slack message came in: *"The translated labels look wrong."* One glance at the API response confirmed it—"Финансирование инвестиций в ИИ" (AI Investment Financing) had arrived pristine from Claude, but somewhere between the backend and the DOM, "ИИ" had collapsed into "ии". Classic case of right data, wrong rendering. The culprit was `formatClassName()`, a utility function that handles label capitalization for display. It was applying strict sentence-case logic—uppercase first character, lowercase everything else—indiscriminately to both English and Russian text. For English, this works fine because we maintain an `ABBREVIATIONS` set that preserves known acronyms like "LLM" and "API". But Russian abbreviations like "ИИ" (AI), "США" (USA), and "ЕС" (EU) had no such protection. The lowercase transformation was eating them alive. The decision point came down to this: should I add a massive Russian abbreviations dictionary to the frontend, or should I detect when we're dealing with non-ASCII text and skip the aggressive sentence-casing altogether? The latter felt smarter. The backend's Claude LLM was already returning perfectly capitalized Russian text via `_enforce_sentence_case()`. I wasn't fixing translation quality—I was preventing the frontend from *breaking* it. The fix was surgical: check if the input contains Cyrillic characters. If it does, preserve case entirely and only guarantee the first letter is uppercase. If it's pure ASCII (English), apply the original sentence-case logic with `ABBREVIATIONS` protection. A simple `includes()` check against the Unicode range for Cyrillic (U+0400 to U+04FF) solved it without bloating the codebase. **Here's a fun fact:** Cyrillic script actually predates Latin in Byzantine tradition—it was designed in the 9th century by Saint Cyril specifically to preserve proper capitalization rules for Old Church Slavonic. Centuries later, and we're still fighting the same battle: respecting case sensitivity in non-Latin alphabets. The labels render correctly now. "ИИ" stays "ИИ". The branch (`fix/crawler-source-type`) is clean, the build passes, and Monday's code should behave exactly like Friday's—which is all we can ask for 😄