BorisovAI — Tools for the community. By the community.

Debugging a Ghost in the Trend Analyzer: When Two Scores Told Different Stories

The trend-analysis project had an unexpected visitor in the data: two nearly identical analyses showing suspiciously different scores. One trend pulled a 7.0, the other landed at 7.6. On the surface, it looked like a calculation bug. But as often happens in real development work, the truth was messier and more interesting.

The task was straightforward—investigate why the scoring system seemed inconsistent. The project tracks Hacker News trends using Python backend APIs and frontend analytics pages, so getting the scoring right wasn’t just about pretty numbers. It was about users trusting the analysis.

I started where any good detective would: examining the actual data. The two analyses in question? Different stories entirely. One covered trend c91332df from hn:46934344 with a 7.0 rating. The other analyzed trend 7485d43e from hn:46922969, scoring 7.6. The scores weren’t bugs—they were correct measurements of different phenomena. That’s the moment the investigation pivoted from “find the bug” to “find what’s actually broken.”

What emerged was a classic case of technical debt intersecting with code clarity. The frontend’s formatScore function was doing unnecessary normalization that didn’t match the backend’s 0-10 scale calculation. Meanwhile, in api/routes.py, there was a subtle field-mapping issue: the code was looking for "strength" when it should have been reading "impact". Nothing catastrophic, but the kind of inconsistency that erodes confidence in a system.

The fix required three separate commits, each surgical and focused. First came the API correction—swapping that field reference from strength to impact in the score calculation logic. Next, the frontend got cleaned up: removing the normalization layer and bumping precision to .toFixed(1) to match the backend’s intended scale. Finally, the CHANGELOG captured the investigation, turning a debugging session into project knowledge.

Here’s something worth knowing about Python scoring systems: the 0-10 scale is deceptively tricky. It looks simple until you realize that normalized scoring, raw scoring, and percentile scoring all feel like they’re the same thing but produce wildly different results. The real trap isn’t the math—it’s inconsistent expectations about what the scale represents. That’s why backend and frontend must speak the same numeric language, and why a mismatch in field names can hide for months until someone notices the discrepancy.

By the time all three commits hit the repository, the scores made sense again. The 7.0 and 7.6 weren’t contradictions—they were two different trends being measured honestly. The system was working. It just needed to be reminded what it was measuring and how to say it clearly.

😄 Turns out the real bug wasn’t in the code—it was in my assumptions that identical scores should be identical because I wasn’t reading the data carefully enough.

Ghost in the Numbers: When Bug Hunting Reveals Design Debt

Debugging a Ghost in the Trend Analyzer: When Two Scores Told Different Stories

Metadata