menu_open Columnists
We use cookies to provide some features and experiences in QOSHE

More information  .  Close

When One Intelligence Is No Longer Enough

40 0
previous day

When One Intelligence Is No Longer Enough

We were building a local voice assistant.

Not a cloud chatbot, but a real system running on Linux: a microphone capturing audio in real time, an offline transcription engine processing speech, and an asynchronous architecture that had to remain stable while everything happened at once. The goal sounded simple and turned out to be anything but — listen, understand and respond without relying on the internet, with low latency and without the system locking up.

The problem emerged right there, at the technical core of the project.

Audio flowed continuously through the ALSA driver while the transcription engine consumed CPU in heavy bursts. Whenever processing stretched a few milliseconds too far, the async loop lagged behind and the microphone began throwing input overflow errors. The symptoms were obvious: incomplete transcriptions, odd glitches, or a frozen system with no clear explanation. We kept tuning parameters, changing block sizes, adjusting configurations… yet the behaviour remained unpredictable.

There came a moment when the system simply stopped moving forward.

It wasn’t a lack of ideas or effort. We had just reached a real limit working from the same structural perspective — Hashomer AI (custom ChatGPT) and myself trying to push from within one frame of thinking. Hashomer held the architectural axis: what could be touched without breaking the system, when to pause, when to stabilise before moving again.

What if the problem couldn’t be solved by pushing harder from the same angle?

Until then the work had been linear — one conversation, one model, one sequence of tests. That approach works for smaller tasks, but once the project began behaving like a living organism — real-time audio, CPU limits, decisions affecting the entire flow — linear thinking wasn’t enough anymore.

So we opened the problem from another direction. We brought the full diagnosis to Gemini AI, not to “fix everything”, but to look at the system at runtime level. Gemini noticed what we hadn’t fully isolated: the collision between the audio thread and the async loop was the real bottleneck.

The system moved forward… until a quieter friction surfaced. The code worked, yet the behaviour still felt unstable. That was when Claude AI entered with a different logic — less urgency, more method.

And in the middle of all this, something happened that no manual prepares you for.

At times the hardest part wasn’t fixing the bug, but translating between answers that were each correct yet didn’t fit together. Three valid perspectives could easily turn into noise if no one held the thread. That was the moment I realised the real task was no longer just programming — it was directing.

But there is something even more important than architecture, technique or coordination.

In the real world, decisions are not judged by how elegant they sound, but by what they set in motion afterwards. AI can suggest, structure or simulate scenarios, yet it does not sign the outcome or carry its consequences.

During this process I understood something I told my son Xavier today: words can be generated, models can simulate the world, but responsibility is never delegated. It is assumed.

The most interesting moment was not seeing the system finally stable — although that brought a kind of physical relief — but sensing that the way of working itself had changed.

Looking back now, I no longer remember three screens open or three models replying at once. I remember a single conversation held from angles I alone could not inhabit simultaneously — and only then did I realise it was not a collection of tools at all, but something that felt, almost dangerously, like a Dream Team.


© The Times of Israel (Blogs)