
Harness Engineering: The Discipline That Makes AI Agents Production-Ready
Why "just use a better model" is the wrong instinct — and how systematically engineering your agent's configuration surface is what actually makes them reliable.
Insights and perspectives on AI, digital health, and emerging technologies

Why "just use a better model" is the wrong instinct — and how systematically engineering your agent's configuration surface is what actually makes them reliable.

How I built a persistent, self-improving AI system with semantic memory, procedural knowledge, quality feedback loops, and coherence measurement running 24/7.

Building a production memory system for conversational AI based on cognitive science principles. Dual storage, temporal validity, uncertainty tracking, and entity deduplication in production.

Evolutionary search over AI agent orchestration patterns. 18 experiments across 6 models (7B–405B parameters) reveal that orchestration effectiveness depends entirely on model scale.

Building a production voice assistant with local GLM routing and semantic memory. 85% of queries handled at 700ms latency with smart home integration and Engram memory.

What started as porting Kyutai's Mimi neural audio codec to Apple Silicon became something bigger: a full-duplex, fully local voice AI system running on an M4 Max. No cloud. No round-trips. Sub-200ms first-audio latency.