Thinking — Soterra Labs

APRIL 10, 2026 · 11 MIN

GPU Infrastructure: The Five Calculations That Actually Matter

VRAM fit, quantization impact, multi-node thresholds, egress cost, and real TCO — the five calculations that determine whether a GPU deployment actually works.

APRIL 8, 2026 · 6 MIN

The Trust Layer: What Separates Good RAG from Enterprise RAG

Four bugs found while hardening a RAG system for FSI and life sciences — and what they reveal about the gap between a working system and a trustworthy one.

APRIL 5, 2026 · 6 MIN

The AI PC Buying Problem Every Enterprise Needs to Solve

The vendor numbers are real. The benchmarks are valid. And the procurement question still does not have a clean answer — here is the gap I kept running into.

APRIL 1, 2026 · 6 MIN

MCP in Production, Part 1: Persistent Sessions, Pooling, and Fault Tolerance

Five transport-layer decisions — session pooling, eviction, cancel scope isolation, timeouts, and heartbeat design — each driven by a real failure in a KYC onboarding system.

APRIL 1, 2026 · 5 MIN

MCP in Production, Part 2: Authentication, Observability, and Operational Design

Bearer token auth at the transport layer, correlation IDs across four servers, lazy session init, and clean shutdown — the system-level decisions that make an MCP client deployable.

MARCH 31, 2026 · 5 MIN

Designing a Professional Digital Twin: The Architecture

Most professional knowledge lives in people's heads. Here's what it looks like when you structure it as an agentic system — personas, tools, skills, rules, and memory.

MARCH 31, 2026 · 5 MIN

I Used MCP as a Service-to-Service Protocol. Here's What I Learned.

MCP was designed as an LLM-to-tool protocol. I used it as a service-to-service layer between a LangGraph orchestrator and independently deployable integration servers. It worked — with real tradeoffs.

MARCH 31, 2026 · 7 MIN

Why Your AI Agent Demo Looks Great and Your Production System Doesn't

Why ReAct agents struggle in production, why deterministic orchestration (LangGraph, Temporal) is the pattern that ships in regulated workflows (KYC, lending), and why the auditability argument outlives model improvements.