I haven't started building yet. I'm still in the part where I stare at the space, read everything I can find, and wait for the right problem to become obvious. This page is where I think out loud about it.
The agentic space is moving faster than the tooling, the mental models, and definitely the production track record. We have models that can reason, use tools, maintain context, and plan across steps—but most production agents are still brittle, over-scaffolded, and solving problems that don't actually need agents. The interesting work is figuring out where the leverage is.
At FedEx I've been experimenting with IDE AI agents to accelerate documentation and code quality in an enterprise environment—which has given me a ground-level view of where the friction is between what agents promise and what they actually deliver when deployed in real workflows. That gap is where I want to build.
“What does reliability mean for an agent that takes action on your behalf?”
Not accuracy — reliability. The bar is different when consequences are real.
“When should an agent ask for permission vs. just proceed?”
Trust calibration is an unsolved design problem. Most agents get it wrong in both directions.
“How do you debug a system whose reasoning is opaque?”
Observability for agents is where logging was for distributed systems in 2010. Wide open.
“What's the right granularity for a task?”
One big agent vs. many small ones is a real architectural question with real tradeoffs.
“What should agents remember, and who decides?”
Memory gives agents leverage. It also gives them drift. The design space here is enormous.
Something that operates—not just generates. An agent that handles a workflow end-to-end, makes real decisions with real consequences, and gets meaningfully better over time. I'm drawn to problems where humans are doing high-frequency, semi-structured work that could be delegated to a system that genuinely understands context.
Productivity, development tooling, and research workflows all feel like the right territory. But I'm more interested in finding the right problem than picking a category. The best projects I've built started with a real frustration, not a market thesis.
Full-stack engineering — I can build the product, not just the agent layer
Enterprise API experience at scale (FedEx logistics systems)
A strong bias toward shipping over theorizing
Enough curiosity about the underlying models to work productively at the frontier
Experience building complex multi-user systems with real performance constraints
If you're working on something in the agentic space—or have a problem you think needs an agent but aren't sure yet—I'd genuinely love to talk. Even early, unformed ideas.
Let's talk