About
I build production AI systems
teams can test, observe, and trust.
I build production AI systems, agent runtimes, and harnesses that make AI features testable, observable, repeatable, and safe to operate. My edge is turning high-context AI ideas into production-ready systems, especially when the work demands agent behavior, backend boundaries, evaluation loops, and operational rigor all at once.

Approach
How I work
I work best on problems where engineering judgment creates compounding leverage. That usually means moving between agent behavior, backend systems, product constraints, and implementation details without losing sight of the outcome.
When AI is involved, I care less about demos and more about whether the system can be trusted. I build the surrounding harnesses, evaluation loops, observability, and engineering constraints that turn agent behavior into something teams can actually operate and improve.
Toolbox
Languages and frameworks
Web development
React, Next.js, TanStack, Svelte, HTMX, Tailwind CSS, Tailwind Variants, TypeScript, React Query, Three.js, Motion
Node.js, Bun, TypeScript, Express, Elysia, Hono, Python, Flask, FastAPI, Django
Postgres, SQLite, MySQL, MongoDB, OracleDB
AI development
Claude, OpenAI, Gemini, Llama, Mistral, Most leading LLM models
OpenAI SDK, OpenAI Agents, LangChain, LangGraph, LangSmith, PydanticAI, Agno, Vercel AI SDK, Building agents from scratch
pgvector, ChromaDB, Pinecone, Qdrant
Langfuse, DeepEval, LangSmith
DevOps
Docker, Docker Compose, Kubernetes, Terraform, AWS (EC2, S3, RDS, Lambda), GCP, AWS, Azure, Alibaba Cloud, Tencent Cloud, CI/CD (GitHub Actions, Jenkins), Ansible, Bash scripting, Nginx
Ubuntu, Debian, SSH, systemctl, Package management, Firewall management (ufw, iptables), Monitoring and logging (systemd-journald, syslog)
Principles
What guides the work
Build systems that stay legible under scale, change, and operational pressure.
Treat AI agents like production systems: instrument them, evaluate them, and give them harnesses that earn trust.
Optimize for repeatability, observability, and clear backend boundaries over novelty.
Elsewhere