AI Platform Engineer
Building production-grade AI systems with LLM orchestration, autonomous agents, and intelligent safety controls
About Me
I'm an AI Platform Engineer with 10+ years of experience building scalable cloud infrastructure. Recently, I've focused on architecting production-grade AI systems featuring LLM orchestration, autonomous agent infrastructure, and AI safety systems.
I've built and deployed two full-stack AI platforms from the ground up: OnCallShift (AI-powered incident management) and WorkerMill (autonomous coding agent orchestration). Both demonstrate production-quality architecture with multi-tenant design, real-time monitoring, cost optimization, and safety controls.
My expertise spans Claude API integration, distributed systems, cost tracking for AI workloads, and building safety guardrails for autonomous systems. I bring deep cloud infrastructure knowledge (AWS, Azure, Terraform, Docker) combined with hands-on AI platform engineering experience.
Core Expertise
AI/LLM
- Claude API (Anthropic SDK)
- Autonomous Agent Orchestration
- AI Safety & Governance
- Prompt Engineering
- Cost Optimization
Cloud & Infrastructure
- AWS (ECS, Lambda, RDS, S3)
- Azure
- Terraform / CloudFormation
- Docker / Kubernetes
- CI/CD Pipelines
Development
- TypeScript / JavaScript
- Python
- React / Node.js
- PostgreSQL
- Real-time Systems
AI Projects
OnCallShift
ProductionAI-Powered Incident Management Platform
Full-stack incident management platform featuring Claude-powered diagnosis, AI runbook automation, and cloud investigation capabilities. Includes iOS/Android mobile apps built with React Native.
Key Features:
- Claude-powered incident diagnosis with streaming responses
- AI runbook automation with sandboxed execution
- Multi-tenant API key management
- MCP server for AI assistant integration
- Real-time streaming with sub-second latency
WorkerMill
ProductionAutonomous AI Coding Agent Orchestration
Mission control platform for autonomous AI coding agents featuring real-time monitoring, cost tracking, and safety controls. Manages end-to-end workflow from Jira tickets to GitHub PRs.
Key Features:
- Real-time terminal streaming with sub-second observability
- PostgreSQL-based log streaming (2x faster than CloudWatch)
- Atomic task claiming with database-level locking
- Persona-based workers with role-specific directives
- Safety guardrails and approval workflows
Professional Experience
DevSecOps Manager
Eaton | Raleigh, NC
Architecting cloud-native DevSecOps roadmap on Azure with infrastructure-as-code standards. Building scalable infrastructure automation supporting AI/ML workload provisioning across multiple business units.
Lead DevOps Engineer
ICF | Remote
Modernized cloud platforms for federal programs. Developed AWS Lambda functions for orchestration workflows, migrated Kubernetes to ECS/EKS, and established centralized monitoring systems.
DevOps Engineer
Nextworld | Remote
Managed CloudFormation-based infrastructure for multi-region ERP platform. Built CI/CD pipelines and infrastructure automation across AWS regions.
Site Reliability Engineer
State of Colorado Dept. of Revenue | Remote
Supported statewide rollout of DRIVES, Colorado's cloud-based motor vehicle platform.
Get in Touch
Interested in AI platform engineering, LLM orchestration, or autonomous systems? I'm currently exploring new opportunities.