Lessons Learned About AI Agents

A quick note about my journey before we begin

I have tried many things, and made many mistakes. The path I chose was full of researching, listening, learning, vibe coding, and building. Lots of trial and error, a few false starts, some promising prototypes, and at least three major triumphs.

I have leveraged agents to interact with AI models and agents in coding platforms to help along the way. Based on our work together, I asked the agents in these platforms to help me distill what I have learned to share with others who are just getting started. I hope you find our notes helpful.

Start here

"An AI agent is a smart teammate that follows our rules, uses our tools, and gets better because we measure it."

AI agents are digital teammates. They can see what's going on, think about next steps, and do work in your systems. Keys to success aren't in the model alone; the results come from the plumbing around it: clearly defined objectives, clean data, the right tools, clear guardrails, and steady measurement.

What agents have been great at (and not great at)

"If you can write a short checklist for a smart intern, an agent can probably help."

Great at (for me):

Repetitive, multi-step online work (research, tagging, routing, summarizing, filling forms)
Swivel-chair tasks across multiple systems
Last-mile chores humans dislike but do often
Prototyping, setting up tests, and analyzing results

Not great at (yet):

Safety-critical tasks, fully hands-off autonomy
Navigating complex web apps without oversight
Work with no clear definition of "done"

What you actually need under the hood

"Start with the brain + 2 tools + logging. Add the rest as you scale."

Start only with what you need. Expand from there.

The 5-Level Agent Ladder (don't skip steps)

"Advance a level only when the current one is boringly reliable."

Copilot: Answers questions, drafts content, pulls info, assists with planning and analyzing; humans still drive activity
Tool-using taskbot: Performs one task via an API (tag, summarize, enrich, route)
Workflow agent: Handles a small process end-to-end (intake → triage → prep)
Multi-agent team: Specialized agents (researcher, analyst, writer) orchestrate or hand work to each other
Computer-using agent: Clicks through web apps when no API exists; always start by tracing, supervising, and never stop auditing

A simple, example 6-week starter plan

"If it's not measured, it didn't happen."

Weeks 1-2: Pick one job and measure it; save real examples to test
Week 3: Ship a tiny agent with 2-3 tools; log everything
Week 4: Add measurement (dashboards + automatic checks)
Week 5: Wrap it as a workflow with guided retries and timeouts
Week 6: Pilot with a small group; track KPIs vs. baseline

Guardrails leaders care about

"Without guardrails, agents can get off track quick (believe me)"

Least privilege: minimum access to systems (data, rights, tools)
Human checkpoints: approvals for risky actions
Trace everything: so you can audit and improve
Plain-English policy: on what data agents can change, store, or erase

Build vs. Buy (60-second rubric)

"Buy plumbing, build your playbook."

Buy: common problems with strong vendors
Build: your secret sauce or many internal system touches
Hybrid: buy the platform, build the business logic

One-minute summary

This isn't about replacing people — it's about removing or accelerating work and helping people leverage AI to increase potential, impact, and success. Start small. Pick a real job. Give the agent the right tools. Measure everything. Build trust. Scale from there.