Engineering capabilities, amplified

Running an army of AI coding agents on the CI

Table of Contents

Overview

This post explores how to deploy and manage multiple AI coding agents with minimal infrastructure and maintenance overhead by leveraging continuous integration (CI) systems. I’ll demonstrate how to orchestrate agent workflows using GitHub Issues and Pull Requests as the primary interface for task assignment and coordination.

TL;DR

Background

Recently I’ve been coding exclusively with kodelet. It’s a CLI-based coding agent I built myself.

Before raising an army of kodelet agents I predominantly ran kodelet on my workstation. My typical workflow for small-to-medium features or bug fixes has become:

  • Navigate to the repository
  • Check out a new branch
  • Document my requirements in REQUIREMENTS.md
  • Kick start kodelet via cat REQUIREMENTS.md | kodelet run
  • Review kodelet’s work and iterate based on the feedback as needed
  • Polish the final 10% of code manually (rarely needed these days)
  • Use kodelet commit --short --no-confirm to commit changes
  • Run kodelet pr to create a pull request

Over the past month this approach has completely augmented my dev workflow. My productivity is no longer constrained by typing speed, and cognitive overload, but by how effectively I can articulate clear, detailed requirements.

Climb The Complexity Ladder

While the agent is generating code, I’m free to think ahead or take breaks (admittedly, sometimes doom-scrolling X feed!). Occasionally I’ll clone the repository to a separate directories and work on branches and run kodelets in parallel, however I don’t do this often. This is because managing multiple concurrent branches requires significant mental overhead that involves cloning repos, checking out branches, making LLM-assisted changes, committing and resolving merge conflicts. I’ve attempted this workflow several times, but it demands too much micromanagement.

My ideal scenario would involve a swarm of autonomous AI coding agents working in parallel, with progress visible from my phone while I’m at the playground with my kids. Specifically, I envision:

  • On demand PR on issue - Creating GitHub issues that kodelet automatically picks up and works on
  • On demand update on PR comments - Providing feedback on pull requests through comments, with the agent updating the code based on the feedback
  • Flexibility - Still I have the freedom and flexibility to take over the work when it’s needed, in case the AI run off the track

This sounds very promising, except for one significant barrier - setting up on-demand development environments. Each environment would need to run in a separate VM or container in order for the agent to work in isolation without interfering with each other. This creates a high barrier to entry.

Just Do It

Speaking for myself, the best thing about working with coding agent is that it significantly reduces the the activation energy of working on challenging problems, and the tenacity of procrastination due to demoralisation. Last Friday (May 30), I began to take on the problem with kodelet.

We began with brainstorming about the technical choices and trade-offs in the form of crafting Architecture Decision Record. Specifically we were considering a few options of managing infra:

  1. On demand VM-based “Cloud IDE” - It is a very exiciting engineering challenge, but requires a lot of infra effort to setup and maintain. Besides my experience with remote dev env is that the reproducibility typically would decay over time. As a result I strongly believe it’s a dead-end.

  2. Local workstation - It comes with little cost, but requires you to maintain a local dev tunnel (ngrok, cloudflare tunnel, tailscale etc) for accepting Github webhook, and the solution is extremely hard to be streamlined because everyone runs their own virtualisation solutions on their workstation.

  3. Run on your CI (Chosen) - As I’m not aspired to build a Cloud IDE, both me and kodelet think running AI agent dev environment on CI is a great idea as it’s very low key and low maintenance. Essentially the infra for running AI coding agent dev-environment is out-sourced to GitHub Actions, and project management UI is off-loaded to the pre-existing Github Issue and Pull requests pages, which is a established workflow used by the dev. Besides comes with great reproducibility - since the setup for the dev env for AI coding agent is likely to be identical to your CI test env.

After discussing about the architecture for an entire afternoon, we decided to go with the third option.

FWIW I am very impressed by the mermaid diagram kodelet generated in the ADR:

graph TB
    subgraph "Developer Interaction"
        Dev[👤 Developer]
        Issue[📋 GitHub Issue]
    end

    subgraph "GitHub Platform"
        GH[🐙 GitHub]
        API[📡 GitHub API]
        Repo[📁 Repository]
        PR[🔀 Pull Request]
        Events[⚡ GitHub Events]
    end

    subgraph "GitHub Actions Runner"
        Runner[🏃 Actions Runner]
        Env[🛠️ Environment Setup]
        Kodelet[🤖 Kodelet CLI]
        Tools[🔧 Development Tools]
    end

    subgraph "Configuration"
        Workflow[📄 GitHub Actions Workflow]
        Secrets[🔑 GitHub Secrets]
    end

    %% Interaction Flow
    Dev -->|"@kodelet work on this"| Issue
    Issue -->|issue_comment event| Events
    Events -->|Trigger workflow with if condition| Runner

    %% Workflow Execution (validation done in if condition)
    Runner -->|Validation passed| Env
    Env -->|Clone & setup| Repo
    Env --> Kodelet
    Kodelet -->|Access secrets| Secrets
    Kodelet --> Tools

    %% Output Generation
    Kodelet -->|Commit changes| Repo
    Runner -->|Create PR| PR
    Runner -->|Update status| Issue

    %% Error Handling (if validation fails)
    Events -->|Validation failed| API

    %% Styling
    classDef user fill:#e1f5fe
    classDef github fill:#f3e5f5
    classDef runner fill:#fff3e0
    classDef config fill:#fce4ec

    class Dev user
    class GH,API,Repo,PR,Events,Issue github
    class Runner,Env,Kodelet,Tools runner
    class Workflow,Secrets config

Implementation

The implementation is actually very simple. As someone who has never built a Github Action, I vibe-coded the workflow using kodelet - https://github.com/jingkaihe/kodelet-action.

Essentially what it does it:

Issue and issue comments with @kodelet will trigger a Github workflow that summon kodelet to: look at the issue, create a branch, plan the work, write the code and tests, commit the changes and raise a PR and last but not least update the issue with the PR link.

This following issue demonstrates the issue workflow:

issue-workflow

PR review and comments with @kodelet will trigger a Github workflow that summon kodelet to: look at the PR, checkout to the PR branch, plan the work, write the code, tests and docs, commit the changes and update the PR with a comment about the changes.

The following PR demonstrates the PR workflow:

pr-workflow

Conclusion

With a straightforward workflow like this:

    steps:
      - name: Checkout Repository
        uses: actions/checkout@v4
        with:
          fetch-depth: 0
          token: ${{ secrets.GITHUB_TOKEN }}
      - name: Setup your dev env
        script: |
          # setup your dev env/CI here, e.g. install dependencies, start the dev database etc
      - name: Run Kodelet
        uses: jingkaihe/kodelet-action@v0.1.4-alpha
        with:
          anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}

You can harness the power of parallel AI coding agents working on your behalf with minimal setup overhead. The interface is unreasonably simple — just comment @kodelet on any issue or pull request to delegate tasks to your AI agents.

This approach transforms your CI infra into a scalable AI-powered engineering workforce, enabling you to tackle multiple coding tasks simultaneously while maintaining the familiar GitHub workflow you already know.

Acknowledgement

During my research for this post, I discovered that GitHub Copilot also leverages GitHub Actions to power its coding agents:

While working on a coding task, Copilot has access to its own ephemeral development environment, powered by GitHub Actions, where it can explore your code, make changes, execute automated tests and linters and more.

It turns out this approach is more common than I initially realised. It’s a testament to the natural fit between AI agents and CI systems ;)