Architecture & Performance Review Process

Effective: April 2026 Team: MinuteMenu KidKare

Quick Summary

Dev

Expand the scope of regular PR code review — add architecture and performance impact review with developer and agent (Agent 5 in code review). Every PR now includes an Infrastructure Impact Assessment.
Submit the agent implementation plan for review before executing major features (multi-week, multi-repo). Catch architecture concerns early, align on approach, and avoid costly rework. Keep Llewellyn and Harry in the loop so we can get their valuable feedback.
Provide a technical release note for every release. Dev lead prepares the note covering all tickets, infrastructure impact, deploy order, and rollback plan — reviewed by DevOps and architect before production deployment.

DevOps

Set up performance alerts to identify the issue sooner instead of hearing from client's complain, for example: cpu & ram usage alert, average API response time, database connection health, error rate spikes...
Maintain system headroom — infrastructure must absorb sudden spikes in a system with significant traffic variation throughout its claim cycle.

QA

Collaborate with dev on solution review for complex tickets. Share insight on performance from a business perspective — which screens are used most, which data volumes are realistic, what user patterns to expect...
Test with realistic data volumes — for features that interact with large datasets or change filtering and querying logic, QA tests with accounts that have significant data to make sure results are accurate and performance is acceptable under real conditions.

1. Why We Are Improving

We already have AI-assisted code review in place. It catches bugs, guideline violations, and ticket alignment issues. But a couple of recent production incidents — including a stored procedure performance regression and a login degradation after a major release — showed that we are missing architecture and infrastructure impact during review.

We are improving the process to:

Eliminate architecture and performance issues before they reach production.
Increase system reliability through better review at every stage.
Improve client satisfaction by preventing incidents rather than reacting to them.

 WHAT WE REVIEW TODAY              WHAT WE ARE ADDING
 ┌──────────────────────┐         ┌──────────────────────────────┐
 │ ✓ Code bugs          │         │ + Architecture changes       │
 │ ✓ Guideline rules    │         │ + Performance impact         │
 │ ✓ Git history context│         │ + Infrastructure stress      │
 │ ✓ Ticket alignment   │         │ + Cross-repo side effects    │
 │                      │         │ + Release-level impact       │
 └──────────────────────┘         └──────────────────────────────┘
  Existing Agents 1-4              Three new review gates

2. The Three Review Gates

Three review steps added at different stages of the delivery process.

flowchart LR
    subgraph GATE 1
        A[Feature Plan Review]
    end
    subgraph GATE 2
        B[Code Review Agent 5]
    end
    subgraph GATE 3
        C[Release Review]
    end
    A -->|Plan approved| D[Dev builds tickets]
    D -->|PR created| B
    B -->|All tickets merged| E[QA testing]
    E -->|Testing passed| C
    C -->|DevOps + Architect approve| F[Production]

 WHEN EACH GATE RUNS
 ────────────────────────────────────────────────────────────────

 ┌─────────────┐    ┌─────────────┐    ┌──────────┐    ┌──────┐
 │ Plan Review │───>│ Code Review │───>│ Release  │───>│ Prod │
 │ (features)  │    │ (every PR)  │    │ Review   │    │      │
 └─────────────┘    └─────────────┘    └──────────┘    └──────┘
       │                   │                  │
  Major features      Every PR         Every release
  only (weeks/        Developer +       Dev lead provides
  months of work)     Agent review      release note

3. Gate 1: Feature Plan Review

What It Is

When developers work with AI coding assistants (Claude Code, GitHub Copilot, Antigravity), there is a plan mode where the agent works with the developer to create a comprehensive implementation plan. For major features, this plan must be reviewed and approved by stakeholders before the developer proceeds to execution.

The plan is committed to the docs repo as a PR so architect, dev lead, DevOps, and stakeholders can review it.

When Required

 REQUIRED                              NOT REQUIRED
 ┌─────────────────────────────┐      ┌─────────────────────────────┐
 │ Multi-week/month features   │      │ Individual tickets          │
 │ Changes to auth/payments    │      │ Bug fixes                   │
 │ Changes touching 3+ repos   │      │ Small enhancements          │
 │ Core flow changes (claims)  │      │ Config changes              │
 │                             │      │ UI text/style changes       │
 │ Examples:                   │      │                             │
 │ - SAML SSO integration     │      │ We deliver hundreds of      │
 │ - Adyen payment integration │      │ tickets per 2-week cycle.   │
 │ - Claims processor rewrite  │      │ Cannot review plans for all.│
 └─────────────────────────────┘      └─────────────────────────────┘

How It Works

flowchart TD
    A[Feature assigned] --> B[Dev + Agent create plan in plan mode]
    B --> C[Commit plan to docs repo]
    C --> D[Open PR in docs repo]
    D --> E[Assign reviewers]
    E --> F{Approved?}
    F -->|No| G[Revise plan]
    G --> D
    F -->|Yes| H[Dev proceeds to execution]
    H --> I[Update plan with PR links as work progresses]

Reviewers: architect, dev lead, DevOps (if infra impact), client stakeholder (if needed).

4. Gate 2: Code Review — Architecture & Performance Agent

What It Is

A new agent (Agent 5) added to the existing code review. It runs on every PR alongside the existing 4 agents. Developer reviews Agent 5 findings and triages them — we never fully rely on the agent alone.

How It Fits with Existing Code Review

 BEFORE (4 agents)                  NOW (5 agents)
 ┌──────────────────────────┐      ┌──────────────────────────────┐
 │ Agent 1: Guidelines      │      │ Agent 1: Guidelines          │
 │ Agent 2: Bug detection   │      │ Agent 2: Bug detection       │
 │ Agent 3: Git history     │      │ Agent 3: Git history         │
 │ Agent 4: Ticket alignment│      │ Agent 4: Ticket alignment    │
 │                          │      │ Agent 5: Architecture &      │
 │                          │      │          Performance ← NEW   │
 └──────────────────────────┘      └──────────────────────────────┘
                                          │
                                   Developer reviews findings
                                   and produces Infrastructure
                                   Impact Assessment to share
                                   with DevOps.

What Agent 5 Checks

Six categories of architecture and performance risk:

 CATEGORY                  WHAT IT LOOKS FOR
 ─────────────────────────────────────────────────────────────
 Scope Creep               Diff does significantly more than
                           the ticket asks for.

 Data Access Change        Materialized view → real-time query.
                           Indexed lookup → table scan.
                           Cached result → live query.

 Infrastructure Stress     Mass re-authentication. Session
                           invalidation. New background jobs
                           that hit the database.

 Timeout/Pool Changes      Connection timeout changes. Pool
                           size changes. Command timeout values.

 Removed Optimizations     Dropped indexes. Removed caching.
                           Batch → per-row operations.

 Cross-Repo Impact         Changes to SSO, shared DB, or
                           payment services used by other
                           products.

What It Produces

Every PR gets an Infrastructure Impact Assessment, even when no issues are found:

 INFRASTRUCTURE IMPACT ASSESSMENT
 ──────────────────────────────────────────────────
 Risk Level:           None | Low | Medium | High | Critical
 Summary:              One sentence.
 Details:              What changed, what tables/services affected.
 DevOps Action Needed: Yes / No
   If yes:             What specifically (VM, pool, index, etc.)

This section can be copied directly to share with DevOps.

Learning from Incidents

Agent 5 references a known-patterns.md file with example patterns from past incidents. These are starting examples, not a complete list — the agent uses them as guidance to recognize similar risks, not as a strict checklist. The file is updated whenever a new incident reveals a useful pattern.

5. Gate 3: Release Review

What It Is

After all tickets merge to the release branch and QA testing passes, the dev lead provides a technical release note. DevOps and architect review it before production deployment.

When Required

Every release. No exceptions.

How It Works

flowchart TD
    A[All tickets merged to release branch] --> B[QA testing passes]
    B --> C[Dev lead prepares technical release note]
    C --> D[Commit to docs repo, open PR]
    D --> E[DevOps reviews]
    D --> F[Architect reviews]
    E --> G{Both approve?}
    F --> G
    G -->|No| H[Address concerns]
    H --> D
    G -->|Yes| I[Deploy to production]
    I --> J[Merge release to master]

What the Release Note Covers

 RELEASE NOTE SECTIONS
 ──────────────────────────────────────────────────────────────

 1. TICKETS IN THIS RELEASE
    Table: ticket ID, title, type, infrastructure impact level.

 2. INFRASTRUCTURE IMPACT (AGGREGATE)
    ┌─────────────────────────────────────────────────────────┐
    │ Database Changes      Schema, stored procs, indexes,    │
    │                       migration scripts                 │
    │                                                         │
    │ Auth/Session Changes  Login flow, tokens, re-auth       │
    │                       impact                            │
    │                                                         │
    │ API Changes           New/changed/removed endpoints,    │
    │                       traffic pattern changes           │
    │                                                         │
    │ Performance           Large table queries, removed      │
    │                       optimizations, new jobs           │
    │                                                         │
    │ Cross-Repo            Repos involved, deploy order,     │
    │                       rollback order                    │
    └─────────────────────────────────────────────────────────┘

 3. WEB CONFIGURATION CHANGES
    Table: service, key, value, notes.

 4. RISK ASSESSMENT
    Worst case. Likelihood. Detection. Rollback plan.

 5. DEPLOYMENT CHECKLIST
    DevOps review        ☐
    Architect review     ☐
    SQL scripts verified ☐
    Web config confirmed ☐
    Rollback plan ready  ☐
    Deploy approved      ☐

Release note template: releases/template.md

6. How the Three Gates Work Together

Each gate catches different types of problems at different stages.

 STAGE          GATE              CATCHES                    WHO REVIEWS
 ──────────────────────────────────────────────────────────────────────
 Before coding  Plan Review       Wrong approach             Architect
                (features only)   Missing infrastructure     Dev Lead
                                  Scope misalignment         DevOps
                                                             Stakeholder

 Per PR         Code Review       Architecture changes       Developer
                Agent 5           Performance regression     + Agent
                (every PR)        Infrastructure stress

 Before deploy  Release Review    Aggregate impact           DevOps
                (every release)   Cross-repo coordination    Architect
                                  Deploy/rollback order
                                  Config changes

flowchart TB
    subgraph "Gate 1: Plan Review (features only)"
        P1[Is the approach right?]
        P2[Infrastructure ready?]
        P3[Stakeholders aligned?]
    end
    subgraph "Gate 2: Code Review Agent 5 (every PR)"
        C1[Architecture changed?]
        C2[Performance impacted?]
        C3[Scope matches ticket?]
        C4[Infrastructure stress?]
    end
    subgraph "Gate 3: Release Review (every release)"
        R1[Aggregate infra impact?]
        R2[Deploy order correct?]
        R3[Rollback plan ready?]
        R4[DevOps + Architect OK?]
    end
    P1 --> D[Dev builds feature tickets]
    P2 --> D
    P3 --> D
    D --> C1
    D --> C2
    D --> C3
    D --> C4
    C1 --> T[QA Testing]
    C2 --> T
    C3 --> T
    C4 --> T
    T --> R1
    T --> R2
    T --> R3
    T --> R4
    R4 --> PROD[Production Deploy]

7. If an Issue Still Slips Through

Even with all three gates, an issue can still reach production. DevOps must ensure the following are in place so we detect and respond before clients are affected:

Performance alerts — API response time, database query duration, error rate spikes. The team should know about a problem within minutes, not hear about it from clients days later.
System headroom — infrastructure should handle sudden spikes without going down. If the system runs at capacity on a normal day, any spike becomes an outage with no time to respond.
Fail-fast timeouts — short command and connection timeouts so blocked requests release resources quickly instead of holding them for minutes.
Connection cleanup — automatic cleanup of idle connections so leaks do not accumulate silently.

DevOps will follow up with the specific alert thresholds, capacity targets, and monitoring setup.

8. Summary

 ╔════════════════════════════════════════════════════════════════════════╗
 ║       ARCHITECTURE & PERFORMANCE REVIEW — SUMMARY                     ║
 ╠════════════════════════════════════════════════════════════════════════╣
 ║                                                                       ║
 ║  WHY:  Improve our review process to catch architecture and           ║
 ║        performance issues before production. Increase system          ║
 ║        reliability. Improve client satisfaction.                       ║
 ║                                                                       ║
 ║  DEV:                                                                 ║
 ║  - Expand PR code review scope with architecture + performance agent. ║
 ║  - Submit agent plan for review before executing major features.      ║
 ║  - Provide technical release note for every release.                  ║
 ║                                                                       ║
 ║  DEVOPS:                                                              ║
 ║  - Set up performance alerts — detect issues in minutes, not days.    ║
 ║  - Maintain system headroom for significant traffic variation.        ║
 ║                                                                       ║
 ║  QA:                                                                  ║
 ║  - Collaborate with dev on solution review for complex tickets.       ║
 ║  - Test with realistic data volumes for data-heavy features.          ║
 ║                                                                       ║
 ║  TEMPLATES:                                                           ║
 ║  - Plan template: docs/plans/template.md                              ║
 ║  - Release note template: docs/releases/template.md                   ║
 ║                                                                       ║
 ╚════════════════════════════════════════════════════════════════════════╝