JAMR Game Labs Developer Blog

JAMR Game Labs Developer Blog

Home
Archive
About

AI Enabled Engineering

A Senior Engineer's Day to Day

Mike Rogers's avatar
Mike Rogers
Jun 03, 2026
Cross-posted by JAMR Game Labs Developer Blog
"I wanted share a bit about what my day to day looks like using AI coding agents as a senior software engineer, because there's a lot more to it than "Here's a document, go write my code." This is a bit of a weird place to break the radio silence, but I'm having a lot of fun with this project, so I thought I'd share."
- Mike Rogers
black flat screen computer monitor
Photo by Fotis Fotopoulos on Unsplash

Finding stock images for code is a treat, because I find myself more worried about the code in the image than I probably should be. Like if I pick an image with terrible code, someone’s going to think I’m a terrible engineer.

I have enough trouble with impostor syndrome,1 I don’t need that stress on my plate too.

In all honesty, when I’m not spiraling inside my own head, I like to think I’m good at my job. I don’t have all the answers, but no one does.

When it comes to AI development, I was a huge skeptic. People started talking about vibe coding, and it made me more than a little uneasy.

Folks who have worked with me, or, honestly, spent any length of time with me, will all tell you that I am a very opinionated person. There are a lot of hills I’m willing to die on, and for a long stretch, AI development was one of them.

It couldn't possibly be that useful. There’s no way.

Around the start of 2026, the contract I was assigned to took an unexpected detour, and I found myself with a lot more time for professional development, so I decided to give AI a chance.

Maybe I just happened to pick it up at the right time, but the result was not what I expected.

Coding is not a Creative Endeavor

Coding isn’t creative. At least not in the same sense as books or art. Coding is pattern recognition and repetition. Creativity surfaces in how you stitch those patterns together to solve a particular problem.

Software engineers are constantly reusing other people’s code. Languages have entire ecosystems of libraries and frameworks that we use to do our job. It’s not just acceptable, it’s encouraged. Don’t reinvent the wheel. If someone else already solved the problem, you don’t need to solve it again.

If the AI agent is going to scan StackOverflow for an answer to my question and drop a code snippet into my file and I can count on that snippet being even mostly right, I’m good with that (at least as a starting point).

When it comes to coding, the AI agent just makes you faster. It's not a magic wand, and it doesn’t have all the answers.

Set Boundaries

The AI can’t do everything, and honestly, it shouldn’t. If you’re going to dive into AI enabled development, it’s a good idea to start by setting some ground rules. Decide up front (as best as you can) what types of tasks you’re going to offload to the AI, and then stick to that.

For example, I don’t let the agent make commits for me. I know a lot of people do, but in my experience, I tend to keep better track of changes when I write the commit messages myself, because I instinctively spot check diffs before I do it.


My Go To Tasks

I give the AI the boring stuff: Lint errors, dependency updates, stale test cleanup, etc. None of these tasks are hard, but they can be time consuming, so it makes perfect sense to hand them off to the tool that can do it quickly.

Here’s a good example:

On my project at work (the day job), I’m working on a couple python applications. During the initial pilot phase, GitLab reported a vulnerability in one of the project’s main dependencies.

I updated the app to the suggested fix version, and my local environment fell into a crash loop. It turned out that we had a few other dependencies in the project that required the older version of the one I updated.

I handed the list of dependencies off to Codex2, along with the security finding, and said “I need you to find compatible versions of these dependencies that resolve the finding.”

The agent spun up multiple parallel threads and tried running the app against multiple variations of the dependency versions simultaneously until it found one that worked. It followed the exact same workflow I would have, but it only took the agent 2 minutes to reach the desired result.

It ran in the background while I kept working on feature tickets.

This is where the tool really shines. I let the AI worry about test coverage, or small TODO tasks that it can execute while I’m working on other aspects of the code.

Focused Task Delegation

AI agents thrive when they have a narrow focus with clearly defined expectations. When I hand something off to the AI, I have an explicit roadmap for it to follow. Sometimes its a PRD with clearly defined implementation phases, and sometimes it's an inline TODO.

The key is to give the agent a clear starting point.

Example time:

One of the bigger things I’m working on with this project is AI opponents. I’m not talking about letting LLMs play the game (not yet, anyway). It’s more like a “decision tree.”

While working on Summits, I ran into a bug with one of the game’s components during the AI turn. Summits has special cards that trigger actions when drawn, and those actions need to be revealed to everyone at the table.

The initial component that the AI built wasn’t working properly, so I needed to change it.

Original Props:

type DrawnCardWindowProps = {
  cardRef: CardRef;
  /** When true, always shows card back regardless of card identity (AI draws). */
  isFaceDown?: boolean;
  /** When true, hides all interactive controls (AI turns — observe only). */
  isObserveOnly?: boolean;
  // ...
};

The DrawnCardWindow is a modal window used to reveal a card when drawn so that cards are easily discernable even on smaller screens. The two boolean values create some unnecessarily complex conditionals, so I asked the AI agent to tighten them up:

There is a bug with action cards during the AI turn. Action cards are not properly revealed
to the user. When an action card is drawn, that card is revealed publicly, because it's
played directly to the discard pile.
  
Due to screen size limitations, in this game, action cards are revealed by DrawnCardWindow 
(src\game-bundle\components\DrawnCardWindow.tsx). We can fix this by tuning a couple of the
props in that component.
  
* `isFaceDown` and `isObserveOnly` can be rolled into a single `isHumanTurn`.
* if isHumanTurn --> hide interactive controls
* if isHumanTurn || card.type is action card --> show card face -- else --> show card back 

This change simplified the logic, and made the meaning of the props more transparent. I engineered the fix, and then handed it off to the AI to make the changes.


Control the Context Window

red and white floppy disk on white surface
Photo by Fredy Jacob on Unsplash

When you're working with AI, you have to pay attention to the context window. The longer a thread goes on, the less precise the agent becomes. It takes longer to sift through the content, and creates a wider margin of error for misinterpretation.

If you stretch a thread too far, the agent will eventually compact the conversation to free up its context window. It might be easy to see that as a sort of clean slate, but compacted conversations are kind of lossy compression.

The AI chooses what to remember. It has to decide what details are important, and sometimes it chooses the wrong thing.

New Threads, Early & Often

As a general rule of thumb, I try to maintain 1 thread per task. Once a task is finished, I'll clear the context and spin up a new thread with an actual clean slate.

This is a little bit of a balancing act, though. Sometimes I'll run several small tasks in the same thread (especially for more iterative work). The real key is that I make a point to “finish" my work before the agent starts compacting anything. If I see the context window reaching its limit, I try to find a good stopping point as soon as possible.


Restrict Access

white and red do not enter signboard
Photo by Sanchez Amezcua on Unsplash

The AI agent will overstep at some point. It's inevitable, but vague prompts will make it worse. If you don't want the AI to touch something, be explicit about it.

When I first started using AI for coding tasks, I had to correct a lot of mistakes, because it misunderstood a prompt.

I'd be fiddling with some refactor trying to compare options, and when I asked the AI for input, it would go sprinkle a smattering of updates across the app in an attempt to finish what I started.

In some cases, it even undid my changes under the premise that it was “fixing bugs.”

That’s the important detail here. The AI likes to guess. AI coding agents are code completion tools. It’s just like IntelliJ suggesting method names as you type. AI just has a larger context window and a bigger data set to use for that predictive feature.

The (potential) problem is that if it's not sure about something, it’ll make an assumption based on the information it has available. That’s when you tumble down the rabbit hole into Wonderland.

Human engineers do the exact same thing. If you don’t know the answer, you research, you weigh options, and you make a decision based on the data. If a ticket is unclear, you reach out to the lead engineer or the PM, and you ask questions.

Coding in a vacuum leads to mistakes.

Senior engineers have all had conversations with junior team members about this.3 It’s a teaching moment. Guessing wrong creates more work for the team, so ask questions.

Crafting Prompts

Left to their own devices, AI agents don’t ask questions, so you have to be diligent about scoping the work and setting expectations for what the agent should do if it runs into missing information.

When I assign a task with any amount of complexity, I tend to end my prompt with something like the following:

Do not make assumptions, and do not guess. If any part of this request is unclear, stop and ask me clarifying questions. I am the source of truth for resolving ambiguity.

A few well placed constraints can go a long way.

Read Only Review:

For the sake of this task, I want you to treat {insert path} as READ ONLY. Do not make changes.

Filter White Noise:

This is an in process refactor. For the sake of this review, I am only concerned with gaps or bugs in the source. Ignore tests. We will address those in a future pass

Focused Test Updates:

I've made some changes to {insert process}, which left a number of stale tests. Please update tests to reflect the latest process changes. For the sake of this task, treat /src as READ ONLY

Do not paper over real failures. If a bug in the source code would cause a test to fail, write the failing test anyway. I'll review the fallout.

This last one is rooted in a very real experience using Codex.4 I had made some substantial changes to a data import process, and asked Codex to add some test coverage for the new logic.

I had a bug in the process that stemmed from a typo on a field reference. It was something small like a self._setings instead of self._settings kind of mistake. Easy to do, and easy to miss in a spot check, but a well written test will find it every time.

Codex ran into the error while writing a test, and instead of flagging the error, it chose to monkeypatch the class with a _setings attribute.

“_setings does not exist on {type}, so I'm going to inject the field in order to get the tests to pass."5

Tell the Agent to Stop

The agents I've worked with spit status updates out to the console while they're working, so things like stupid test resolutions can be easy to spot. If you see the agent doing something wonky, tell it to stop. Interrupt the command and clarify the instructions, or tell it to try a different approach.

You don't have to wait for a task to finish. Most agents are pretty good at picking up where they left off, so don't hesitate. If it looks like the agent is about to wander off the reservation, interrupt it.

Plan First

Codex and Claude both have a “plan mode.” When you give the agent a task with plan mode active, instead of running off to write code, it will write up an action plan for implementation.

That plan will call out what files it intends to create or modify, and provide a summary of the changes it intends to make. If you’re working with gspec (which, again, I strongly suggest you do), it will also call out which PRD capabilities it will check off once the plan is implemented.

Plan mode will help you spot issues early, because you’ll get a summary of what the agent intends to do before it actually does anything. If you spot any mistakes, you can ask for changes or provide more information. The agent will continue to iterate on the plan until you’re statisfied with it.


Not All Agents Are Created Equal

white and brown concrete house
Photo by Florian Schmetz on Unsplash

This is a matter of opinion, but I think its an opinion worth sharing:

In my experience, Claude is my agent of choice for coding tasks. The gap between Claude and Codex is large. Larger than you'd probably think.

I use both pretty extensively day to day (Codex at the day job, and Claude for JAMR dev projects), and it's a night and day comparison.

Codex makes much bigger mistakes, and generally requires a lot more hand holding to get the job done. With Claude, I have no problem letting it run (mostly) in the background.

I work from home, so I'll give Claude a prompt in the morning, and then I'll check in at lunch time to review the results. I still need to massage the code a bit, but Codex regularly chokes on tasks of an equivalent size/scope.

I can count on Claude to be reliable enough that any cleanup I have to do will be manageable. I’ve had multiple instances where Codex was so far off on a particular task that I basically needed to start over.

I’ve figured out how to adapt to some of the limitations of Codex, so I can avoid the big problems with better prompts, but that slows down the workflow a bit too much for my tastes.

Claude also has a better memory. If I tell Claude to “never do { } again,” it does a pretty good job of remembering that.6 Codex has the memory of a goldfish.

If you're in a scenario where you're required to use Codex, it can definitely get the job done, but if you have a choice, choose Claude.7


Follow the Project

If you want to keep tabs on the project, this is the perfect time to subscribe. There’s a button right here:

And I’d be remiss if I didn’t take this opportunity to shamelessly plug some of JAMR’s other ongoing projects.

Stellar Empire and Sci-Fi Camp should keep you entertained if my update cadence lags a little bit because I’m hyper fixated on the 1s and 0s.

1

Fun fact: I have a film degree.

2

Anthropic is frowned upon at the day job. Sort of.

3

The good ones, at least.

4

Codex is not my agent of choice.

5

I wish I was kidding.

6

Within the context of a project.

7

I am not getting paid for this. There's no kick back for JAMR or for me personally. Claude is just better.

No posts

© 2026 JAMR, LLC · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture