An AWS instance family we were using since 2016 finally hit its end-of-life and our account manager gave us a strict migration deadline. Codex helped me during the transition.
This was the moment our development debt finally caught up with us. I finally decided to deal with a 100k+ LOC legacy C++ codebase that has been around for 10 years now.
This wasn’t just changing compilation flags.
The project had:
- GCC7-era assumptions;
- CPU-specific instruction paths;
- CUDA 8 specific GPU code;
- Hardcoded optimizations for architectures that don’t even make sense anymore.
The goal was simple. To make it work on modern ARM to save infrastructure cost (we already have reserved ARM capacity running) and move the GPU parts toward current CUDA 12+ API - without breaking runtime behavior for customers.
I initially considered porting only the parts still used by legacy customers into the newer successor project. That didn’t work. The components were too tightly coupled to the old architecture. Untangling them would have been harder than modernizing the original codebase. It simply didn’t fit the newer system’s design.
So:
No rewrites (unless absolutely necessary such as some old-school CUDA files), no “upgrade and pray”, no dependency explosion.
Just careful, incremental changes with validation at every step. Luckily, the modules I was interested in were test covered.
I used the latest GPT-5.3 Codex primarily in the Codex app, and occasionally via CLI, with separate skill-specialized sessions for high-level codebase mapping, build/toolchain fixes, CPU and ARM compatibility changes, CUDA/GPU adjustments, regression testing, performance and memory validation, and deployment.
Key takeaway:
If you don’t constrain it, it will happily try to modernize your entire dependency tree, which is usually the wrong and more costly (even token-wise) move in legacy systems.
I took my time to carefully plan the steps and define the guardrails:
- Fixed compiler versions;
- Scoped change and access boundaries (out of critical IP);
- Strict behavior parity via tests.
Result:
Around 5 business days of work from planning to actually running it in production.
Realistically, this would have been at least a month of manual work, mostly because of the cognitive load of navigating the codebase and validating the edge cases.
The bonus was that Codex also found and fixed a 10-year-old memory leak, hiding in the shadows.
The codebase in question was something the team was afraid to even touch it. At some point, I was the only one that remembered how it worked.
Legacy modernization is no longer something you postpone for years.
With discipline + AI as a constrained assistant, it becomes possible.
AI won’t replace the engineering judgement. It amplifies it, that is, if you know what you are doing.
Next experiment: repeat the same task with our locally running Kimi K2.5, why not even benchmark it against the new GLM 5.