Skip to main content
← Back to Case Studies

Agentic Legacy Batch Modernization

Confidential client · U.S. insurance and benefits administrator

Convective rebuilt 33 core operational systems as modern Ruby on Rails and Temporal applications and validated each one against the original before cutover, using one AI model to generate the code and an independent model from a second vendor to review and verify it.

Insurance & BenefitsLegacy ModernizationAI-Accelerated Engineering

The Challenge

A U.S. insurance and benefits administrator ran its operations on a large set of core operational systems: automated programs that produced the enrollment and cost reports the business relied on, moved data between its internal systems, and generated the files its outside partners required. Most were written in Perl, with some in Ruby and Bash, and they ran on a schedule against SQL Server and MySQL databases. Many processed regulated health and personal data subject to HIPAA. Several had been in continuous use for more than 25 years.

The systems were stable but increasingly difficult to maintain. The original developers had left the organization, and the one remaining engineer with working knowledge of the code was approaching retirement. No current staff member had the Perl experience to support these systems or could fully document what they did.

Rewriting them by hand carried significant risk. A manual rewrite of undocumented code that processes regulated data can introduce errors that are not caught until they affect downstream reporting. The objective was to rebuild each system on a modern, testable platform and to demonstrate, one at a time, that the new implementation produced the same results as the original before it was placed into production.

The Approach

Convective built an automated migration pipeline that rebuilds a single legacy system as a modern Ruby on Rails and Temporal application and verifies it before release. Each system moves through five stages, orchestrated as a durable Temporal workflow so that work resumes without loss if a process is interrupted. The pipeline uses two AI models from different vendors: one to generate the new code, and a second, independent model to review, security-check, and validate it.

The five-stage migration pipeline. Claude reverse-engineers and builds a legacy operational system; an independent OpenAI model reviews, security-checks, and validates it side-by-side against the original, with automatic rework loops, on durable Temporal orchestration.
The migration pipeline. One model generates the new application; an independent model from a second vendor reviews, security-checks, and validates it against the original.
  1. Reverse-engineer

    The pipeline analyzes the legacy code and produces a structured specification of the system's inputs, outputs, side effects, schedule, and business rules, along with a record of any ambiguities. This specification becomes the reference for every later stage.

  2. Build

    From the specification, the pipeline generates a new Ruby on Rails and Temporal application with an automated test suite and commits it to a repository.

  3. Code review

    An independent model from a second vendor reviews the generated code for correctness, adherence to engineering standards, and test coverage, working from the original source rather than the generated code. Findings that block release return the system to the build stage automatically.

  4. Security review

    A review checks the application against HIPAA-relevant requirements, including secret handling, encryption of connections that carry regulated data, audit logging, and safe handling of files that contain personal information. A critical finding stops the system.

  5. Parity validation

    The pipeline runs the legacy system and the new one against a live database and compares all observable output, including files, database writes, emails, log entries, and exit codes. Test data is backed up and restored around each run. Each difference is classified as a defect, an intended modernization, or an environmental factor, and the system is released only when every difference is accounted for.

Independent verification across vendors

The pipeline separates code generation from code review by assigning them to AI models from different vendors. Models from the same family tend to share limitations and can repeat one another's errors. Having a model from a second vendor review and validate the generated code, and giving that reviewer the original source rather than the new implementation, reduces the chance that a defect passes through unexamined.

On one system, the cross-vendor review identified five correctness issues in the generated code that an earlier same-vendor review had not flagged. That outcome is the basis for the pipeline's design.

Example: a monthly reporting system

A complete-looking rebuild that was missing three reports

One of the migrated systems produced a monthly reporting workbook used by leadership and finance to track enrollment and plan costs by plan, vendor, and region. The original was more than two thousand lines of Perl, with business logic embedded in hand-written SQL and no automated tests. No current staff member could fully explain its behavior.

The pipeline reverse-engineered the system, identified it as a sequence of five steps rather than a single script, and generated a modern implementation with a test suite. The generated version passed its own tests and produced a complete-looking workbook. Comparing against the original source, the validation stage found that three reports the legacy system produced were missing from the new output, including the enrollment summary and a per-employee cost worksheet. Because the omission did not cause an error, it would not have been apparent until a downstream review noticed the absent figures.

The pipeline returned the system for rework, regenerated the missing logic, and confirmed a match against the original, including a long-standing formatting irregularity in one report header that was retained to preserve identical output. The system was among the more complex in the set and required several rework cycles. It was released as validated, with documented environmental caveats: because it rebuilds a table in place, a destructive row-level comparison against live data was not run, and test coverage fell below the pipeline's standard threshold. The full test suite passed and the output matched the original.

Additional findings

  • On a system that processed employee records, the generated parser read an identifier from the wrong position in a fixed-width record. The output would not have failed, but it would have associated data with the wrong records. The independent review identified the error against the specification.
  • On the same system, the generated code applied restrictive file permissions only to newly created files, which could have left existing files containing personal data more broadly readable than intended. The security review flagged the issue before release.
  • A regulated rate value entered with a single decimal was being reformatted to two decimals. The existing test data did not surface the issue; a boundary test case generated by the pipeline did, and the system was corrected before release.
  • A generated application was configured to point at the wrong database environment, which was caught before it accessed the intended one.

The pipeline also applied specific modernizations where appropriate, such as converting a plain-text report email to formatted HTML and refining a date comparison. Where existing behavior was relied on downstream, it was preserved deliberately so that output remained identical.

33

core operational systems across eight application families were rebuilt and validated against the originals. A single system moved through the pipeline in about an hour; the full set was completed in roughly twelve hours with systems running concurrently. Comparable manual work would have required months of specialized effort the organization no longer had on staff.

Results

  • All 33 core operational systems were rebuilt as modern Ruby on Rails and Temporal applications across eight consolidated families and placed into production in place of the originals.
  • Each system was validated by side-by-side comparison against the original before cutover. The largest family, 15 systems, was additionally validated through a full manual parity review.
  • Every generated system was reviewed, security-checked, and validated by an independent AI model from a second vendor.
  • Behavioral and HIPAA-related issues were identified and corrected before release that a manual rewrite or a single-vendor process could have missed.
  • Code that had been maintained by a single departing engineer, some of it more than 25 years old, is now documented, tested, and maintainable by the current team.

Built With

Ruby on RailsTemporalClaude (Anthropic)OpenAIMulti-Agent OrchestrationSQL ServerMySQL

Modernizing a system no one fully understands?

We rebuild legacy systems on modern platforms and verify the result against the original before anything goes live. Tell us what you are running and we will outline an approach.