
From Research Paper to Prototype: Using Generative AI to Automatically Generate Test Cases
Introduction About five years ago, I came across a research paper on Search-Based Software Testing (SBST) published on IEEE. The idea was fascinating: instead of writing test cases manually, software testing could be treated as an optimization problem. Algorithms could explore the space of possible inputs and automatically discover test cases that maximize coverage and expose hidden defects. Conceptually, it felt like a glimpse into the future of testing. But there was a problem. ...

Why Feature Parity Bugs Are Architectural, Not Testing Failures
This article is part of a series on behavioral consistency in software systems. Previously: The Doppelgänger Dilemma — Why Apps Drift Why Feature Parity Bugs Are Architectural, Not Testing Failures QA reports: “Android works. iOS fails.” The backend logs show success. Payloads look identical. Nothing crashes. Nothing obvious is broken. Yet the system behaves differently across platforms. The instinctive response is procedural: Expand regression coverage Add cross-platform test matrices Increase release coordination Tighten QA cycles These actions feel responsible. They feel disciplined. ...

The Doppelgänger Dilemma: Why Your Mobile Apps Look Alike but Act Like Strangers
Most mobile teams don’t ship one app. They ship two apps that slowly disagree. A validation rule changes on Android. iOS ships it two sprints later. Weeks afterward, users report “random failures” but nothing is actually broken. The platforms simply made different decisions. I call this the Doppelgänger Dilemma: apps that look identical in the store, yet behave like strangers in production. In mobile engineering, the hardest problem is not performance or UI. ...