3,800 of you read the 12x piece. here's what you got wrong — and what you got right. | romanking

so that happened.

the webmatrices admin dropped my piece on a developer forum. 545 upvotes. 103 comments. 3,800 readers here. average read time: 4 minutes 26 seconds.

i didn't expect that.

i also didn't expect some of the pushback. a few comments made me rethink parts of the argument. others missed the point entirely. and a handful pointed me toward research i hadn't seen — which made the problem worse than i originally thought.

let me address all of it.

"the 12x ratio assumes a worst-case scenario"

fair criticism.

one commenter wrote:

"Contributor produces slop (5m). Maintainer recognizes slop, hits reject (5m). This is such a ridiculous, fictional, worst case possible case study."

he's partially right. if maintainers rejected slop immediately, the ratio would collapse. 5 minutes vs 5 minutes. 1:1.

but here's what the criticism misses: maintainers don't want to reject good-faith contributions.

the whole point of open source is collaboration. when someone submits a PR, the maintainer's instinct is to help them improve it — not slam the door. that's how the ecosystem grew.

the 12x ratio isn't worst-case. it's what happens when maintainers treat AI-generated PRs the same way they'd treat a junior developer's first contribution. with patience. with feedback. with the assumption that the person will learn.

except they don't learn. they regenerate.

one maintainer on a small node.js library put it perfectly:

"i used to get maybe 1-2 PRs per 3 months. now my lib is having good times, someone thought of developing an AI agent to better my lib. i dunno what they get out of it. but it's fun rejecting them."

he's learned to reject fast. most maintainers haven't.

the fix: i should have been clearer that the 12x ratio assumes good faith on the maintainer's side. if you want to survive, reject faster. your time is worth more than their feelings.

"this is just junior developer behavior with extra steps"

another commenter:

"This happened before AI. I would review the PR of a junior/apprentice, then the next PR is completely different because he thought of a better idea."

true. the pattern isn't new.

but there's a critical difference someone else pointed out:

"a junior dev can be taught to stop doing that. while on the other hand..."

a junior learns. they internalize feedback. they get better over time. the investment in reviewing their early PRs pays off.

an AI-assisted contributor who doesn't understand the code? they feed your feedback into the model and get a completely different output. there's no learning. there's no compounding improvement. just infinite regeneration.

the fix: nothing. the criticism is valid but the conclusion stands. the economics are different when the contributor can't learn from feedback.

"45% of AI-generated code contains security flaws"

this is the part where the problem got worse.

after the piece went out, multiple people pointed me to research i hadn't seen. i spent the last few days going through it.

veracode's 2025 GenAI Code Security report:

tested 100 leading LLMs across 80 curated tasks
approximately 45% of AI code generation tasks introduce a known security flaw
no real improvement across newer or larger models

tenzai's december 2025 assessment:

compared five vibe coding tools (claude code, openai codex, cursor, replit, devin)
built the same three test applications with each
69 vulnerabilities total across 15 applications. around 45 rated 'low-medium' in severity, with many of the remainder rated 'high' and around half a dozen 'critical'

escape's analysis of vibe-coded applications:

analyzed over 5,600 publicly available applications and identified more than 2,000 vulnerabilities, 400+ exposed secrets, and 175 instances of PII including medical records, IBANs, phone numbers, and emails

unit 42's state of cloud security report:

AI agents are now used in software development by 99 per cent of organisations
insecure code is being generated faster than organizations can detect or remediate

the tenzai finding is particularly interesting. the tools are good at avoiding security flaws that can be solved in a generic way, but struggle where what distinguishes safe from dangerous depends on context

in other words: AI doesn't make obvious mistakes. it makes subtle ones. the kind that pass code review.

"this is just DDoS with PRs"

someone called it exactly what it is:

"It's like a DDOS attack with PR's."

another commenter expanded:

"The 12x ratio also assumes 1 human person slowly submitting PRs, and not an army of vibe slop flooding your project. it could become a full-time job in of itself scanning & closing them for projects that are big enough."

this reframes the entire problem.

it's not about individual bad actors. it's about volume. when the cost of generating a PR approaches zero, the number of PRs approaches infinity. maintainer time is finite. the math doesn't work.

one maintainer proposed a solution:

"maybe github needs a way to flag people like this. Like if they get X% of their public PRs flagged by maintainers of that Repo, then they are marked, and repos can choose to block those people, or auto tag their PRs."

reputation systems. contributor scores. automated slop detection.

the tools don't exist yet. but they will have to.

"AI is great for small-scale, non-maintainable code"

this was the most upvoted comment:

"From personal experience, AI is great for small scale code that is not designed to be maintainable, but poorly at bug fixing or following style guides. If I ask AI to make an Arduino project that drives NeoPixels in Christmas red/green colors, it works fine. If I ask AI to fix a bug in our work application without more context, it never works."

this is the nuance i should have included more of.

AI-assisted coding isn't binary. it's a spectrum:

works well:

prototypes
one-off scripts
throwaway code
isolated components with clear boundaries

works poorly:

bug fixes in existing codebases
contributions to projects you don't understand
anything that requires architectural context
anything someone else has to maintain

the commenter who nailed it:

"it makes code that I would use but wouldn't really maintain. it basically produces 'consumable code'."

consumable code. that's the term. code that works once, then gets thrown away. the problem is when consumable code gets committed to long-lived projects.

"we're turning into software janitors"

"We're really turning into software janitors aren't we..."

yeah. we are.

another commenter added:

"Juniors can generate vibe coded trash with lots of suspect tests and create a PR very quickly. Now the more skilled senior spends all afternoon discovering all the bad practices and useless tests and coaching the junior as to how to fix them. It's such a wasteful cycle."

and someone pointed out the difference:

"At least a human junior coder will learn from this. AI will quite happily do the same things wrong again in the next vibe coding session."

the senior-to-janitor pipeline is real. the question is whether it's temporary or permanent.

my bet: temporary. either the tools get better at generating maintainable code, or organizations build better filters. the current state — where skilled developers spend their time cleaning up AI output — is economically unstable.

something has to give.

"the article reads like it was written by AI"

"An obviously vibe written case study about vibe coded software. How much authority are we going to give this low effort case study?"

"The article reads like it's been heavily written by AI with little review, and the scenarios seem exaggerated and unrealistic."

i'll address this directly: no, i didn't use AI to write the piece.

but i understand why it reads that way. the style is deliberate — short sentences, lots of whitespace, direct language. it's optimized for scanning, not academic rigor.

if that reads as "AI-generated" to you, fair enough. but the irony of accusing someone of vibe coding their argument against vibe coding... i'll take that as a compliment to the clarity of the writing.

what i'd change

if i were rewriting the piece today:

acknowledge the spectrum. AI-assisted coding works well for throwaway code. the problem is specifically with maintained codebases and open source contributions.
emphasize the maintainer's choice. the 12x ratio assumes good faith. maintainers who reject fast avoid the trap. but that requires abandoning the collaborative ethos that built open source.
add the new security data. 45% of AI-generated code contains known flaws. 2,000+ vulnerabilities across 5,600 vibe-coded apps. the problem is bigger than maintainer burnout.
propose solutions. reputation systems for contributors. automated slop detection. human-in-the-loop requirements for PRs above a certain size.

what i'd keep

the core argument stands:

AI multiplies what you already know.

10 years of experience × AI = 10x output
0 years of experience × AI = 10x slop

multiple commenters quoted this line. one called it "the best summary i've seen." another pointed out that 0×10=0, which... fair, but you get the point.

the developers who thrive won't be the ones generating the most code. they'll be the ones who can tell the difference between code that compiles and code that belongs.

taste scales. slop doesn't.

what's next

the conversation isn't over.

i'm seeing patterns emerge:

maintainers developing "slop radar" — the ability to spot AI-generated PRs instantly
companies adding AI disclosure requirements to contribution guidelines
security tools starting to flag "hallucinated abstraction" patterns
junior developers learning to use AI as an assistant, not a replacement

the equilibrium hasn't been found yet. we're in the messy middle.

but the 12x problem? it's real. the security problem? worse than i thought. the solution? still being figured out.

if you've developed techniques for filtering AI-generated PRs, or if you're a maintainer who's found a sustainable workflow, drop a comment. i'm collecting approaches.

#CaseStudy #ReplyBack #AfterMath #AI #vibecoding

sources: