Welcome to my company a few years ago. Except they were not programmers but PM and architects, and algos were subcontractors. Now we have real developers (I'm in one of those teams)
I'm pretty sure the generated code will be an incomprehensible mess that passes precisely those tests and nothing more, e.g. it would correctly sort arrays [1,3,2] and [3,2,1], but not [2,1,3]; and obviously not [1,4,2], because the specs didn't mention number 4.
It’s doing TDD!