The branch does, but you still have the comparison itself, and also the branch instruction to decide and skip (although thanks to branch prediction, your pipeline doesn’t get flushed).
If you have enough idle execution units, you might not see a difference in wall clock time. But with many algorithms you can put those units to good use.