The whole point is - it's not about a concrete language (CUDA-C syntax is almost the same as C, you know). It's about the way of thinking and completely different parallel algorithms like Blelloch scan.
There is no compiler for any language, which can turn an arbitrary sequental algorithm into parallel, and will never be.