GPUs are huge, TPUs etc are also quite large, and CPUs are tiny. I'm no expert on any of these, but intuitively you're losing something cramming that functionality into a way smaller chip. Probably something to do with bandwidth (where big helps) vs latency (where small helps).