Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In the coarse graining code, you use an @parameter-for. Doesn’t that lead to some pretty large code size unrolling that? Or is that less of an issue on GPU?

Great write up! I learned a lot!



It doesn’t. The batch size is just 8. This is a very good trick and often needed to archive peak performance in memory bound kernels. You can checkout the equivalent code in cuda aswell :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: