Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

People are better understood intuitively. We understand how people fail and why. We can build trust with people with some degree of success. But machine models are new and can fail in unpredictable ways. They also get deployed to billions of users in a way that humans do not, and deployed in applications that humans do not. So its certainly useful to try to explain neural networks in as great of detail as we can.


Or we can build trust using black box methods like we do with humans, e.g., extrapolating from past behavior, administering tests, and the like.


We can, but the nice thing about neural networks is the ability to do all kinds of computational and mathematical manipulations to them to basically pick them apart and really find out what’s going on. This is important not just for safe deployment but also for research on new methods that could be used to make them better. Plus we need this ability to help avoid neural networks with intentionally hidden features that appear to behave linearly in certain regimes but are designed with a strong nonlinear response when special inputs are applied. You could have all the tests you want for a self driving car based on real world conditions but some bad actor with access to the training system could create a special input that results in dangerous behavior.


The more fundamental problem is the sheer size of them, and this is only going to get worse as models grow larger to become more capable. Being able to look at the state of individual neurons during inference is very convenient, but that does not by itself make it possible to really find out what's going on.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: