Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Having worked as a data scientist at multiple companies (From FANG to startup), the first thing I look at when I get my hands on data is the existence of the Pareto principle.

I still haven’t found one company where this principle didn’t show up.



What does this prove? If you have lots of data and dimensions, I bet you could just as likely find distributions that are roughly 50/50, 60/40, 90/10, 100/0 if you looked for them.


Agreed. It’s not necessarily 80/20, it’s just that power laws show up a LOT. 90/10, 99/1, etc.


So you massage your features until they can produce a 20-80% split?

Very scientific.


The 50/50 principle always shows up in my histogram with interval of two.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: