It’s still outside the hn mainstream to use both in the same submission, so that might be biasing the model in strange ways.
> But to simplify, instead I’ll just limit to stories that have only text bodies, instead of links.
This line implies that pre- and post- 2016 stories are text only, so this change should not affect the data so much.
It’s still outside the hn mainstream to use both in the same submission, so that might be biasing the model in strange ways.