It's funny to see this #1 on HN. I have a PDF converter site[0] that I did a show hn [1] years back, and have been currently pushing updates too as I work on a entire site redesign since the PDF niche is massive. I'm alleviated to see that some one actually made a package for PDF to OCR[2]. And that they are using it[3]. It will finally make what I was doing less hacky.
[0] https://www.pdf.to [1] https://news.ycombinator.com/item?id=23238862 [2] https://github.com/ocrmypdf/OCRmyPDF [3] https://github.com/Frooodle/Stirling-PDF#technologies-used