Category: PDF scraping
-
“pdftohtml -xml” – only the poppler suite supports that
https://forum.xpdfreader.com/viewtopic.php?f=3&t=41211 only the poppler toolset (the xpdf-related toolset) has “pdftohtml -xml“ https://en.wikipedia.org/wiki/Poppler_(software) https://poppler.freedesktop.org https://anongit.freedesktop.org/git/poppler/poppler.git