https://wp.jochen.hayek.name/blog-en/2011/04/27/pdf-harvesting-automatic-extraction-of-information-from-pdf-files/
PDF harvesting – automatic extraction of information from PDF files