Hi all,

I am looking to automate text extraction from a PDF document (close to over 2000) pages. I am thinking it'd be better if I convert it into a structured document for automated parsing.

Is there a tried and tested tool/way to convert PDF to XML/JSON?

Regards,