
On Tue, Mar 23, 2010 at 5:15 PM, James Wachira <jwaciira.lists@gmail.com> wrote:
I have a problem. Am working on a Moodle site and the bulk of my work has been conversion of modules [Learning Material] which are in *.pdf into *.html files. The said modules are divided into sections About Course, About Author, Learning Activities etc sections of which go into different html files. I have resulted into manually having to convert this which is very tedious and repetitive to say the least. Is there a way to automate this? Say a way to parse the pdf file to produce the different html files that I produce from each pdf.
See http://itextpdf.com/ which can be used from the command line and also has an API. But pdf to html is not always a good idea since PDF is primarily meant for print media. How were the original PDF files created ? if you find the original source files from which the PDFs were created -- you may find it easier to convert the source files to HTML.