an INCOMPLETE story from my “PDF to JasperReports” migration project

My “current” (as of 2011-01) project is actually rather interesting and challenging and well-paying,
but it’s only going to last for no longer than 2 months, I assume.

From my customer’s point of view this migration project must be horror.

I don’t really know, how serious “they” were, when they determined, that this would be a 3-months project.

Migrating 98% of the documents from PDF to JasperReport’s “JRXML”

  http://en.wikipedia.org/wiki/JasperReports#JRXML

will take like 6 months (I started in December).
But the remaining 2% of the documents may take another 3 to 6 months.

If they don’t complete that project (I mean true 100% of the documents, that need to get migrated),
they cannot abandon the old software,
which was one of the main goals initially.

They had no realistic concept for migrating all these documents,
they even had no realistic approach of analyzing all the PDF documents.
We are talking about many hundreds of PDF documents resp. pages with form fields,
and these form fields are really rather “delicate” details.

The usual way to work on PDF form fields
is to load the file into Acrobat Pro and to display the form fields.
You can “obviously” create a hard copy for each page,
but that’s tedious and you still don’t have hold of all the “more atomar details” of PDF form fields.

There was no software available for easy “batch-way” dealing with PDF forms.
I created something myself around a PDF library from perl’s CPAN (“CAM::PDF”).
Oh, wonderful CPAN!!!
In the meantime I contacted the developer of that PDF library,
because I needed more details for PDF document objects, such as the page number of a form field,
and “of course” he helped me out.
I was so happy.
Initially the page number was only “nice to have”,
as 99.5% of the documents, that I had dealt with so far, were 1-page only.

In the meantime I found out, I would have to deal with like 200 multi-page documents,
so it was a serious necessity to be able to extract the number of the page the form-fields are located on.

When I decided to contact the developer of CAM::PDF, I was already close to despair,
and I was overly happy, when we had gotten that feature implemented within a couple of days, with just a few e-mails in both directions.


Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.