Discussion:
[gs-devel] Converting to PDF/A using -dPDFACompatibilityPolicy=2 returns without error but produces invalid PDF/A files
Simon Stratmann
2016-01-27 14:36:45 UTC
Permalink
Hi,

it's me again... To test the Ghostscript PDF/A conversion I downloaded the Isartor Test Suite from here: http://www.pdfa.org/2011/08/isartor-test-suite/
I converted all PDF files using the following command line:

gswin64c.exe -dPDFA=1 -dBATCH -dNOPAUSE -sProcessColorModel=DeviceRGB -dColorConversionStrategy=/DeviceRGB -dCompatibilityLevel=1.7 -sDEVICE=pdfwrite -dNOOUTERSAVE -dNumRenderingThreads=8 -dPDFACompatibilityPolicy=2 -sOutputFile="%{outputfilepath}" "%{inputfilepath}" PDFA_def.ps

Of the 203 test files (I left out one with 10000 pages) all were processed without Ghostscript aborting, although two gave out warnings during conversion.
But 19 of the resulting files are not valid PDF/A (tested with PDFBox Preflight 2.00 RC3 and veraPDF)

Here are the names of the files and the reported PDF/A violations:

isartor-6-2-3-3-t01-fail-a.pdf
Invalid Color space, The operator "k" can't be used with RGB Profile

isartor-6-2-3-3-t03-fail-a.pdf
Invalid Color space, The operator "k" can't be used with RGB Profile

isartor-6-2-3-3-t03-fail-b.pdf
Invalid Color space, The operator "k" can't be used with RGB Profile

isartor-6-2-3-3-t03-fail-c.pdf
Invalid Color space, DestOutputProfile isn't CMYK ColorSpace

isartor-6-2-3-3-t03-fail-d.pdf
Invalid Color space, DestOutputProfile isn't CMYK ColorSpace

isartor-6-2-3-3-t03-fail-e.pdf
Invalid Color space, DestOutputProfile isn't CMYK ColorSpace

isartor-6-3-3-1-t01-fail-a.pdf
Invalid Font definition, TOQKJT+KozMinProVI-Regular: The CMap is a string but it isn't an Identity-H/V

isartor-6-3-3-1-t01-fail-b.pdf
Invalid Font definition, TOQKJT+KozMinProVI-Regular: The CMap is a string but it isn't an Identity-H/V

isartor-6-3-3-3-t01-fail-a.pdf
Invalid Font definition, TOQKJT+KozMinProVI-Regular: The CMap is a string but it isn't an Identity-H/V

isartor-6-3-3-3-t02-fail-a.pdf
Invalid Font definition, TOQKJT+KozMinProVI-Regular: The CMap is a string but it isn't an Identity-H/V

isartor-6-3-4-t01-fail-c.pdf
Invalid Font definition, EHBDGX+DroidSansFallback: The CMap is a string but it isn't an Identity-H/V

isartor-6-3-4-t01-fail-g.pdf
Font damaged, The encoding 'null' doesn't exist

isartor-6-3-5-t01-fail-a.pdf
Glyph error, The character code 5 in the font program "ASEWMI+AdobeMingStd-Light" is missing from the Character Encoding

isartor-6-3-5-t01-fail-b.pdf
Glyph error, The character code 1674 in the font program "DZRERX+ArialMT" is missing from the Character Encoding

isartor-6-3-5-t01-fail-d.pdf
Glyph error, The character code 36 in the font program "DWDNFL+ArialMT" is missing from the Character Encoding

isartor-6-5-3-t03-fail-b.pdf
Invalid Color space, The operator "K" can't be used with RGB Profile

isartor-6-5-3-t03-fail-d.pdf
Invalid Color space, The operator "K" can't be used with RGB Profile

isartor-6-6-1-t01-fail-a.pdf
Action is forbidden, The action Launch is forbidden

isartor-6-6-1-t02-fail-a.pdf
Action is forbidden, The action Launch is forbidden

That seem to be seven different problems, probably in different versions.

I use the AdobeRGB1998 compatible ICC profile from http://www.freedesktop.org/wiki/OpenIcc/ and the "default" PDFA_def.ps with output intent AdobeRGB1998.

I realize that some of the test files use somewhat obscure corner cases, but some like the "Launch" action are a bit more common, I think. Even so, I'd expect Ghostscript to produce valid PDF/A (best case) or abort (worst case).

Thanks.


Kind Regards
--
Simon Stratmann
Manufacturing Execution Systems
Werum IT Solutions GmbH
Wulf-Werum-Strasse 3 | 21337 Lueneburg | Germany
Tel. +49 4131 8900-443 | Fax +49 4131 8900-20
***@werum.com | www.werum.com
Geschäftsführer / Managing Directors: Rüdiger Schlierenkämper, Richard Nagorny, Hans-Peter Subel
RG Lüneburg / Court of Jurisdiction: Lüneburg, Germany
Handelsregisternummer / Commercial Register: HRB 204984
USt.-IdNr. / VAT No.: DE 118 589 979
Ken Sharp
2016-01-27 14:54:35 UTC
Permalink
Post by Simon Stratmann
Of the 203 test files (I left out one with 10000 pages) all were processed
without Ghostscript aborting, although two gave out warnings during conversion.
But 19 of the resulting files are not valid PDF/A (tested with PDFBox
Preflight 2.00 RC3 and veraPDF)
If you think you have found problems, please open bug report(s) but note
that PDF/A validators are not without their own problems. Also if you start
from a PDF it may not be possible to create a fully compliant PDF/A file,
and it may not even be possible to tell that it isn't possible.

gs-devel is not an appropriate place to report bugs.

NB if you are going to open bugs then please *attach* the relevant file to
the bug report, don't just put a URL in the report.

Ken

Loading...