this seems should quick do, in practice there seems problem. have bunch of pdf forms include form fields , embedded javascript. remove javascript code safely, leave pdf form fields intact.
so far i've been able find lots of solutions, solutions have either eliminated both javascript , form fields, or left both intact.
here's solution a; copies both form fields , javascript:
var pdfreader = new pdfreader(infilename); using (memorystream memorystream = new memorystream()) { pdfcopyfields copy = new pdfcopyfields(memorystream); copy.adddocument(pdfreader); copy.close(); file.writeallbytes(rawfilename, memorystream.toarray()); } alternately, have solution b, strips out both form fields , javascript:
document document = new document(); using (memorystream memorystream = new memorystream()) { pdfwriter writer = pdfwriter.getinstance(document, memorystream); document.open(); document.adddoclistener(writer); (int p = 1; p <= pdfreader.numberofpages; p++) { document.setpagesize(pdfreader.getpagesize(p)); document.newpage(); pdfcontentbyte cb = writer.directcontent; pdfimportedpage pageimport = writer.getimportedpage(pdfreader, p); int rot = pdfreader.getpagerotation(p); if (rot == 90 || rot == 270) { cb.addtemplate(pageimport, 0, -1.0f, 1.0f, 0, 0, pdfreader.getpagesizewithrotation(p).height); } else { cb.addtemplate(pageimport, 1.0f, 0, 0, 1.0f, 0, 0); } } document.close(); file.writeallbytes(rawfile, memorystream.toarray()); } does know how modify either solution or b eliminate javascript leave form fields in place?
edit: solution code here!
using (memorystream memorystream = new memorystream()) { pdfstamper stamper = new pdfstamper(pdfreader, memorystream); (int = 0; <= pdfreader.xrefsize; i++) { object o = pdfreader.getpdfobject(i); pdfdictionary pd = o pdfdictionary; if (pd != null) { pd.remove(pdfname.aa); pd.remove(pdfname.js); pd.remove(pdfname.javascript); } } stamper.close(); pdfreader.close(); file.writeallbytes(rawfile, memorystream.toarray()); }
to manipulate single pdf should use class pdfstamper , manipulate contents, in case iterating on existing form fields , removing javascript entries.
the itextsharp sample addjavascripttoform.cs corresponding addjavascripttoform.java chapter 13 of itext in action — 2nd edition shows how javascript actions added fields, central code being:
pdfstamper stamper = new pdfstamper(reader, ms); acrofields form = stamper.acrofields; acrofields.item fd = form.getfielditem("married"); pdfdictionary dictyes = (pdfdictionary) pdfreader.getpdfobject(fd.getwidgetref(0)); pdfdictionary yesaction = ...; dictyes.put(pdfname.aa, yesaction); thus, remove such javascript form field actions have iterate on pdf form fields , remove /aa values in associated dictionaries:
dictxxx.remove(pdfname.aa); edit: (provided ted spence) here final code removes javascript while leaving form fields intact:
using (memorystream memorystream = new memorystream()) { pdfstamper stamper = new pdfstamper(pdfreader, memorystream); (int = 0; <= pdfreader.xrefsize; i++) { pdfdictionary pd = pdfreader.getpdfobject(i) pdfdictionary; if (pd != null) { pd.remove(pdfname.aa); // removes automatic execution objects pd.remove(pdfname.js); // removes javascript objects pd.remove(pdfname.javascript); // removes other javascript objects } } stamper.close(); pdfreader.close(); file.writeallbytes(rawfile, memorystream.toarray()); } edit: (by mkl) solution above overachieving because touches each , every indirect dictionary object. on other hand ignores inline dictionaries (i haven't checked spec, though; maybe /aa, /js, , /javascript entries appear in dictionaries have indirect objects, or @ least de-referenced code).
if fulfilling task job, try , access objects possibly carrying javascript more specifically.
the advantage of overachieving procedure might be, though, pdf objects inspected not specified carrying javascript in later pdf versions.
Comments
Post a Comment