Vous êtes sur la page 1sur 59

DocumentBurster User Guide

DocumentBurster User Guide


Copyright © 2006-2011 Virgil Trasca
Table of Contents
1. Introduction ................................................................................................................... 1
DocumentBurster ..................................................................................................... 1
DocumentBurster Server ............................................................................................ 1
2. Bursting and merging reports ............................................................................................ 3
Bursting reports .......................................................................................................... 3
Merging reports .......................................................................................................... 4
Configuration ............................................................................................................. 6
3. Distributing reports ......................................................................................................... 8
Distribute reports by email ........................................................................................... 8
Server settings ................................................................................................... 8
Distribute to a single email address ...................................................................... 10
Distribute reports to multiple email addresses ....................................................... 10
Send plain text email messages ........................................................................... 10
Send HTML email messages .............................................................................. 11
HTML message sample ..................................................................................... 12
Distribute reports by FTP ........................................................................................... 14
4. Variables ..................................................................................................................... 16
Built-in variables ...................................................................................................... 16
Example .......................................................................................................... 17
User-defined variables ............................................................................................... 18
Example - customizable burst file name .............................................................. 18
5. Automatic polling for incoming reports ............................................................................. 21
Watch a folder for incoming reports ............................................................................. 21
6. DocumentBurster Server ................................................................................................. 22
Installation ............................................................................................................... 22
Prerequisites ..................................................................................................... 22
Download DocumentBurster Server ................................................................... 22
Basic Usage ............................................................................................................. 22
Scheduling ............................................................................................................... 23
Configuration ................................................................................................... 23
Web console ............................................................................................................ 24
Windows - Run DocumentBurster Server at system startup ............................................... 27
7. Using scripts to achieve more .......................................................................................... 32
Scripting scenarios .................................................................................................... 32
File related capabilities ...................................................................................... 32
Execute external programs .................................................................................. 32
Publish reports to Microsoft SharePoint portal ........................................................ 33
Distribute messages to SMS, Fax or print reports .................................................... 34
Mail, FTP, FTPs and SFTP ................................................................................ 34
Upload reports to a shared location ...................................................................... 35
Encrypt or stamp the output reports ..................................................................... 35
Introduction to the Burst Lifecycle ............................................................................... 35
Controller ...................................................................................................... 35
Bursting context ............................................................................................. 36
Sample scripts .......................................................................................................... 39
zip.groovy ....................................................................................................... 39
encrypt.groovy .................................................................................................. 40
overlay.groovy .................................................................................................. 43
exec_pdftk_background.groovy ........................................................................... 45
Further reading ......................................................................................................... 46
8. Command Line ............................................................................................................. 48

iii
DocumentBurster User Guide

Bursting reports ........................................................................................................ 48


Merging reports ........................................................................................................ 48
Polling for incoming reports ....................................................................................... 49
9. Auditing & Tracing ....................................................................................................... 50
Logging .................................................................................................................. 50
10. Trouble Shooting ......................................................................................................... 52
Restriction on maximum 25 burst reports ...................................................................... 52
Issues running basic features ....................................................................................... 52
Old Java - UnsupportedClassVersionError exception ....................................................... 52
Windows - DocumentBurster.exe GUI is failing ............................................................. 53
Windows - DocumentBurster.exe GUI still fails ............................................................. 53
Windows - DocumentBurster.exe GUI still fails ............................................................. 53
Bursting issues 1 ...................................................................................................... 53
Bursting issues 2 ...................................................................................................... 53
Variable values are not parsed properly ........................................................................ 53
Email is failing ......................................................................................................... 54
Email still fails ......................................................................................................... 54
Email is still failing .................................................................................................. 54
FTP issues ............................................................................................................... 54
Windows - DocumentBurster Server Console is failing to start ......................................... 54
Windows - DocumentBurster Server is not properly processing the submitted burst jobs ......... 55
Windows - DocumentBurster Web Console is failing to start ........................................... 55
Forum community help .............................................................................................. 55
Professional support .................................................................................................. 55

iv
Chapter 1. Introduction
DocumentBurster is a powerful solution to schedule, merge, split, and distribute reports and can naturally
complement and extend with additional report distribution capabilities any current business intelligence
deployment.

DocumentBurster can merge, split, burst and distribute reports which are being generated with any kind
of reporting platform. DocumentBurster can burst reports generated with an existing in-house reporting
platform or with any of the well known leading commercial report writers like Crystal Reports, Cognos,
Microsoft Reporting Services, Microsoft Access, Web Intelligence and also the reports generated by the
leading open source report writing engines like JasperReports, BIRT and Pentaho.

DocumentBurster is a cross platform application which works on Windows and on any UNIX like system
like Linux and MacOS X.

The software has two flavors:

• DocumentBurster - desktop based GUI interface to be used by a single user

• DocumentBurster Server - WEB based GUI interface to be concurrently accessed by multiple people

DocumentBurster
DocumentBurster can merge, split and can distribute any PDF report.

• The software can distribute reports to a wide range of destination types such as e-mail, ftp, ftps, sftp, tftp,
Windows shared drives, Unix Samba servers, WebDAV servers and document management systems

• Publish reports to enterprise portals such as Microsoft Sharepoint Server, SAP NetWeaver, Oracle Por-
tal, and IBM WebSphere Portal

• Dynamically generate good looking HTML e-mail messages based on email templates

DocumentBurster is distributed under the terms of the Affero General Public License v3 license or, at your
own will, under a commercial license.

Usually the commercial license is required if:

• Your organization cannot conform to the terms of the Affero General Public License v3 - http://
www.gnu.org/licenses/agpl.html

• Your organization requires professional DocumentBurster support

DocumentBurster Server
DocumentBurster Server has additional features like reports scheduling, parallel reports processing and
can achieve the most advanced report delivery scenarios.

DocumentBurster Server is a fully fledged report distribution solution which can be tailored with advanced
features to perfectly meet most complex report bursting and report distribution requirements. Following
capabilities are all achievable with DocumentBurster Server , either as out of the box features, either
through the tailoring of the software:

1
Introduction

• Distribution Server - The software can be deployed as a central report bursting and report distribution
platform being concurrently accessible to multiple people or legacy systems

• Web based GUI console interface available for IE, Firefox and Chrome

• Merge, burst and distribute any kind of report format

• Scheduling - define simple or complex schedules for executing nightly, weekly or monthly report burst-
ing and distribution jobs

• Gives advanced storing, indexing and searching capabilities for the distributed reports

• Support for parallel execution which allows achieving a high throughput of reports to be converted,
merged, split and distributed in a short period of time

• In order to support any report bursting and report distribution scenario, DocumentBurster can be de-
ployed as a standalone server, or it can run being deployed on various application servers such as We-
blogic, Websphere, JBoss etc.

• DocumentBurster can be easily integrated with existing CRM and ERP kind of applications

DocumentBurster Server web console

2
Chapter 2. Bursting and merging
reports
Bursting reports
DocumentBurster is splitting the reports with the help of burst tokens. An example of such token might be
{doc1} . If a page from the report needs to be extracted in a separate document then DocumentBurster will
be informed about this by using burst tokens . Please take a look at the burst.pdf file which is available
in the samples folder. This is a three page report which, after bursting, will generate two separate output
files; the first one doc1.pdf containing first and second pages while the second document doc2.pdf
contains only the third page from the initial document.

Burst tokens can be anything which is uniquely identifying the document to be burst such as the invoice
ID, customer number or the email address where the document should be distributed.

Note 1
Depending on the business requirements, the report generation software should properly fill the
burst tokens into the pages of the reports.

Note 2
Usually the burst tokens will get the white font color so that the visual appearance and the layout
of the report will not be affected.

Note 3
Out of the box DocumentBurster supports bursting of PDF reports. If there is a need to burst
any report format (including Word, Excel and any other document type) than DocumentBurster
software can be tailored to achieve the bursting and distribution of such report types.

3
Bursting and merging reports

In the menu go to Actions -> Merge, Burst and Trace... -> Burst

Merging reports
Sometimes, prior to bursting, it might be required to merge few reports together and burst the merged
result. DocumentBurster allows merging of reports through both the command line interface and through
the GUI.

4
Bursting and merging reports

In the menu go to Actions -> Merge, Burst and Trace... -> Merge -> Burst

• By default the reports are merged in the selection order. The merge order can be changed using the Up
and Down button commands.

Select multiple files - Using the Ctrl key it is possible to select multiple files at once.

• Merged File Name - allows overriding the name of the generated merged file. Default value is
merged.pdf

• Burst Merged File - If checked, the generated merged file will be also split.

View Generated Reports - Allow browsing of burst/merged reports.

While bursting/merging reports, following properties are for information purposes only. Their values can
be configured in the Config screen and will be discussed in the next section.

• Distribute reports to Email, FTP ... - If strikeout, the generated reports will not be distributed, otherwise
yes.

• Delete reports once they are distributed - If strikeout, the generated reports will not be deleted from the
disk once they are distributed, otherwise yes.

• Quarantine reports which fail to be distributed - If strikeout, the fail to be distributed reports will not
be saved to the quarantine folder, otherwise yes.

5
Bursting and merging reports

Note
Out of the box DocumentBurster can merge PDF reports. If there is a need to overcome this
limitation and merge any report format (including Word, Excel and any other document type)
than DocumentBurster can be customized to process any report type.

Configuration
Following settings can be configured in regards with merging and bursting of the reports in Document-
Burster :

In the menu go to Actions -> Config... -> General Settings

• Burst File Name – If specified, it will be used to form the file name for the generated files, otherwise the
name of the file will be the corresponding burst token. Default value is the variable $burst_token$ .pdf

• Default Merge File Name – If specified, it will be used as a name for the merged file. It can be overridden
for each individual merge job.

• Output Folder – Used to specify the folder where to place the generated files. Default value is out-
put/$input_document_name$/$now; format="yyyy.MM.dd_HH.mm.ss"$ .

• Backup Folder – Used to specify the folder where to backup the input files. Default value is back-
up/$input_document_name$/$now; format="yyyy.MM.dd_HH.mm.ss"$ .

• Quarantine Folder – Used to specify the folder where to quarantine the files which fail to be distributed.
Default value is quarantine/$input_document_name$/$now; format="yyyy.MM.dd_HH.mm.ss"$ .

6
Bursting and merging reports

• Poll Folder – Used to specify the folder to be polled for incoming reports. Default value is poll .

• Distribute reports to Email, FTP ... - If checked, the generated reports will be distributed as part of
the bursting process, otherwise no.

• Delete reports once they are distributed - If checked, the generated reports will be deleted from the disk
once they are distributed, otherwise no.

• Quarantine reports which fail to be distributed - If checked, the fail to be distributed reports will be
saved to the quarantine folder, otherwise no.

Note
$burst_token$, $input_document_name$ and $now; format="yyyy.MM.dd_HH.mm.ss"$ are
variables and will be replaced at run-time with the value of the token used to burst the report,
the name of the input file and the current date, formatted. For more details about variables please
read Chapter 4. Variables.

7
Chapter 3. Distributing reports
DocumentBurster can distribute the generated reports to a long list of different destination types including
email, FTP, FTPS, SFTP, TFTP, Samba servers, Windows network shared drives and WebDAV. Web-
DAV protocol is used to distribute the reports to enterprise portals such as Microsoft SharePoint, Oracle
Portal or SAP NetWeaver.

Email and FTP destination types are directly supported through the GUI while for rest of the destination
types, a small amount of scripting is required. Following will discuss how to setup DocumentBurster GUI
for distributing the reports by email and FTP.

In order to distribute the reports the following configuration needs to be checked Distribute reports to
Email, FTP ... This configuration was described in the previous section.

Distribute reports by email


Server settings
Emails can be distributed by using a Microsoft Exchange email server or by using any other email servers
with POP3 support (Gmail, Yahoo Mail, and Hotmail or whatever).

In order to distribute the reports by email, the email server connect settings should be properly configured.

In the menu please go to Actions -> Config… -> Email Settings -> Connection Settings

8
Distributing reports

The email settings are pretty much self explanatory. The most important settings are the host, the user
name, the password and the port. If an email server with SSL or TLS support (like Gmail is for example)
is being used then the appropriate configurations need to be selected.

If DocumentBurster is being used with a Microsoft Exchange email server, in this case, DocumentBurster
should be configured with the same email settings as the ones already provided in the Microsoft Outlook
email client software.

For testing purposes, Gmail can be used as an email server for distributing few reports. In this case
please read the following Gmail settings documentation [http://mail.google.com/support/bin/answer.py?
hl=en&answer=13287] .

Yahoo, Hotmail and other big email providers have POP3 support which can be configured with Docu-
mentBurster . Please read the POP3 support documentation of the specific email provider which is planned
to be used. It is advisable to use such public services for testing purposes only.

Note 1
If in your organization there is a firewall or antivirus software which is sitting in-between Docu-
mentBurster and the email server, in such case, the protecting software might need to be config-
ured and allow DocumentBurster to send emails as a good and trustable citizen.

9
Distributing reports

Note 2
It's worth to be mentioned that the email connection settings can be dynamically filled at runtime,
when the reports are being distributed, with the help of variables. For more details about variables
please read Chapter 4. Variables.

Note 3
If required, a network or IT administrator should be able to give further help in configuring the
email server settings.

Distribute to a single email address


The simplest email distribution scenario is when sending each of the generated reports to a single different
email address.

The To address is configured by default with the value of the $burst_token$ variable. If the burst tokens
are of type email address, for example by using tokens such as {john.wayne@mail.de} , DocumentBurster
will send the output report to the corresponding email address, in this case the john.wayne@mail.de email
address.

Distribute reports to multiple email addresses


Another scenario is to distribute the reports to more than a single person but to a group of people. For
sending the emails to a group of people, again the variables will be of great help.

For example, it is possible to configure DocumentBurster to distribute emails To $var0$ , CC to $var1$


and BCC to $var2$ .

At runtime the variables will be filled with values from the report which is being distributed, so that $var0$
might be john.george@yahoo.com;joshua.robin@yahoo.com;mike.redford@gmail.com and Document-
Burster will send the email To the previous email addresses. The same is true for CC and BCC .

Multiple email addresses should be separated by either ; or ,

Send plain text email messages


DocumentBurster supports sending of configurable email messages which have the corresponding burst
report attached to the email.

In the menu please go to Actions -> Config… -> Email Settings -> Email Message

10
Distributing reports

Using variables, the subject and the text of the email messages can be dynamically configured and cus-
tomized for each individual customer.

Example - In the previous screenshot, when each individual report is being distributed, $var0$ and $var1$
variables will be replaced with values fetched from the burst report such as John and July . The email
sent will have the message:

Hi John,

Attached you can find the invoice for the


month of July.

Thank you.

Using the Save Template and Load Template buttons it is possible to save and load the message templates
from and to external text files.

Send HTML email messages


DocumentBurster can send email messages with rich formatting in order to add color, images, headings,
bulleted lists, emphasized text, underline key points, or to make some of the words bold.

11
Distributing reports

In order to configure DocumentBurster to send emails with rich formatting, the HTML email checkbox
should be selected and the HTML message should be defined with valid HTML code containing the mes-
sage which needs to be distributed.

Note 1
It is advisable to provide an alternative plain text message for HTML unaware email clients, such
as text-based email clients.

Note 2
DocumentBurster resolves all the image paths used in the HTML code starting from ./tem-
plates parent directory.

For example the image sidebar-top-left.gif is defined in the sample.html having


the relative path src="html-sample/images/sidebar-top-left.gif" , starting
from the ./templates parent directory.

When DocumentBurster is not able to resolve an image which has a wrong path defined in the
HTML code, following things are happening.

• The output burst report is not distributed by email

• If quarantine is configured, the output burst report is copied to the quarantine folder

• An exception is logged in the logs/DocumentBurster.log log file. The log can be used
to identify and fix the problematic image and path

Note 3
Variables can be used to define the custom part of the message which needs to be tailored for
each individual recipient.

Note 4
Before going live, it is advisable to test the HTML code which is planned to be used. This is to
avoid any unpleasant situation of sending wrong information to any client or customer.

HTML message sample


DocumentBurster is packaged with a sample HTML email template located in templates/html-
sample . The template is called sample.html and contains a good looking message to demonstrate
the capabilities of HTML emails.

12
Distributing reports

The DocumentBurster sample HTML email template has a complex enough layout for giving an idea of
what things can be achieved when sending HTML formatted emails.

13
Distributing reports

Distribute reports by FTP


DocumentBurster can distribute the generated reports by FTP. Following image is showing the screen
where the FTP settings are configured.

14
Distributing reports

The FTP address can be statically defined or it can be formed at runtime by using variables. With the help
of the settings defined in the above screen, DocumentBurster is distributing the reports by FTP whenever
is able to generate a complete valid FTP url address.

Please read more about FTP urls at http://www.cs.tut.fi/~jkorpela/ftpurl.html

15
Chapter 4. Variables
DocumentBurster variables are pieces of information from the burst jobs, which may change from report
to report, and which can be used as data to be included in the delivery of documents.

Variables can be used to define custom dynamic values for the following:

• Burst File Name

• Output Folder

• Backup Folder

• Quarantine Folder

• FTP destination URLs can be dynamically generated

• Email To, Cc and Bcc fields can be dynamically generated

• Email subject and email message text can be personalized using variables

• Email connection settings can be customized using variables. From Name, From Email Address, Host,
User Name, User Password and Port settings are all configurable using variables.

Using variables, the values of the above can be dynamically populated at runtime with information coming
from the report which is being burst.

While it is possible to define static values for the output folders, it is not advisable. Following are few
situations in which variables will help:

• The same report is being burst at different times. Bursting the same report to the same statically defined
output folders will override the files generated during previous runs.

• Few different reports are using the same burst tokens (for example the email address of the same client).
Having a common output folder will result in getting the generated reports to be overridden between
runs. This is happening because the same burst token is found in multiple reports.

Using variables, with unique values generated at runtime, it is possible to overcome the above described
problems by defining unique output folder names per each different burst session. Variables are also helpful
when defining dynamic FTP or Email destination addresses and for defining personalized email messages
which have in common the same email message template.

DocumentBurster has two types of variables:

• Built-in variables

• User-defined variables

Built-in variables
Built-in variables contain information such as the name of the report to burst, the date (in various formats)
when the bursting is happening and the burst token.

Following built-in variables are available to be used within DocumentBurster :

Available built-in variables

• $input_document_name$ - the file name of the input report

16
Variables

• $burst_token$ - the burst token which is used for bursting the current file

• $burst_index$ - the index of the burst file. For example, a file which will be burst fourth will have the
value 4

• $now; format="yyyy.MM.dd_HH.mm.ss"$ - the current date and time displayed in a default format.
Custom date formats can be specified also, for example it is possible to display the full date/time or
display one or a combination of year, month, week, day, hour, minute, second. yyyy.MM.dd_HH.mm.ss
is the default format which is provided with the software.

Note
Windows doesn't allow to use the character : for defining the folder and file names.

Full date format documentation is available here [http://download.oracle.com/javase/1.4.2/docs/api/


java/text/SimpleDateFormat.html]

• $now_default_date$ - shortcut to the default date format in the computer's locale settings. U.S. Locale
example would be Jun 30, 2009

• $now_short_date$ - shortcut to the short date format in the computer's locale settings. U.S. Locale
example would be 6/30/09

• $now_medium_date$ - shortcut to the medium date format in the computer's locale settings. U.S. Locale
example would be Jun 30, 2009

• $now_long_date$ - shortcut to the long date format in the computer's locale settings. U.S. Locale ex-
ample would be June 30, 2009

• $now_full_date$ - shortcut to the full date format in the computer's locale settings. U.S. Locale example
would be Tuesday, June 30, 2009

Using built-in variables it is possible to build an advanced foldering and archiving solution for the output
reports.

Following different foldering options are possible. Combinations of the following are also possible:

• Generate one different output folder per each input report

• Generate one different output folder per each different burst token

• Date related options - generate one output folder per year, financial quarter, month, week in month, day
in week and up to the level of hours, minutes and seconds.

Example
DocumentBurster is coming with the following default settings:

• Burst File Name - $burst_token$.pdf

When bursting the sample burst.pdf following two files are being generated doc1.pdf and
doc2.pdf - with doc1 being the first burst token and doc2 the second token - $burst_token$ .pdf

• Output Folder, Backup Folder and Quarantine Folder - all are using the same pattern
$input_document_name$/$now; format="yyyy.MM.dd_HH.mm.ss"$

When bursting the input sample document burst.pdf , by default, the output files are being generated
in a folder similar with burst.pdf/2010.10.28_19.13.13 - $input_document_name$/$now;
format="yyyy.MM.dd_HH.mm.ss"$ .

17
Variables

User-defined variables
User defined variables can contain any text from the report which is being burst or distributed. Variables
might be used for sending emails with personalized subject and personalized text or for generating dynamic
file names and folder names for the output burst reports.

DocumentBurster has support for up to ten user defined variables $var0$, $var1$, $var2$, $var3$, $var4$,
$var5$, $var6$, $var7$, $var8$, $var9$ . While the variable names are not impressive they are for sure
handy to use.

For example, a requirement might be to generate the output file names using the following pattern Cus-
tomer Name-Invoice number-Invoice date.pdf . Using the user defined variables the requirement can be
achieved by defining the Burst File Name configuration like this $var0$-$var1$-$var2$.pdf . $var0$
will be replaced at runtime with the customer name, $var1$ with the invoice number and $var2$ with
the invoice date.

The values for the user defined variables are being populated with text content from the report which is
being splitted. This is not a mandatory requirement but usually the variables will have different values for
each different split token or for each different destination.

In order to populate the user defined variables with values, DocumentBurster engine is looking inside the
report for patterns like the following:

• <0> any text which should be assigned as a value to the first variable </0> or

• <1> any text which should be assigned as a value to the second variable </1>

DocumentBurster engine supports up to ten - 10 different variables, so the last variable will look like <9>
any text which should be assigned as a value to the 10th variable </9> .

User defined variables can be used to dynamically generate any of the following Burst File Name, Output
Folder, Backup Folder, Quarantine Folder, FTP destination URLs, Email To, Cc and Bcc fields, Email
subject, email message text, From Name, From Email Address, Host, User Name, User Password and
email server Port.

Note 1
Usually the start and the end tags of the variables,<0> and </0>, will get the white color so that
the visual appearance and the layout of the report will not be affected.

Note 2
Before going live, it is advisable to practice the use of variables on few sample reports. This is to
avoid any unpleasant situation of sending wrong data to any client or customer.

Example - customizable burst file name


By default DocumentBurster is generating the output file names using:

Burst File Name - $burst_token$.pdf with $burst_token$ being the system built-in variable used to burst
the separate files.

The requirement is to customize the output file names to be in the format Customer name-Invoice num-
ber-Invoice date.pdf . None of the Customer name , Invoice number or Invoice date are used as burst token.

The above will be achieved using user defined variables so that:

18
Variables

• Customer name will be mapped to the first variable $var0$

• Invoice number will be mapped to the second variable $var1$

• Invoice date will be mapped to the third variable $var2$

Burst File Name will be defined as $var0$-$var1$-$var2$.pdf

$var0$, $var1$ and $var2$ variables will be populated at runtime with values fetched from each separate
report.

19
Variables

• Variables are fetched at runtime from each separate invoice. DocumentBurster is looking for
<N>value</N> patterns in each invoice.

• In the above example the name of the generated file will be My Company, Inc-001001-09272002.pdf

20
Chapter 5. Automatic polling for
incoming reports
Watch a folder for incoming reports
DocumentBurster can poll a configurable folder and automatically pick for processing all the reports found
in this folder.

• Start Polling - Start polling of the selected folder. Once the polling is started, DocumentBurster will
automatically process all the reports which are dropped in the polled folder.

• End Polling - Stop the automatic polling of the reports.

• View Generated Reports - Browse the generated reports.

21
Chapter 6. DocumentBurster Server
DocumentBurster Server can be deployed as a central system to provide report bursting and report distri-
bution services to multiple people or software applications from within any organization.

DocumentBurster Server has following additional capabilities:

• Server architecture in order to serve multiple concurrent clients

• Web based GUI console which is compatible with all the major browsers - Internet Explorer, Firefox,
Chrome and Safari

• Schedule nightly, monthly or custom schedules for the report bursting jobs

• Monitor the currently executing jobs and trace the status for previously submitted jobs

• High availability and high throughput can be achieved by starting multiple server instances

DocumentBurster Server is an extendable platform which puts the foundations to add and tailor other
enterprise like features which an organization might need:

• Burst, split and merge any report format (out of the box DocumentBurster can burst PDF reports)

• Full indexing and search capabilities for the reports which are being burst and distributed. In simple
words this feature allows to quickly find a report which was distributed one year ago - even if the
organization is distributing a big number of reports each month.

Installation
Prerequisites
Please read the DocumentBurster in 2 Minutes document with the prerequisites for running the software.
[http://www.pdfburst.com/reports-bursting-quickstart.pdf]

Download DocumentBurster Server


DocumentBurster Server can be downloaded from this link. [http://sourceforge.net/projects/
documentburster/files/] Please make sure to download the latest documentburster-serv-
er-{version}.zip file version which is available.

Extract the zip archive to a folder like C:\DocumentBurster - following two folders will get created:

• server - contains the binaries and the scripts for starting and stopping the report bursting server

• web-console - binaries and scripts for the DocumentBurster Server web console

Basic Usage
Server

• Configuration - the server is configured using the same GUI interface serv-
er/DocumentBurster.exe which was described in the previous chapters.

22
DocumentBurster Server

• Starting - once configured, the server can be started using server/startServer.bat;server/


startServer.sh script.

• Stopping - server/shutServer.bat;server/shutServer.sh script used to stop the server.

• Processing reports - once started, the server is listening for new reports to process in the server/poll
directory . Any report which is dropped in the server/poll folder is automatically picked up and
processed by the server.

Web console

The web console is connecting to the server which needs to be started first.

• Starting - once the server is started, the web console can be started using web-con-
sole/startConsole.bat;web-console/startConsole.sh script.

• Stopping - web-console/shutConsole.bat;web-console/shutConsole.sh script


used to stop the console.

Once started, DocumentBurster Web Console application can be accessed by typing following URL in the
browser of choice - http://machine_name:8080/burst for example http://localhost:8080/burst

Scheduling
DocumentBurster Server can handle scheduled report bursting and distribution jobs. By default the soft-
ware can handle jobs scheduled for nightly (midnight) execution. If this is what it is required, there is
nothing more to be configured in regards with scheduling. On the other hand it is easy to change and
customize the scheduling - familiarity with other cron like schedulers will help in understanding better
the scheduling mechanism which is implemented in DocumentBurster . Yearly, monthly, weekly, daily,
hourly or any other custom report bursting schedule jobs are all easy to define.

While for executing ad hoc, immediate report bursting jobs, DocumentBurster Server is checking the
server/poll folder, scheduled reports should be placed in the server/input-files/sched-
uled directory. DocumentBurster will properly trigger the report bursting and report distribution jobs to
happen at the correct date and time, depending on how the scheduling was configured.

Configuration
DocumentBurster Server scheduling is configured using cron expressions . The default scheduling is con-
figured to trigger daily, at midnight.

At the end of the configuration file server/config/batch/internal/batch-context.xml


there is an entry similar with

<task:scheduled-tasks scheduler="scheduler">
<task:scheduled ref="scheduled" method="run" cron="0 0 0 * * ?" />
</task:scheduled-tasks>

The text cron=" 0 0 0 * * ? " is the cron expression of particular interest in regards with configuring the
scheduling. 0 0 0 * * ? is the encoding to configure the default daily (midnight) schedule. The previous
default cron expression can be replaced with any other valid expression, based on the requirements, in
order to schedule yearly, monthly, weekly, daily - at different time or hourly report processing jobs. Cron

23
DocumentBurster Server

expression documentation is out of the scope of this user guide, more details about how to configure a
cron expression can be found at:

• CRON expression - Wikipedia [http://en.wikipedia.org/wiki/CRON_expression]

• CRON expression - Quartz documentation [http://www.quartz-scheduler.org/docs/tutori-


als/crontrigger.html]

Note
In order for the new value to take effect, a server restart is required after the cron expression
configuration is changed.

Web console
DocumentBurster Server is coming with a web based console interface which can be accessed from any
major web browser. The console can be used for triggering new adhoc jobs, scheduling jobs for later
execution or for viewing the status, history and detailed logs of the previously submitted jobs.

The web console needs the server to be started, so make sure it is so. After the server is started, in the
folder where the software was extracted, please execute web-console/startConsole script to get
the console started. After few seconds the application can be accessed by typing following URL in the
browser - http://machine_name:8080/burst for example http://localhost:8080/burst .

Adhoc/immediate jobs - Reports can be immediately burst and distributed through the Files - Submit
Burst Jobs menu entry.

Scheduled jobs can be submitted through the Files - Schedule Burst Jobs menu entry. Uploaded report
files will be placed in the server/input-files/scheduled being scheduled for execution at a
later time.

DocumentBurster web console can be used for submitting new jobs for immediate execution, scheduling
jobs for later execution, viewing the currently running jobs and for checking the status, history and the
logs of the previously submitted jobs.

Following are some screen shots from the application:

24
DocumentBurster Server

DocumentBurster home page

Burst reports - Uploaded reports will be picked up and processed by the server

Schedule report bursting jobs - schedule documents distribution for executing at a later time

25
DocumentBurster Server

Jobs page

DocumentBurster Server jobs executions page - View status and history of distribution jobs

26
DocumentBurster Server

Detailed logging for the failed report bursting jobs

Windows - Run DocumentBurster Server at


system startup
Being a server application it is possible for DocumentBurster Server to be configured to keep it running
in the background as long as the operating system server is running.

Following paragraph will detail how to schedule DocumentBurster Server to run automatically when the
system is starting. The screenshots are taken from Windows XP while the same can be achieved very
similarly on any other Windows distribution.

The images are showing how to schedule the server/startServer.bat script in order to automat-
ically start the server component. The same can be done for web-console/startConsole.bat
script in order to automatically start the web console.

1. Please go to Windows -> Start -> Settings -> Control Panel -> Scheduled Taks

2. Click Add Scheduled Task

27
DocumentBurster Server

3. Click Next and browse to the location of the server/startServer.bat script.

28
DocumentBurster Server

4. Click Next and select When my computer starts.

29
DocumentBurster Server

5. Click Next and give the domain, user name and the password to be used when the script in being executed.

6. Click Finish to get the task scheduled.

30
DocumentBurster Server

7. Done. server/startServer.bat script was scheduled to start when Windows is starting.

The above screenshots show how to schedule the server/startServer.bat script in order to auto-
matically start the server component. The same can be done for web-console/startConsole.bat
script in order to automatically start the DocumentBurster web console.

31
Chapter 7. Using scripts to achieve
more
Scripts can help in squeezing more tailored functionality from DocumentBurster . For example, there is
no GUI command to archive the output burst reports in a single compressed file, while with few lines of
scripting it is easy to zip all the output files together.

DocumentBurster supports scripts written in Groovy, a scripting language for the Java platform. Docu-
mentBurster Groovy scripts can make use of any existing Java code and library.

This chapter shows how to use the scripting capabilities of the software and how to customize Document-
Burster using some existing sample scripts which are provided with the package.

Scripting scenarios
DocumentBurster has support for injecting tailored behavior during the normal bursting lifecycle. There
are a set of predefined exit points in which, using scripting, it is possible to implement custom logic. For
example there is an endBursting lifecycle phase in which with few lines of code is possible to zip together
all the burst files, which otherwise would have come separated in the output folder.

Following should give some ideas of the kind of things which are possible using DocumentBurster scripting
capabilities:

File related capabilities


• Copy - Copy a file or a set of files to a new file or directory.

• Delete - Deletes a single file, all files and sub-directories in a specified directory, or a set of files specified
with an wildcard (*) like file pattern.

• Mkdir - Creates a directory. Non-existent parent directories are created, when necessary.

• Move - Moves a file to a new file or directory, or a set(s) of file(s) to a new directory.

• Archive - Zip, GZip, BZip2 or Tar the burst reports.

• Other file related capabilities - Change the permissions and/or attributes of a file or all files inside the
specified directories, generate or verify a checksum for a file or set of files and also touch the files.

Sample
For an example on how to zip or delete files, please see the existing scripts/burst/sam-
ples/zip.groovy sample script.

Execute external programs


While integrating DocumentBurster with existing software, following capability will be of interest. It is
possible to call any external executable in some pre-defined points during the report bursting and report
distribution flow.

Exec - Execute a system command. When the OS attribute is specified, the command is only executed on
one of the specified operating systems.

32
Using scripts to achieve more

Sample
The external program to be demonstrated is Pdftk - http://www.pdflabs.com/tools/pdftk-the-pdf-
toolkit/

pdftk or the pdf toolkit is a cross-platform tool for manipulating PDF documents.

It is easy to integrate pdftk with DocumentBurster in order to achieve additional powerful capa-
bilities.

pdftk is capable of splitting, merging, encrypting, decrypting, uncompressing, recompressing, and


repairing PDFs. It can also be used to manipulate watermarks, metadata, and to fill PDF Forms
with FDF Data (Forms Data Format) or XFDF Data (XML Form Data Format).

Install Pdftk

• Please download pdftk from this location - http://www.pdflabs.com/docs/install-pdftk/

• Make sure to download the binaries which are specific to the target operating system.

• Move the pdftk.exe and libiconv2.dll in the folder where DocumentBurster was
installed, next to the DocumentBurster.exe file.

Under Microsoft Windows, pdftk.exe and libiconv2.dll should be placed next to the
DocumentBurster.exe file.

For an example on how to execute pdftk during the report bursting life cycle, please see the
existing scripts/burst/samples/exec_pdftk_background.groovy sample script.

Publish reports to Microsoft SharePoint portal


Using scripting, it is possible to publish reports directly to enterprise portals. Think to the use case where
there are few hundreds or thousands of customers and dealers and, with a single click, the relevant indi-
vidual reports can be made available to each one of them on the portal.

33
Using scripts to achieve more

DocumentBurster is distributing the reports to portals using the WebDAV protocol. Following products,
they all support WebDAV, so that DocumentBurster is capable to distribute reports to the following:

• Microsoft SharePoint

• IBM WebSphere Portal

• Oracle Portal

• SAP NetWeaver

• Tibco PortalBuilder

• Samsung ACUBE Portal

• Liferay Portal, Hippo portal, JBoss Enterprise Portal, eXo and Apache Portal

Distribute messages to SMS, Fax or print reports


• SMS messages can be delivered, via email, through an online SMS gateway service. In such a scenario
DocumentBurster is configured to send an email to the SMS gateway in which the text of the message
and the destination number are specified. The SMS gateway will transform the email message and will
deliver it further, using SMS, to the specified number. Using scripting, DocumentBurster can send con-
figured SMS messages to any gateway service. For a list of available online SMS gateways just Google
for 'list of SMS gateways' . The SMS which is best fitting the needs can be selected and DocumentBurster
will distribute SMS messages using it.

• Fax the reports

There are various ways of sending documents by fax using the computer.

The simplest way is to use an existing fax online gateway to which the reports are sent as an email
attachment. The gateway will further forward the reports by fax to the specified number. For a list of
available online fax gateways just Google for 'list of fax gateways' .

As an alternative, it is possible to send faxes by configuring a dial-up modem to work with specialized
fax software. Microsoft Fax can be used as a fax software service on Windows. For instructions on
enabling Microsoft Fax, please consult the appropriate Microsoft knowledgebase article from the Mi-
crosoft website. HylaFAX or AsterFax™ - Asterisk Fax are valid fax software solutions which can be
used on UNIX/Linux systems. Using scripting it is possible to integrate DocumentBurster with any of
the previously enumerated fax products, while this requires some customization effort to integrate with
the specific fax vendor APIs.

• Printer - DocumentBurster can print the output burst reports directly to physical printers.

Mail, FTP, FTPs and SFTP


With a little bit of scripting it is possible to send reports by email, upload to FTP or FTPs and copy files
to SFTP using SSH.

While sending the burst reports by email is available through the GUI interface, sometimes more flexibility
can be achieved with the help of DocumentBurster scripting. One example is that using scripting it is
possible, if required, to send emails without attachments to any SMS gateway - by default, through the
GUI interface, all the emails which are sent will have a corresponding burst report attached.

34
Using scripts to achieve more

Upload reports to a shared location


DocumentBurster can upload the generated reports to a network shared location.

Encrypt or stamp the output reports


Using scripting, DocumentBurster can encrypt the output reports. This feature is commonly used to prevent
unauthorized viewing, printing, editing, copying text from the document and doing annotations. It is also
possible to ask the user for a password in order to view the report.

Sample
For an example on how to encrypt and password protect the burst reports, please see the existing
scripts/burst/samples/encrypt.groovy sample script.

DocumentBurster can stamp the distributed reports in much the same way that it is applied a rubber stamp
to a paper document. If required, it is possible to apply bates stamping, page numbering, text stamping,
logo insertion or add headers/footers and watermarks to the reports.

Sample
For an example on how to stamp the burst reports, please see the existing scripts/burst/sam-
ples/overlay.groovy sample script.

Introduction to the Burst Lifecycle


During the report processing, DocumentBurster defines a set of exit points which can be used to customize
the default software behavior. Before the bursting starts, the very first place which can be customized is the
controller . Following the controller , and part of the bursting lifecycle, are coming a list of sequentially
ordered burst phases. The burst lifecycle has the following ordered phases:

• startBursting - event triggered when the burst is starting

• startParsePage - event triggered before a page text is parsed

• endParsePage - event triggered after a page text was parsed

• startExtractDocument - event triggered before a burst report is extracted

• endExtractDocument - event triggered after a burst report was just extracted

• startDistributeDocument - event triggered before a burst report is distributed( by either email or by FTP)

• endDistributeDocument - event triggered after a burst report was just distributed

• quarantineDocument - event triggered whenever a report failed to be distributed and it is being quar-
antined

• endBursting - event triggered when the burst is finishing

Controller
scripts/burst/controller.groovy is the first script which is being executed when a report is being processed.

35
Using scripts to achieve more

When few different categories of reports need to be processed by using different program settings, in this
case, customizing the scripts/burst/controller.groovy is the way to achieve it.

Use case example - there might be two different types of reports marketing reports and financial reports
. Marketing reports should be processed using some settings and the financial reports using another set
of settings. Even more, financial reports should be published to SharePoint, while the marketing reports
should be archived in a zip file.

In this case, the custom scripts/burst/controller.groovy will be similar with:

if (ctx.inputDocumentFilePath.contains("marketing"))
{
ctx.settings.loadSettings("./config/marketing.xml")
ctx.scripts.endBursting = "marketingZipAllBurstReports.groovy"
}
else
{
ctx.settings.loadSettings("./config/financial.xml")
ctx.scripts.endExtractDocument = "financialSharePointUpload.groovy"
}

The code is self explanatory. The intention is to:

• process the marketing reports using the settings defined in marketing.xml and also zip all the mar-
keting reports.

• process the financial reports using the settings defined in financial.xml and publish the reports
to SharePoint.

ctx is a special bursting context object available, throughout the bursting lifecycle, in all the scripts. Fol-
lowing paragraph will provide more details about the bursting context .

It is not mandatory to customize the controller. By default all the reports are processed using the existing
config/burst/settings.xml configuration file. DocumentBurster is packaged with a set of empty
script templates, one per each bursting lifecycle phase. Most of the times, just filling the custom code into
an existing script template will suffice, without a need to modify the controller for this.

Bursting context
Bursting context is an object implicitly available for scripting throughout all the bursting lifecycle phases.
The bursting context is available during scripting as a variable named ctx . Following is the information
available with the bursting context .

public String inputDocumentFilePath;


public int numberOfPages;

public Settings settings;


public Variables variables;
public Scripts scripts;

36
Using scripts to achieve more

public int currentPageIndex;


public String currentPageText;
public String previousPageText;

public String token;

public String outputFolder;


public String backupFolder;
public String quarantineFolder;

public String extractFilePath;

• ctx.inputDocumentFilePath - file path of the report which is being processed.

Lifespan - ctx.inputDocumentFilePath is available for all of the bursting lifecycle phases.

• ctx.numberOfPages - number of pages for the report which is being processed.

Lifespan - ctx.numberOfPages is available for all the bursting lifecycle phases.

• ctx.settings - contains the settings used to process the current report. Following settings fields might
present interest while scripting burstFileName, outputFolder, backupFolder, quarantineFolder, send-
Files, deleteFiles, quarantineFiles - with the last three fields being of type boolean.

Lifespan - ctx.settings is available throughout all the bursting lifecycle starting with the first startBurst-
ing phase and up to the last endBursting .

• ctx.variables - Map<String, Object> which contains both the built-in and the user defined variables.

The built-in variables are accessible using the ctx.variables.get(variableName) syntax.

For instance, the syntax

ctx.variables.get("input_document_name")

will return the file name of the input report.

The values for the following built-in variables can be returned similarly:

input_document_name, burst_token, burst_index, now, now_default_date, now_short_date,


now_medium_date, now_long_date, now_full_date, now_default_time, now_short_time,
now_medium_time, now_long_time, now_full_time and now_quarter .

User defined variables are populated and are available per each separate burst token. The syntax to
access the user variables is ctx.variables.getUserVariables(ctx.token).get(variableName) .

For example the code,

ctx.variables.getUserVariables("johnny-bravo@amazon.com").get("var0")

will return the first user variable for the token johnny-bravo@amazon.com.

While the code,

ctx.variables.getUserVariables(ctx.token).get("var0")

37
Using scripts to achieve more

will return the first user variable for the current burst token.

Lifespan - Beside the burst_token and burst_index all the other built-in variables are available through-
out all the bursting lifecycle starting with the first startBursting phase up to the last endBursting .

burst_token and burst_index are populated during the time the burst reports are generated and are
available in startExtractDocument, endExtractDocument, startDistributeDocument, endDistributeDoc-
ument and quarantineDocument .

User variables are progressively populated during the time the report pages are being parsed and they
become fully available for the startExtractDocument, endExtractDocument, startDistributeDocument,
endDistributeDocument, quarantineDocument and endBursting phases.

• ctx.scripts - keeps track of the Groovy scripts to be executed for each of the bursting phases. Docu-
mentBurster is coming with nine empty script templates found under the scripts/burst folder. The
existing templates are suitable to be used for most of the scripting situations. For example, in order to
put some custom behavior when the bursting is finished, than the simplest way to do this is to write the
tailored logic by editing the existing empty template endBursting.groovy script.

However, there might be cases in which it will be a need to associate totally new Groovy scripts to be
executed when some bursting events are happening.

The syntax to specify a custom script is ctx.scripts.eventName = scriptName.groovy

For example

ctx.scripts.endExtractDocument = myCustomScript.groovy

will assign the myCustomScript.groovy to be executed each time a report was just extracted.

Following are all the phases/events for which custom scripts can be associated:

• ctx.scripts.startBursting

• ctx.scripts.endBursting

• ctx.scripts.startParsePage

• ctx.scripts.endParsePage

• ctx.scripts.startExtractDocument

• ctx.scripts.endExtractDocument

• ctx.scripts.startDistributeDocument

• ctx.scripts.endDistributeDocument

• ctx.scripts.quarantineDistributeDocument

If required, new scripts can be associated to be executed during the bursting lifecycle in the controller
. For an example please see the Controller section.

Lifespan - ctx.scripts is available throughout all the bursting lifecycle phases/events.

• ctx.currentPageIndex, ctx.currentPageText, ctx.previousPageText - the index of the current page


which is being parsed and the text of the current and of the previous pages.

38
Using scripts to achieve more

Lifespan - ctx.currentPageIndex, ctx.currentPageText, ctx.previousPageText are available for the start-


ParsePage and endParsePage phases/events.

• ctx.outputFolder, ctx.backupFolder, ctx.quarantineFolder - the output folder, backup folder and


quarantine folder for the burst reports.

Lifespan - ctx.outputFolder, ctx.backupFolder, ctx.quarantineFolder are available for the startExtract-


Document , endExtractDocument , startDistributeDocument , endDistributeDocument , quarantineDoc-
ument and endBursting phases/events.

• ctx.token - the token used to extract and process the current burst report

Lifespan - ctx.token is available for the startExtractDocument , endExtractDocument , startDistribute-


Document , endDistributeDocument and quarantineDocument phases/events.

• ctx.extractFilePath - the path for current file which is being extracted

Lifespan - ctx.extractFilePath is available for the startExtractDocument , endExtractDocument , start-


DistributeDocument , endDistributeDocument and quarantineDocument phases/events.

Sample scripts
DocumentBurster is coming with a number of sample scripts which can be used as a starting point for
implementing other different custom requirements. All the sample scripts are available in the scripts/
burst/samples folder.

zip.groovy
By default DocumentBurster is not archiving the output burst reports. By running few lines of script during
the endBursting phase, it is possible to capture and zip together all the burst files in a single file.

Edit the script scripts/burst/endBursting.groovy with the content found in scripts/


burst/samples/zip.groovy and then burst a new report. Now, every time a report is burst, the
output files will be archived together in a single zip file.

Similarly, if required, the output files can be archived with different formats and algorithms such as gzip,
bzip or tar. For a complete list and documentation of the available options please consult the help page of
the Ant Archive Tasks - http://ant.apache.org/manual/tasksoverview.html#archive

The following code should be self explanatory. For customizing the name of the zip output file please
change the value of the variable zipFilePath as per the needs.

import com.smartwish.documentburster.variables.Variables

/*
*
* 1. This script should be used for zipping the output burst files
* in a single file.
*
* 2. The script should be executed during the endBursting report
* bursting lifecycle phase.
*
* 3. Please copy and paste the content
* of this script into the existing

39
Using scripts to achieve more

* scripts/burst/endBursting.groovy script.
*
* 4. The script is doing basic archiving of all the output
* PDF files in a single zip file.
* Running multiple times the same input report will
* override the output zip file between the consecutive runs.
*
* 5. More complex archiving requirements can be achieved
* by modifying this starting script.
*
*/

//zipFilePath variable keeps the name of the zip file.


//When bursting a report burst.pdf
//the output zip file will be named burst.pdf.zip and will contain
//inside all the generated reports
def zipFilePath = ctx.outputFolder+"/"+\
ctx.variables.get(Variables.INPUT_DOCUMENT_NAME)+".zip"

def ant = new AntBuilder()

//zip together all the individual PDF burst reports


ant.zip(destfile: zipFilePath, basedir: ctx.outputFolder, includes: "**/*.pdf")

//finally, delete the individual PDF burst reports


ant.delete {fileset(dir:ctx.outputFolder,includes: "**/*.pdf")}

encrypt.groovy
By default DocumentBurster is not encrypting or password protecting the output burst reports. By placing
few lines of script during the endExtractDocument phase, it is possible to encrypt and password protect
all the output files - http://en.wikipedia.org/wiki/Portable_Document_Format#Security_and_signatures

Edit the script scripts/burst/endExtractDocument.groovy with the content found in


scripts/burst/samples/encrypt.groovy and then burst a new report. Now, every time a re-
port is burst, the output files will be encrypted to have both an owner and an user password.

The default user and owner passwords have the same value which is the value of the $burst_token$ vari-
able. For example, when bursting the sample report samples/burst.pdf two output files will be gen-
erated doc1.pdf and doc2.pdf . The password for the first report is doc1 and for the second one is
doc2 with both passwords being generated from the $burst_token$ variable.

Similarly, if required, the output files can be encrypted with the following additional possibilities:

• Certification file

• Set the assemble permission

• Set the extraction permission

• Set the fill in form permission

• Set the modify permission

• Set the modify annots permission

40
Using scripts to achieve more

• Set the print permission

• Set the print degraded permission

• The number of bits for the encryption key

For a complete list and documentation of the available encrypt options please consult the help page of the
PDFBox encrypt utility - http://pdfbox.apache.org/commandlineutilities/Encrypt.html

The following code should be self explanatory. For customizing the passwords, following syntax should
be used to access the value of a variable - ctx.variables.getUserVariables(ctx.token).get(variableName) .

import com.smartwish.documentburster.variables.Variables

/*
*
* 1. This script should be used for achieving PDF report
* encryption capabilities.
*
* 2. The script should be executed during the endExtractDocument
* report bursting lifecycle phase.
*
* 3. Please copy and paste the content of this sample script
* into the existing scripts/burst/endExtractDocument.groovy
* script.
*
* 4. Following PDF encryption scenarios are possible:
*
* 4.1 - Set the owner and user PDF passwords. Default is none.
* 4.2 - Digitally sign the report with a X.509 cert file.
* Default is none.
* 4.3 - Set the assemble permission. Default is true.
* 4.4 - Set the extraction permission. Default is true.
* 4.5 - Set the fill in form permission. Default is true.
* 4.6 - Set the modify permission. Default is true.
* 4.7 - Set the modify annots permission. Default is true.
* 4.8 - Set the print permission. Default is true.
* 4.9 - Set the print degraded permission. Default is true.
* 4.10 - Sets the number of bits for the encryption key.
* Default is 40.
*
* 5. For a full list and documentation of the various PDF encryption
* capabilities please see
* http://pdfbox.apache.org/commandlineutilities/Encrypt.html
*
*/

/*
*
* Warning:
*
* 1. Normally it should not be any need for you to modify
* the value of pdfBoxClassPath.
*

41
Using scripts to achieve more

* 2. You should only double check that the values of


* the hard-coded jar paths/versions are still valid.
* With new releases of new software the jar paths/versions
* might become obsolete.
*
* 3. If required, modify the paths/versions with care.
* Having the pdfBoxClassPath wrong will result in the
* following ant.exec/pdfbox call to fail.
*
*/

def pdfBoxClassPath="lib/burst/pdfbox-1.0.0.jar"
pdfBoxClassPath+=";lib/burst/commons-logging-1.1.1.jar"
pdfBoxClassPath+=";lib/burst/jempbox-1.0.0.jar"
pdfBoxClassPath+=";lib/burst/fontbox-1.0.0.jar"
pdfBoxClassPath+=";lib/burst/bcmail-jdk15-1.44.jar"
pdfBoxClassPath+=";lib/burst/bcprov-jdk15-1.44.jar"

/*
*
* 1. encryptOptions are the arguments which are passed for
* PDF encryption.
*
* 2. By default the encryptOptions is defining the
* owner (-O) and user (-U) passwords having the same
* value of the $burst_token$ system variable.
*
* 3. You can customize for different user and owner
* passwords which can be fetched from the values
* of any variable such as var0, var1, etc.
*
*/

def burstToken = ctx.token

/*
*
* Following is an example to access the value of the first
* user defined variable var0.
*
* def password = ctx.variables.getUserVariables(ctx.token).get("var0")
*
*/

def password = burstToken

def inputFile = ctx.extractFilePath

/*
*
* 1. By changing the encryptOptions arguments you can
* achieve more PDF encryption features such as applying
* certification files, modifying the permissions on the report
* and modifying the length of the key which is used

42
Using scripts to achieve more

* during encryption.
*
* 2. For a full list and documentation of the various
* PDF encryption capabilities please see
* http://pdfbox.apache.org/commandlineutilities/Encrypt.html
*
* 3. Gotchas: Take care if you want to pass an argument
* that contains white space – it will be split into
* multiple arguments. This is the reason why
* in encryptOptions all the string arguments are
* surrounded with the \" character.
*
* For more details please read
* http://groovy.codehaus.org/Executing+External+Processes+From+Groovy
*
*/

def encryptOptions = "-O \"$password\" -U \"$password\" \"$inputFile\""


log.info("encryptOptions = $encryptOptions")

def ant = new AntBuilder()


ant.exec(outputproperty:"cmdOut",
errorproperty: "cmdErr",
resultproperty:"cmdExit",
failonerror: "false",
executable: 'java') {
arg(line:"-cp $pdfBoxClassPath org.apache.pdfbox.Encrypt $encryptOptions")
}

println "return code: ${ant.project.properties.cmdExit}"


println "stderr: ${ant.project.properties.cmdErr}"
println "stdout: ${ant.project.properties.cmdOut}"

overlay.groovy
Using this sample script, DocumentBurster can stamp the output burst reports. The script should be ex-
ecuted during the endExtractDocument report bursting life cycle phase. The script is using the sam-
ples/stamp.pdf to overlay the output burst reports. It is easy to customize the overlay with a different
custom stamp.

Edit the script scripts/burst/endExtractDocument.groovy with the content found in


scripts/burst/samples/overlay.groovy and then burst a new report. Now, every time a re-
port is burst, the output files will be stamped with the samples/stamp.pdf file.

The following code should be self explanatory. For customizing the overlay document please replace the
existing samples/stamp.pdf with a a different file.

import com.smartwish.documentburster.variables.Variables

/*
*
* 1. This script should be used to overlay one document with the content

43
Using scripts to achieve more

* of another document.
*
* 2. The script should be executed during the endExtractDocument
* report bursting lifecycle phase.
*
* 3. Please copy and paste the content of this sample script
* into the existing scripts/burst/endExtractDocument.groovy
* script.
*
* 4. For a full documentation of the PDF overlay capability
* please see
* http://pdfbox.apache.org/commandlineutilities/Overlay.html
*
*/

/*
*
* Warning:
*
* 1. Normally it should not be any need for you to modify
* the value of pdfBoxClassPath.
*
* 2. You should only double check that the values of
* the hard-coded jar paths/versions are still valid.
* With new releases of new software the jar paths/versions
* might become obsolete.
*
* 3. If required, modify the paths/versions with care.
* Having the pdfBoxClassPath wrong will result in the
* following ant.exec/pdfbox call to fail.
*
*/

def pdfBoxClassPath="lib/burst/pdfbox-1.0.0.jar"
pdfBoxClassPath+=";lib/burst/commons-logging-1.1.1.jar"
pdfBoxClassPath+=";lib/burst/jempbox-1.0.0.jar"
pdfBoxClassPath+=";lib/burst/fontbox-1.0.0.jar"
pdfBoxClassPath+=";lib/burst/bcmail-jdk15-1.44.jar"
pdfBoxClassPath+=";lib/burst/bcprov-jdk15-1.44.jar"

//apply the samples/stamp.pdf as overlay


//for the extracted report
def inputFile = ctx.extractFilePath

def overlayOptions = "samples/stamp.pdf \"$inputFile\" \"$inputFile\""

log.info("overlayOptions = $overlayOptions")

def ant = new AntBuilder()


ant.exec(outputproperty:"cmdOut",
errorproperty: "cmdErr",
resultproperty:"cmdExit",
failonerror: "false",
executable: 'java') {

44
Using scripts to achieve more

arg(line:"-cp $pdfBoxClassPath org.apache.pdfbox.Overlay $overlayOptions")


}

println "return code: ${ant.project.properties.cmdExit}"


println "stderr: ${ant.project.properties.cmdErr}"
println "stdout: ${ant.project.properties.cmdOut}"

exec_pdftk_background.groovy
Using this sample script, DocumentBurster can apply a PDF watermark to the background of the output
burst reports. The script should be executed during the endExtractDocument report bursting life cycle
phase. The script is using the samples/stamp.pdf to be applied as a background to the output burst
reports. It is easy to customize the background operation with a different custom stamp.

Edit the script scripts/burst/endExtractDocument.groovy with the content found in


scripts/burst/samples/exec_pdftk_background.groovy and then burst a new report.
Now, every time a report is burst, the output files will be stamped with the samples/stamp.pdf file.

The following code should be self explanatory. For customizing the background stamp please replace the
existing samples/stamp.pdf with a different custom file.

import com.smartwish.documentburster.variables.Variables

/*
*
* 1. This script should be used:
*
* 1.1 - As a sample script to call an external executable during the report burst
* life cycle.
* 1.2 - As a sample for applying a PDF watermark to the background
* of the burst reports.
*
* 2. The external program to be demonstrated is pdftk
* http://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/
*
* 3. pdftk or the pdf toolkit is a cross-platform tool for manipulating
* PDF documents. pdftk is basically a front end to the iText library
* (compiled to Native code using GCJ), capable of splitting, merging, encryptin
* decrypting, uncompressing, recompressing, and repairing PDFs.
* It can also be used to manipulate watermarks, metadata, and to fill PDF Forms
* with FDF Data (Forms Data Format) or XFDF Data (XML Form Data Format).
*
* 4. The script should be executed during the endExtractDocument
* report bursting lifecycle phase.
*
* 5. Please copy and paste the content of this sample script
* into the existing scripts/burst/endExtractDocument.groovy
* script.
*
* 6. For a full documentation of the PDF background capability
* please see
* http://www.pdflabs.com/docs/pdftk-man-page/#dest-op-background

45
Using scripts to achieve more

*
*/

def extractFilePath = ctx.extractFilePath


def stampedFilePath = ctx.extractFilePath + "_stamped.pdf"

//apply the samples/stamp.pdf as a background


//to the extracted report
def execOptions = "\"$extractFilePath\" background samples/stamp.pdf output \"$sta

/*
*
* 1. Please download and install pdftk from this location
* http://www.pdflabs.com/docs/install-pdftk/
*
* 2. Make sure to download the binaries which are
* specific to the target operating system.
*
* 3. Move the pdftk.exe and libiconv2.dll in the folder
* where DocumentBurster was installed, next
* to DocumentBurster.exe file.
*
*/

def ant = new AntBuilder()

log.info("Executing pdftk.exe $execOptions")

//http://groovy.codehaus.org/Executing+External+Processes+From+Groovy
ant.exec(outputproperty:"cmdOut",
errorproperty: "cmdErr",
resultproperty:"cmdExit",
failonerror: "false",
executable: 'pdftk.exe') {
arg(line:"$execOptions")
}

ant.move(file:"$stampedFilePath", tofile:"$extractFilePath")

Further reading
• Groovy documentation [http://groovy.codehaus.org/] - general Groovy docs which will help for writing
better DocumentBurster scripts.

• Ant documentation [http://ant.apache.org/manual/tasksoverview.html] - In case there is a need to copy,


mkdir, move, delete files and folders. Ant can also be used for sending emails from within scripts or to
FTP and SCP files using SSH.

• AntBuilder documentation [http://groovy.codehaus.org/Using+Ant+from+Groovy] - Using Ant from


Groovy.

• Commons VFS documentation [http://commons.apache.org/vfs/filesystems.html] - WebDAV script-


ing, in case there is a need to upload reports to Microsoft SharePoint or to other portal product. Com-

46
Using scripts to achieve more

mons VFS can also be scripted to copy reports to a network shared drive or to upload the reports to
FTP and SFTP servers.

47
Chapter 8. Command Line
DocumentBurster can be executed in command line mode so that it can be integrated and automated from
existing legacy software systems. All the features of the program are available through command line.

Note
Before running DocumentBurster in command line, the software should be properly configured.

Bursting reports
Windows

Following is the syntax for running the program:

documentburster.bat –f <filePathOfTheFileToBurst>

For example the command:

documentburster.bat –f samples\burst.pdf

will burst the burst.pdf file located in the samples folder.

Unix/Linux/MacOSX

Following is the syntax for running the shell script:

./documentburster.sh –f <filePathOfTheFileToBurst>

For example the command:

./documentburster.sh –f samples/burst.pdf

will burst the burst.pdf file located in the samples folder.

Merging reports
Windows

Following is the syntax for running the program:

documentburster.bat –m <"filePathOfTheFileToMerge1#...#filePathOfTheFileToMergeN"> [-o


<mergedFileName>] [-b]

• –m <"filePathOfTheFileToMerge1#...#filePathOfTheFileToMergeN"> - Mandatory argument. # sepa-


rated list of the PDF reports to merge.

• -o <mergedFileName> - Optional argument. The name of the output merged file. If it is not specified
then the merged.pdf file name is assumed by default.

• -b - Optional argument. Optional switch which specifies if the resulted merged file should be also burst.

For example the command:

documentburster.bat –m "samples\merge1.pdf#samples\merge2.pdf" -o testForMerge.pdf -b

48
Command Line

will first concatenate the two files merge1.pdf and merge2.pdf in a file called
testForMerge.pdf (-o) and will also burst the resulted file (-b).

Unix/Linux/MacOSX

Following is the syntax for running the program:

./documentburster.sh –m <"filePathOfTheFileToMerge1#...#filePathOfTheFileToMergeN"> [-o


<mergedFileName>] [-b]

Polling for incoming reports


Windows

Following is the syntax for running the program:

documentburster.bat –p <pathOfTheFolderToPoll>

For example the command:

documentburster.bat –p poll

will start polling the folder poll for incoming reports to process.

Unix/Linux/MacOSX

Following is the syntax for running the shell script:

./documentburster.sh –p <pathOfTheFolderToPoll>

For example the command:

./documentburster.sh –p poll

will start polling the folder poll for incoming reports to process.

49
Chapter 9. Auditing & Tracing
Logging
When dealing with reports and financial documents it is important to have a good mechanism for auditing
and tracing the possible problems.

DocumentBurster has support for logging all of the activities and for tracing back the reports which failed
to be distributed.

In the menu please go to Actions -> Merge, Burst and Trace… -> Logging, Tracing...

• View Current Log File - Open the current active log file.

• View All Log Files - Browse all the log files.

• View Quarantined Files - Browse all the quarantined files.

• Current Running Jobs - List with the details of jobs which are currently executing.

• Queue status and quarantine status - At the bottom of the screen there is a status bar about the running
jobs and about the quarantined files. Green color means that there are no failing reports while red color
means that at least one report was quarantined.

By default the program is logging all the exceptions and some informational events. This is for making the
log files easier to read. If required, DocumentBurster can be configured to generate much more detailed
log files. To do this please edit log4j.properties file and do the following changes:

50
Auditing & Tracing

log4j.category.com.smartwish.documentburster=info, stdout,R

Do the bold change

log4j.category.com.smartwish.documentburster=debug, stdout,R

This change will generate more logs which can be used for tracing of possible problems.

DocumentBurster can automatically send an email whenever some problem is happening.

It is possible to configure the program to send emails whenever a problem appears. For doing this please
go in log4j.properties and give the email account details in the following places:

log4j.appender.EMAIL=org.apache.log4j.net.SMTPAppender
log4j.appender.EMAIL.SMTPHost= smtp host
log4j.appender.EMAIL.SMTPUsername= user name
log4j.appender.EMAIL.SMTPPassword= user password
log4j.appender.EMAIL.From= from email address
log4j.appender.EMAIL.To= destination email address
log4j.appender.EMAIL.subject= email subject

DocumentBurster can quarantine the reports which failed to be delivered.

It is crucial for the software to properly distribute all the documents to the correct destinations. However,
sometimes because of various reasons the distribution of the documents might fail – this might happen
because the email server connection details are not correct or because the server itself is down or maybe
the SSL settings are not accurate. Whatever will be the failing reason, the program can be configured to
quarantine the failed documents to a folder with the same name. This way the failed documents will be
traced at a later point in time in order to take a decision (either to distribute again or to do something else).

Note
For tracing and auditing purposes, DocumentBurster has the log files and keeps all the generated
reports in a configurable folder. This might be enough at the beginning or when distributing
few reports a month on an adhoc basis. On the other hand, if more serious things are required
and a distribution server is deployed to be used by multiple employees, in such a case a more
capable reports archiving and indexing solution should be implemented. DocumentBurster Server
is capable of storing all the distributed reports in a central database or, in more advanced scenarios,
can be integrated with a specialized content management platform for storing and archiving a
large set of reports. By integrating DocumentBurster Server with an advanced search platform
it is possible to get powerful full-text search, hit highlighting, ranked searching -- best results
returned first and other indexing and search like capabilities on top of a big archive of reports.

51
Chapter 10. Trouble Shooting
Restriction on maximum 25 burst reports
The free version of DocumentBurster has a restriction of maximum 25 burst output reports being generated
and/or being distributed. For bursting and generating an unlimited number of reports please purchase
the commercial version of DocumentBurster [http://www.pdfburst.com/purchase.html] .

Please feel free to send an email to support@pdfburst.com with any additional DocumentBurster related
question which you might have.

Issues running basic features


If you don’t know from where to start or you have problems in running the basic features of the program
please read the DocumentBurster in 2 Minutes - reports-bursting-quickstart.pdf tutorial guide.

Old Java - UnsupportedClassVersionError ex-


ception
If on the console or log file there is an exception similar with :

exception in thread "main" java


.lang.UnsupportedClassVersionError:test (unsupported major.minor
version 49.0)
at java.lang.ClassLoader.define(Class0(Native
Method)
at java.lang.Clas

This is happening when the program runs with an ancient java version (<1.5) Please read the Document-
Burster in 2 Minutes - reports-bursting-quickstart.pdf document and double check the version of java under
which DocumentBurster is running. It is advisable to run DocumentBurster on the newer version Java1.6

Sometimes the exception is still coming after the latest Java is installed. This is happening because
the old java is still installed and active on the computer. The solution is to edit the last line from
documentburster.bat (or the shell script under Linux/Unix) and to do the following change:

java -Djava.endorsed.dirs=./lib/endorsed -cp ./lib/burst/ant-launcher.jar


org.apache.tools.ant.launch.Launcher -buildfile ./config/burst/documentburster.xml -Darg1=%1 -Darg2=
%2 -Darg3=%3 -Darg4=%4 -Darg5=%5 -Darg6=%6

please do the bold change

"C:\Program Files\Java\jre6\bin\java.exe" -Djava.endorsed.dirs=./lib/endorsed -cp ./lib/burst/ant-


launcher.jar org.apache.tools.ant.launch.Launcher -buildfile ./config/burst/documentburster.xml -Darg1=
%1 -Darg2=%2 -Darg3=%3 -Darg4=%4 -Darg5=%5 -Darg6=%6

The proper path to the location where the latest java was installed should be provided. This change will
force DocumentBurster to run with the latest java.

52
Trouble Shooting

Windows - DocumentBurster.exe GUI is failing


Was the gtk-sharp prerequisite installed before running DocumentBurster.exe? Please read the Document-
Burster in 2 Minutes - reports-bursting-quickstart.pdf tutorial guide for checking all the required prereq-
uisites.

Windows - DocumentBurster.exe GUI still fails


Was the gtk-sharp prerequisite installed using the default values presented by the wizard, as per the Doc-
umentBurster in 2 Minutes - reports-bursting-quickstart.pdf guidelines? GTK runtime should be properly
exported and visible through the Windows %Path% variable. If required, please move the location of the
GTK runtime location in the %Path% variable to be near the front.

Windows - DocumentBurster.exe GUI still fails


Is the software executed from a shared drive? It should not. DocumentBurster.exe GUI can run only being
extracted and executed on the local machine.

Bursting issues 1
If you have bursting problems please check that you have configured the burst tokens in-between brackets.
For example {doc1} is a valid token.

Bursting issues 2
If you configured all the tokens properly but still the program is not working as you are expecting, than
you can enable the detailed logging. By looking at the detailed log file you can understand what is going
wrong. To enable the detailed logging please read the previous Chapter 9. Auditing & Tracing.

Variable values are not parsed properly


Sometimes variables defined like <0>some value</0> and up to <9>some other value</9> might fail
in getting parsed the proper values. Following is an example of the issue coming with MS Access, while
similar behavior might be observed with other report writers also.

The issue - Example of the issue coming with MS Access

I am using various MS Access reports to grab variable data using <0> text </0> . If I use a label for the
text and key it into the text box as <0> report id 100 </0> it works fine but if I drop a field onto the report
and then put the <0> and </0> in front and back of the field, it does not work.

The solution - And here is the solution for the previous MS Access behavior

When you drop the fields into an MS Access report you need to define any field you use as a variable as a
single field by concatenation. For example, let's say I have a field named "date" and place it on the report
with a text box of <0> in front and then place a text box of </0> at the end. This will not work. You need
to create one field (object) as follows: =" <0> "text" </0> ". Now it will work.

More details

If the start and end tags (eg. <0> and </0> ) are statically defined, while the content inside is a dynamic
variable length field or report formula, what is happening when the report is being generated is that the
dynamic content will grow and will start to overlap with the static tags (eg. <0> and </0> ). This might

53
Trouble Shooting

cause problems when DocumentBurster is parsing the variable values. Please see the following screenshot
in which the "Tuesday" hidden text was generated by a date field/formula which expanded its length and
started to overlap the start <1> tag. In this case the text which is extracted by DocumentBurster is a messy
Tues<da1>y and as a result the variable value is not properly parsed. The solution to this problem was
described above.

Email is failing
If you have problems in getting the email working please:

• Double check the email server account connects details

• If your organization is using MS Exchange as email server and Outlook as email client, in this situation,
you will need to give the same account server details which are already configured in your Outlook
email client.

• If your organization has any firewall or antivirus software in between DocumentBurster and the email
server they need to be made aware and configured to allow DocumentBurster sending emails as a good
and a trustable citizen.

• A network/IT administrator can assist if you find further difficulties in configuring the settings.

Email still fails


Check the logs on the email server side. If your organization is using MS Exchange as an email server,
please check the Exchange logs and see if the server was reached and what problem was encountered.

Email is still failing


DocumentBurster can be configured to log more details about the SMTP communication. Please edit
the file config/burst/settings.xml and do the bold change <emailserver><debug>false</de-
bug></emailserver> change to <emailserver><debug> true </debug> </emailserver>. Running the pro-
gram again should give more details related with the email communication in the DocumentBurster log file.

FTP issues
If you have problems in getting the FTP working, please make sure that you are giving the proper FTP url
details as described in the Chapter 3. Distributing reports.

Windows - DocumentBurster Server Console is


failing to start
When server/startServer.bat script is executed it is flashing up the cmd box then it disappears.

54
Trouble Shooting

Solution

• Are all the prerequisites in place? Please read the DocumentBurster in 2 Minutes - reports-burst-
ing-quickstart.pdf tutorial guide for checking all the required prerequisites.

• The system should run on Java >=1.6 ( java -version DOS command should return >=1.6)

• Please try to start the server again. Did you shut the server properly from previous runs by using the
server/shutServer.bat script? For future, please stop the server properly by using the script
server/shutServer.bat .

Windows - DocumentBurster Server is not


properly processing the submitted burst jobs
Jobs submitted through the desktop GUI are working properly.

Solution

• Are all the prerequisites in place? Please read the DocumentBurster in 2 Minutes - reports-burst-
ing-quickstart.pdf tutorial guide for checking all the required prerequisites.

• The system should run on Java >=1.6 ( java -version DOS command should return >=1.6)

Windows - DocumentBurster Web Console is


failing to start
The link http://localhost:8080/burst is not working on the local machine. When web-con-
sole/startConsole.bat script is executed it is flashing up the cmd box then it disappears.

Solution

• Are all the prerequisites in place? Please read the DocumentBurster in 2 Minutes - reports-burst-
ing-quickstart.pdf tutorial guide for checking all the required prerequisites.

• The system should run on Java >=1.6 ( java -version DOS command should return >=1.6)

• Was the DocumentBurster Server console started before using the server/startServer.bat
script? DocumentBurster Server console should be started before the web console is started.

• At least one of %JAVA_HOME% or %JRE_HOME% variables should be properly defined on the system
(either echo %JAVA_HOME% DOS command should return a proper JDK 1.6 installation path or echo
%JRE_HOME% DOS command should return a proper JRE 1.6 installation path)

Forum community help


If you don’t find an answer to your questions in the documentation, there is also DocumentBurster
Forum Community Help available here. [http://sourceforge.net/projects/documentburster/forums/fo-
rum/769838]. Please feel free to ask your questions on the forum.

Professional support
Purchasing DocumentBurster is automatically giving access to quick support. If you are looking for profes-
sional DocumentBurster support, please send an email describing your problem to support@pdfburst.com

55

Vous aimerez peut-être aussi