Vous êtes sur la page 1sur 26

Cyrion Technologies PCL Corner

Capturing printer
data streams Version 1.3 (22.05.2007)

Dipl. - Ing. Joachim Deußen

Overview
Windows
UNIX
Macintosh
Port redirection
Network capturing
Network tools

Copyright © 2005 by Dipl. - Ing. Joachim E. Deußen. All rights reserved.


All trademarks belong to their respective owners.

Joachim E. Deußen Page 1 22.05.2007


Cyrion Technologies PCL Corner

Overview
When you supporting printers or you are creating a project involving printers it
becomes necessary to exactly know which data is send to a printer.

This data normally is called a printer data stream.

Sometimes people refer to this data as the print job, but we will see later, that
this is not always, what we want. We are really only interested, what data
arrives at the printers inputs like USB-, parallel- or network-ports.

As discussed in another article only this data stream can be used for deep
analysis of a printing problem other than the bare connectivity. So we must
differentiate between two problems:

1. No data arrives at the printers input ports Æ connectivity check


2. Wrong data arrives at the printers input ports Æ data stream analysis

This article discusses the various possibilities to get such a printer data stream
for analysis. The analysis itself is discussed elsewhere.

Drain

Printer
Data Stream Printer

Source

Printer
Data Stream
Port Connectivity
Application

Printer Driver

User
Document

First let us set a basic workflow that is more or less used on all operating
systems:
1. Document
2. Application
3. Printer driver or similar operating system instance
4. Connectivity via a virtual port (USB, parallel, FireWire, Network etc.)
5. Printer

Joachim E. Deußen Page 2 22.05.2007


Cyrion Technologies PCL Corner

Or in words: A document is send from an application to an operating system


instance that creates a printer data stream. This instance normally is
something called a printer driver. The data stream then is send using some
kind of inter-connectivity to the printer, where the actual output (the pages)
is produced.

Since we are interested in the data stream we have to look at (3) for that. This
is the point where the actual data stream comes into existence. It is
transferred using some inter-connectivity method to the printer, where it is
converted into printed pages.

So to get the data stream we have to intercept it between (3) and (4) or
between (4) and (5). Let us call the first point the source and the second
point the drain.

Capturing directly at the source is the most convenient way to get those
printer data streams. Often features of the generating application or printer
driver can be used to re-direct the data stream into a file that then can be
transported for analysis.

Joachim E. Deußen Page 3 22.05.2007


Cyrion Technologies PCL Corner

Windows

Print to file
Almost known to everybody is the possibility of Microsoft Windows to print to a
file from certain applications. You will most often find such an option in the
print dialog of the application – such as Microsoft Word etc.

Joachim E. Deußen Page 4 22.05.2007


Cyrion Technologies PCL Corner

When you select that option, the printer data stream will be produced and
the file will be re-directed to a file system file, regardless of the original port
the printer driver is connected to.

Print to FILE:
If you have an older or proprietary application that has no Print-to-file option
in its print dialog there is always another alternative to print to a file:

The virtual printer port called FILE:

Printer
Port Data Stream
Printer Driver

Application

User/Document

If you look in the list of available ports (to be found in the printer drivers
properties page; on the tab <ports>), you will find in all versions of Windows a
port called FILE:

Upon selecting this port for a printer driver, you can print from any application
to this printer and it will ask you for a destination where the printer data
stream will be saved to.

If you redirect the printer data stream to a FILE port you must ensure, that you
write printer data and not windows data to a file. Windows print data – also
called EMF (enhanced meta file) is the default, if you print to a network
printer attached not directly to your printer, but to a printer server.

Please note that the EMF passing the printing sub-system is not the standard
EMF as defined in public documents by Microsoft:

Joachim E. Deußen Page 5 22.05.2007


Cyrion Technologies PCL Corner

You can probably locate the information about the EMF structs in the spool file
format. But note that one major caveat with this approach is that since the
spool file format is MS proprietary, it might change in future releases without
notice. So if your driver is dependent on a particular spool file format, it could
break on future releases. This is the reason for us recommending that you not
rely on the spool file format. //Ashwin, Microsoft

So if you grab EMF by incident (or on purpose) the chance of having a


proper file is very, very low!

On Windows XP and up, you will always get RAW printer data if you print to
FILE: regardless if you set the print processor to RAW or EMF.

So the following procedure (even though documented from within Windows


XP applies only to Windows NT 4 and Windows 2000 based computers).

So check first if the print processor WINPRINT is set to RAW and not to any of
the EMF data types!

Joachim E. Deußen Page 6 22.05.2007


Cyrion Technologies PCL Corner

Point-and-Print redirection:
If you print in a corporate environment then Windows workstations normally
use a Windows printer server to spool their printer data streams. This method is
also called point-and-print.

To return control faster to the calling application the Windows workstation is


not generating the target printer language, but an intermediate language
called enhanced metafile EMF (See the above notice on the true nature of
this file format).

Joachim E. Deußen Page 7 22.05.2007


Cyrion Technologies PCL Corner

This EMF file then is spooled to the printer server, where the actual conversion
from windows internal representation (EMF) to the target printer language
(RAW) is performed.

If we like to capture now a data stream from a Windows workstation, we can


use the same technique as described in the prior paragraph to re-direct the
data stream to a file using the virtual printer port FILE: on the Windows printer
server.

On true point-and-print connections (also called RPC connections,


established between Windows NT-based computers) the popup-dialog for
the filename is shown on the client computer and not on the server. Also the
file is saved in RAW format on the client and not on the server!

If you use SMB connections (from Windows 9x based computers or by using


the “Local Port” for connecting to the shared printer) the dialog box is shown
on the server and the file is also saved on the server.

Print Services for UNIX:


Unknown to most people is the fact, that Windows NT based operating
systems offer an LPD service. For Windows NT 4.0 this is called “TCP/IP Print
Service” while newer versions (Windows 2000, Windows XP and Windows
2003) call it “Print Services for UNIX”.

The Windows Print services are the first drain printer data stream capture
method discussed in this paper.

Joachim E. Deußen Page 8 22.05.2007


Cyrion Technologies PCL Corner

You can install the LPD-Service when you select the following:
• START
• Settings
• Control Panel
• Add or Remove Programs
• Add/Remove Windows Components

Navigate to the option “Other Network File and Print Services” and select
then [Details].

Now check “Print Services for UNIX”. Confirm the selection with [OK] and then
choose [Next >] to start the installation of the additional components.

Joachim E. Deußen Page 9 22.05.2007


Cyrion Technologies PCL Corner

After completion, the LPD-Service is installed, but not started. To enable the
service, you must open the Computer Management Console (to do this,
choose “Manage” from the pop-up menu of “My Computer” and then select
“Services”)

Either start the service manually or select “Automatic” for the “Startup type”.

Now, how to use the TCP/IP Print Server? By default the LPD server is reacting
on all local IP interfaces on port 515. This means, if you have more then one
network card or more than one IP address assigned to a network card, the
LPD services is accessible from all these interfaces.

Joachim E. Deußen Page 10 22.05.2007


Cyrion Technologies PCL Corner

If you ever have configured a LPR port, you know that you need an IP
address and a printer - or queue - name for this. Windows NT4 uses the names
of shared printers for this.

Imagine you have installed a printer “HP LaserJet 4100” in your printer-and-fax
folder.

So if you have shared a printer to the windows network under the name
“HPLJ4100” and then you enable the TCP/IP Print Service, you can also print
to that printer with any LPR Port using the IP-Address of the sharing computer
and the printers share name as the queue – name.

On Windows 2000/XP/2003 and up there are some extensions to this


behaviour:
• You can alternatively use the printer name “HP LaserJet 4100” as a
queue-name (take care that your LPR can handle spaces in queue-
names or rename the printer accordingly).
• The printer does not need to be shared anymore to be accessed.
• If a printer is shared anyway, you can use both share-name and
printer-name.

Since the LPD Service is conform to RFC 1179 you can print from any
operating system having an LPR port to that service:

From other windows workstations or servers


• from Mac OSX,
• from any UNIX,
• from an AS/400 or
• from an application like SAP R/3 or anything else.

Joachim E. Deußen Page 11 22.05.2007


Cyrion Technologies PCL Corner

Now recalling our knowledge from chapter (4) we know that the shared
printer itself can be re-directed to print to a file by using the virtual printer port
FILE: on the Windows NT-based server or workstation that is offering the TCP/IP
print service a.k.a. LPD service.

Joachim E. Deußen Page 12 22.05.2007


Cyrion Technologies PCL Corner

Thus we can use this method for instance to capture print jobs from any
operating system or server, that has not print-to-file option, like AS/400 or SAP
R/3.

Since this is not considered a point-and-print connection the save dialog-box


will pop-up on the server.

A rather clever method is like follows: You must capture the data stream from
an SAP R/3 application to an old printer with an IP-Address of “11.22.33.44”
and an LPD service running on port 515 and a queue-name of “RAW”.

Now do this:
• Disconnect the original printer from the network.
• Configure a laptop with the IP-Address of the old printer.
• Install a PCL printer like a HP LaserJet 4100 or similar.
• Use the FILE: port for this printer.
• Share that printer with the name “RAW” to the network.
• Enable or start the TCP/IP print server on the laptop.
• Connect the laptop to the network.
• Print from SAP R/3 to the <printer>.

Now you will be ask to enter a file name on the laptop, because the print is
not gone to the old printer but re-directed to your laptop and then to your
hard disk for later analysis.

MiniLPD:
There is another LPD Service that can be used on windows operating systems:
MiniLPD.

If can be found on a CANON web page (I don’t know if it is an official or


unofficial site; it seems to be run by some CANON engineers):
http://www.digitalissues.co.uk

Joachim E. Deußen Page 13 22.05.2007


Cyrion Technologies PCL Corner

Download MiniLPD.zip, create a new folder and unzip the program into that
directory. You can do this for instance by using the “Extract to folder …”
option of the WinZip context menu.

When you start MiniLPD you may receive an error message: Winsock Error
10078. This is because your “TCP/IP Print Server” (as discussed in the previous
chapter) is active and blocks access to port 515 which is needed for MiniLPD
to run. So stop the “TCP/IP Print Service” first!

Now you can start MiniLPD and it will listen to ANY queue name on EVERY
Interfaces port 515 of the computer.

You can test MiniLPD by sending a testjob using the Windows-own LPR.exe
command from within a command shell. If you have a file called
“testpage.prn” then use:

C:\> LPR –S <IP address> -P ANY testpage.prn

Where you replace <IP address> with one of your computers IP addresses. As
stated before, the queue name is not important, MiniLPD reacts on any name.

Joachim E. Deußen Page 14 22.05.2007


Cyrion Technologies PCL Corner

Now look in the folder where MiniLPD.exe is located: You will find two new
files.

The BIN-file is the actual print job and the CTL-file is the file with the LPD-
specific control commands. The BIN-file is what we want so just delete the
CTL-file.

MiniRAW:
There is also a tcp-raw capture tool accompanying miniLP, called miniRAW
now. This can be run on Windows OS only. It is listening on TCP -port 9100.

If can be found on a CANON web page (I don’t know if it is an official or


unofficial site; it seems to be run by some CANON engineers):
http://www.digitalissues.co.uk

Download MiniRAW.zip, create a new folder and unzip the program into that
directory. You can do this for instance by using the “Extract to folder …”
option of the WinZip context menu.

Now you can start MiniRAW and listen on TCP-Port 9100. There is no possibility
to change that port, but it is the far most common RAW-printing port, so there
should no need to do so.

You can test MiniRAW by just creating a Standard TCP/IP-Port port, assigning
it to a printer driver and then send a Test page down to it:

Joachim E. Deußen Page 15 22.05.2007


Cyrion Technologies PCL Corner

Now look in the folder where MiniRAW.exe is located: You will find a new file.

The BIN-file is the actual print job. This file is what we need. The naming
scheme is YYMMDDHHMMSS.bin.

cti:Downloader 2005:
If you need to capture raw IP transmissions like Port 9100 ports or LPD
communication on port 515, you can use the Downloader 2005 by CTi for this
purpose. This program features a raw port capture function for any port and
an LPD service.

First select a destination where the files should go and – if you like – select a
local network interface to listen to. By default all incoming communication
will be monitored.

Then select the capture method LPD or RAW and if necessary the port.

Joachim E. Deußen Page 16 22.05.2007


Cyrion Technologies PCL Corner

Now start the port monitoring and capturing with [Execute].

For every incoming data stream it will create a separate file. The naming
convention is according to the date and time set by your operating system.

Joachim E. Deußen Page 17 22.05.2007


Cyrion Technologies PCL Corner

If you capture LPD data, you can have one file that includes data and
control-commands, or two separate files (like with MiniLPD).

You must stop the capture by pressing the [Stop] button. The program can
not decide when to stop the capture and return to normal function.

Joachim E. Deußen Page 18 22.05.2007


Cyrion Technologies PCL Corner

UNIX
UNIX and all its relatives are ancient operating systems. A common printer
driver system is non-existing and thus we have a very basic approach to
printing at all. Normally the application produces some kind of printer data
stream and then sends this using the LPR, TELNET or FTP protocol to the printer.

There are many locations where these files can be captured, but since there
is no real printer driver system, it is currently impossible to show how this can
be accomplished in general.

Joachim E. Deußen Page 19 22.05.2007


Cyrion Technologies PCL Corner

MacOSX
On the Macintosh the Postscript language is the only supported printer
language. So you need a Postscript enabled printer to actually print
something. The operating itself does not support any other printing language
such as PCL5 or PCL 6. The Mac OSX standard print dialog offers by default
the possibility to produce a PDF- or Postscript-file.

Joachim E. Deußen Page 20 22.05.2007


Cyrion Technologies PCL Corner

Port redirection
Port redirecting is a method that is also known from NAT (Network address
translation) gateways. I.e. that one computer is put in the middle of the
communication and forwards every data that is received on one port to
another port and vice versa.

In NAT gateways it allows for many computers to use one outgoing (internet)
line. By using Port redirectors in a printing environment we can capture the
data stream from the source (the computer system) to the drain (the printer).

Port redirection is a drain capture method, since it requires reconfiguring of


the source computer system or the replacement of the drain printer system,
but in contrary to the above introduced methods, we can capture all
communication protocols with this, not just RAW-IP and LPR/LPD data streams.

5001

RAW

RAW

5001
5001

RAW

In the above illustration you see a host printing an IPDS data stream to an
IPDS-enabled printer on Port 5001. To capture this data stream you can insert
a port redirector on a laptop that is capturing the data stream, saving it to
hard disk and forwarding all communication directly to the printer.

This forwards also all bi-directional TCP communication and in the same time
you not only get the data stream, but the matching printout.

Please note that currently there are no UDP port forwarders available!

RelayTCP:
DLCsistemas (http://dlcsistemas.com) has a freeware application called
RelayTCP. This can act as a port redirector.

The command line version can be used to capture one TCP communication

Joachim E. Deußen Page 21 22.05.2007


Cyrion Technologies PCL Corner

C:\> releaytcp <listenport> <remoteip> <remoteport> -d

For <listenport> insert the port you want to capture the data. <remoteip> and
<remoteport> are the printers port and IP-address. The –d option instructs
RelayTCP to save the data streams to a file in the same directory.

Example:
Original Capture
Host 10.0.0.15 10.0.015
Printer 10.0.0.200 10.0.0.199
Laptop 10.0.0.200

1. Change the IP-address of the printer to a free IP address Æ 10.0.0.199


2. Give the IP-address of the printer to the laptop Æ 10.0.0.200
3. Open a command prompt on the laptop and issue

Relaytcp 5001 10.0.0.199 5001 -2

4. Now print something to the printer. The data is send to the laptop,
captured to a file and forwarded to the real printer. All bi-directional
communication is also forwarded and the host reacts normally.
5. Now remove the laptop and change all back to the original state.

RelayTCP can also be used to implement more than one port forwarding
facility. You can install it as a service under Windows NT and above and
configure more than one port. Please read the documentation for instruction
how to accomplish this.

Interactive TCP relay – ITR


ITR from Imperva (http://www.imperva.com/) is a more visual implementation
of a TCP forwarder. To ITR instead of RelayTCP for the above example enter
the following settings:

To save the communication to a file, check “Save Log”. And if the


communication is time-critical, check “Don’t show messages” to set ITR to
stealth mode.

Joachim E. Deußen Page 22 22.05.2007


Cyrion Technologies PCL Corner

Network capturing
The previous described methods have either captured the printer data
stream at the source or at the drain. But what if we can not use either one of
the described methods?

If your printer is connected to a network one could use the direct network
data stream and extract the printer data stream from this. This method is
called network capturing.

The most common problem with today’s network capture software is the use
of switches as internetworking devices. A switch by design send incoming
data on one port only out on that port where the destination device is
connected.

So if we like to capture a data stream between the source (the computer


system) and the drain (the printer) then we must either replace the switch
temporary with a hub that send all the data to all the connected devices, or
we must persuade the switch to send the data also to our computer system
with the capture software running.

If you use small office switches you will have to use a hub for this. If you have
more sophisticated, manageable switches, they have normally the possibility
to fall back to hub-mode or feature a monitor port or promiscuous port. A
system connected to this monitor/promiscuous port receives all data packets
send through the switch.

Joachim E. Deußen Page 23 22.05.2007


Cyrion Technologies PCL Corner

Now there is a second step to be taken prior to using network capture


software: activating the network cards promiscuous mode!

Since network cards and hubs were designed the same time (when there
was no switch around) the designers build a packet filter into the card, to
ease the load of the higher levels of network protocols that where
implemented as software.

So only packets that are addressed to the card itself are delivered to the
higher protocol stack levels; all others are dismissed silently.

To use a standard computer as a network capturing station, this network card


filter must be bypassed: This is called promiscuous mode.

Only in promiscuous mode the network capturing software can see the
packets on the network, that are addressed to other computers and systems
attached to the network. Otherwise the software will only see the packets
addressed to the computer itself and this is the same as using the switch in its
native mode.

Joachim E. Deußen Page 24 22.05.2007


Cyrion Technologies PCL Corner

Sometimes the network capture software is able to do this mode switching by


itself, but this of course has its limits for the newer and newest network cards
especially if they are integrated on a motherboard. Because it means the
network capture software must know the card and how to do it.

So sometimes the network capture software uses a special driver like the
WinPCap driver for this.

But using sophisticated network capture software is far beyond the scope of
this paper. Please refer to the user manual of your favourite network capture
and analysis software for more information.

In the screen shot you will see an example of Packetyzer capturing some
network data:

In the addendum you will find some (free) network capture and analysis tools
that may help you in getting your data stream.

Joachim E. Deußen Page 25 22.05.2007


Cyrion Technologies PCL Corner

Network tools
Netcat – The network swiss army knife
http://www.vulnwatch.org/netcat/

Ethereal – A network protocol analyser


http://www.ethereal.com/

Packetyzer – Windows Interface for Ethereal


http://www.networkchemistry.com/products/packetyzer/

RelayTCP – TCP/IP relay software


http://www.dlcsistemas.com/

Interactive TCP relay – TCP/IP relay software


http://www.imperva.com/application_defense_center/tools.asp

MiniLPD – small footprint LPD-to-file capture tool


MiniRAW – TCP-RAW-communication (Port 9100) capture tool
http://www.digitalissues.co.uk

cti:downloader 2006 – multi-purpose printer troubleshooting tool


http://www.cyrtech.de/

Joachim E. Deußen Page 26 22.05.2007