Use Inkscape and XSLT to Create Cross-Platform Reports and Forms
Author Note: This was originally published in the April 2007 edition of Linux Journal. I am replicating it here for posterity.
I work for a small health care company named NovaSys Health developing application software. My colleagues and I are responsible for writing software to process health care claims, manage workflow and make the company as efficient as possible. We recently decided to replace a piece of third party software that took health care claim data and put it on standard claim forms. The software would take the data and transpose it into PDF files that we stored on a large file server. Each PDF contained one claim on it’s proper form. We made the decision to replace the software because we needed something that was more agile. Since we already had the data stored in a database we wanted something that would create the claim image dynamically and not use space on our servers.
Health care claims are very intricate. There are lots of boxes and boilerplate text that have to be drawn. The conventional way to do this with a software application would be to draw a series of lines using coordinates and lengths. Then lay the static and dynamic content on top of the newly drawn lines. The process of programming this is long and tedious, not to mention error prone. We wanted something that was easier to create and maintain. Here was our list of requirements:
- Must be able to print high quality versions of the claims.
- Claims must be accessible from a web browser.
- Has to be programming language independent. We use Python, PHP, PERL and Java. The images need to be created using any of these languages.
- Have to be able to convert the claim data and form into several different file formats, specifically PDF and PNG.
- Entire solution must be platform independent.
After reviewing the requirements we looked at several different open and closed sourced options. None of them met all of our requirements. So then we turned to creating our own solution. We tried scanning a blank claim form and using ImageMagick to put the claim data on them. This almost gave us what we wanted. The problem was it was going to be tedious and redundant to create the solution in all of the required languages. Next we turned to FOP (Formatting Objects Processor). This solution was closer to what we wanted. There was a problem with it though, it would take too long create the claim forms. Plus the solution was not really language independent either. We could have written wrappers for it but we were convinced that there was still a better solution.
While exploring the FOP solution we had the idea of using Scalable Vector Graphics (SVG). Basically we would take a SVG image of the claim form and make it into an XSLT (eXtensible Stylesheet Language Transformation) since the SVG format is a special XML format. Then we would pull the claim data from the database and put it in a standard XML format. Using any of our languages we could then take the XSLT and the XML and create a SVG image of the claim. This solution met all of our requirements. It was language and platform independent. We could print the SVG images and embed them into web pages. Furthermore, since SVG images are just text they can easily be converted into different file formats. Another nice feature of this solution was the small file size of the SVG images. If we wanted to archive the images they would take up a fraction of the space the old solution did. Again being text, the images could be compressed and save even more space.
Creating the Master SVG
One of the things that made the SVG solution so appealing was how easy it would be to create and maintain the master SVG image. To do this we would use Inkscape. Inkscape is a SVG authoring tool that will work on Linux, Mac OSX, Windows and other UNIX like operating systems. There are other SVG authoring tools available, we chose Inkscape because it is open and it is in the package manager for most Linux distributions.
The first thing we did to create the master SVG was open Inkscape and create a new US Letter size document. To keep thing organized we created four layers in the new document; scan, overlay, boilerplate and dynamic text. Using the scan layer we imported a scan of a claim. Doing this allowed us to line everything up on the Inkscape stage without having to measure anything. After importing the image we locked the layer so that it could not be accidental modified. Actually, after we were finished with each layer on the SVG we would lock it to ensure it did not get tampered with.
Next we used the overlay layer to trace all of the lines and boxes from the original claim that we imported. This step was a little tricky. When the image we scanned was created the lines were not spaced evenly for one reason or another. We decided to line things up correctly on our version. Fortunately, Inkscape has tools to do this automatically. By selecting all of the objects that need to be spaced out (Shift+Left Click) and using the “Align and Distribute” dialog (Object > Align and Distribute… in the menu) Inkscape fixed the spacing issues. When we finished we has something that looked like Figure 2.
After drawing all of the lines it was time to add all of the boilerplate text. For this we used the aptly named boilerplate layer. Before we got started we decided to hide the scan layer because the text was not aligned correctly due to us adjusting the grid lines on the overlay layer. To align the text properly we used the Guides in Inkscape. Guides are exactly what their name suggest. They are guide lines that exist only inside of Inkscape for the purpose of aligning objects. To use a guide line just click the top or left
hand margin and drag the line into place. To get the most out of the guide lines we enabled the “Snap points to guides” feature (File > Document Preferences > Guides). Doing this allowed us to place all of the text exactlyin alignment. Look at Figure 3 to see what the SVG looked like after this step.
Finally we switched to the dynamic text layer and added place holders where the claim data will be located. Again we used the guides to align everything. For the place holder we used a single “$”. Then to make life easier when we converted this to an XSLT we renamed each of the dynamic text objects to something relevant. We did this by left clicking on the object and going to “Object > Object Properties…” in the menu. Figure 4 shows the final master SVG with the guide lines.
Creating the master SVG took about four full hours of work. I would venture to guess that it would have taken several days to do this programmatically.
Converting the SVG to an XSLT
Once we had the master SVG finished it was time to convert it into an XSLT. Since SVG images are just XML files we added all of the XSLT markup with a text editor. Converting the SVG was rather simple matter. To make it a true XSLT there are only a few lines that are required in the header. Example 1 shows the a few lines of the SVG before we modified it. Example 2 shows the same set of lines with the XSLT markup.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!-- Created with Inkscape (http://www.inkscape.org/) -->
<svg
...
</svg>
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!-- Created with Inkscape (http://www.inkscape.org/) -->
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/claim">
<svg
...
</svg>
</xsl:template>
</xsl:stylesheet>
As you can see there are two new lines. The first new line declares this file an XSLT. The second new line contains an XPath (XML Path Language) expression that matches the root element in our claim data XML. This line will tell the XML transform engine where to start reading the XML to do the conversion. The last two new lines simply close the open xsl tags.
At this point the XSLT can be used, in conjunction with our claim data XML, to produce a SVG. However, the resulting SVG would look just like the SVG did before we modified it. To make it actually show the claim data we had to go into the SVG and add all of the XPath expressions to populate the SVG. Since we divided the SVG objects into layers we only had to modify the dynamic text layer. In the SVG XML the dynamic text layer is nothing more than a series of text tags. Example 3 shows the text tag for the Patient’s City box on our claim form.
<text
xml:space="preserve"
style="..."
x="33.237278"
y="231.77995"
id="textPatientCity"
sodipodi:linespacing="125.00000%"
inkscape:label="#text7272">
<tspan
sodipodi:role="line"
id="tspan7274"
x="33.237278"
y="231.77995"><xsl:value-of select="patient/address/city"/></tspan></text>
When the XSLT is applied to the claim data XML the value of “/claim/patient/address/city” will be substituted here. We went through the entire XSLT and added the appropriate XPath expressions where they belonged. In special cases we also added XPath conditional logic and formatting rules. See the resources for more information about XPath.
The Claim Data XML
As I mentioned all of our claim data was in a database, A Postgres database to be more specific. Since we wanted a solution that was not language specific we had to devise a way to get the claim data out of the database and into an XML format without depending on a specific programming language. Greg Lindstrom, our resident Python expert, had the idea to write a series of PL/pgSQL functions to return a single XML string that contained the XML data. His solution was brilliant. All we needed to do to get the claim data was run one small query with the claim number (see Example 4). The result was well formatted XML that we used to make claim images.
SELECT xe2_claim('09152006A5226');
Displaying the Final SVG in a Browser
At first the primary point of creating this solution was to display claims in our web interface. All of our web applications are written in PHP5 and run in a Apache/mod_php environment. To do the XSLT transformation we used the XSL functions in PHP. These set of functions come as an extension to PHP. The extension is a front end to libxslt C library.
The XSLT extension makes doing the transition easy. Example 5 shows a portion of a PHP script that transforms the claim XML into a SVG and displays it in the browser.
// import the SVG XSLT
$xsl = new XSLTProcessor();
$xsl->importStyleSheet(DOMDocument::load("svg_xslt.xsl"));
// load the claim data XML
// $claim is the database result from example 4
$doc = new DOMDocument();
$doc->loadXML($claim);
// tell the browser this is a SVG document
header("Content-Type: image/svg+xml");
// print the SVG to the browser
echo $xsl->transformToXML($doc);
Example 5 is a simplified version of our solution. In our solution there is the possibility of having multiple pages for a single claim. To fix this we had to do multiple transformations, one for each page. To get the multiple page claims to display in the same browser window we had to embed them. This can be done using the “embed” and “object” HTML tags. Do note that there are several issues with browser compatibility when using these tags. To solve the compatibility issues we wrote a script that checks the user’s browser and decides which tag to use. Then we set the target object data/embedded source to a script similar to the one in Example 5. This allowed the web browser to display multiple SVG images in the same window.
There are other considerations that must be made when using SVG images in a web browser environment. Internet Explorer does not have native support for SVG images. The user is forced to use a third party plug-in to display the images. Adobe provides one of these for free. Mozilla Firefox has built-in support for SVG images starting with version 1.5. However, there are several aspects of SVG images that Firefox does not support such as scaling and grouped objects. Fortunately for us all of our user’s use an up to date version of Firefox.
Printing and Archiving the SVG Images
Once we finished the web end of our solution we turned our sights towards the rest of our integration. This meant that we had to print the SVG images and find a way to archive them. Since all of our back-end software is written in Python it also meant that we had to do the XML transformation in a different language. To do all of the XML work we used the 4Suite XML API.
To print the images we again turned to Inkscape because our postscript printer drivers did not know what to do with the SVG images. Inkscape has a handful of command line only options. The one we used to print is the -p option. This combined with the “lpr” command allowed us to print our images without any user interaction. Example 6 shows how we did the same transform we did in Example 5. It also shows how we called Inkscape to print our claim images.
from Ft.Xml.Xslt import Processor
from Ft.Xml import InputSource
from Ft.Xml.Domlette import NonvalidatingReader
// load the claim data XML
// claim is the database result from example 4
doc = NonvalidatingReader.parseString(claim, "http://spam.com/doc.xml")
// load and process the XSLT
xsl = InputSource.DefaultFactory.fromUri("file://svg_xslt.xsl")
processor = Processor.Processor()
processor.appendStylesheet(xsl)
// do the transformation
result = processor.runNode(doc, "http://spam.com/doc.xml")
// write the SVG to a file
f = open("/tmp/"+ claim +".svg", "w")
f.write(result)
f.close()
// print the image on the default printer
os.system("inkscape /tmp/"+ claim +".svg -p '| lpr")
Earlier I mentioned we often have multiple pages per claims. When printing this was not an issue, we would just send each page to the printer as a seperate job. When it came to archiving we had to do something different. Like the web interface we had to group the pages, this time into a file not a web browser. When archiving we had store the files in PDF format because that is what our clients wanted. To get the images into a PDF and combine the multiple page claims we used Inkscape and Ghostscript.
Like printing, Inkscape has an option to export a file into postscript format. Instead of using “-p” we use “-P” and pass Inkscape the desired output file name. After all of the pages of a claim have been written to files we use the following Ghostscript command to put the pages into a single PDF and archive them.
gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=out.pdf /tmp/foo1.ps /tmp/foo2.ps
Maintaining the Forms
Shortly after we finished the project we were faced with making two rounds of changes to the layout of the form. The first round of changes were changes to the positioning of text objects. The second round was far more extensive, we had to draw a series of new boxes on the form to accommodate a new identification system. Since we could not open the modified SVG in Inkscape we had to make our changes to the original SVG and then manually apply them to the XSLT version.
At first we thought making the changes would be hard and tedious but as it turned out the process was simple. For the first round we simply made the changes in the master using Inkscape careful to keep a note of the objects we changed. Then using a text editor we replaced the old portions of XML with the new ones in the XSLT. Since the second batch of changes were additions only we decided to just make another layer in the master and add the boxes. When we finished we simply copied the layer into the XSLT using a text editor.
Conclusion
From start to finish our project took a little over a month to design, build, test and publish. Our solution has made all of applications more agile and effective. We have also saved Terabytes worth of storage space on our servers as well.