Elements or Lower

Thu, 20 May 2004

Web Analytics in the CMS

CMS Watch has a recent feature advising vendors to build analytics into their CMS products.

It’s with no small amount of smugness that I realize my CMS for Woking is ahead of the game in this respect, as it contains a fairly substantial traffic analysis system built right into the administration area. Since the CMS revolves around the concept of a resource that may potentially have multiple URLs, it was essential that the CMS could log site traffic in the same terms.

Since the CMS delivers the site dynamically through the miracle of mod_perl, it was straightforward to write a custom log handler that would incorporate a selection of the properties of a delivered page as known to the CMS — the resource ID, the resource owner, the processing time taken to construct the page, and so forth. Having built the database with all this extra information, it was then a matter of writing the analysis code for the Administration Shell. For any period that they define, publishers can now see analysis for:

The log analysis is made all the better by judicious use of Leo Lapworth’s excellent SVG::TT::Graph to create on-the-fly graphs of log data in SVG format. This wasn’t without problems, however. On IE/Windows, when Adobe’s SVG browser plugin doesn’t feel it’s received an entire SVG file, it presents a grey box and the browser continues to “fetch” the file ad infinitum (for example, if the SVG file is actually a 404). Here in the studio, about a quarter of the graphs would “hang” like that, although a browser refresh would pretty much always clear it. The graphs were always being produced and delivered properly, however, so I was at a loss as to why sometimes the browser would behave like they weren’t. They’d render perfectly on Safari, for example, every time.

More worryingly, when the staff at the Council tried to look at the graphs, they’d always see the grey box. I couldn’t help thinking that whatever was playing up for me intermittently was being exacerbated for them by the fact that their web connections all run through a proxy server.

I found that when I saved out a graph as a static file, and got them to visit that, everything would be fine. Whatever was going wrong was down to the way I was generating the graphs dynamically and then outputting them directly through mod_perl. I’m guessing there’s an HTTP header that I wasn’t setting properly (although the MIME type was correct) — or perhaps the plugin needs to acquire the data using byteserving?

So, I changed the system to:

  1. Generate the graph for a specific set of parameters as normal

  2. Save it to a disk cache

  3. Issue an HTTP redirect to the cached file

This also meant that I could jump straight to stage (3) for a given set of parameters if the graph had already been created.

And it worked a charm — the graphs now load properly, for everyone, every time. So, like John Gruber, I thought I should document the issue for the sake of Google!