Elements or Lower

Mon, 30 Aug 2004

Woking’s CMS and the Content Management Landscape

David Wheeler’s new article for perl.com is a superb explanation of how Bricolage fits into the CMS landscape. I do wish more CMS vendors would summarise their products in that way: this is what we mean by workflow, this is how we expect you to author content, this is how your site gets delivered to the visitor. Often, wading through the swamp of buzzwords on CMS vendors’ sites leaves you feeling more confused than when you began. Sometimes, the executive summary summarises nothing.

When I started developing the new CMS for Woking Borough Council, I was aware of Bricolage, and somewhat concerned that it might have been better to customise it instead of writing a new system more-or-less from scratch. I’m now reassured that, whilst Bricolage is almost certainly a better piece of programming than my efforts, it occupies a slightly different position from the one we needed.

The key element here was that the existing Woking site contained a fair sampling of CGI applications that we wanted to preserve in the new site — things like the Planning Applications search, The Woking Forum, the events guide and the Council meetings system. These systems needed, of course, to present their content dynamically, but in a way nonetheless integrated into the CMS navigation and presentation system. Consequently, it seemed clear that the presentation aspect of the site itself should be dynamic, rather than generated as static pages by “publishing” content from the CMS administration area.

This means that there are four principal resource types within the Woking CMS:

  1. Text resources
  2. PDF resources
  3. Database resources
  4. External database resources

The first of these is the traditional web page — content is authored visually, using a highly customised version of HTMLArea, which the presentation system wraps into the site templates and navigation in a predictable way. PDF resources are PDF documents uploaded to the site and integrated as resources within the site navigation. Database resources are the CGI scripts I mentioned above, converted to output their content to the presentation system in the same way that the static content of a text resource is passed through. Finally, external database resources are external systems whose output is proxied over HTTP and transformed in order to be passed through the presentation system as well. This means that, with a suitable XSLT filter, more-or-less anything can be integrated into the site in a way that’s invisible to the end user.

When authoring content for the site, each resource is assigned content (or a content source) according to its type, is assigned metadata, and is assigned a “nodename” and a single location within the site hierarchy. URLs are then constructed by working from the nodename upwards through the hierarchy. So, the page on “the Woking Martian” has the nodename martian and is assigned as an immediate descendent of the Places to Visit resource, which has the nodename attractions and is in turn an immediate descendent of the Leisure and Tourism resource, which has the nodename leisure and is an immediate descendent of the homepage. Thus, the Martian’s canonical URL is http://www.woking.gov.uk/leisure/attractions/martian. A URL history is kept, so that if a resource is moved around the site, its old URLs will continue to work.

The administration area is comprised of a database of editing options which can be available as options applying to the site as a whole, to individual resources, or to any resource matching certain criteria. Options can be restricted to site administrators or to publishers according to their relationship to an individual resource in question. For example, the option to edit the content of a text resource is restricted to site administrators and to the owner of the resource — the owner being the publisher who originally authored the resource. This means that we can customise editing options in a very precise way — there’s an editing option, for example, to add a new PDF to the Forward Plan of Key Decisions that’s only available to site administrators and the owner of that resource.

As I’ve mentioned in a previous entry workflow is assigned according to the category metadata assigned to a resource. The relevant Service Head is alerted by email whenever a new resource is posted in one of their categories (or if certain kinds of changes are made to a resource already published in that category), and they’re invited — well, actually, nagged — to either approve or defer that resource from publication to the live site. If they approve, a further message is sent to the site editor (one of the site administrators) inviting them to do the same. If she approves, the resource is made available to the public.

Aside from Dublin Core and eGMS metadata, a resource can additionally be set to be hidden from navigation altogether, assigned a colour scheme, set to persist down the hierarchy, or set to masquerade as another resource dependent on context.

As my reader will know, this is all built on mod_perl, so those CGI scripts have become Apache::Registry scripts, and the whole thing is held together by customising parts of the Apache lifecycle.

I guess in landscape terms, that positions it somewhere over there, with the camels — no, not the adults or the baby ones, but the teenagers; yes, that’s it, the “individual” one with all the makeup.