Building an Atlassian Confluence plugin without Atlas, et al

Note: This posting has been updated so as to work with Confluence 4.x. (The previous version of the posting targeted Confluence 2.x.)

I spent several semi-productive hours this weekend writing a plugin for Atlassian's Confluence wiki. The plugin enables the running of JavaScript on the server and placing the result into the page displayed. I have found that using a wiki to intermix static content and dynamic content makes for a great reporting and situation-awareness tool kit (see JSPWiki). I wanted to do the same again within the confines of Confluence 4.x.

Confluence has a really good plugin manager webapp. You can search a repository for existing plugins for immediate inclusion. And you can also upload the plugin from your desktop. I gave the plugin manager a work out getting my plugin to work and can attest to its stability.

And here is the rub, Atlassian has made including and using plugins simple but has made creating a simple plugin almost impossible. My plugin's actual logic is very simple

private static final ScriptEngineManager factory = new ScriptEngineManager();
ScriptEngine engine = factory.getEngineByName("JavaScript");
Object result = engine.eval(script);
return result.toString();

To make this happen, however, Atlassian wants me to use Atlas. Atlas, as a far as I can tell, is much like Ruby on Rails and Spring Roo where the tool lays out a directory structure with files that together are the parts of and tools for building an "Hello World" application/plugin. In addition to Atlas, Maven and (perhaps) Eclipse are also needed. If my occupation was building whole applications on top of the Atlassian products and their APIs I could understand the logic of this tool chain. But I have a 4 line (!) plugin.

Confluence has been around for a long time and I guessed that the pre-Atlas way of building plugins was documented and with code examples. As far as I can tell, and this is after much searching, this is not the case. I was not able to find a single example of a basic plugin built with, for example, Ant. Since, in the end, a plugin is nothing more than a jar containing code and configuration I was shocked that this was missing from the mass of other documentation Atlassian provides. A whole population of, mostly in-house, programmers are being ignored. These are the programmers that are going to build plugins, i.e. small extensions to big tools that aid the better match between Confluence and the users needs.

To this end, here is how I build a basic macro plugin. (Note that the plugin documented here is not what I finally created. In the process of using the JDK's implementation of JavaScript I discovered that it is a old version of Mozilla's Rhino that does not support E4x, the XML language extensions. E4X makes XML a first class data type within JavaScript. Even the JavaScript syntax has been extends to allow for XML constants, for example x = <a/>. And so the final plugin uses Rhino 1.7R3 which does support E4X and JavaScript 1.8.)
The plugin's jar needs a minimum of two files. The Java class and the atlassian-plugin.xml configuration file. The development directory tree is

The JavaScriptPlugin class extends BasicMacro and must override the methods isInline(), hasBody(), getBodyRenderMode() and execute(). The isInline method specifies if the output of the plugin is suitable for an HTML span or block. The hasBody method specifies if the plugin has content, for example, as does the {code} macro. The getBodyRenderMode() specifies how Confluence is to handle the macro's output. Returning RenderMode.COMPATIBILITY_MODE specifies that the output is wiki text to be rendered as HTML. And, finally, execute does the work of the plugin.

package com.andrewgilmartin.confluence.plugins.script;

import java.util.Map;
import com.atlassian.renderer.RenderContext;
import com.atlassian.renderer.v2.macro.BaseMacro;
import com.atlassian.renderer.v2.macro.MacroException;
import com.atlassian.renderer.v2.RenderMode;
import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
import javax.script.ScriptException;

public class JavaScriptPlugin extends BaseMacro {

    private static ScriptEngineManager factory = new ScriptEngineManager();

    public boolean isInline() {
        return true;

    public boolean hasBody() {
        return true;

    public RenderMode getBodyRenderMode() {
        return RenderMode.COMPATIBILITY_MODE;

    public String execute(Map params, String body, RenderContext renderContext) throws MacroException {
        try {
            ScriptEngine engine = factory.getEngineByName("JavaScript");
            engine.put("params", params);
            Object evalResult = engine.eval(body);
            String result = evalResult.toString();
            return result;
        catch (ScriptException e) {
            throw new MacroException(e);

// END

The atlassian-plugin.xml is an XML declaration of the plugin. It must be in the root of the plugin's jar file. The file contains two main sections. The first is plugin-info which declares the plugin as a whole. The second is the (repeatable) macro which declares the specific plugin. The atlassian-plugin.xml has several further refinements of these sections. I was not able to find an XML schema for this file type.

What I have discovered about it is from reviewing Atlassian's own plugins. The atlassian-plugin element's key attribute seems to be a unique identifier but does not have a prescribed structure: I am here just using the JavaScritMacro class's package name. The macro element's name attribute is the name of the macro as used by the user. For example {javascript}1+2+3{javascript}

 name="JavaScript Macro Plugin"

        <description>A JavaScript macro plugin</description>
        <vendor name="Andrew Gilmartin" url="" />

        <description>A JavaScript macro plugin. Place the script to execute within the body of the macro.</description>


The build script is

<project name="com.andrewgilmartin.confluence.plugins.script" default="dist">

    <property environment="env" />

    <path id="build.classpath">
        <fileset dir="${env.HOME}/lib/atlassian-confluence-4.3.3/confluence/WEB-INF/lib">
            <include name="**/*.jar"/>
        <pathelement location="${basedir}"/>

    <target name="dist">

    <target name="clean">
            <fileset dir="${basedir}" includes="**/*.class"/>
            <fileset dir="${basedir}" includes="**/*.jar"/>


Replace ${env.HOME}/src/confluence-2.10.4-std/confluence/WEB-INF/lib/ with the location of your Confluence jar files.
For more information about building out your plugin do read the documentation and review the code for Atlassian's own plugins, for example Confluence Basic Macros, Confluence Advanced MacrosConfluence Information Macros and Chart Macro.

Atlassian uses the Spring toolkit in their development. Since Spring performs dependency-injection of objects matching the results of using Java reflection to find "bean" names, a lot of Atlassian's code looks like magic is happening. That is, there is no visible configuration or other assignment of object to values and yet the assignments have to made for the code to work. Spring is the magician.
And that is it. The built jar file is a valid Confluence 4.x plugin. Happy wiki scripting
Download an archive of the development tree (Thanks to David Wilkinson for the tool to create this data URI.)

Coda: I was asked why I created my own scripting plugin when two already exist in Atlassian's plugin repository. The initial reason was that our MySql 5.0 installation has too small a max_allowed_packet size and so Confluence was not able to install the existing script plugins into the database. The ultimate reason was that I knew what I wanted from the plugin and said to myself "how hard can it be?"

Tools & notes

What I am focusing on in this photo is that he is using a computer and a notebook. I do the same. It suddenly occurred to me that every crafts person, since the beginning of time, uses this work arrangement. Tool to one side and notes to the other side. This is a great pattern. Why do we not perpetuate it today? Why put both on one machine? Some places have. When I worked at Lotus we always had a "development" machine and an "administration" machine. Perhaps some places do.

I should be able to open my laptop for use on one side of the desk and then slip out the tablet to place on the other side. Just say'n.


I am because we are.

Horizon charts

I have been slowly working on a metrics collection and monitoring service. There are many others, but I wanted something very simple to feed and to integrate with a wiki to monitor/display. During the development of the service I discovered Horizon charts by way of (the brilliant) Cubism.js. The goal of an horizon chart is to use the a minimum of vertical height without loss of precision. Horizon charts look like bar charts. However, horizon charts use both the top and bottom edge as axises. The top edge is used to show values below a threshold with the bars going downward and the bottom edge is used to show values above the threshold with the bars going upward. Further, to extend the range of a value beyond the height of the chart the values are "folded" and the folds layered on the chart. Each fold is drawn as a bar on top of the previous fold's bar. The illustration at Sizing the Horizon: The Effects of Chart Size and Layering on the Graphical Perception of Time Series Visualizations shows this very well: I was a little intimidated by Cubism.js and D3.js; they are sophisticated toolkits that will take more time for me to understand than I wanted to commit just now. Plus I really wanted to learn more about using HTML's canvas element, and so I set about my own implementation. The code at is my first cut at the implementation. It works for a fixed, two fold horizon chart. The example chart below plot the values between +/-200 with folds at +/-100.

Update: Updated the working example to be more general. Removed the code from this posting in preference to the Gist. FYI: To run the code just download the gist and open it in a browser. To play with it copy the gist into the HTML panel of and run.

Blackout Friday!

Looking forward to Blackout Friday! I am so surprised that the national brand stores and the electrical grid operators have come together to turn off the power on Friday so we all can stop and view, with utter revulsion, the rapacity that is now "Christmas."


Jeff Atwood recently proposed that Markdown users come together and create a unified specification. While I do think a unified specification would be useful I do not think Markdown is the right syntax. I have used it and many of the other structured text markup alternatives. For programmer types and anyone else used to machine readable documents, structured text markup is a great way to reduce the effort needed to get content on the page. However, HTML is a great way to reduce the effort needed too. The problem with both is that Markdown is too weak a syntax and HTML is too strong of a one. We need a middle ground.

Here is a proposal for a middle ground. I am going to call this syntax Permatext:
  • A valid HTML document is a valid Permatext document.
  • HTML class and style information is discarded by a Permatext processor.
  • Any HTML end tags can be dropped if it is clear from the context that the end tag would be used. So, for example, an opening P tag automatically closes the previous P tag. An opening A tag need not use the closing tag if the link's title is a single word. Etc.
  • A short closing tag, </>, closes the most recient open tag.
  • Text within a code or pre tag need not be XML escaped. However, escaped XML will be honored. That is, if a valid named or numbered entity is found then it is honored. [tricky]
  • TODO
For example, the document contained in Atwood's posting would be marked up as follows:

<h1>Lightweight Markup Languages</>

<p>According to <i>Wikipedia</>:

   A <a href="">lightweight markup language</>
   is a markup language with a simple syntax, designed 
   to be easy for a human to enter with a simple text 
   editor, and easy to read in its raw form. 

<p>Some examples are:


Markup should also extend to <i>code</>: 

    20 GOTO 10

[NOTE: The A tag is hard to read. Perhaps assume href.]

  For all Permatext details see

A simple monitor with SMS messaging

Now and then you need to watch a directory for changes. For me, this is mostly to ensure a data build is continuing. When the directory size remains constant then it is likely that the data build has failed. A simple monitor is to regularly check the directory's size and send an email message should it not change. If the email is to a email-to-SMS service then you will know sooner about the problem.
To regularly check use cron. While you can have a program that loops over the check and then sleeps for an interval this is very likely to fail at some time, that is, the program dies without warning. Cron will not fail in this way.

Since you are interested in change you need to record the previous value to compare to the new value. Just use a file. Anything else will likely fail.

So here is your crontab line (place all lines on one line) to run the command several times per hour
*/5 * * * * $HOME/bin/watch-directory-and-mail
                -o $HOME/var/ 
                -w $HOME/var/data-2012-09-18/

This says to watch the directory "$HOME/var/data-2012-09-18/", keep the record of previous directory size in "$HOME/var/", and to raise an alarm by sending email to "" (AT&T's email-to-SMS service). The "$HOME/bin/watch-directory-and-mail" script is  

function send-message() {
  echo "$3" | mail -s "$2" "$1" >/dev/null 2>&1

SUBJECT="$(basename $WATCH)"

while getopts "t:s:w:o:h" opt
 case $opt in
  t) TO=$OPTARG ;;
  *) echo "usage: $(basename $0) \
        -t email \
        -s subject \
        -w directory/file \
        -o status-file" ;;

V1=$(du -s $WATCH|cut -f 1)
if [ -r "$OUTPUT" ]
  V0=$(cat "$OUTPUT")
  if [ $V0 -eq $V1 ]
    send-message $TO $SUBJECT "$(basename $WATCH) unchanged size at $V1"
  send-message $TO $SUBJECT "$(basename $WATCH) initial size is $V1"

echo $V1 >$OUTPUT


Update: The script works just as well with a file as with a directory. If the rate of growth of the directory/file is slow then use "du -sb" to get byte counts. (OS X's du does not have the -b option.)

Update: If you have multi watching going on then the message sent is not very helpful. Have updated the code to enable to use to specify the message.

Update: If you don't want the script txting you during the night then change the crontab schedule to 9 AM to 5 PM, eg "*/5 8-17 * * * $HOME/bin/watch-directory-and-mail ...".

Using a simple command line script and Marked to keep and display notes

I was reading this and it sparked an idea for the combined use of Marked and a command line script. I often would like to simply collect notes as I work at the command line. The notes can all go into the same file and I will sort them out afterwards (GTD "inbox" style). So the simple command line

function n() { echo "$(date): $*" >>~/Inbox/Note.txt; }
defines a "n" command that works well enough. Usage is simply to type "n this is my note" at the command line and the script will put the timestamp and the "this is my note" text in to the file ~/Inbox/Note.txt.

The spark, however, was that I could use Marked to display an automatically updated view of the note file's contents. Further, I could add a little markdown notation to the script to help visually separate the notes. The final script is

function n() { printf "\n### %s\n%s\n" "$(date)" "$*" >>~/Inbox/Note.txt; }
Here the date is formatted as a header and the note's text as a paragraph.

The end result is a window like this

When to delete an abandoned project?

One of the projects I worked on was funded by the EU and so the code needed to be open-sourced. We did this by placing the code on SourceForge. And that was mostly all that was done. Eventually the EU stopped funding the project and so now the code sits abandoned. Untended. Crumbling with each new release of a dependent toolkit or tool.

When it was coded it represented some of my best work. Some of it is still good. Some of it I look upon with horror. This posting is not about me and my code but about the code's life. I can say without any doubt that the code will never be used again. Even in part. Anything that was useful has already been subsumed into other projects. So why keep it? Why should SourceForge pay to keep it available? I don't have a good answer yet, but I am leaning towards deleting it.

A less violent killmatching

After reading my previous posting Jonathan Stockdill suggested that the command really needed dry-run and/or confirmation flags. He is right.

A challenge in doing this in bash was in coordinating the input from the ps pipeline and the user's confirmation. I came up with an answer but am not sure it is the right one and so have asked on for help. Until then, here is a less violent killmatching:

#!/bin/bash -e

PS_OPTS=${PS_OPTS:- -A -o pid,command}


while [ 0 -ne $(expr $1 : '\-') ]
case "$1" in
-i) INTERACIVE=true ;;
-n) DRY_RUN=true ;;
-h) echo "usage: killmatching [-in] [kill-options] pattern1 pattern2 ... patternN" ; exit ;;
shift 1

exec 4<&0

for PATTERN in "$@"
ps $PS_OPTS \
| grep -v -F grep \
| grep -v -F "$0" \
| grep -e "$PATTERN" \
| while read PID COMMAND
read -u 4 -p "kill $PID $COMMAND? (y/N) " choice
if [ "$choice" == "Y" -o "$choice" == "y" ]
elif $DRY_RUN
echo kill $KILL_OPTS $PID


Killmatching: a violent killall.

I always seem to need a better killall especially when running script or Java processes where the command name is, for example, "java". Scratching the itch, here is killmatching

#!/bin/bash -e

# usage: killmatching [ kill-options] pattern1 pattern2 ... patternN

PS_OPTS=${PS_OPTS:- -A -o pid,command}

while [ 0 -ne $(expr $1 : '\-') ]
shift 1

for pattern in "$@"
for pid in $(ps $PS_OPTS \
| grep -v -F grep \
| grep -v -F "$0" \
| grep -e "$pattern" \
| awk '{ print $1 }')
kill $KILL_OPTS $pid

Be very careful using it with one wrong pattern and you can shutdown the computer.

Some ps commands do not support the -A and -o options. Edit these options for your ps.

Establishing a new leader and informing the lead

I have a distributed system design problem. The problem brief is
We have many clients to a service. The service needs to be fault-tolerant and so it will have many replicas. When a client can no longer access the service it will switch to a replica and inform all the other clients will switch to the same replica. (It is not acceptable to load-balance across the replicas as the replicas' data values are not exactly the same but all the clients must return the same data values at all times.)
My current design is to have the client, on failing to reach the service, ask for a new service leader. When a new service leader is established it will then notify all the clients to use it.

As with many distributed coordination designs I need a distributed group manager. I am considering the use of JGroups and/or Apache Zookeeper in a solution.

Is there an existing recipe or recipes that I should be looking at to solve this problem?

Update: We finally decided to have a master make a new copy of the repository, copy it to all service hosts, and notify the service hosts to switch to the new repository. The master tracks who has switched and raises an alarm if there is missing switchers after a fixed interval.

"I like Obamacare."

About a week ago now I was tired of NPR using the term "Obamacare" instead of the laws real name, The Affordable Care Act. And so I sent email and Facebook messages to the local and national NPR offices:

Dear NPR, please call the law by its rightful name, The Affordable Care Act, and not the prejudicial name its opponents want it called.

Both offices wrote back. The local office, RIPR, said that they would use the real name during its initial reference in a story and Obomacare afterwards. Further, "[...] because (like it or not) that's what many people know it by." I was peeved by this sloppiness and the lack of an historical grasp of the significance of names and so sent a rather strident response (for which I did apologize later). The message I receive from NPR's ombudsman was remarkable:

Thank you for your inquiry. We asked Ron Elving about the term and here is his reponse:

Initially, the term was coined and used by opponents of the ACA. It had a sneering kind of tone to it, implying that the president was trying to imitate or piggyback on the popularity of Medicare. (As indeed a lot of commercial products have done since 1965.) The White House initially resisted the term for this reason, preferring Patient Protection and Affordable Care Act.

In the headline wars of the cable tv news world, of course, PPACA never had a chance. Obamacare became increasingly common.

So some while ago, the White House did a turnaround and embraced the term. I am attaching a copy of the memo David Axelrod wrote about the term and why it was okay to use it as far as he was concerned. Axelrod was still in the White House as the top political advisor at the time and is now in Chicago co-directing the re-elect campaign.

From: David Axelrod
Date: Fri, 23 Mar 2012 17:34
To: [Supporters]
Subject: Hell yeah, I like Obamacare

Friend --

I like Obamacare.

I'm proud of it -- and you should be, too.

Here's why: Because it works.

So if you're with me, say it: "I like Obamacare."

Obamacare means never having to worry about getting sick and running up against a lifetime cap on insurance coverage. It gives parents the comfort of knowing their kids can stay on their insurance until they're 26, and that a "pre-existing condition" like an ear infection will never compromise their child's coverage.

It's about ending the practice of letting insurance companies charge women 50 percent more -- just because they're women.

And Obamacare can save seniors hundreds of dollars a year on prescription drugs -- and gives them access to preventive care that is saving their lives.

President Obama never lost sight of the fact that this reform is about people. People like his own mother, who spent the last years of her life fighting cancer -- and fighting with insurance companies, too.

That shouldn't happen. And because of Obamacare, it can't.

So next time you hear someone railing against Obamacare, remember what they're actually saying they want to take away.

And, today, stand with me in saying, "Hell yeah, I'm for Obamacare":



P.S. -- Side note: Can you imagine if the opposition called Social Security "Roosevelt Security"? Or if Medicare was "LBJ-Care"? Seriously, have these guys ever heard of the long view?

If President Obama wants to call it Obamacare then I am dropping my opposition and joining those who are reclaiming it as a term for a positive accomplishment of his first term.

Open a new terminal window from the command line with helper script on OSX

With the help of a little script called term you can quickly tail the logs of remote machines each in its own window using

for n in 1 2 3 4 ; do \
term -t ds$n ssh ds$n tail -F ~qs/var1/logs/\?s-common.log ; \

The four machines I am watching are called ds1, ds2, ds3, and ds4 and the two log files I am interested in tailing -- qs-common.log and ds-common.log -- are in the same location. Term uses the -t option to set the window's title (if it is not given then the first argument is used.) Don't forget to use ssh-id-copy to configured the remote hosts to use a private key so no password is needed

The term script is

if [ "$1" == "-t" ]
shift 2
osascript <<EOH
tell app "Terminal"
set currentTab to do script "printf \\"\\\\e]0;$TITLE\\\\a\\""
do script "$@" in currentTab
end tell

A kanban for one

I know that a visual workplace works to improve productivity. What I did not think about was that it can work for the individual just as well as for the team. During the weekend's browsing I came across this posting about Kanban for One. And, better, the inspiring photograph of Nomad8's office with a kanban board at most desks.

Thus inspired, I created my own Kanban board and used it right from the start of the day. And it worked! It was much easier to manage the set of activities over the day. It was delightful seeing the steady accumulation of "done" tasks. It was also surprising how quickly tasks got stuck. (And it is the stuck tasks that tend to get forgotten.) I plan to keep using the system and see how it goes.

The board is a 1'×2' piece of foam core board from a discarded marketing sign, the markings are with white pencil, and the tasks are written on 2"×1.5" Post It notes. It really couldn't be simpler to construct.

Note that I am using the kanban board in conjunction with an existing bug tracking system. The backlog is managed by the bug tracker. The queue needs to be on the kanban board, however.

Less opportunity to see how much hard work is necessary

Comment on Ian Schreiber's blog post Dreamers:

There seems to be less and less opportunity to see how much hard work is necessary to achieve proficiency and (sometimes) success. The actions of a craftsman in wood, for example, can be seen by sitting at the side of a bench and being attentive. What works and what does not is equally visible. And if the craftsman has a the habit of talking to the work some of his interior worries and choices are available.

My children have no visibility into my work. I am a software programmer. I have been at this trade for 30 years but what can be seen of my efforts and the growth? There is nothing.

From my children's perspective there is no history there is only now and the future. No wonder they and the kids you write of think they can be famous. It requires so little effort.

Sending Growls from a remote host

There are times when I want a notification about the status of a long running process I have running on a remote system. I could have it send an email. I could have it send an instant message. I decided that, since I have a Mac, I would have it send a Growl notification. Growl's GNTP makes this easy and the language bindings available make it easier still.

The one obstacle I had was getting the IP address of the connecting host (i.e., my Mac). Commands like finger and who often show this data but I did not want to scrape the command output as it is different between operating systems. Digging into the code for who at I discovered that it used the user accounting database utmpx API for getting this data. And so, with all the parts needed, I coded up growler

#!/usr/bin/perl -w

use strict;
use Getopt::Long;
use Growl::GNTP;
use User::Utmp qw(:constants :utmpx);

sub getRemoteHost() {
my @utmp =
sort { $b->{ut_time} <=> $a->{ut_time} } # reverse sort by time
grep { $_->{ut_user} eq getlogin() } # filter by current user
getutx(); # get user accounting data
( my $host = $utmp[0]->{ut_host} ) =~ s/:.*$//;
return $host;

my $title = "";
my $host = getRemoteHost();
my $port = 23053;

'title|t=s' => \$title,
'host|h=s' => \$host,
'port|p=i' => \$port )
or do {
print STDERR "usage: $0 ",
" [ --title text ]",
" [ --host name/ip ]",
" [ --port number ]",
" message\n";

my $growl = Growl::GNTP->new(
AppName => "growler",
PeerHost => $host,
PeerPort => $port

$growl->register([ { Name => "growler" } ]);

Event => "growler",
Title => $title,
Message => join( ' ', @ARGV )


My original installation of Growl was broken. This caused all manner of errors during testing. To finally get growler to work I had to uninstall Growl and then install Growl 1.4.

Refactoring Java classes with explicit serialVersionUIDs

I had a small problem today where I needed to refactor a Java class and maintain its serialization identity. I guessed that if I maintained the class's instance variable order and explicitly defined the serialVersionUID class variable with the class's previous implicit value I would be safe. So, before refactoring, I used Java's serialver to get the class's serialVersionUID. I then refactored the class and added the serialVersionUID class variable. But it didn't work. What was wrong? I needed to look into the object's serialization.

An object's serialized form is defined in Java Object Serialization Specification. I really didn't want to read the specification: All that I really wanted was to see the differences between the class's pre-refactor stream and the post-refactor stream. So I used jdeserialize to dump the protocol encoding of the streams.

What was quickly apparent in the dump was that the order of the instance variables within the stream have no relation to their order in the class definition. I was not expecting this. During the refactoring I had renamed some instance variables to better reflect their purpose. Once I reverted the instance variable's names to their previous values the serialization was the same. Success.

A distressing aspect of this exploration was that two objects with the same serialVersionUID will be considered the same by the serialization implementation and so it will misinterpret the bytes in the stream for the bytes needed by the class. The end result is that you get an object that is initialized wrong. I would have expected an exception that indicated the mismatch between data in the stream and data expected by the class. Perhaps there is a command line switch or system property I can use to enforce a stricter pairing. Be very careful when refactoring a class with an explicit serialVersionUID to either remove or update its value before committing the changes.

The trifecta of low-ceremony document management

My comment on How I name files on my Mac:
I too add dates to file names even though I know the creation date is part of the file's meta-data. When files are moved between systems the meta-data is not always preserved. This chance of loss is also what thwart me from using a tagging tool as I can not be assured that the tags will be preserved between systems. And so, the trifecta of low-ceremony document management are
  • a good title,
  • a date in the title, and
  • a good full-text search.

I want more "geek lights"

Geek-lights are the indicator lights embedded on the surface of your electrical equipment. The green "On" light. The blinking light on your ethernet port or USB thumb-drive telling you data is being transferred. Most of them tell you redundant or otherwise obvious facts. I don't need an "on" light for the monitor I am using. It is useful information when in power saving mode, however. Apple seems to have learned this.

However, geek-lights are incredibly useful for other kinds of information. Information that is different for you and me. For example, one to indicate if traffic is too heavy to venture out now. Another to tell me that I have email. Another to tell others that I am in a conference call and not just talking to myself again.

I would like to have a box of these geek-lights that I can use each for a specific purpose. The geek-light would be nothing more than a shallow and wide cylinder with the light on the top, temporary adhesive on the bottom, and the electronics sandwiched in between. Let's call them eyelets as they let you see something ordinarily invisible. Each eyelet is individually addressable by a short range wireless signal. An eyelet receives a signal that indicates whether it is illuminated or not. Eyelets don't send signals back. I can imagine the hardware being something akin to Logitech's Unifying receiver -- that is, unobtrusive.

The question now is, should eyelets have a battery charge indicator light?

"Acting like a startup" is overrated

"Anyone who isn't acting like a startup in publishing [replace with your industry] has a serious problem." I hate this aphorisms. It is not useful and mostly wrong. The only difference between a new product at a startup and one at an established company is where the seed money comes from. Everything else is the same.

A rose by any other name is incorrect: a rant on character sets and encodings

I have been working with XML since SGML. I have been working with character-sets since before cuneiform was contemporary. That is, a long time. I have seen and continue to see characters very poorly handled. Even the use of the term "characters" is wrong. Unicode calls them "code points" as a single character (graphic impression) can be made from one or more code points. Unicode supports both a single code point that represents an e with an acute, U+00E9, and a multiple code point representation, U+0065 with a U+0301. The single code point is 'é' while the multiple code points are 'é'. For most users and most web browsers there is no difference in how this character is presented.
Now é is presented differently in different character-sets and in different character-encodings. For UTF-8 it is represented as two bytes, 0xC3 and 0xA9. For XML it is presented as the numbered entity "&#xE9;" or "&#233;". For HTML both the XML encodings can be used and the named entity "&eacute;" can be used. For percentage encoding within a URL's path and query it is represented as "%C3%A9". Within a domain name that uses punycode it is "xn--9ca". While these differences are not inherently problematic it has been my experience that their combined use very much is.

For example, if I have a URL that contains an é and I want to use this in an HTML A tag's href how do I encode it? All of the following are useable but only the first is correct:
And being correct matters more and more because this stuff is being handled by code. Code is written by programmers and programmers don't universally understand character-encodings and character-sets. Moreover, you have programmers at both ends of the supply chain: The writer writes it wrong and the reader reads it wrong. How many times have you seen the following?
  • An HTML named entity being used in XML.
  • A entity double encoded, eg &eacute; is incorrectly encoded as &amp;&amp;eacute;.
  • A french text, for instance, with question marks oddly scattered throughout.
I don't have a solution to the misunderstandings and misuses. My only advice is that you have some simple rules and adhere to them:
  1. Be clear that you and your supplier know the difference between a character-set and a character-encoding.
  2. Don't ever accept bad data. Being generous and accepting (and "correcting") bad data never turns out well for either you or your supplier.
  3. Only accept percent character-encoding for URLs.
  4. Only accept numbered entities character-encoding for XML & HTML (except for &lt;, &gt;, &amp;, and &quot;).
  5. Only accept content for which there is a specified character-set. (For XML it is UTF-8 by default.) So, for HTML form elements make sure to use the accept-charset attribute. For HTTP requests make sure to require the charset attribute of the content-type header. If you can, only accept UTF-8 -- mostly because the supplier's tools will almost aways to the right thing with UTF-8.
  6. Never use a character-set in your byte-oriented repositories as anything other than UTF-8. (So, when you get a percent-encoded URL make sure to store it not as supplied but as its decoding.)
I hope this helps.

Need to paint my soldiers.

I have returned to earth after some months of all-things-wargames-all-the-time. I have learned much about Roman, feudal, and 18C and 19C warfare. Perhaps not enough to author a Buffer's Guide but enough to ask respectable initial and followup questions. But what I have not yet done is actually play a game! My Bacuss Saxons and a Vikings are still unpainted. I do not have a game table. I haven't tracked down local DBA wargamers. So over the coming weeks I intended to paint my toy soldier: that is, after helping with kids projects, house projects, garden projects, etc. Wish me luck.

Library catalog, data: URIs, & bookmarklets

My public library, and, I expect, so does your's, uses Innovative Interfaces catalog tools. This web application is the poster child for poor user interface and user experience. It is not my objective to enumerate the problems. Instead, I discovered a useful mechanism of getting around the problem I have with remembering my library card's barcode. (Why on earth do the designers of this software expect me to remember my barcode is unfathomable.)

My original solution was to embed the barcode in a URL that mimicked the web application's form-based login. This URL was then bookmarked to allow for immediate access. This worked for a good number of years until the catalog software was updated. At which point the URL broke and I was unable to reproduce this solution using the updated software. (The obstacle seemed to be the need for a session identification that I could not contrive.) And so I needed a different approach.

The approach I took was to be able to show the barcode when I used the catalog. Having an absolute DIV positioned at the bottom the browser's window is easy to do with CSS. The problem was that to do this I needed to dynamically alter the catalog's HTML. I really did not want to install GreaseMonkey to accomplish this. The next best solution was to have a standalone page with an embedded IFRAME. This give me full control over what was on the page. The HTML is

        <title>South Kingstown Library Catalog</title>
            * {
                margin: 0;
                padding: 0;
                border: 0;
            #tab {
                font-family: Verdana;
                position: fixed;
                bottom: 0px;
                right: 4ex;
                padding: 2ex;
                border-top-left-radius: 1ex;
                border-top-right-radius: 1ex;
                color: white;
                background: gray;
        <div id="tab">
        barcode: 123456789

I didn't like having to store this page as a file on my local machine and storing it on a HTTP server seemed overkill. Clearly, I was over thinking, but this is what I do for my day job. It then occurred to me that I could encode the page as a data URI and, I hoped, that when I used this URI the browser would render the encoded page. I used David Wilkinson's data: URI creation tool to create the URI. And, to my wonderment, Safari, FireFox, and Chrome (all on OS X) did exactly as hoped for.   
What this means is that my future bookmarklets can more sophisticated then I ever considered practical before. Perhaps you can use this discovery too.

Moving forward

One of my favorite lines from M.A.S.H was Frank Bruns insisting that "we must have a plan even if it is wrong." While this was a line intended to ridicule Frank, it has been my experience that creating a plan, even if it turns out wrong, is a positive step forward.

A miniature wargamer begins

I am enamored with all things miniature wargame. Last year I dove into the world of boardgames (and card games) as my sons were now of an age where we could plays games that were enjoyable to each of us. Boardgaming is not, however, a hobby. A hobby requires hands-on creativity. (Aka, dirty hands.) A hobby is what you do between events. While I am up to the challenge of developing boardgames and card games it is too much of an intellectual challenge only and I have enough of that in my day-job. And further, I don't really want to do it.

As a young teenager I had a detailed model Great Western Railway in N gauge. I loved this hobby. It allowed for the creation of a whole world and one run with British timeliness. (Timeliness, is something that has vanished from the modern UK rail system, I am told.) Modeling railways and railroads are still something I look forward to doing again in my dotage, but not now.

What sparked my interest in miniature wargaming come from an unexpected encounter at the Hobby Bunker. This game store is near to my employer's office and so I went to it too look at their stock of boardgames. What I encountered was not, principally, a boardgame store but instead a miniatures wargaming store. As, as it turns out, the Hobby Bunker is one of the best such stores in the US. And so for two hours I wondered the aisles of shelves stocked with unpainted miniatures in plastic and metal of all historical periods; the selection of paints in acrylic and enamel; the shelves of books with the historical accounts of nations, armies, commanders, and battles; wargamer magazines; and wargame rules. (Hobby Bunker does also have a good collection of boardgames.)

That day I only bought an issue of Wargames Illustrated but the next time I bought some plastic saxons and the paint for them. And, like every good wargamer, they remain unpainted in the box still. But that is not because of waning interest but instead I have dove in the deep-end of the pool and am currently surrounded with magazines, military history books, wargaming books (mostly the foundational one), rule books, wargaming podcasts, and, soon to arrive, armies of vikings and saxons in 6mm.

As one of my sons noted, "you are serious about this dad!" I am.


I had a great time Saturday attending the Boston Battle Groups's Havoc Game Convention (HAVOC). My goal in attending was to see different historical periods and different rule sets in play. In the morning I watched Day of Battle rules payed out by Christopher Parker (the rule set's author), his brother, Jen, and Richard Brian. The game is driven by a unique combination of dice (d6) and a standard deck of cards. I need to see the rules played out in a tournament to make a judgement about them. I want to thank Chris and Dick for sharing their time and knowledge with me.

In the afternoon I watched a Punic wars battle played with the the De Bellis Antiquitatis (DBA) rules ((say what). This rule set seems most approachable. It had the feel of a boardgame but played with miniatures on an open table. Since returning from HAVOC I have spent the most time with this rule set. I want to thank Maureen, Spencer, Mike, and Harrison for a wonderful afternoon.

All in all, I met a wonderful group of people at HAVOC. All most willing to ofter advice and help. And Vic Gregoire invited me to join his "The Usual Gang of Idiots" for some wargames later in the year.

Why is Spring taking a perfectly good Set value and passing along a null?

Why is Spring taking a perfectly good Set value and passing along a null to the property's setter? That is, I had a simple bean declared like the following

<bean name="set" class="java.util.HashSet">

<bean name="x" class="...">
   <property name="y">
         <entry key="z" value-ref="set"/>

Where the property's setter is declared

void setY( Map<String,Object> y )

But every time setY() was called the key z's value was null. The problem's answer was very hard to track down. It comes from how Spring automatically converts objects to type specific property values. Spring has two ways of converting a value for use in a property's setter method. The first is based on converting a textual representation to an object via the java.beans.PropertyEditorSupport API. The second is based on Spring's own org.springframework.core.convert.converter.Converter API. I has used a Converter in an unrelated part of my application's Spring context. This was my application's first use of the Converter API. Turning the Converter API on, however, brought with it other conversions I was not expecting. The most detrimental was the Collection to Object converter. When Spring sees that the property expects an Object it tries to convert any value to an Object. (Even though every value is an object.) The Collection to Object converter converts an empty collection to null. Let me say that again .. with emphasis. The Collection to Object converter converts an empty collection to null. My set was empty. And so, a perfectly good empty set value was converted to a null. The correction is the change the setter for property y to

void setY( Map<String,?> y )

And all works now. p.s. I would never have been able to debug this without Spring having open source. Thanks VWware, Spring Source, and all the contributors.

Need multiple-view calendar

The calendar displays that we have had forever just don't help me anymore. Too much information and/or too much interaction need. I want my calendars to show me what I want without interaction. I want to distinguish what is common and what is unique. Common, for example, are events such as every workday at 9 AM I have a standup meeting; Thursdays I deploy new releases; Every second Monday of the month during a school year I have a PTA meeting. Unique ones are the dentist appointment and the odd professional event at night. I don't really want one-view calendar anymore. I need multiple-view calendar like he one illustrated here.

The top chart contains my unique events over the last 2 and next 8 days. The second chart contains my common events over the last 2 and next 8 days. The bottom left chart is a list of common events for the next several weeks. And the bottom right chart is a list of unique events for the next several weeks.

Since we have separated events that actually occupy the same timeline we need to rejoin them somehow. My initial thought was to add a ghost to each chart that signifies the presents of an event in another chart, but I am not wedded to this.

Does anyone make a calendar like this, either visually or purposefully?

A idea for fixing the flow of money in pharma. deseses

Last year I attended a few lectures and read news reports about the pharmaceutical business where drug researchers and clinicians spoke about how the allocation of money was disproportionally being spent on life-style drugs and existing (often unique) drugs were being discontinued. This got me thinking about how we could continue to allow pharmaceutical companies to make gobs of money (I don't object to profiting from a great product) and also allow clinicians to depend on a consistent supply of drugs. We need a segmenting and deregulation like what was done with electric utilities in the mid-1990s.

The electric utilities in the mid-1990's were split into three segments: generation, transmission, and local distribution. While the customer does not have a choice of local distribution and transmission they should have a choice of generation. Local generation and transmission continued to be regulated. Generation was deregulated to enable competition. This change was fundamental in the motivation to build wind farms, solar farms, etc.

The pharmaceutical companies need to be split into three segments: research and development (R&D), manufacturing, and distribution. No one company can be in more than one segment. Manufacturing and distribution are typically low-risk, low-margin businesses. They are also critical to continuing care and so will be regulated. The R&D segment is typically a high-risk, high-margin business. This will be unregulated (well, more like less regulated). Within R&D there are two means of funding. The first is the self-funding pharmaceutical industry we have today. The second kind of R&D companies are the startups financed by venture capital.

Given these segments how does the money flow to encourage innovation and discourage abandonment of established, low-margin markets. Firstly, by separating manufacturing and distribution these businesses can continue to exist and be profitable as they do not need to directly bare the costs of R&D. People, world over, continue to get sick with the same diseases!

How is R&D paid for? R&D works within a venture capitol environment. Money comes from outside the pharmaceutical industry and money comes from inside the pharmaceutical industry. The money from within the industry comes from the sale of drugs to manufactures and a royalty on retail sales. This money does not go directly to the R&D businesses. This money goes into a venture capital fund. This is one of the many funds that R&D businesses can pitch a therapy to. I do not know how the internal fund is managed. This might be a kind of utility with public oversight.

This segmentation and organization allows for the continuation of the high-risk and high-reward therapy R&D but takes out of R&D's control the manufacturing and distribution. A win for all interested parties -- and we are all interested parties in health.

Lawrence Lessig: How Money Corrupts Congress and a Plan to Stop It - The Long Now

I have mentioned before Lawrence Lessig's work to remove the inequitable distribution of power through money and the dependency corruption that is pervasive in US politics, esp. at the federal level. His book Republic, Lost is a challenging read. He recently gave a talk at the Long Now Foundation that I highly recommend. Please do listen to it. Well worth an hour of your time.

Lawrence Lessig: How Money Corrupts Congress and a Plan to Stop It - The Long Now

Rants (aka Tweets) on resumes

Rants (aka Tweets) on resumes:

A resume should not be a ouija board. Your mother might not need to understand it but I do. Make your work clear.

Your resume should tell me about the teams you were on; team size; team duration; role(s) you played; tools you used. Everything else is BS.

If your resume can't hook me on the first page the following pages are doing nothing more than annoying me.

I hope to god that I never see a 3 page resume again.

Apple had just one customer

“Apple had just one customer. He passed away last year.” - Seth Godin

The realpath command line utility

I seem to use my command line utility realpath all the time. All it does it take a relative path and return its absolute path. It is very useful. So useful that I can't believe all systems don't have one in /usr/bin/. If you are missing this functionality too then here is source
#include <stdio.h>
#include <limits.h>
#include <stdlib.h>

int main( int argc, char** argv ) {

int i;
char _realpath[ PATH_MAX ];

for ( i = 1; i < argc; i++ ) {
if ( realpath( argv[i], _realpath ) != NULL ) {
printf( "%s\n", _realpath );
else {
exit( 1 );
exit( 0 );
Save this to the file realpath.c and then build with make realpath.

Cloning a Subversion repository

If you are following the explanation in Version Control with Subversion, "Repository Replication", page 177, of how to clone a Subversion repository and want a working example then here it is as a Makefile

# NOTE ROOT must be an absolute path

    svnadmin create $(CLONE)
    ln -s /usr/bin/true $(CLONE)/hooks/pre-revprop-change
    ln -s /usr/bin/true $(CLONE)/hooks/start-commit
    svnsync init file://$(CLONE) $(SOURCE)
    svn co file://$(CLONE) $(SANDBOX)
    svnsync sync file://$(CLONE)
    ( cd $(SANDBOX) ; svn update ; ls -l )

    rm -rf $(CLONE) $(SANDBOX)


Replace SOURCE with a value appropriate to you. (Here I am using a convenient, small, public repository.)