Logging and the utility of message patterns

I wrote a logging layer over log4j sometime ago. The layer was to encourage our developers to log at any point in the code where they thought some extra context would help resolve runtime errors. And so the layer had to have a familiar API, be effortless to code, and cheap to call. This posting is mostly about the cost to call the logger.

When a message is not going to be used you don't want to incur the cost of creating it. How often have we seen logging code that looks like this

logger.debug("short explanation: v1="+v1+"; v2="+v2+";");


Here the logged message is composed before the call to logger.debug() and it is composed even if the logging level is above DEBUG. If v1 or v2 have a complex toString() implementation, then you add this compositional time to the calling cost too.

The simple solution is to use java.text.MessageFormat with the debug() method, for example

logger.debug("short explanation: v1={0}; v2={1};", v1, v2 );


In this way, the time cost to calling debug() is the time taken to push three values on to the stack, call the method, and pop the three values off the stack. No composition time (or memory) costs are incurred at all. (I handle exception messages the same way, that is, all exception constructors accept a message pattern and message parameters and the message string is not composed until it is actually needed.)

I have since replaced Java's MessageFormat with a (mostly) compatible local implementation. This local implementation does simple composition tasks like formatting numbers without commas and Unicode escaping all non-ASCII, and non-visual characters. However, the primary reason to implement a replacement was to expand the available format types and format styles.

The first new format type added was json so that now when a parameter is so typed its value is transformed to a JSON representation before adding to the message. For example,

Map v1 = ...;
logger.debug("short explanation: v1={0,json}", v1 );


will convert v1 to a JSON string only if the log level is DEBUG

short explanation: v1={ "k1": [ "e1", "e2", ... ], "k2": [ ... ] }


(In hindsight it should have been a format style.) 

Today I added the string format type and two format styles. One format style specifies a substring of the value using a start index & length and the other a start & an end index. The start and end indexes can be given as numbers or as searches. For example, the style /foo/:10 specifies a 5 character substring starting where "foo" is first found, and the style /foo/../bar/ specifiers a substring between (and including) the two searches. The using search is particularly useful if you know that somewhere in a very long string is an identifier that would be useful to have in the log.