Traversing, aka picking for scraping part 2

Extending the idea in the Picking for scraping posting is the traverse. A traverse safely traverses a (typically) hierarchical data structure where any of the intermediary traversal steps may encounter a missing node. For example, in an hypothetical application most data instances have a value at the end of the a traversal from A to B to C to D, however, this data instance is missing the B node. Traversing from A to B is safe, but traversing from B to C will result in an null pointer exception (NPE).

The common ways to avoid NPEs while traversing is to check each node for null before traversing it. And so you see code that looks like a ladder, eg

value = data
if ( value != null ) value = value.A
if ( value != null ) value = value.B
if ( value != null ) value = value.C
if ( value != null ) value = value.D

While this works it is very cumbersome and prone to copy-paste errors due to small variations found on some ladder rungs.

Some languages offer syntax support for traversing. This is called a null coalescing operator. Java, the language I mostly work in, does not have this operator and so a different solution is needed.

The solution I use allows for a safe traversing using data format specific classes. That is, XML has its own traverser, Json has its own traverser, etc. A fully general mechanism might be possible, but the technique is simple and so really does not warrant one. Here is the example of safely traversing some Json data

value = new Traverse(data).at('A').at('B').at('C').at('D').value()

Or, let's have B be an array and we want to traverse the first element

value = new Traverse(data).at('A').at('B').at(0).at('C').at('D').value()

The dictionary at() method is

public Traverse at( String key ) {
    if ( value != null ) {
        if ( value instanceof Map ) {
            value = ((Map)value).get(key);
            // data might be null indicating an unsuccessful, 
            // but safe from NPE, traversal
        }
        else {
            // an unsuccessful traversal always sets data to null
            value = null;
        }
    }
    return this;
}

While the array at() method is

public Traverse at( int index ) {
    if ( value != null ) {
        if ( value instanceof List ) {
            value = ((List)value).get(index);
            // data might be null indicating an unsuccessful, 
            // but safe from NPE, traversal
        }
        else {
            // an unsuccessful traversal always sets data to null
            value = null;
        }
    }
    return this;
}

As with the Builder pattern, the Traverse pattern always returns the traverse instance.

A common aspect is the check for type and this can be refactored out.

public Traverse at( int index ) {
    List l = cast(List.class);
    if ( l != null ) {
        data = l.get(index);
    }
    return this;
}

protected <T> T cast( Class expectedClass ) {
    if ( value != null &amp;&amp; ! expectedClass.isAssignableFrom(value.getClass()) {
        value = null;
    }
    return (T) value;
}

Expanded, working versions of XML traversal and Json traversal are at [TODO].

Oh, how do you get the traversed to value?

public <T> T value() {
    return (T) value;
}

END

No comments: