Workflow automation needs to take a more central role in our distributed systems designs

I am always at a loss to understand why workflow automation is not part of the common operations infrastructure of even the smallest software development shop. Few continuous, autonomous processes can operate without them needing to keep humans informed or occasionally ask them for input. These needs are usually solved by it sending an email to a functional address, eg "clean-the-data@firm.com," containing a short description of the issue and a link to an HTTP enabled form for providing the input. The email is forwarded to the right person, he or she completes it, and the autonomous processes moves forward, sometimes, changing its path. The sent email and the form's use are rarely centrally logged and so the autonomous processes can not be fully audited. Repeat this ad hoc solution for a dozen more conditions and use it a few times per month and you have created post hoc chaos.

We do see some workflow automation tools regularly used in devops groups, but they are specialized. Jenkins, for example, is used to build and distribute applications. Rundeck manages "cron" tasks that need to be executed on all the hosts or services. Even tools like Spiniker is, at heart, a workflow automation. I suspect that all these could be implemented as workflows on top of, for example, Camunda.

Historically, workflow automation has been entwined with ERP and BPM. I don't recall anyone ever saying their company's ERP migration was a pleasure to participate in. (Which reminds me of Douglas Adams's statement that "It can hardly be a coincidence that no language on Earth has ever produced the expression 'as pretty as an airport'.") To discard workflow automation due to the historical horror stories is truly a lost opportunity for the future.

Workflow automation needs to take a more central role in our distributed systems designs. Camunda seems like a good place to begin -- search for talks by Bernd Rücker of Camunda.


Who wants to perpetuate a flawed design when a proper one is just around the corner?

The SimpleDB (SDB) persistence design in Adding persistence to the Incident Response Slack application is poor. The plan was to persist a block of data containing all of a Slack channel's tasks. This SDB item would be named (SDB's primary key) with some combination of Slack channel attributes, such as channel id and enterprise id. I expected to add a version attribute to the SDB item so that I could use SDB's conditional-put to prevent overwriting someone else's changes to the same task list. This all sounds acceptable, but upon further consideration it is not.

Its flaw is that the design focuses on sets of tasks while the UI -- the slash command -- focuses primarily on individual tasks. This flaw led to a design that adds unnecessary contention to task updating. That is, if tasks A and B were being updated at the same time by users X and X, respectively, then it is likely that one of the two updates would fail and have to be retried. Add more concurrency and tasks to the mix then the service will feed unresponsive (pushing users away) and burden the server (increasing operating costs).

The other problem comes from SDB's eventual consistency and how conditional-put will affect throughput. If X and Y are changing tasks then how long will each have to wait on the other before persisting their change? Intuitively, a conditional-put on the version attribute would require that the value be consistent everywhere before a next change. This effectively pipelines all changes to a channel's tasks without any of the advantages a purposefully designed pipeline has.

For this hobby project these flaws are irrelevant. Nevertheless, who wants to perpetuate a flawed design when a proper one is just around the corner?

Internal displacement

When I see these charts I want to know how I can use my skills to help people and not machines.


Source http://www.internal-displacement.org/

What does my pile of tech at Crossref look like?

What does my pile of tech at Crossref look like?

Service infrastructure is 29 deployments -- mostly 4 CPUs and 8G RAM -- handling 100M external (unique) queries and 1B internal requests per month.

Server infrastructure is Tomcat, ActiveMQ, MySql, and Oracle.

Service development is primarily in Java w/ Spring. Infrastructure operations aided with Bash and Perl scripts.

Primary datastores are RDBS using MySql and Oracle.

Secondary datastores are NoSQL using Oracle Berkeley DB, Solr, and bespoke solutions.

Full text search uses Solr, and bespoke Lucene solutions.

Data originates primarily in XML, JSON, tabular, and semi-structured text.

Lots of Linux operations experience. Some AWS operations experience.

No server, disk, or network hardware configuration and operations experience.

Source code managed in Subversion, developed in NetBeans, bespoke CI, and bespoke automated deployments.

Oh, and my trusty MacBook Plus.

Equivalence and Equality

There is an interesting debate going on over at The Eternal Issue About Object.hashCode() in Java. The root of the problem in the debate is that the participants are not distinguishing between equivalence and equality (aka identity). In computing we make a distinction between the two. Any two instances are equivalent if their values are the same. Equality is a stronger equivalence. Ie, it is not enough that the values are the same, but the values must be the same instances. Eg, A = "joe", B = "joe", and C = A then A, B, and C  are equivalent while only A and C are equal.

Java has made a mess of this distinction because its Object class 1) implements an equals() method and 2) the implementation is equality. So, by default, no two instance are ever equivalent. Most people would consider "joe" and "joe" to be equal, but not Java. (Java should have defined equals() and hashCode() as abstract or, better yet, define them in an Equatable interface.)

My reading of the Eternal Issue article is that there is a contract that base-class instances would use a value, provided during construction, to distinguish themselves. This is equivalence. Eg
class Baseclass {
   String name;
   Baseclass(String name) { 
      this.name = name; 
   }
   boolean equals(Baseclass that) { 
      this.name.equals(that.name); 
   }
}
The sub-class would depend on this, in part, for its determination of equivalence, eg
class Subclass extends Baseclass {
   int age;
   Subclass(String name, int age) { 
      super(name); 
      this.age = age; 
   }
   boolean equals(Subclass that) { 
      super.equals(that) && that.age == that.age; 
   }
}
Now, what happens when you remove Baseclass's equals() method? Doing that changes Baseclass's distinguishing behavior from equivalence to equality as instances now default to using Object.equals(). This is a extraordinary change to the contract between the classes. Subclass equivalence will immediately start to fail because no two Subclass instances will ever be at the same memory location.

Never mix equivalence and equality and an inheritance hierarchy.

A consequence of the rule is that since you decided not to use name to distinguish Baseclass instances then all instances are equivalent. Ie, they are indistinguishable. The only correct change that adheres to the contract would be not to remove equals() but to to replace it with
boolean equals(Baseclass that) { true; }
Of course, any change to Baseclass is likely a bad idea and I would never recommend that you respond to this change with anything less than tar and feathers. But, at least, make sure the changer of the Baseclass understands what they did vis a vis the contract.

Adding persistence to the Incident Response Slack application

Adding persistence to the Incident Response Slack application is the next feature to implement. For this application change happens at a human pace. That is, even the busiest incidence response is unlikely to have more than a few dozen changes per hour. That is, changes per channel per hour. The application might need to coordinate many thousands of channels of changes per hour. Given this situation, persistence at the channel level can be coarse while persistence at the application level needs to be fine.

For coarse persistence with infrequent access storing the whole model as a chunk of data is usually sufficient. Within a channel our model is a collection of tasks each with a description, assignments, and a status. There might be one or two dozen tasks at any time. With an expectation of, on average, short descriptions, one user assignment, and one status we expect 100 to 200 bytes per task and so some 1200 to 4800 bytes in total, ie, 12 tasks * 100 bytes to 24 tasks * 200 bytes. Reading and writing this amount of data is too small to worry about performance; that is, the storage mechanism's overhead will dominate each operation.

For fine persistence with frequent access persisting must be done at the item level. The storage mechanism must allow for random, individually addressable datum. We don't need the storage mechanism to provide structure within the item. A key-value store will do.

A simple system design would have one application instance running on a host that has RAID or SAN storage. If the application crashes the host will automatically restart it and so only incur a second of downtime. And the likelihood of losing the RAID or SAN is too low to worry about. If your level of service allows for this system design then a useful key-value store is the humble file-system. Unfortunately, this design is also the most expensive choice from cloud providers.

Cloud providers will want you to allow them to manage your compute and storage separately. This enables them to provide your application with the highest level of service to your customers. A consequence of this is that your application needs to be designed to run with multiple, interchangeable instances, remote storage, and network partitioning. Unlike the one host & one disk platform, the cloud platforms are not going to help your application that much. The problems associated with distributed application design — CAP, CQRS, consensus, etc — are still largely the application's to solve.

Incident Response is, fortunately, too simple a tool to warrant sophisticated tooling [1]. If two users update the same task at the same time then one of them will win. We will attempt to tell the user of the collision, but the limits of eventual consistency may preclude that. Every cloud platform has a managed key-value store (with eventual consistency) and managed web applications. Since I know AWS, I plan on using SimpleDB and Elastic Beanstalk for the next implementation.

[1] I really want to explore Apache Geode!

Grumble about the JDK standard libraries

I recently wrote a Slack application and restricted myself to using only the JDK. For some reason, I wanted to reminded myself of how awkward Java programming is for newbies.

No one chooses a Java implementation without the expectation of needing to include a shedload of external libraries. Luckily, Java has the most best of class libraries available, that are easily incorporated with Maven, and that have facilitated the great variety of successful applications being built and maintained every day. The JDK's standard libraries are, however, doddering and incomplete, especially as to building applications for the internet. Scripting languages like Python, PHP, and Ruby do have standard libraries that have evolved to incorporate the internet. And these languages are being successfully used by newbies [1]. Nevertheless, I had my question to answer.

My application uses Slack's outgoing webhooks and so is a specialized HTTP server. The JDK does have an HTTP server, but it is in the com.sun namespace and does not implement the Servlet API. Implementing the Servlet API would give the newbie the experience of using a standard container and, moreover, allow his or her application to be deployed to any number of cloud providers. Implementing an HTTP server is not a quick task and so I choose to use the JDK's.

When Slack sends a webhook request the body content is url form encoded, eg "a=1&b=2&b=3".  (You more often see this used in URL queries.) Apart from an early spat over delimiting with ampersands vs semicolons this encoding[2] has been universally, consistently used for decades. Unfortunately, the JDK does not have a standard means to encode or decode this data. I implemented a decoder.

The response to the webhook is a JSON encoded message. JSON's adoption rate has been breakneck, but standard libraries have caught up. The Java Community Process (JCP) did define a JSON API several years ago, but an implementation is not in the JDK. While outputting JSON is straightforward, ensuring it is syntactically correct calls for having support. I implemented an encoder

Lastly, and most sadly, is the situation with fixed width hex values. Slack wants colors expressed in the RGB hex notation, ie #RRGGBB. JDK has a Color class, but it can't be used to create an RGB hex value. (Its Color.toString() method is practically useless.) To format the value you can use

String.format("%02x%02x%02x", c.getRed(), c.getGreen(), c.getBlue()) 

but, honestly, what newbe is going to know that or have any hope of Googling the answer?

Gumble over. Back to real world Java development.


[1] That academics use Java in introductory programming classes is appalling.

[2] It is not really "encoding" and "decoding," but marshalling and unmarshalling.

I am leaving Crossref

Crossref logoI gave in my notice at work. I will be there until the new year helping them transition. After 9 years at Crossref, there is a fair amount to hand over. The hardest part in any of these long transitions is keeping up the momentum in the face of increasing detachment. Crossref has given me much over the years and so extra perseverance is warranted.

If you would like to talk about coming opportunities then contact me at andrew@andrewgilmartin.com and we will schedule a time to do so.

Incident Response Slack App

As we learn how to better use Slack we are experimenting with different ways of managing our response to incidents, aka emergencies. One experiment is that when an incident is discovered the existing #incident-response channel is used to send an alert to those on-call. We then immediately create a new channel for only the new incident's staffing and communications. While we rarely have overlapping incidents, having a dedicated channel does prevent the interleaving of messages about other incidents, too many tangents, and only those working the incident are disturbed by @channel or @here messages. When the incident is resolved the channel's messages can be copied into the beginnings of the post-mortem document, and then archived.

During the response, tasks emerge that need to be assigned and tracked. Slack itself is not good at this alone. There are many applications for task management that can be made accessible via Slack slash-commands. For incident response tasks, however, these general purpose applications were too focused on the user and not enough on the channel. When listing tasks we only want to see those for this incident. Their extra features also had a cognitive weight that I brisselled at. Overall, their fit for purpose was poor.

What was needed was a task manager with a scope limited to one channel. The task manager would be installed in the workspace, and so accessible to everyone, everywhere without configuration, but when in use was channel focused. The task manager needed to support these simple the use cases
  • Adding tasks with a description, assignments, and a status.
  • Updating a task’s description, assignments, or status. Only notify the channel when the changes are pertinent to all.
  • Listing the tasks with optional criteria.

These use cases turned into the /ir slash-command
/ir description [ user … ] [ status ]
/ir task-id [ description [ user … ] [ status ]
/ir [ all | finished ] [ user … ] [ status … ]

If you are interested in the implementation then see github.com/andrewgilmartin/com.andrewgilmartin.incidentresponse

Slack messages are rarely ever completed thoughts

Comments on TechCrunch's article "Distributed teams are rewriting the rules of office(less) politics."

I don't think that article quite hit its mark, but it did contain two important bits. Loneliness is an issue for me. I like being among my colleagues. Even if nothing is said all day the presence of others roots us to a common purpose and common comfort. The other is the need to write more and with specificity.

A requirement of asynchronous communications is leaving information for others to respond to later. Ad infinitum. The longer the timespan between communications the more context each information leaving must include. It is a skill to provide brief context in these leavings along with clear answers and, perhaps, further questions. Many characterize asynchronous communications as a conversation. However, conversations are informal. Our communications are more like a dialogue where we explore problems and solutions. Problems and solutions are formal, or rather somewhere between not informal and algorithmic. And they must to be recorded and the records curated. Asynchronous communications requires a diligence that is not provided by simple conversation.

One of my favorite discoveries (unfortunately, shortly after college!) was Thinking on Paper by Howard and Barton. Writing is design thinking. (Coding is design thinking.) Design thinking is iterative and unsteady in its forward motion. Two steps forward, one step backward. Unfortunately, people still think of writing as something that happens after thoughtfulness. This misunderstanding can lead to written mistakes being harder to recover from than said ones. If you take away one bit of advice from here, it would be to assume that Slack messages are rarely ever completed thoughts.

Remote teams and hiring

While I was driving to work this morning I was thinking about an earlier conversation that touched on an earlier blog posting about the difficulty of having too large a skills gap in your team or in your development organization. We agree about the skills gap. Then he said something that gave me pause to totally reconsider my stance on remote development organizations. My stance had been that they don't work as too often during a project you need the high bandwidth available in face to face collaboration. But this stance comes from assuming there is a large gap between staff. What if the gap was small? What if gap was of zero width?

It has been my experience that you can successfully communicate and work remotely together when the gap is small. The eureka moment came when I reconsidered the reputed benefit of remote work that allows you to hire the best from anywhere in the world. It is not a benefit, but a rule. It is not that you can, but that you must hire the best.

Chris Sinjakli's talk "Doing Things the Hard Way"

I am sure you watched James Mickens's talk Q: Why Do Keynote Speakers Keep Suggesting That Improving Security Is Possible? today. It was honest and funny — even my ceramic artist wife laughed at the IoT humor — and contained a great primer on machine learning — my wife skipped that part. As it ended YouTube's autoplay started Chris Sinjakli's talk Doing Things the Hard Way from this years SREcon18 Asia/Australia. From where I sit today his experience, analysis, and advice are spot on. So, if don't have 51 mins to listen to Mickens I high recommend you spend 22 mins listening to Sinjakli. On second thought, listen to both.

X. Kishore Mahbubani and "Has the West Lost It? Can Asia Save It?"

When your country is falling from first place into second place your best course of action is to establish a world order that gives second place countries advantage. Bluntly, multilateralism is far better than isolationism. As China and India continue to rise the USA is destined to be 3rd place, at best; perhaps even 4th place behind Indonesia. I highly recommend this lecture by X. Kishore Mahbubani "Has the West Lost It? Can Asia Save It?"

Grumble about Google Calendar display of vacations

Does anyone else hate the new Google Calendar display of vacations? I only want one weekday column highlighted and that is today. Not P's vacation day on Friday. Not M's midweek vacation next week. G's all-day event block at the top of Friday's column is fine; even better is T's multi-day vacation line at the top of the day columns. Really, Google Calendar UI design team, what were you thinking!?



Ousterhout's "A Philosophy of Software Design"

Greatly enjoyed John Ousterhout's Google talk "A Philosophy of Software Design." It is for the Tcl language and the Tk user interface toolkit that I know him. I was a fan of Tcl back in the day when you didn't add a REST API to your application, instead a you added a scripting language. For that purpose Tcl was a perfect match: easy integration of the interpreter into the application and easy extension of the interpreter with application functionality. Tcl did not make the transition to the Web and so has mostly faded into software development history. If you think DSLs are awesome you wish your language had uplevel and upvar. If you think Docker container images are awesome you will be interested in Tcl's Startkit.

In this talk Ousterhout is chronicling his attempt to teach software design at Stanford University's CS 190: Software Design Studio class. I want to find out if the local universities are trying this and offer to be a teaching assistant.

Technical Junk

Much of what is called "technical debt" is more likely "technical junk." Technical debt has the ring of responsibility around it. As we build new services and maintain the others we tell ourselves that we have made reasoned judgments as to what to ignore for now:

• We don't need to update that library just yet, but we know not that falling too far behind the current version will make updating grueling and so will do it later.

• We don't rewrite a troubled module just yet, but we know the rewrite will relieve us from burdensome support and so will do it later.

• We don't replace the data design implementation just yet, but we know its scope has been exceeded and is now impeding enhancements and so will do it later.

• We don't broaden the testing regime just yet, but we know that doing so critically supports systems changes and so will do it later.

• We don't speed the ever slower, periodic, automated task just yet, but we know that task overlap has dire consequences for downstream processes and so will do it later.

• We don't hire additional staff just yet, but we know that additional staff is critical to efficiency and so will do it later.

Now, this sounds like technical debt and it is when you actually attend to it later. When you don't you have technical junk. The system and its data are patched, brittle, duplicated, lossy, slowing, and only though the sheer force of willpower can it be enhanced and maintained. Still, we continue to tell ourselves and our management a compelling story of progress on the two-fronts of enhancement and maintenance.

Why have group meetings in software development?

Why have group meetings in software development? I think there are only three good reasons.

The first good reason is when several people need to come to a consensus. The outcome of these meetings are decisions. Ideally, everyone comes to the meeting prepared for the discussion. I like for a proposal to be written and distributed before the meeting. This means that at least one person has thought through the decision's context and ramifications, and that the meeting's participants have time to read and ponder it beforehand. Jeff Bezos has an interesting workaround to unprepared participants and that is to have meetings start with several minutes dedicated to reading the prepared materials. That Inc. article has several other good tips. I can't help myself and so must include a reference to Tufte's The Cognitive Style of PowerPoint. Thankfully, slide decks have been mostly absent from my last several years of professional work.

The second good meeting is the daily standup. It is too easy for a developer to not ask for help soon enough, and standups quickly stop that situation from worsening or the developer going dark. My manager at Cadre used a system of green, yellow, and red work status. If the work was going well then it was green. When there were complications the work was yellow. When it was red it was blocked. It is useful to use this system even in a standup. The first part of the standup has all participants give a brief status. The second part deals with yellow statuses with a brief description of problem and assignment of who is best able to help. Red statuses are dealt with outside of the standup. In all cases, however, don't solve problems in the standup. Giving solutions might not derail the standup the first few times, but has been my experience that by not maintaining the standup's principles it will soon devolve into a group chat and finally abandonment.

The third good meeting is for celebration. I am most interested in celebrating the group's achievement as a whole. Generally, everyone contributes as they are asked and as they are able, and so I see no need to single out individual contributions. The one exception is for someone who has shown marked growth. Pizza in the conference room can work, but you will have more success if you take everyone out to lunch.

Update: The TechCrunch article "Distributed teams are rewriting the rules of office(less) politics" had the link to Amazon's "narratively structured six-page memos."

Some more comments about a healthy software development organization

That last posting was a little to high level. Especially for someone like me that likes things to become grounded — at least for one day! Most of my career has been in small companies building small products. Apart from places like Lotus and Geac my employers have had less than 20 developers. This organization size has shaped my expectations of what is useful for a project to succeed.

The first document any project needs the "Product or Feature Digest" A template of which is here. This document organizes the other documents. It is the one place everyone can read and get a grounding on the product and its implementation. If what is being built is very small then it might be the only document. Most sections of the document have an obvious purpose. However, the first two require further explanation.

I have never worked on a product or feature that did not change its goals. I doubt you have either. Most changes are refinements due to a better customer understanding, or due to initially unforeseen constraints, or revisions to feature priorities, or features removed or added, etc. The reason we keep product management's original document is that it was the locus of everything that happened afterwards. The differences between it and the current revision gives the reader an understanding of the maturation of what is being shipped. For senior management, where their attention to the project is periodic, it helps bridge their previous understandings to a current understanding.

I like my projects to have this kind of documented grounding, but this does not make me waterfall methodology advocate. I agree completely with Kent Beck's statements in Extreme Programming that software change is cheap. I like the agile methodologies that stem from this seed. I draw the line, however, that product management and software development is nothing more than reorganizing and implementing the backlog. Developers need to know that what they are doing is coherent and concise. Otherwise the work becomes little more than hacking at the coal face — an endless drudgery.

Within a project I don't care much what tool you use to enumerate the work items and their dependencies. What I do care about is that the discussions related to these work items are located in that tool. If the tool's commenting interface is cumbersome then don't select it. If you do, your staff will not use it. Let me repeat that. Your development staff, as a whole, does not like to write stuff down, and if you make it inconvenient to do so they will attempt to make progress without using it [1]. When this happens the only record you will have of the obstacles found and decisions made will be in heads. When the project ends and the staff disperse you will have little from which to draw on to hold a successful postmortem. Worse, however, is that your development organization is doomed to remain, at best, at Level 2.

Lastly, for now, where do instant messaging tools sit in a software development organization? For me, at the bottom of the communication modes. What message is so urgent that it can't wait until tomorrow morning or some other synchronization time [2]? For smaller organizations where one developer might have multiple roles the interruptions will be harder to control, but they can be controlled. The head of development needs to take control of them.

The other problem with instant messaging is that it becomes the primary mode for discussions and decisions. Instant messaging is technically an asynchronous form of communication, but it is rarely used with that expectation [3]. Instant messaging has become akin to oral communication with all of its concomitant weaknesses. It is too easy for a senior staffer to initiate, for example, a Slack conversation to come to a decision than it is to open an issue and discuss it there or simply wait until a next meeting. Or the developer who interrupts everyone to ask a question that could have been answered elsewhere with a little effort on his or her part.

At this point I am sure you are thinking I am a madman. I have the developers sitting alone at his or her desk coding and communicating only online and only asynchronously. It would be a lifeless place without actual face to face communications. Hallway conversations and meeting are vitally important to a development organization. Important enough to write about separately in another posting.

[1] Jira is a good example of a bad interface. Atlassian has put so much effort into enabling customizability that its has made the hourly effort of interacting with issues & comments to be on a par with the one time effort of creating a set of project "status" tokens.

[2] For example, place all announcements on the kitchen refrigerator or on the bathroom mirrors.

[3] Why We’re Betting Against Real-Time Team Messaging criticism of Slack is spot on.


Some comments about a healthy software development organization

I have been thinking a lot about healthy software development organizations recently [1]. I have never headed a development organization. I have always been an individual contributor working as an architect, a principle engineer, a team leader, or one of the many other roles on my the way to these. So I work within an existing organization and have some command to shape it. There are limits to this command, however. The shape is principally formed and administered by the department's head.

I have worked for several heads of development. All were good people. All managed differently. Some were curt, some loyal, some inclusive, and some neglectful. I have learned much from working with them. Here is some of what I have learned.

A healthy development organization is one that has
  1. Staff with a range and overlap of skills & experiences.
  2. Work that is finished when all steps are complete.
  3. Communication is balanced between structured & unstructured.
  4. Achievements are celebrated.
In a larger organization these points are easier to achieve, but even a small organization must position itself to attain them.

You need to have a staff with a range of skills & experience and without large gaps in either. For example, having three junior developers and an architect is not healthy. Beyond the one bus problem is the problem that the larger the gap the more likely the organization will create a implementation that is of irregular consistency. Consistency enables an organization to potentially use many more of its staff to resolve bugs and add improvements. You don't want to have anyone say "Andrew wrote that. I don't touch it without his input."

The work of a software product is not just code. We all know that, but the pressure to release the implementation alone is great. I am a firm advocate in documentation and testing. Testing is easier for developers to accept as it is a channel for more coding; coding is something they want to get good at and enjoy doing it even when the effort is going badly. Getting development documentation written is an uphill struggle. 

I have only ever worked at one organization where developers prepared documentation without acrimony and that was the CASE company Cadre. I was an apprentice programmer at the time and ready to accept, without question, everything I learned there. I have since come to be more selective, but the importance of documentation has remained. Documentation is a part of my third point about communications. Communications includes all its in-person and online forms. Communication is about coming to a common understanding and then achieving consensus.

A software project has a product goal and an implementation goal. The product's goal, ie having an implementation that works, is the easy part of software development. Having an implementation that the development organization can support is much harder. The initial expression of the implementation is not code, but a written design. This written design might only consist of a few diagrams and an enumeration of constraints and problem areas, but having it written down means that someone made the effort to attain a comprehensive grasp of the product implementation and to communicate it to others. From that the development staff can begin to come to a common understanding. There are other documentation needs, but for now, lets start with an upfront design!

My last point is one coders and heads of development publicly dismiss, even belittle, but privately value when it is done well. So much of a developer's work is unseen. The reviewer finds problems. Who finds successes? (We don't even have a title for such a role!) Developers want others to know about their trials (their stories), and for their accomplishments and improvements to be acknowledged. None of this is needed for the product. All of this is needed for a healthy development organization.

[1] I am going to use the term software development and not software engineering. My ego is buffed by the engineering term, but, frankly, software development is far from an engineering discipline as the term is used in other inventive organizations. Our work is, with the best of meanings, craft.

The best way to think about Silicon Valley is as one large company

"The best way to think about Silicon Valley is as one large company, and what we think of as companies are actually just divisions. Sometimes divisions get shut down, but everyone who is capable gets put elsewhere in the company: Maybe at a new start-up, maybe at an existing division that’s successful like Google, but everyone always just circulates. So you don’t worry so much about failure. No one takes it personally, you just move on to something else. So that’s the best way to think about the Valley. It’s really engineered to absorb failure really naturally, make sure everyone is taken care of, and go on to something productive next. And there’s no stigma around it."

-- Valley of Genius

Found at Stuff The Internet Says On Scalability For August 3rd, 2018

Gaming with Alexa

I am an armchair tabletop wargame and broadgame geek. I say armchair as I mostly seem to read and speculate about games far more than I play them. In part this is due to available opponents and in part to simply not making the time. Nevertheless, I persist. 

When my children were very young I noticed that they would spend long periods of time studying the details of intricate pictures. This was their "dinosaur period" and so it was mostly illustrations of Jurassic flora and fauna. I had the notion of a game set on a large rug sized illustration they could scamper around on. The rug was pressure sensitive so the children's location was known. The rug would speak and listen. The children would respond to its directions either alone or in small groups. They could be the hunters, the hunted, the treasure seekers, the jungle veterinarians, etc.

The game remained speculative, but the ideas of location aware game boards, audio interaction, and physical game pieces has continued to interest me. I explored using old school pen digitizers, old school touch screens that used infrared interference for locating, magnets and mechanical switches, RFID, image pattern recognition with and without QR-codes or colored dot markers, etc. 

Gaming in wild came under scrutiny. How would LARPing or scavenger hunts change with augmented reality? What about audio only games? What would a naval or starship strategy game require from a driver stuck in commuter traffic? How much of the map or simple orientation could the player keep in their head? Clearly, these would not be realtime games or there would be high likelihood of distracting the driver into actual vehicular combat.

When Amazon's Alexa was introduced I read the SDK documentation with excitement. Amazon had done the hard work of creating a conversational model for audio interaction. I think a small, jet fighter combat oriented over the driver's car roof is a game well within the skills of even a moderately skill programmer. Now to make the time.

When getting it to work is the least of your problems

There are 3 important traits of good software implementation
  1. It works.
  2. It is maintainable.
  3. It is reusable.
We have heard many times over the years how a language, a framework, or a methodology are better (and often "the best".) And I don't doubt this for a minute, but over time their betterment is very unlikely to prevail. Three examples are in order.

The developer that creates a data driven implementation that only he or she can understand is not gong to be maintainable by your low skilled, procedural developers. The implementation works, but it can not be considered maintainable and reusable if these activities are limited to one developer.

The development team that picks a language that has a passionate following and generally good library coverage for implementing the initial product release, but it does not have broad acceptance in the development organization. This implementation is not going to have staff readily available for maintenance. So the original staff become its maintainers and these are, very likely, the most expensive developers on your staff.

The development organization that eschews using anything but in-house developed tools and frameworks. No matter how well it is architected, systems designed, used in implementations, and available on-boarding training it is never going to be better than what is available outside. The longer your staff remain, the less employable they become, and so the more anxious they are to leave or, worse, hunker down into survival mode.

When you are creating a product for a one-time use then by all means use whatever it takes to get it working. Remember that these products are gadgets. Even if the gadget is critical to success of some other endeavor -- eg, it specializes in an island's one-time disaster response logistical problems -- it is still a gadget and so expendable. When you are creating an appliance then getting it to work is the least of your problems. Maintaining it and making its parts reusable are central to the development organization's success and repeatability. This requires that you be able to continually staff your organization with the range of developers with costs appropriate to the lifecycle stages of your products.

Slack and "Operation Crossfire aftermath"

I was watching Operation Crossfire aftermath and another use for Slack came to mind. Lindybeige was coordinating a miniatures wargame with several dozen people, across the world, over a 4 hour period using email. It was a logistical nightmare trying to keep track of each conversion and the materials needed to be gotten and sent to each participant -- half of whom were called "Steve"! As I watched it occurred to me that had Lindybeige instead used Slack and created, for example, a channel for each participant and a channel or two for general purpose it would have been logistically much easier. Channels in bold have unread messages, messages are organized chronologically, messages could include maps or command and control documents, messages can be pinned for easier reference, etc. Lindybeige still would have been exhausted by the effort, but, hopefully, in good spirits throughout.

I should note that using Slack in this way is not the same as using a messaging platform such as Skype, IRC, etc. In a messaging platform you are talking with an individual within the context of the rest of their life's conversions. Instead, a Slack workspace is being using to gather individuals for a shared purpose. The rest of their life happens outside of the workspace. When the individual is in the workspace they are, for all intents and purposes, the character they are playing. This is reinforce by their communicating via the named channel. I am no long Andrew Gilmartin, but instead I am #Second Lieutenant Anderson.

[1] This is a follow up to Slack for One.

[2] I know next to nothing about who is Lindybeige. I do enjoy his YouTube channel.

Slack for One

I have a deep background thread running in my head that started when I read in The Year Without Pants: WordPress.com and the Future of Work about how Automattic, Inc. created a new Wordpress site for every bug. I tend to think of blogs as having weight. They are heavy. However, this is psychological response to how blogs have been used. Technically, they are not much more than a column value in a database table. One of the lightest datum there is. Slack is no different in this regard than is Wordpress; Slack's Workspace is no more than a column value in a table.

Recently, I had the idea that you could use a private Slack site for personal projects and journals. Each channel is a new project or a journal. Slack's chronologically organized messages allow for text, images, (simple) formatting, and the inline inclusion of external content such as from Google Apps. There are many bots for managing each project's actions and reminders, for example. And if you do need to share the project you can. As I thought about this more there seemed no end to how to use Slack as purely a personal information tool rather than a group communications tool. I wonder if anyone has actually used Slack this way?

So, how little data are too few?

Don't use percentages when the data you have are too few. If you have 3 data points then saying "%67 of reviewers gave it 5 stars" is both accurate and misleading. It is better to say "2 out of 3 reviewers give it 5 stars." Doing this assists the reader's intuitive grasp of the usefulness of the rating. When I was following the local school district's quantitative heavy presentations it was obvious when they chose to use percentages and when they chose to use counts in order to give a positive depiction of bad news. Don't cause your reader to doubt your trustworthiness.

So, how little data are too few? I would say anything less than 100 data points.

Four at a time

My miniature painting has not gone well. I am deep in guilt. A fellow gamer gave me all his 28mm dark age armies when he decided that he was not going to play that period any more. It was a very generous gift and I am grateful. I did not like his Norman army's painting, however, and so decided to repaint them. I stripped 27 figures and primed them. I used a priming method I had not used before on the advice of another gamer I trusted. The primer is a mixture of black gesso and a medium used for painting on glass. My results were not so good. I am sure it was my fault. The consequence was that the figures sat on the worktable for months. My guilt set in. I had made the gift useless.

Months later I was reading about primers, as you do, and was inspired by the article on using Krylon ColorMaster ultra flat black primer. I stripped all 27 of the figures again and applied the primer. My results were not so good. I am sure it was my fault. Again, the figures sat on the worktable in limbo. My guilt grew heavier.

This weekend I stripped all 27 figures again. They are, as I write this, sitting in a second round of stripper. Each of my successive primings seems to have actually added tooth to the figure's surface and so has made cleaning them more difficult!

I have learned a few useful lessons form this experience. The small lesson is that the primer I have used for years works and I really don't need to experiment. The big lesson is I can't work on so many figures at once. If I had only worked on 4 figures at a time then by the end of each week I would have 4 painted figures. Within two months all the figures would have been painted and have been on the table killing the Anglo-Saxons.

When the second round of stripper has eaten away at the bond between primer and figure I am going to pack them away. Too much bad karma has enveloped them. I and they need to rest. Instead, I am going to paint some old Warhammer 40K space marines. 4 at a time.

Getting Things Done for Teens

My children are off to college in the Fall and so my mind has been on what will help them succeed. I have always liked GTD and so I read the recent publication of Getting Things Done for Teens. The book does a good job of using a voice that it is not too young and not too formal. The GTD advice is laid out as any other GTD tutorial and is supported with some useful illustrations. I don't mind the two cartoon characters that are used to distinguish the impulsive and the steady centers of the brain. Monkey brain and Owl brain are useful mnemonics, but, perhaps, a little childish.

The book is composed of 3 sections. The 1st section is the GTD framework. The 2nd section is life planning with GTD. The 3rd section is troubleshooting with GTD. The 1st section is required reading. The 3rd section is useful, but its advise could have been rolled into the 1st section. The 2nd section is a mistake. Few American teens are mature enough to use the advise in this section. The 2nd section adds a considerable page count to the book's total. And here is the rub; the book is useful, but at 288 pages it is 200 pages longer than most American teens are willing to read without a clear & present need.

I don't doubt that the authors know their audience. I suspect they would agree that a shorter book is more likely to be read than a longer one. So why included the 2nd section at all? I suspect it has more to do with selling a standard sized product than helping the teens. My advise is to tear the book into front and back parts and then throw away the back part.

Graphviz online

I like Graphviz for quickly creating network graphs. Xin Huang compiled the code to JS and created this useful online editor. https://github.com/dreampuf/GraphvizOnline

JNDI, Tomcat, and ClassLoaders

If you want to use JNDI within a servlet container than you need to be careful to use the corresponding class loader for your shared resource. So, ensure that you have only one copy of the class in your JVM and then when you want to bind or lookup the resource use the resource's classloader. For example,
public <T> void bind(String name, Class<T> valueClass, T value) throws NamingException {
    ClassLoader cl = Thread.currentThread().getContextClassLoader();
    try {
        Context c = new InitialContext();
        Thread.currentThread().setContextClassLoader(valueClass.getClassLoader());
        c.bind(name, value);
    } finally {
        Thread.currentThread().setContextClassLoader(cl);
    }
}

public <T> T lookup(String name, Class<T> valueClass) throws NamingException {
    ClassLoader cl = Thread.currentThread().getContextClassLoader();
    try {
        Context c = new InitialContext();
        Thread.currentThread().setContextClassLoader(valueClass.getClassLoader());
        return (T) c.lookup(name);
    } finally {
        Thread.currentThread().setContextClassLoader(cl);
    }
}
I am passing the valueClass to bind() so that an interface can be used if wanted.

Problems of a modern, digital, open office

Never liked open plan offices. Lots of studies going back to the 1960s on their problems. This new, small study shows the problems of a modern, digital, open office. Summary at Open Offices Make You Less Open.

Off-shore oil rigs

When I was a kid in the 1970's I was fascinated with off-shore oil rigs. I was living in the UK and there was much news about developing the North Sea oil reservers. I would see short snippets on the TV of rigs on maiden voyages, and exciting photographs in magazines highlighting their intricacies. By today's availability of information these few images were as ephemeral as smoke on the wind. So off-shore rigs had a mystery and a fantasy around them. Earthbound space colonies. Nevertheless, I was, and continue to be, inspired by their engineering and the unmitigated confidence needed to build them.

Now, of course, an internet search finds enough information to become an armchair expert in the off-shore oil industry. There are models to build. There is even, it seems, a sub-culture of 3D renders of them and other industrial sites (eg).

Photograph is of Norway's Draugen Oil Platform (src).

Computer assisted support for historical wargames will happen if...

My favorite wargames podcast is Meaples & miniatures. The hosts are all wargaming butterflies, and I mean this kindly, and so there is a good amount of new and old discussed and compared. Their 250th episode is their last of the Summer as they take a break to, well, rest and renew like the rest of the northern hemisphere. For this episode they are answering questions from their audience. I've not finished listening yet, but felt the need to bring a different perspective to one of their answers.

The question was whether or not an assistant computer application would be used in historical wargaming? Apps are appearing more regularly now for boardgames. Some of these boardgames have game mechanics and parts that are very close to those of miniature wargames. Even the forthcoming X Wing 2 is supposed to be app assisted. So, there is a trend in apps and there is a trend in wargames to be more like boardgames (in their initial costs and time commitments). An overlap is inevitable.

The hosts' common answer was one having to do with implementation rather than game play. They discussed how an app is dependent upon a general device, and a large networking and computing infrastructure. The general device being your phone which, for all practical purposes, you own but have little control over its OS or applications suite. The infrastructure being mostly the publisher's backend servers in data centers that run the core of the game programming; which, again, you have no practical control over. The hosts see these dependencies as the achilles heal of apps. They suggest that only large publishers have the money to continue to support a game that has passed its peek sales and so must sustain the game's implementation on smaller, incremental sales. While not said, and I expect that the hosts would agree, that it is optimistic to expect that any publisher would continue to support a game beyond its suitable profitability; ie, if profitability is too small then the company is better of discontinuing the game and use the freed-up resources on higher profit games. So apps are doomed!

Not so. There are implementations that do not require the publisher's continued support. First, some assumptions.

1. A phone or tablet is cheap enough to have a single, specialized use. The device is the game and it is the "rule book." Rule books are around US $15 to $50 these days. The device would be the same. The device will never be upgraded so the apps on it will continue working baring mechanical failure.

2. The device needs access to a messaging network. The messaging network is one where every device has a unique address and that a message can be sent to that address. If players are not colocated then a wide area network would be needed. The Simple Message System (SMS, ie texting) is one such network. There are others, but SMS is by far the most common, well supported, and almost future-proof given the world's telecoms commitment to it.

In many ways, the Kindle with 3G is an archetypal example of this device and network.

With such a device and network you can implement a multiplayer game very successfully. All the algorithms needed, eg peer to peer and modular co-operating services, are battle tested and open. The device's computational and storage requirements are minimal. The networking bandwidth needed is small, eg a few kilobytes, irregularly sent around to all player devices [1]. Embedded systems manufacturers have been designing and massively deploying just this kind of environment for years.

So, if you consider this different implementation of computer assisted support for historical wargames then the answer is yes, just as soon as gaming companies have lead engineers and architects that have a broader view of the devices and their communication. Now the real and important question can be asked, is historical wargames game play enhanced by having this assistant?

See my earlier posting $0.43 for Psychotherapist Barbie services about funding backend servers for toys.

[1] MMOs, massive multiplayer, online games, is a different beast. These systems do need a central or federated infrastructure.

Fourier Transform & Dubstep Drops

Two of my favorite recent finds.

But what is the Fourier Transform? A visual introduction.

Encoding data in dubstep drops

12 years training for this role!

So far, my children's senior year of high school has been mostly wasted time. This is especially so as their graduation day comes ever closer. I propose that we alter senior year from one of listening from the back of the classroom to one of teaching from the front. It should be year when the seniors go out into all the lower grades, K through 11, and be teachers's assistants. Who else has a better understanding of how to be useful and successful in schools than do these young adults that have spent 12 years training for this role?

Sleeping dogs

Sometimes you forget those services that quietly run for months at a time in a corner of the datacenter. I am thinking of using gRPC to replace the communication layer for our micro-services. They currently use a home-grown RPC over HTTP that, while successful and simple to use for new services, is home-grown, non-streaming, and Java specific. One of our earliest micro-services handles over 2B requests per month. Perhaps I should let sleeping dogs lie....

Struggling to load Arabic text into MySql 5.6

If you are struggling to load Arabic text or Emojis into your MySql 5.6 table then:
  1. Ensure your database, table, and column use uft8mb4 character set.
    create database XXXX character set 'utf8mb4' collate 'utf8mb4_unicode_ci';
    
    create table ( ... ) ... character set 'utf8mb4' collate 'utf8mb4_unicode_ci';
    
  2. Ensure that the data file you are loading is UTF-8 encoded.
  3. Run the mysql client with the binary character set.
    mysql ... --default-character-set=binary ...
    
  4. Load the data with the binary character set.
    loading load data local infile 'XXXX.DAT' into table XXXX character set binary ( ... );
    

Good Terraform and AWS tutorial

The Learn DevOps: Infrastructure Automation With Terraform tutorial from Udemy is very good. While it is focuses on using Terraform to manage AWS it also gives a good introduction on how to assemble AWS' many moving parts.

The only trouble I had with the examples was that the initial ones assume that you have IP access via the AWS default security group. That might not be the case and so you will need to use the AWS console to allow your machine's public IP address ingress. To do this, go to the AWS Console, then to the EC2 dashboard, then to the Security Groups resource, and in that dialog select the default security group and add an Inbound rule for your public IP address.

Two masters

I attended last night's release of the Providence Geeks new website developed by Kenzan. Owen Buckley, Kenzan's director of engineering, did a good job describing the management and implementation of the project. The final result is a useful website, but with little polish.  This is because Kenzan had two customers and not one for this project. Providence Geeks (customer 1) wanted a new website to replace Facebook. Kenzan (customer 2) wanted a learning opportunity for its staff. So the project had too many acceptance criteria, or, as Owen Buckley said "when do you know it is done?"  Unfortunately, "done" was having built an 18-wheeler to haul a pin.