It is a one-ways hash

I keep on forgetting to correct my use of terms. The substitution cipher in the last postings is really a one-way hash function. That is, you can not reverse the ciphering process and get the, in this example, domain name. This is because more than one character is mapped to a single digit. In the demonstration grid all instances of 'a', 'g','m', and 's' are replaced with 1. And everything that is not between 'a' and 'x' is replaced with 0. This is a one-way hash. The most famous is the MD5.

The start of a bookmarklet for generating passwords

Following up on yesterday's password generation posting, here is a Javascript generate passwords based on the last two names in the domain name. It is presented in a form suitable for use in a bookmarklet.


( function( substitution_cipher_grid ) {
substitution_cipher_grid = substitution_cipher_grid.replace(/\s+/g,'');
var substitution_cipher={};
var rows = substitution_cipher_grid.split('.');
for ( var r = 0; r < rows.length; r++ ) {
var cells = rows[r].split( '' );
for ( var c = 0; c < cells.length; c++ ) {
substitution_cipher[cells[c]]=c+1
}
}
var domain = window.location.host.split('.').reverse().splice(0,2).reverse().join('.');
var password=domain.split('').map( function(c) { return substitution_cipher[c] || 0; } ).join('');
window.prompt( 'Password for '+domain+' is:', password );
} ) ( " a b c d e f . \
g h i j k l . \
m n o p q r . \
s t u v w x " );


The substitution_cipher_grid is expressed as a Javascript string where each row is separated by a period and each cell is separated by a space. Again, the actual grid used here is only for demonstration and should not be actually used.

The most significant flaw in this example is having the substitution_cipher_grid encoded in the bookmarklet. Ideally, the first time the bookmarklet is used it would ask for the grid and store it away until you quit the browser or its time expires. This is the functionality of a cookie. Unfortunately, I do not know how to make a cookie that is only associated with a bookmarklet. Another caching mechanism would be acceptable too. Any ideas anyone?

Update: The code has been updated to not use the strip() function. I did not realize during testing that I was within a site that was using Prototype which adds methods to Javascript objects. One of these methods is strip().

A password generator I can remember

I want a password generator that I can remember. In most cases I will have my phone or computer that can be used to regenerate the password. However, I don't want to be so reliant on these tools for this. It would be great if I had a system that worked outside of the virtual world too. I want to be able to regenerate the password with pencil and paper if I need to. What I need is a substitution cipher and a secret or two.

For example, lets say the secrets about the password are that 1) it uses the substitution cipher on the domain name and 2) appends a memorable suffix. For example, for "yahoo.com" the substitution cipher is "012330331", the suffix is "MainSt", and so the password is "012330331MainSt". This is a fairly good password. It is not a dictionary word and it contains a mixture of numbers and letters of different case.

The cipher is made by arranging the letters of the alphabet in a grid. The substitution is had by finding the letter in the grid and recording its cell position – choosing the x and y or perhaps just one dimension as done here. Any character not in the grid is replaced with 0. The grid used for the cipher above is

cypher-grid-a

The grid is 4 rows of 6 columns with the alphabet filling the cells from left to right and top to bottom. The column numbers are the substitutions. This is not a very good substitution cipher, however: It is shown here simply to have a clear example. The following grid is better as the alphabet fills the cells in a non-obvious way but, for me, easily remembered.

cypher-grid-b

The "yahoo.com" substitution is "012440243".

A hacker might try using a substitution cipher on your password but it would take them up to 40,353,607 guesses – there are 7 replacement numbers (0 through 6) multiplied by the number of characters in the domain name (9), eg 7^9 -- but they still would not have the secret suffix. Further, well before 40,353,607 login attempts Yahoo! would have locked out the account.

I am not sure if I will use this technique just yet. It has all the characteristics I want. It is also easily coded as a bookmarklet or web application and an iPhone application.

If you know of similar password generators or better ones please send me a note.

Update: I like the ideas behind the password card at http://www.passwordcard.org/

Chickens: Status 18 May AM


Well, we can't tell which of the eight chickens are sick. In fact, after reading far too much about the composition of chicken poop we are not even sure that they are sick. Nevertheless, all the chickens are now in their own recuperation box, under lights in the studio's bathroom. If there was any doubt about our sanity this picture clearly clears that doubt away.

Jim Coombs's tools

The Extreme Geek piece by Cory Doctorow on writing tools reminds me of Jim Coombs's tools. Writers using geek tools to aid the research and composition process. Coomb's tools used XEDIT under IBM's mainframe VM/CMS operating system. His tools were implemented well before the web and so not much evidence of their existence is easily found today. This period is captured in McCarty's HUMANIST: Lessons from a Global Electronic Seminar.

FYI: The timing of local news cycles

The timing of local news cycles nicely supports some of what I wrote about in yesterday's Journalism and the new news organization's three products.

Journalism and the new news organization's three products

I agree with comments (other places) that much of the discussion about the demise of newspapers did not distinguish newspaper publishing and journalism. The newspaper publishing business model is on its last legs. While I will miss the physical artifact I don't mind seeing it go. Journalism, however, is needed now more than at any other time in its history.

Government at all levels encompasses more of people's lives that ever before. It is huge: from products to processes to places there is not much that is not touched by a regulation. Without investigative journalism we, as a country, will be lost. Any corruption we have now will pale in comparison to what we will have. Any danger we have now will pale in comparison to the future dangers we will have. Journalists study these things. They are the nudge that gets answers. Journalism needs to be supported.

There is talk that David Geffin will buy the New York Times and make it a non-profit. Something like what was done for the St Petersburg Times in Florida. On the surface this seems like a good thing for the New York Times. However, given how much debt the Times has it is already a non-profit. But seriously, making all news entities non-profits does not make long term sense. I am sure that there is enough philanthropic money out there to do it but journalism needs to stand on its own feet. We don't want "journalism" to have the same air as "academic" has today -- work distinct and unrelated to everyday lives.

What to do? My opinion is that news organizations need three products. The products are distinguished by factors of intellectual effort and historical perspective. Each has a different business model. They all derive benefit from skills and act of investigative journalism.

The first product is the "news stream." The recently released Times Wire is a model presentation for this. News streams represent the events and reportage happening now. A better Twitter for the news room. The content is enough for the reader to get the gist of the story. Sometimes the content comes from journalists but content can automatically from the raw data available from sources such as government administration, police, fire, hospitals, etc. Overall, there is very little editing that does into the news stream. The news stream is paid for by advertising. It is free to the reader.

The second product is the "news edition." Its primary function is to present a day's or a week's worth of general local and regional news and information to the public. It is similar to the newspapers and news magazines we have today. This contains some of what is in news stream but in longer form. Long form investigative pieces are its bread and butter. What is investigated is, in part, driven by the readers choices. The news edition is paid for by advertising, subscriptions, and single issue purchases. Advertisements must be subscriber specific (just as Google search is today) so a higher advertising rate can be charged.

The third product is the "news horizon." They are reports containing a deep investigative analysis of a single topic. "Commercial Fishing in South Kingstown", for example. It is informed by the current and historical trends and the current and historical facts. Depending on the topic it is updated biennially , yearly, or quarterly. It is considered the principle source of an objective perspective on the topic. Its readers are businesses and investors and sometimes the general public. The news horizon is a subscription service and single issue purchases. Any advertising would be limited to an underwriting notice.

Just Landed: Processing, Twitter, MetaCarta & Hidden Data

Just Landed: Processing, Twitter, MetaCarta & Hidden Data is brilliant example of social visualization from found data.
Just Landed - Screenshot

Eight chickens

If you build a coop then you should expect chickens and be prepared. But, just like when H&O came, I feel totally unprepared. Having successfully cared for H&O I expect that I can do the same with the eight new additions to the household.

No plain text password

Every site that asks that you register and use a password should clearly declare whether the password is stored in plain text within their systems or not. No system need store a password as plain text but so many sill do. Sigh.

South Kingstown, RI Now and Transparent Government

My head these days is stuck in thinking about and working to achieve a more transparent government and so some of my actually interesting postings are over at South Kingstown, RI Now. Interesting even to non-Townies.

^T ? "three significant terms" ^M

The most used keyboard combination used during my day (besides the backspace!) is control-t, question-mark, "three significant terms", and enter.

Where would I be without fast access to Google in FireFox?

Chicken 911?

Chris and I have discovered that we are late ordering chicks for our coop. McMurray Hatchery is mostly sold-out. Agway is still taking orders. Alle's is not. And so our selection of breeds is limited. If you know of other places in RI where we can order chicks please send details to andrew@andrewgilmartin.com. Thanks.

Wooden Baskets for Mother’s Day

Here is one of the two wooden baskets Henry and Owen made this weekend for Mother’s day. The baskets were inspired by a small book on weekend projects – a kind of “Vogue for men”. All you need is a miter-saw, glue, and wire nails. It takes about an hour to cut and prepare the pieces and about another hour for a 9 year-old to put it together.

Chris and I have a plot at the new South Kingstown community garden and so this baskets will be useful for carrying out weeds and veggies.

(This posting also tests the Windows Live desktop software for blogging (and for photo gathering and management). I have been looking for a Blogger editor for sometime and so if this one works I can move on to other sources of minor irritation.)

Using inotifywait with queue directories

After writing Using directories as queues I realized I did not mention how you should check the queue. Using a cron job is usually adequate. However, this does mean your script will be polling the queue-directory and having to distinguish been an empty and a non-empty queue-directory. If you don't like polling then use inotify and specifically inotifywait (part of the Ubuntu package inotify-tools). For example, this command line will monitor the directory /queue/new and when files are added to or moved into the directory the word count command is run.

Q=/queue ; inotifywait \
   --monitor \
   --event moved_to \
   --event create \
   --format "mv $Q/new/%f $Q/cur/ && wc $Q/cur/%f" $Q/new/ | sh

Using directories as queues

Software managed queues are great tools. Most enterprise system installations have some sort of message broker available. These are sophisticated tools. This posting is not about them. This posting is about using directories as queues.

The basic idea of using a directory as a queue is to place new "work" as a file in a "queue" directory. When the queue-processor is ready to process the new work-files it first moves the queue-directory's files to the queue-processor's "working" directory and then processes them.

The assumption here is that the work-files have all their content. But we all know that writing to a file takes time and during this writing time the queue-processor might awake and start processing the new, incomplete work-files. This is not good. If you are using a directory as a queue it is likely your queue-processor is a script and, unfortunately, most script solutions have poor error handling. The result is that the incomplete work-file gets ignored.

The good news is that there is an easy solution. I first saw this solution used within the qmail MTA. A feature of Unix file systems is that adding to or removing files from a directory is an atomic action. That is, all processes wanting to alter the set of files in a directory are queued up and only one at a time is allowed to make changes. So, when you write the work-content file make sure to write it first outside of the queue-directory and then move the completed work-file to the queue-directory. It is critical that the the work-content creator do the moving.

The archetype for this is to create a queue-directory containing two sub-directories
mkdir queue/new/
mkdir queue/tmp/
Add the work-content file to queue/tmp with a globally unique file name. Once writing is finished and the file closed, then move it from queue/tmp/ to queue/new/. This even works nicely across networks, for example
QUEUE=/x/y/z
SOURCE="/a/b/c/foo.txt"
TARGET="$QUEUE/tmp/$(basename $SOURCE)-$(uuid)"
scp -q $SOURCE user@host:$TARGET
ssh -q user@host "mv $TARGET $QUEUE/new/$(basename $SOURCE)"

See also Using inotifywait with queue directories.

Rhode Island's state budget study group

I am forming a study group to read and discuss the actual Rhode Island's state budget. Tom Sgouros, of the Rhode Island Policy Reporter, has agreed to facilitate the group. He would help us understand the necessary background and generally facilitate the meetings. The group's members would be responsible for assigning readings and presenting summaries. Over the course of several meetings we will have collectively read the budget. I know this incredibly wonkish and verges on crazy but I can't imagine truly understanding the RI budget any other way. If you are interested please contact me at andrew@andrewgilmartin.com.

A new technical lust


Man, just when I was recovering from my Kindle 2 technical lust I now find myself lustful for the Kindle DX. At least I have until the summer to save my pennies for this one. Does anyone out there want me to join their Kindle development team?

TweetChat

Just found TweetChat and it is very useful Twitter tool. It allows you to focus on a single area of twittering conversation that it calls "rooms." Within a tweet, a room is indicated by a name preceded by a hash mark, eg "#gov20" and "wif09". TweetChat could be improved by showing the resolved URLs (and perhaps page title) of each tweets' embedded tiny URLs.

See also hashtags.org.