Write dammit, v.not-2-yet

May 17 , 2011 12:37:05 PM CDT

My digital typewriter has grown into a line editor since the last time I talked about it.

A line editor, for those of you who weren't using computers in the days of teletypes and VT100 terminals, is the next step up from a typewriter. It does edit text, but only one line at a time. I'm treating that as a compromise between the sheer go-forward imperative of a device with no backtracking at all, and the endless twiddle capacity of a full-blown screen editor. It lets me back up to fix typos, but still makes it difficult to futz around with entire paragraphs.

This one also does limited HTML. If you put asterisks around a chunk of text (*like this*), the code automatically converts it to boldface. Slashes (/like so/) produce italics.

How did I get both the markup and the demonstration text in the paragraph above? Simple: I cheated. The source display on the right lets you go back and play with things that are just too much for the line editor to handle.

Other features:

If the first character of a line is a pound sign (#), you get a header (H1) instead of a paragraph. Two pound signs (##) produce a second-level header (H2), "###" produces H3, etc. The series stops at H6, which is the smallest header in the HTML spec. You can also use numbers for the more extended headers.. "#4" does the same thing as "####".

If the first sequence of characters in a line is "&date;", the software automatically inserts a date stamp. Three dashes at the beginning of a line ('---') produce a horizontal rule:


The 'first characters' rule is serious.. if you use pound signs, the &date; entity or the '---' structure anywhere else in a line, they show up as literal text. That includes starting a line with one or more spaces. The spaces won't show up in the output (HTML ignores leading whitespace anyway), but they'll keep the parser from treating any of the 'start with this' sequences specially.

The code parses the line every time you enter a character, so you can see changes to the formatting happen as you type. The date stamp ticks along like a clock until you end that line and move on to the next.

Discussion:

My goal is to get a piece of software that lets me create pages in situ, handles the boring-and-distracting mechanics as much as possible, but still limits the amount of fiddling I can do while trying to bash out the next page.

It isn't there yet.. It still doesn't do links, lists, or images, and those are glaring holes.

To make matters worse, after writing 500 lines of code for a shift-reduce parser, I found an article on a different technique for finding structure in text, called "top-down priority parsing." It looks like an easier way to get the same effect, with more readable code along the way. That means I'll probably scrap the code for this version and start over from scratch on the new one.

That's how programming works though. Often as not, you start a program not knowing exactly what you want, and feel your way to a solution as you go. By the time you have a really good idea of what you want the program to do, your implementation sucks and the best thing to do is start a fresh version using what you learned from the first one.

The hardest part of building the second version is knowing when to stop. The geek classic _The Mythical Man-Month_, by Fred Brooks, discusses of a second-version software project that ended up being a billion-dollar fiasco. Brooks was the project's manager, and the book is a big-ass warning to keep other software developers from making the same mistakes in the future.

The software industry as a whole may not have learned the lesson, but I have, so the new parser for this project will have less features than this one, at least until I get comfortable with a stable version of the code.

In the meantime, here's the current version for the amusement of anyone who happens to be amused by such things.

Gallery: