WordPressing

The Thinker by Rodin

I hope to change my blogging software to WordPress within the next month. Currently I am using Movable Type to host this blog. I would like to say I am a loyal Movable Type customer, since I have used it for the nearly five years I have been hosting this blog. However, since its license allows me to use it for free for personal use, I never actually bought the software. That is not to say I have not given its owners, SixApart, some money. I needed to buy support when I re-hosted this year in order to make the dynamic publishing feature work. That cost me $50 and made me start wondering if I wanted to convert to WordPress, which serves all its content dynamically.

No, I stayed with Movable Type not necessarily out of loyalty, but mostly out of convenience. Just as after a certain point it is hard to move from Quicken to Microsoft Money because of the hassle of retraining, so it seemed easier to stay with Movable Type than work my way through the myriad issues associated with moving from one blogging solution to another.

Nonetheless, I am taking the plunge. I recently received an announcement from SixApart informing me that Movable Type 4.0 was ready. It extolled all its wonderful and latest features. Did I really want to upgrade and spend the considerable time learning how to use all these new features, particularly when I would not use most of them? Should I stay with Movable Type 3.3 until it gradually deteriorated into irrelevance? On the other hand, should I bite the bullet and move to what most of us non-commercial bloggers use today, which is WordPress?

Since yesterday was a holiday for me and being geeky looked more appealing than weeding the garden, I took the plunge. Installing WordPress, an open source blogging solution, turned out to be painless. I had to create a new MySQL database instance (easy enough to do in phpMyAdmin), copy the files over to my web server, edit a few settings in a configuration file, and then run the installation program. Installation time: about 15 minutes.

Next step: move 700 plus blog entries and 400 plus comments from Movable Type to WordPress. After digging through the WordPress documentation, I discovered I had to export my entries in Movable Type then import them into WordPress. It was relatively straightforward. Total time: another 15 minutes.

Unfortunately, by default all my WordPress posts will have URLs that are completely different from the URLs generated by Movable Type. All those search engine links would become obsolete meaning that Google might unlist my blog again. This would not do, so I went hunting through the documentation to find out how to solve the problem. Blessedly, WordPress has a way so that you can create a customized path and post names for your entries. I could make the resulting URLs look just like on Movable Type. Problem solved?

Not quite. There was a significant number of impedance issues resulting from blogging for five years with Movable Type. For one thing, until mid 2005, Movable Type limited entry name URLs to 15 characters. (Before that, entry names were numbers, like 000001.html.) Therefore, if your entry was titled “This is a long entry name” the resulting URL was “this_is_a_long.html”. If you wrote another entry with a similar name Movable Type would make sure you didn’t reuse the same name, so the next one was “this_is_a_long_1.html”. During one Movable Type upgrade, this limitation went away so I allowed entry name URLs to be up to 50 characters long, but this still left hundreds of entries where the entry name URL was truncated at 15 characters. In addition, WordPress puts dashes where blank spaces would be in your URLs. Movable Type substitutes underscores. I followed the helpful online advice but still had hundreds of mismatched URLs. Eventually, in frustration I wrote a little PHP script that identified the mismatched URLs. I also came up with a strategy for fixing discrepancies in the 15-character entry name URLs. Many of these entries had underscores in the last character that had to be fixed. Most of these could be fixed with one SQL statement.

When I make the switch to WordPress, my individual entry and monthly archives should now match correctly. Category archives though are not so simple. Had they been stored under /category/archive_name it would be straightforward but I have them under /archive_name. I am still pondering how to solve this one. The most expeditious way seems to be to create symbolic links.

In addition, currently there is no way to move over my Movable Type entry tags. This is an issue that must be solved before I can migrate the blog. The good news is that by examining how tags are stored in Movable Type and WordPress, I think I have found a way to do it using SQL and PHP. I will be testing it when I have some spare time. Moving over the tags was not possible until I first had addressed the inconsistent URLs.

I know there are all sorts of other embedded URLs that will break that will need to be addressed. These include tag archive URLs, feed URLs and differences in the search interface. Then there is the look of the blog itself. There is no utility to move over Movable Type templates, so once rehosted in WordPress, this site will have to look a bit different. Fortunately, WordPress has hundreds of themes to choose from, and they are much easier to edit than Movable Type templates. My WordPress blog is a work progress and can be viewed here. If you have any feedback on the look and feel let me know.

Overall, WordPress is slick. The user interface is much more straightforward and feels more powerful than Movable Type. It is also wholly written in PHP. Movable Type started out as a Perl application, and in my current incarnation, it is a mixture of Perl and PHP. However, I understand PHP and loathe Perl, and I know that PHP is highly scalable. WordPress will be very fast and easier for me to customize with my own programs than Movable Type. WordPress, being open source, is unlikely to disappear. Both Movable Type and WordPress have plug in architectures, but WordPress has a huge user community. Consequently the selection of templates, plug ins and widgets is much greater. Moreover, it is likely that upgrading WordPress will be much more straightforward and less of a hassle than Movable Type. Therefore, I am confident this project will pay off in the end.

I thought there would be more users who had moved over their blogs from Movable Type to WordPress. While there are clearly some, the online documentation was inadequate. Therefore, I have been contributing to improving the documentation by adding my experience in the WordPress Wiki. While wikis have been around for a while, I am still taken aback that I can make changes instantly to their official online documentation and no one bothers to review these changes.

So I expect things to look a bit different around here within a month. One thing will not change: I will continue to set high standards for myself for all the entries I place here.

Rehosted

The Thinker by Rodin

This will be a short entry. My writing here lately has been constrained because (a) I have been busy at work (b) having Google abandon my blog has made it more difficult to get inspired (c) I have been busy doing phpBB modifications work for clients and (d) I have been up to my armpits with rehosting issues.

Thankfully, the rehosting issue is finally solved. I went through a tedious process of moving over my two phpBB message boards (Oak Hill Virginia Online and The Potomac Tavern) but the last domain, this blog, has proven daunting. Thankfully with a help from my friend Jim Goldbloom, calls to the tech support people here at westhost.com, and helpful users in their forums, plus a lot of the troubleshooting common sense skills acquired from being in this business 20 years, this blog is now rehosted too.

So hopefully I will feel a bit more inspired, Google will put me back in their index and clients will not need my services as much, so I will have more leisure time to get back to the sober and well crafted blogging I hope I do so well.

Thank you for your patience.

Spam Solutions for phpBB and MovableType

The Thinker by Rodin

I was pleased to discover two real spam solutions for phpBB and MovableType recently.

phpBB is open source forum software. As you might expect it is written in the PHP programming language, which is installed by default on virtually every UNIX or Linux based web server. I run a message board using phpBB as well as earn some spare change installing and writing modifications to this popular software. However, spam has been a real problem lately for phpBB message boards. Spammers have created software that automatically creates and registers phony users for phpBB message boards. Their software is clever enough to defeat the Visual Confirmation modification, which is now integrated into phpBB. (This modification shows a word embedded in an image that you have to enter into the registration form in order to register.) Once “registered” these spam robots sometimes post spam as topics on the message board. They always place in the “Interests” and “Home Page” fields or the Member List pointers to spam sites.

My workarounds to date have had limited success. That is until I found the Anti-bot Question Modification. This is a clever solution. It requires, as part of the registration process, that the user answer a question that only a human could answer. Since I have installed it, I have had zero spam registrations. (I used to get dozens a week.) One small problem is that the modification was written in German. The English translation is workable, however. Therefore, if you have spam and a phpBB forum then installing this modification should be a no-brainer. In the event that the spam robots learn how to defeat the standard questions, simply create your own. You can also change the name of the registration form variable that collects the answer to the question easily through the Administrator Control Panel, further adding complexity which will drive away spam robots.

MovableType is the software I use to run this blog. With MovableType, the problem has been comment spam. The solution I found is mt-keystrokes. It uses Javascript to infer that a human entered information into a comment field. When a user types information into the comment text field, it triggers a Javascript event. This in turn causes the value of a hidden field posted with the form to change. This plug-in then has to check for the correct value in this field. If it has not changed, it assumes the form was submitted by a robot and is consequently spam. Otherwise, it assumes a human entered the comment. So far, it has worked flawlessly. As a result, my Junk Comments folder has been gloriously empty. There is no reason to sift through it looking for that one comment that might be legitimate. However, I was unable to get it to work correctly unless I used the form variable they provided. Consequently, this solution may be a temporary balm.

Now if only I could permanently banish email spam from my life. I have found a combination of solutions, but nothing that guarantees me that I will not miss a legitimate message or two. I strongly suspect the whole email architecture of the Internet will have to change before that problem is solved.

FastSearch: a faster search for MovableType

The Thinker by Rodin

For reasons I have not been able to wholly ascertain, the search functionality on this blog keeps timing out. Part of it is due to the 15 second timeout that inherent in the Apache web server I am using that I discussed recently. Even so, with 604 entries, 15 seconds should be plenty to locate relevant entries, so I suspect some inefficient coding by the folks at SixApart.

Fortunately, with a little Googling I found this nice hack for MovableType called FastSearch. It makes searching for entries with MovableType very swift again. Enjoy.

Tag! I’m it!

The Thinker by Rodin

Hosting this blog is not without cost. The direct hosting costs are largely trivial. There is a cost to time, however. It was that investment that made me procrastinate upgrading my blog software (MovableType) from version 3.2 to version 3.3.

Shortly after the new version came out, I did make an attempt to upgrade. Then, as usual, it failed to install for reasons that were mysterious. I did not want to dig into their code or sift through their forums to solve. Last weekend, in an attempt to clear out my inbox, I made one more attempt. After an hour or so of scratching my head I figured out the problem. Suddenly I was running MovableType 3.33.

There are many new features to this upgrade. The most useful is the ability to tag my blog entries. To tag, I must associate relevant words that describe a blog entry. These words are indexed. Once indexed, it allows readers like you to find relevant information across my blog.

Tags are a recent phenomenon, but ideas like it have been tried before. For example, MovableType already supports HTML keywords. I never use them, mainly because search engines ignore them; consequently, they are of no use to my readers. MovableType also supports categorization. I always categorize blog entries. For example, this entry will go into the Technology Category. Categorization though is too broad. Tags are an attempt to allow more refinement. For example, if I discuss Hillary Clinton it will likely end up in one of the Politics categories. However, if I tag “Hillary Clinton” to a blog entry, people who come to my blog who want to read everything I consider relevant about Hillary Clinton can search for entries where I have tagged her name.

There is one problem implementing tags: I have been blogging for nearly four years and none of my entries had been tagged. This means that I have to go back, reread all 580 or so blog entries and enter appropriate tags for each one. The alternative was just to start tagging and forget about tagging previous entries.

The latter idea was attractive, but of course I opted for the former. This is a blog of essays, which means it is primarily a blog of ideas and ordered thoughts. Tags are a natural fit for my kind of blog. So of course I have been busy tagging my entries during my limited spare time.

Tagging remains a work in progress, but you can see the results. Next to the search form on the left panel I added a link that will take you to my tag index. Click on a tag link and you can easily read all blog entries for that particular tag.

I discovered two issues with tagging that might not be obvious. One is that there are no real criteria for proper tagging. Tagging seems to have been invented in response to the inherent difficulty of finding relevant content on the web. Tagging lets you sort of, but not completely do this job. This is because it relies on the person who is tagging to decide which tags are relevant. Most of doing tagging have no training in taxonomy.

Another issue is that it is hard to know which tags to use. Assign too specific a tag to an entry and sifting through tags becomes a real problem. Take for example, a movie review. For my entry on Million Dollar Baby, should I tag Clint Eastwood? What about Hilary Swank? There are no limits to the number of tags that I can assign to an entry. How many tags for an entry constitute too many? Another approach is to limit the entry to the single tag of Movies. Where does one draw the line? How am I supposed to know what tags others will consider to be relevant?

I see tagging as a necessary bump on the road toward a truly semantic web. Just as Windows 95 was a vast improvement over Windows 3.1, it still was flaky, buggy and confusing, just less so than Windows 3.1. Yet it was a necessary step in the evolution of Windows and it still ran programs written for Windows 3.1. Similarly, HTML is not going to go away, but it does slowly evolve over time. Perhaps tagging entries is the next logical step toward finding relevant content on the web.

Tagging will definitely help me find entries on a particular topic. But will it help the casual web surfer? Only time will tell. It is unlikely though that we think in similar ways. Consequently, I could choose different terms to categorize my entries than you would. I do know is that it is time consuming to go and tag over five hundred entries. So far I have completed 2005 to the present. I have about sixty percent of my blog entries left to tag.

There are other new features in MovableType 3.3 that I will likely use in time. Drag and drop templates and widgets are now available. Actually templates have been around in MovableType for a long time, but you still had to know HTML and study the MovableType template tag library to be creative with them. In other words, you had to be a bit of a geek. The drag and drop interface should make it easier for me to maintain the presentation of this site without writing HTML.

One thing is worse as a result of upgrading: comment spam. I don’t know why but spam that used to get sent directly to my junk folder now makes it as a comment for review. I have changed the spam threshold, but so far it has made little difference.

I hope you find the tagging feature useful.

The joy of coding

The Thinker by Rodin

I’m a software engineer and a project manager so I don’t do much in the way of coding software anymore. In truth most code writing and testing isn’t that much fun. I was kind of glad to be lead out of the programming hole I was stuck in some ten years back. I realized I was writing the same code over and over again. It was getting boring. How many times can one code variations on the same do/while loop without pulling your hair out? It was better to give the work to some programmer grunts and work at a higher lever of abstraction. Project management pays better anyhow and college tuitions will be coming due in a few years.

Programmers may dispute this assessment, but they are the blue collar people of the information age. We coders are software mechanics, really. At some point I was led out of the software garage and into the manager’s office because others thought I had bigger fish to fry. I try to keep a toe or two back in the garage though. It feels more real than project management. Programming feels tangible and something I can take to the bank. Being a project manager feels ephemeral. I’m not sure I will have enough work to keep me busy a year from now. But I can always hang out my sign “Will code for food” if need be. I doubt “Will manage projects for food” will have the same marketing appeal. So I try, but don’t always succeed, in keeping up my programming skills. This is a market that moves very quickly. I’ve done some programming in the Java language, for example, but need to do a lot more. I won’t be asked to code Java servlets in my job, however. I may need to assign people to do the work for me however.

I took up teaching web page design partially to force myself to keep up with new technology. It worked and I now can create validated XHTML, can write cascading style sheets without usually consulting a reference manual, code cross browser Javascript and have good working knowledge of some hot server side scripting languages like PHP and ASP.

This blog is one place I practice. The underlying software is Moveable Type, which is written in a programming language called Perl. If necessary I can go in and tweak the code, but it’s not necessary. Setting up this place was pretty straightforward. Fortunately I also get to play with the PHP server scripting language on my forum, The Potomac Tavern.

My forum is based on open source bulletin board software written in PHP called phpBB. About the time I installed it I also ordered some manuals so I could learn to write PHP. phpBB also requires a database. A database called MySQL comes free from my web host so I used that and ordered a book on MySQL. The combination of the server operating system (Linux), PHP and MySQL is a zero cost option for creating extremely robust and reliable web based systems. And it turns out you don’t have to be a programming guru to do serious stuff in this environment. Much like those at the start of the PC revolution who put together HeathKit personal computers in their garages, the hobbyist with decent understanding of programming languages can do it themselves and have some fun. No need to work on a car in your garage anymore for amusement. Program some scripts for the web instead!

A lot of programming is boring for me because it doesn’t mean that much. I’ve done a lot of patching and upgrading of systems written by others in my career, and it’s definitely not that interesting. It’s necessary work, just like the mechanic who has to replace your muffler, but it is boring. Most programmers would like to write something original and all their own. It gives them a feeling of ownership and that they have created something meaningful. Unfortunately unless you do it for your own amusement, such experiences tend to be fewer and further between. Sadly, much of this work can be outsourced to India instead of keeping Americans gainfully employed as programmers.

So it’s a joy to find such a coding project recently that was both creative for me and actually useful for a large number of people. Back in May I was looking at the phpBB forum software and thinking “Why can’t it have digests? It works for Yahoo! Groups!” I frankly expected someone to have done it before but no one had. So I began work on a “mod” or “modification” to the official blessed phpBB software. With my modification you don’t get sent every email to your group, as happens with Yahoo! Groups. Rather, this software allows you to fine tune the digest you get to pick particular forums of interest, and to set a fairly wide variety of options. It is customized for you. It was a great mod that I installed on my own forum. I learned a lot about the phpBB architecture and how to write good PHP code in the process. Eventually I packaged up the whole thing in a ZIP file and posted it on the phpBB web site. I figured it would get people excited.

But it didn’t. It just sat there and got ignored. I didn’t understand it because it was a great idea. But I guess its time hadn’t come then. A week or two back I started getting inquiries about my modification. Is it going to be finished? Will it be submitted as an official phpBB modification?

It’s time has come. Now it has garnered a lot of interest and my spare time has been kept increasingly busy making more modifications to it and getting feedback from the developer community. Shortly it will be submitted as an official modification and when it shows up on the list of approved phpBB software modifications, as I hope it will, I suspect it will be pretty popular.

No, there is no money in this work. When building on top of an open source platform you just give it away. But there is a vicarious thrill and pride in ownership of not only writing some very cool and efficient code optimized for this phpBB software, but to garner some fleeting low level fame among this community of people. These people are appreciative of my work. It reflects not only a needed enhancement to phpBB, but from the feedback I am getting it is also very well designed and thought out.

And that makes me feel happy and gives me a tangible feeling of accomplishment. Some people are jumping the gun and won’t wait for the final release. One guy from Brazil has been writing me with questions. I’ve been helping him out. When I took a look at his site though I realized that I was really helping out … a low level pornographer!

Well, why am I not surprised? Who were the pioneers on the internet? Not Bill Gates, that’s for sure. No, it was the smut merchants who figured out how to turn a profit on from the internet first. If a pornographer or two finds a way to use my software modification to push down adult content to some horny end users looking for some cheap thrills, that’s part of the deal. I’m sure it will find more legitimate uses in time.

It’s still a damn fine set of code. And I’m glad to know I still got the right stuff.