Leedberg.com

The online home for Greg Leedberg, since 1995.

Monday, May 29, 2006

No More Word Files!!

No, I'm not declaring that the Microsoft Word file format is dead, or soon to be so. Rather, I'm calling for people to realize why it should be dead (or at least marginalized!).

First, let's soldify what this argument is about. Microsoft Word is a word processor, developed by Microsoft. Over the years, it has become the most-used word processor around the world. Interestingly, it's quite expensive -- Microsoft Office Standard (which includes Word plus a handful of other less-used applications) costs in the neighborhood of $130.

Many arguments have been made against Word on the basis that it is closed-source, while there are free, open source alternatives out there, such as OpenOffice, KOffice, and AbiWord. I'm not here to make this argument. I think that corporations have a right to make money off of their products if they want to (and that other people have the right to make their software available for free, if they want to). Likewise, if someone thinks a product is worth what it costs, they have every right to pay for it. And if having the source code isn't important to them, so be it.

My argument is against the file format Word uses. Whenever you type up a document in Word, you likely save it such that it has a ".doc" extension. This signifies that it is stored in the Word file format. The Word file format is what's considered a "closed " file format. It's binary, so a human wouldn't be able to look at the contents raw and understand them. Worse yet, only Microsoft fully understands the format, and they don't release the specifications of that format. So, only products made by Microsoft can (in theory) fully and reliably read and write Word files. Contrast this with an open file format. Generally, an open file format's raw contents are human-readable, so it's easy to figure out what's going on in the file. Most importantly, the specification for the format is documented and publicly available, so that anybody is free to make a program that can read and write the format.

Ignoring the specific case of Microsoft Word, there are lots of problems with closed (or "proprietary") file formats in general. Most obviously, they lock you into a particular vendor's products. This means if you use Microsoft Word to create a document, to ensure full compatiblity you will always need Microsoft Word in order to read that file. There are some free projects that have attempted to reverse-engineer the Word format, but none of them are 100% accurate. This hurts you in the present, since it means that any computers you own will have to have Microsoft products on them in order for you to carry your work between the computers. More frighteningly, this introduces lots of possible problems in the future. You will basically need a copy of Microsoft Word forever in order to continue to read your files. What if Microsoft goes out of business? Stops making Word? Stops making Word for the particular operating system you are currently using, forcing you to upgrade unwittingly. By creating this lock-in, the closed format decreases competition, as people are less likely to use a competing product if all of their existing files will be unreadable. This is true for any product that uses proprietary formats.

Also, using closed formats is a hinderance to open communication. If you want to type up a document in a closed format such as Word, and send it to someone, they have to have Word as well. This turns the closed file format almost into a "virus" of sorts -- it keeps spreading as people find a need to communicate with someone who already has it. If the person you want to send the file to doesn't have Word, you won't be able to share your information with them.

The above reasons are general arguments against all closed file formats, and they all apply to the Microsoft Word file format. But of course, the Word format has several of its own particular downsides. For one, if you are forced to upgrade to the newest version of Word in order to read your old files, you may very well find that the new version of Word can't actually read your old files. Even though Microsoft has the specifications to this closed format, it has a notorious reputation for somehow making it so that new versions of Word have problems reading certain older files. And of course since only Microsoft has the specification to this format, you're out of luck if you want to try and find some other program to use.

Also, it is a problem that Word is not cross-platform -- it is only available to people that run Microsoft Windows, and Apple's operating systems. So, if you want to send a file to someone using some other operating system, such as Linux or BSD, there is simply no way for them to acquire a copy of Microsoft Word, and you are completely blocked from communicating with them. On this same train of thought, you have to keep in mind that Microsoft Office is a very, very expensive program to purchase. As I said above, $130 just for the most basic functionality. It is entirely possible that this is more than some people can afford, or is more than some people think Office is worth. It's not at all clear to me why someone would assume that their peer has purchased a program that costs this much money. Sending someone Word files may be putting pressure on them to spend the money for Office -- money they may not be able to spare.

So what's the solution? Clearly, the point I'm getting to is that we should try to use open formats rather than closed formats. Currently, the best example of an open format for word processing is OpenDocument. OpenDocument is an open, XML-based, file specification that was developed by a committee of interested organizations. It incorporates the vast majority of word processing features that existing products such as Word offer. However, the specification is completely open, and anybody can produce a product that can read/write it. Several already do, most notably OpenOffice. It is expected that in the future there will be a plugin for Word that will allow it to use this format, and eventually it's likely that Word will even natively support it.

Even if you don't use OpenDocument, use something more open than Word's default format. such as RTF, PDF, or HTML. OpenDocument is probably the best open format for word processing currently, but even if you don't use OpenDocument right now, you should at least use something more open than Word (especially when you send a file to someone). Formats such as RTF, PDF, and HTML are relatively well-understood and/or open, and have both free and commercial readers and writers available for most operating systems. Coincidentally, both RTF and HTML are natively able to be read and written from within Word.

In conclusion, I think that the success of the Word file format is one of the worst things to happen to the computer industry -- ever. It's pretty bad for storing your own personal files, but it's especially bad for cases where you want to share your files with other people -- closed formats simply weren't designed for this. If you need to send a file to someone, please, please, don't send them a Word file. Convert it to something more open. And even if you store your everyday documents in Word format, consider saving your most important documents in a format that you know will still be accessibly 10 years from now.

Of course, ideally, you should just use OpenDocument for everything.

Labels: ,