Author Archives: dleppik

Windows is less secure than Linux

Over the years, there have been countless arguments and counter-arguments about Linux security versus MS Windows security. Those of us who have to deal with computer security know that Linux (and other Unix-based OSes) makes it possible to mantain a secure system, whereas Windows is encumbered with a rich legacy of bad design decisions. Unfortunately, there are enough knee-jerk partisans and opinion-for-hire analysts to keep the water perpetually muddy.

These charts show just how much more needlessly complex Windows is, and by extension, how much harder it is to secure. They also help to explain why Windows Vista missed its deadlines by several years. This is why Linux (and Mac OS X, which is similar) get cool new features year after year, whereas every major Windows version is a deadline-breaking fiasco.

Struts/JSF update: why don’t I open source it?

The obvious rebuttal to the previous Struts/JSF post is: “If you’re so smart, why don’t you open source your solution and show the whole world your bright ideas?”

And perhaps I should. But open sourcing is significantly more work than just writing an in-house project. For one thing, you have to keep from having in-house dependencies. That part’s relatively easy. Harder is finalizing an API which is generic enough to be generally useful, but not overblown.
For example, I don’t have a use for mutli-lingual error messages, but you cut yourself off from most of the world if you have an English-only attitude. One of the annoying things about Struts is that it requires internationalization of everything– you can’t just give it text, it requires the name of a text field where it can look in a language-specific file.

The best way to get a good API is to start out with a minimally useful API, throw it out to a bunch of testers, and evolve it based on actual needs at a variety of places. Which means you need to get interest in it and foster a community of developers. And that takes time and effort and a willingness to toot one’s own horn.

At the same time, I’ve found myself reinventing the Struts wheel here and there. Particularly for error handling. Many of the design decisions they made were good. Still, I’ve written only a tiny amount of code, and it’s about as full-featured as I need. Much of the value in a good framework is in the design patterns it enforces; the code itself is rarely particularly clever.

The obligatory iPhone post: I love my Treo

I’m a big Mac fan. I’ve been into Macs since my dad bought the original 128K Mac. I regularly read Mac news sites, especially John Gruber’s excellent Daring Fireball. And yet, despite all the hype, I don’t see an iPhone in my future– even if the price comes down.

The fact of the matter is that I love my Treo. And I love it for many of the things that Apple is rejecting. It has great physical buttons; I can play Astraware Sudoku comfortably with one hand while holding a sleeping baby in the other hand– at times without looking at the screen. I have appointments in my calendar dating back to 1997 and recipes dating back nearly as far. If I really want to geek out, I can write little Lisp programs with it. And it plays a text adventure written by my friend Seebs. Last week I bought a Spanish dictionary for it. The iPhone, on the other hand, will only have limited capabilities to run third-party software. How limited, Apple won’t say.

The fact of the matter is that the cell phone market isn’t like the PC market or the pre-iPod music player market. Apple isn’t the only company in the world that does good design work. Apple has been able to excel by not making the sorts of mistakes that technology companies often make.

The main mistake is to think of features only. Features before function and features before fashion leads to features without fun. If getting to a feature is annoying or frustrating, people won’t use it. Apple’s signature is to simplify a design until it looks more like a slab of plastic or metal than a device with buttons. And they also like to shrink a device to unimaginable proportions.

In fact, that same design aesthetic is what launched the Palm Pilot. Apple’s own product, the Newton, was a technological tour-de-force with an incredibly sophisticated operating system. But it was slightly too big (the size of a paperback) and the handwriting recognition required legible handwriting. The Palm Pilot, like the original iPod, was lambasted by critics for being too limited. It had no printer port and rather than recognizing handwriting, it had a funny language for entering text. But it took over the handheld market because it was had a long bettery life, synchronized well with computers, felt good in one hand, and–most crucial– fit in a shirt pocket.

The cell phone market is one which trades in Apple’s strengths: highly integrated software and hardware, design that makes a fashion statement, and top-notch usability. Microsoft couldn’t penetrate the market for the longest time because they forgot what Nokia and Palm never do: that the phone must work flawlessly as a phone, first and foremost.

So why is it that, after Steve Job’s demo, everyone is complaining about how hard cell phones are to use? (Strangest to me is the claim that it’s too hard to insert phone numbers, since just about every cell phone will offer to add a number after every call, sometimes including the caller ID info.) Part of the answer is wishful thinking: typing on these things isn’t as nice as using a full-sized keyboard, and we can hope (despite early reports to the contrary) that Apple’s solution is better. And every new technology sounds better before you try it. But Apple does have an advantage that Palm and Nokia don’t. Integration.

One of the cool features of the iPhone is voice mail as a graphical application. That’s something that every cell phone could do and should do. And don’t think that the manufacturers haven’t thought of it. The problem is that they need buy-in from the cellular providers. And the cellular providers don’t care enough to handle the technical hurdles. From the demo, it’s not clear if the voice mail goes over the cell network, the messaging network, or the Internet. If it’s the Internet, you’re dead in the water if you don’t buy that additional service. I don’t have Internet service on my Treo, and I’m glad I don’t need it.

There are lots of chicken-and-egg issues in the cell phone business. Cell phone makers don’t want to sell to only one provider, so they don’t work with one provider to add special services. Meanwhile, providers in the US often disable features they don’t like. Apple, being a newcomer with a huge amount of clout, was able to cut a deal with Cingular to provide a phone with all sorts of integrated services.

A second issue, specific to Palm, is the Palm OS. Once it was a paragon of simplicity. Now it’s under-capable. Not just that, but Palm, trying to be the next Microsoft, split into a software company and a hardware company. Long story short, Palm (the hardware company) hasn’t always had control over Palm OS. And the latest phones need PC-style multitasking capabilities. Palm isn’t there yet. Manufacturers which use Linux are there. (If you strip away the desktop user interface, it takes a hard core geek to tell Mac OS from Linux. On a cell phone, Mac OS has no particular advantage.)

So the iPhone has two distinct advantages: no legacy, and tight integration with Cingular. In terms of making a cell phone with an easy-to-use and flashy interface, that’s a big advantage. But seeing as Verizon has better customer service and better coverage around here than Cingular, and seeing as I’m enjoying my ability to use my old Palm apps (and would hate to loose my recipes!) I’ll be sticking with Palm for the forseeable future.

I expect that the iPhone will shake up the industry in a good way. Expect to see graphical voice mail everywhere in a few years. I just hope Palm weathers the competition (which is not just from Apple right now) and comes back fighting.

Why Java Server Faces gets it wrong

Several years ago, Sun came up with a technology called Enterprise Java Beans (EJB). Because of their marketing clout (they own the Java language), for a while you could harldy get a job as a Java programmer without EJB on your resume. I know; I was job hunting at the time. The problem was, rather than making your life easier, EJB made you describe, in triplicate, the data you were trying to pass around. It took someone writing a book about how EJB is a bloated mess for developers to come to their senses and revolt.

A programming framework is supposed to make your life as a programmer easier. If you find it’s making your life harder, you shouldn’t use it. In fact, it should be more than a little helpful, since it must overcome the cost of learning the framework, installing it, and staying on top of its bugs and security holes.

Perhaps the best rule of thumb is the DRY principle: Don’t Repeat Yourself. EJB was a terrible violation of DRY. Not only did you have to write three classes to define one “bean” of data, you also had to run an extra program to generate additional classes and description files.

About the same time that EJB was all the rage, I learned about a new framework called Struts from a job interview. Struts is supposed to make web forms easier to write, and it has a lot going for it. These days, Sun has cloned Struts with something called Java Server Pages, or JSF. Struts got it wrong, although they eventually started to improve. JSF is just as wrong.
It handles data entered from web forms, keeps track of data validation problems, and perhaps most important, refills the form so that every little error doesn’t force the user to re-enter everything. This sort of stuff is a real pain to program, but very important. It’s not uncommon to have a form online that asks for your name, address, email address, phone number, and a zillion other details, all of which must be valid before you can proceed. If every time you forget to put in a required field you have to re-enter everything, you’re likely to give up pretty quickly.

My honeymoon with Struts ended when I realized that, to keep track of what was happening on my web pages, I had to keep track of Struts, JSP (the Java technology Struts lives on), Servlets (the technology underlying JSP), Tomcat (the program which runs the servlets), and all of the HTML/HTTP stuff that comprises the web. And a bug on your web page could be caused by any of these, or by a weird interaction between any of them.

Struts violates the DRY principle. You start out with a web form. You need a Java class (a “bean”) which describes the data in the form. If the bean isn’t the same as the Java class you ultimately want to populate (and it usually isn’t), you’ve just repeated yourself three times: the form, the bean, and the target class. And on top of that, you need to write an XML file which maps from the name of the form bean Java class and the name of the form bean that you use on your web pages.

The thing you want to do from a DRY perspective is to write one java class, and have a little JSP tag that says “here’s my data, generate a form.” In reality, that’s usually not what you want, since your Java class may have lots of interal-use-only fields that you don’t want just anyone to modify. And a particular web page often has a particular look that an auto-generated form wouldn’t match. So it is necessary to repeat yourself, if only to mention which fields go where on the web page.

I ultimately gave up on Struts because VocaLabs does a lot of surveys. Every survey has a different set of questions, so you really do need the entire form to be auto-generated. Struts ultimately introduced something called a DynaActionBean, which allows you to define your form data in the XML file, rather than writing custom code. Even so, the fields are fixed, so it wouldn’t work for surveys. As far as I know Java Server Faces still doesn’t have this feature.

So today I decided to give Java Server Faces a look, since I’m working on something similar to the problem JSF is supposed to solve. Earlier this month I finished writing an internal utility which allows us to edit all the data in our database through web forms. It was surprisingly easy, considering we have 72 tables spread across two databases. The utility lets us avoid editing the database directly, thus enforcing the rules that are defined in our Java classes. And every field in the web form is cross-referenced against documentation gleaned from our source code, so when our system administrator (or–worse– our CEO) pokes at things, he can see the trouble he’s getting into.

Today I’m writing a web form so that our clients can change their account information. I can’t just give them access to our internal utility, but I’d like to leverage the work I did on it. JSF, as I mentioned, is completely the wrong tool. And I realized what the right tool is.

The right approach

To fill in a web form, Java already has some conversion routines to, for example, turn the string “123″ into an integer. What I’m working on builds on those conversion routines. To put it simply, I work on a field-by-field basis, not a form-by-form basis. Each conversion class generates the HTML for its part of the form, and knows how to convert that back into the appropriate data type. The conversion classes are easy to write, so that special cases are easy to handle. When I write the code to generate the web page, I tell it where the data comes from, and it does all the rest. Rather than having special “beans” to remember what the web form looks like, when the form is submitted it works straight from the HTTP request.

Remarkably, writing my own custom framework that does the right thing should take less time than reading the documentation for JSF.

Romance vs. Everything You Need to Know About Artificial Intelligence

This is a follow-up to my last post.  In it, I was talking about how fascinating Artificial Life is.  Genetic algorithms (trying to do something useful with Artificial Life) and neural networks may be among the most overused artificial intelligence (AI) algorithms for the simple reason that they’re romantic.  The former conjures up the notion that we are recreating life itself, evolving in the computer from primordial soup to something incredibly advanced.  The latter suggests an artifical brain, something that might think just like we do.

When you’re thinking about AI from a purely recreational standpoint, that’s just fine.  (Indeed, I have occasionally been accused of ruining the fun of a certain board game by pointing out that it is not about building settlements and cities, but simply an excercise in efficient resource allocation.)

But lest you get seduced by one claim or another, here are the three things you need to know about artificial intelligence.

First, knowledge is a simple function of alleged facts and certainty (or uncertainty) about those facts.  Thus, for any circumstance the right decision (from a mathematical standpoint) is a straightforward calculation based on the facts.  This is simply the union of Aristotle’s logic with probability.  Elephants are grey.  Ruth is an elephant.  Therefore Ruth is grey.  Or, with uncertainty:  elephants are grey 80% of the time.  Ruth is probably (90%) an elephant.  Therefore there is an  (0.80*0.90=) 72% chance that Ruth is grey.  If you have your facts straight and know the confidence in them– something you can learn from a statistics class– you can do as well as any intelligence, natural or artificial.

Think about this for a moment.  For just about every situation, there are one of three cases.  In the first case, there isn’t enough data to be sure of anything, so any opinion is a guess.  How popular a movie will be when almost nobody has seen falls into this category.   Second, there is enough data that one possibility is likely, but not certain.  (The same movie, once it has opened to a small audience, and opinions were mixed.)  And third, evidence is overwhelming in one direction.  But in none of these cases will a super smart computer (or human analyst) be able to do better on average than anyone else doing the same calculations with the same data.  Yet we tend to treat prognisticators with lucky guesses as extra smart.

Which leads us to the second thing you need to know about AI:  computers are almost never smarter than expert, motivated humans.  They may be able to sort through more facts more quickly, but humans are exceptionally good at recognizing patterns.  In my experience, a well-researched human opinion beats a computer every time.  In fact, I’ve never seen a computer do better than a motivated idiot.  What computers excel at is giving a vast quantity of mediocre opinions.  Think Google.  It’s impressive because it has a decent answer for nearly every query, not because it has the best answer for any query.  And it does as well as it does because it piggybacks on informal indexes compiled by humans.

And the third and final thing you need to know about AI is that every AI algorithm is, at one level, identical to every other.  Genetic algorithms and neural networks may seem completely different, but they fit into the same mathematical framework.  No AI algorithm is inherently better than any other, they all approach the same problem with a slightly different bias.  And that bias is what determines how good a fit it is for a particular problem.

Think about a mapping program, such as MapQuest.  You give it two addresses, and it tells you how to get from your house to the local drug store.   Internally, it has a graph of roads (edges, in graph terminology) and intersections (vertices).  Each section of road has a number attached to it.  The maps I get from AAA have the same number– minutes of typical travel time.  MapQuest finds the route where the sum of those numbers– the total travel time– is minimized.  In AI terminology, the numbered graph is the search space, and every AI problem can be reduced to finding a minimal path in a search space.

What makes AI interesting is that the search space is often much too large to be searched completely, so the goal is to find not the shortest path, but a path which is short enough.  Sometimes the path isn’t as important as finding a good destination, for example, finding the closest drug store.
In the case of artifical life, each “creature” is a point in the search space.  Consider Evolve, the program I wrote about the other day.  In it, each creature is defined by a computer program.  The search space is the set of all possible programs in that programming language– an infinite set.  And any transformation from one program to another– another infinite set– is an edge.  By defining mutation and reproduction rules, a limited number of edges are allowed to be traversed.

So, to summarize:  certain AI algorithms sound romantic, but they are all essentially the same.  And humans are smarter than computers.

Artificial life

For some reason, yesterday I was obsessed with artificial life. It started out on Friday night when I was trying to sleep but found myself thinking about things that are entertaining but don’t help me sleep. (This happens to me a lot. In one particularly insomnia-provoking bout a few years ago, I figured out how long division works as a way of proving that 0.9999999… equals one, and discovered that that is true for the highest digit in base other than base ten.)

In this case, in the morning I searched online to find something interesting enough that I won’t have to write my own to satisfy my curiosity. Right now I barely have enough time to read about other people’s software, let alone write my own. Unfortunately the field is so diverse that I just might have to write my own someday. Not because the world needs more artificial life, but just because.

Artifical life consists of three things: a genome; a method for mutation and “natural” selection, a universe, which provides the domain where the life forms reside, as well as the physical laws of that universe. (The body of an artifical creature can be thought of as the union of the genome with the universe.) All three of these things are incredibly broad in their possibilities.

At one extreme is framstics, which is a research project (as well as shareware and other stuff) which models systems that could be physically possible in our world. Think of it as a construction kit, not unlike an Erector Set or Lego Mindstorms, that exists entirely in software. There are sticks, muscles, neurons, and sensors, all of which are modeled on real-world things. For the genome, they have several options. One is a low-level language which simply describes body parts and connections between them, much like a blueprint or circuit diagram. The rest are all intended to be more realistic or interesting genomes. They wrote a paper which describes in engrossing detail something I regard as completely obvious (and which makes artificial life compelling to me): the genome you choose has a profound impact on the organisms you get. In the low-level Framsticks language, a simple mutation (changing one letter in the genome) is likely to result in a useless change. Languages which have symbols for repetition or recursion allow simple mutations to yield symmetrical bodies or repeating patterns.

On the topic of natural selection, the defining quality of artifical life is that there is mutation (and optionally gene sharing, a.k.a. sexual reproduction) and culling of the less fit. In some cases it’s more like breeding, where the computer chooses the most fit organisms each generation according to an explicit “fitness function”– for example, choosing the fastest or the tallest organisms. That is analagous to breeding. The option is to build reproduction into the organisms and limit the resources so that some of them die off (or worse, none die and your computer gets swamped calculating the actions of random, unevolved– and therefore uninteresting– critters.) The former case has practical applications– a genetic algoritm has yielded a patentable circuit design– but the latter strikes me as more entertaining. Framsticks supports both modes.

Even though framsticks is a complete toolkit which includes great eye candy and enough complexity to keep you busy for years, I’m somewhat more fascinated by a little program called Evolve. It is in every way the opposite of Framsticks. It’s a one-person project. The genome is simple computer code, a simple version of Forth. So it’s not unlike what I write for a living. The universe is a grid, like a chess board. (The technical term for this sort of grid universe is a cellular automaton, the most famous of which is Conway’s Game of Life.)

What makes Evolve so fascinating is it’s so simple and comprehensible to a programmer like me.  Its author has learned a lot about what sort of languages and mutations yield intersting results.  Are these things modelling plants or animals or what?  Do they have brains, or is it all just DNA?  In this case, it’s the latter, and the DNA contains commands like “MOVE”, “EAT”, “GROW” and “MAKE-SPORE”.  It is far more like a computer program than a life form, and yet it is all the more compelling because it is so unlike earthly life.

Finally, this isn’t artifical life, but it’s the inspiration for a lot of artificial life sims.  Core Wars.   First described in a popular Scientific American article, Core Wars is a battle to the death between two computer programs in a simulated machine.  The assembly language for the Core Wars computer (called Redcode) is simple enough that the article suggests running it as a pen-and-paper game.  (Or you could write to the author to get the computer code; this was before the Internet.)

Until yesterday, I’d heard of Core Wars but never looked into it, since I tend to think of assembly language as difficult.  Which it is, if you’re trying to write something complicated like a web server.  But for this sort of thing, it’s simple and approachable– not to mention the best language for doing the sort of mischief that other languages try to keep you from doing.  Self-modifying code in particular, which is the core of the game:  modify your opponent before it modifies you.

The worst product idea in a very long time

I read a story in the paper yesterday about a product called Nicogel. It is a hand lotion substitute for the nicotine patch. Only it’s available without a prescription, because it’s not a drug. It contains tobacco extract, rather than purified nicotine, so it is presumably categorized as an herbal supplement. Or it would be if it were a food. As it is, it appears to be completely unregulated. And it contains a highly addictive substance. And it’s in a form that kids love.

I can just imagine lots of 12-year-olds trying to get high on hand lotion, only to find themselves addicted.  And perhaps getting skin cancer.