Tuesday, March 3, 2009

HtmlUnit vs HttpUnit

Today I bid a fond farewell to HttpUnit.

I've been using HttpUnit for my black-box testing for about 3 years, and I really like it. However its JavaScript support just hasn't kept up with our needs, and it seems HtmlUnit has a much more active community around it.

I converted about 6,000 lines of test scripts in about 3 days. I thought the HtmlUnit folks (if not the HttpUnit folks) might be interested in what I experienced.

First, the FUD
HttpUnit is a great project, and very similar to HtmlUnit.

Blog entries like this (from an HtmlUnit guy) paint an inaccurate picture saying that HttpUnit is 'fairly low-level, modeling web interactions at something approaching the HTTP request and response level' whilst HtmlUnit is 'more high-level than HttpUnit’s, modeling web interaction in terms of the documents and interface elements which the user interacts with'. It then gives an HttpUnit example using requests and responses, and an HtmlUnit example using forms and input controls.

But the examples are not comparing apples to apples. HttpUnit does forms (and input controls, and tables, and JavaScript) too - and in almost exactly the same way as HtmlUnit. I'm not saying this is deliberate deception, but I think such a comparison is unfair and may even be detrimental to HtmlUnit because developers may be more reluctant to 'make the switch' if they perceive the APIs are very different.

In fact, the API methods are almost 1-to-1 identical. In coverting about 6,000 lines of code here's what I found:
  • With HttpUnit: 6,198 lines
  • With HtmlUnit: 6,285 lines
That's less than a 1% difference - not really a difference of 'high level' versus 'low level'.

Next, the Good
The HtmlUnit API definitely feels nicer.
  • There's some neat 'public <I> I getInputByName' code that saves a lot of casting - I've stolen this idea for the next release of Metawidget
  • I love the asText and asXml methods which do a lot of parsing for you
  • HttpUnit used to silently submit forms using 'null' if the button you asked it to submit didn't exist (eg. form.submit( form.getSubmitButton( 'not-there' ))). HtmlUnit doesn't do this
  • You can set file upload boxes just like regular text boxes. HttpUnit required you to wade through some form.getRequest goo
  • I love that things like HtmlTextInput are implemented as direct extensions of the internal DOM, rather than some parallel heirarchy
Finally, the Bad
There are some things from the HttpUnit API I missed:
  • Locating anchors in HtmlUnit is very fiddly. You have page.getAnchorByName and page.getAnchorByHref. But this is black box testing - it's meant to test the 'user experience'. And the user never gets to see either an anchor's name or its href. HttpUnit had a response.getLinkWith method that located an anchor by its 'innerHTML' or 'what the user actually sees'. This would be very helpful?
  • Choosing options from select boxes in HtmlUnit is similarly fiddly. You have select.setSelectedAttribute(optionValue) but again this is keying off something the user never sees. I'd really like a select.setSelectedAttribute(innerHTML) so I can simulate choosing what the user chooses?
  • HtmlUnit warns that use of the script type 'text/javascript' is obsolete, and maybe it is (as of about 2007). But that's a pretty recent change. If you try <script/> in the W3C validator it still suggests using 'text/javascript', and older browsers will want to see it. So this seems a very noisy warning to have on by default?
  • There doesn't seem a good equivalent to HttpUnit's form.getParameterNames?
Overall, though, there's really no bad: HtmlUnit is the better product. Certainly its JavaScript support seems much better. I guess the only bad is that interest around HttpUnit has waned and it's therefore no longer a close competitor. Competiton is good, incentive to improve is good, and could have only made both products better.

Many thanks to both the HttpUnit and the HtmlUnit teams for all their hard work and contributions to the community!