New Python unittest module?

Collin Winter blogs about an updated unittest module he wrote. His update fixes the internal structure, and therefore the expandability, of the module, but also cleans up the external API. There are still a few minor improvements I would like to see, but nevertheless I hope that his updated version will be included in Python’s standard library eventually.

Java File API

I don’t like Java’s File API. It’s main problem is that is mixes several responsibilities into one class. One responsibility is handling of “abstract” file and path names, i.e. operating system independent file names. The other responsibilities are all file system related: Querying file meta information (access rights, access times, file type), creating files, and listing directory contents. In my opinion these should be clearly separated.

I am also missing a constructor or construction method that creates a File instance from a list of path components like this:

  File f = new File("/", "path", "to", "file");

Maybe I should start a Java warts page, similar to my Python warts page …

Java: Iterators are not Iterable

This is something I stumbled across multiple times now in Java: Iterators are not Iterable. Java 1.5 introduced the foreach statement, which allows easy iteration over collections like this:

for (MyClass item: myCollection) {
    doStuffWithItem(item);
}

For this to work class MyClass must implement the Iterable interface. This interface defines just one method that returns an Iterator:

    Iterator<T> iterator();

This works fine in many cases. But now suppose we have a class that can be iterated in multiple ways. For example a Tree class that can be iterated breadth first or depth first. I would define two methods and use them like this:

class Tree {
    Iterator<Node> depthFirstIterator();
    Iterator<Node> breadthFirstIterator();
}

// ...
    for (Node node: myTree.depthFirstIterator()) {
    }

This won’t work in Java. Iterators do not extend the Iterable interface, like they in other languages. For example in Python an iterator will return itself when its iterator() equivalent is called. Instead in Java I have to return an object that itself implements the Iterable interface like this:

class IterableIterator<T> implements Iterable<T> {
    private Iterator<T> iter;

    public IterableIterator(Iterator<T> iter) {
        this.iter = iter;
    }

    public Iterator<T> iterator() {
        return iter;
    }
}

class Tree {
    IterableIterator<Node> depthFirstIterator();
    IterableIterator<Node> breadthFirstIterator();
}

Not only is this boilerplate code for no gain, a generic class like the IterableIterator above is not included in Java’s standard library.

JavaScript Includes Revisited

I’ve given in in the JavaScript and Modules matter. I’m doing now what all the cool kids are doing: Using XMLHttpRequest synchronously to load external JavaScript files. This has the advantage that it works synchronously, a very important thing for includes. Therefore I don’t need callback and moduleLoadedhackery anymore. Also I have sensible error checking, i.e. I can notice when loading a JavaScript module failed. This enables me to have multiple search paths, for example, and sensible error handling.

But it comes at a cost, too. XMLHttpRequest is only supported by “modern browsers”. But since moderns browsers means any browser of the last five years I don’t care too much. When XMLHttpRequest is not supported, I suppose I can’t rely on other useful stuff neither. The synchronous requests also mean that there might be a slight performance hit. But the advantages of synchronous includes outweigh this by far. Finally, XMLHttpRequest doesn’t provide the means to recognize when a synchronous request is stalled. This is not of too much concern, since a stalled request usually means that an include files can’t be downloaded and so the JavaScript application is not usable anyways. It might be a concern for sensibly degrading, though.

Update: Here is the code to the updated module. Using it is now as simple as this:

<script src="include-http.js"></script>
<script>
    include.include("my.module");
    myfunc(...);
</script>

Porting JavaScript to Internet Explorer

While porting a small AJAX application I wrote to Internet Explorer, I encountered a few problems:

  1. IE doesn’t like <script> tags without a matching closing tag. This is especially a problem if you use the src attribute and try to use an XHTML-like closing tag like this:
    <script src="..."/>
    

    In this case IE doesn’t draw anything at all, since it keeps looking for the end of the script and doesn’t find it. This is especially a problem with templating toolskits like Kid. If Kid encounters a construct like <foo></foo> it will correctly shrink this to <foo/> in the XML output. The solution is rather easy: Insert a newline character between openening and closing tag in the template. Since Kid has now way to determine whether this whitespace is significant it will keep it in the output.

  2. It seems that IE doesn’t like trailing commas in JavaScript object literals:
    var o = {
        x: 'Foo',
        y: 1,
    };
    

    I like to insert trailing commas in cases like this. It’s easy to forget to add one if extending the property list. Firefox parses the trailing comma, while IE doesn’t. (IE follows the ECMAScript standard here, which doesn’t allow the trailing comma. Indeed in array literals a trailing comma has a special meaning, so it’s probably a good idea to disallow the trailing comma in object literals.)

  3. My AJAX application queries a web service for a list of items and displays that list in a table. The table is generated in HTML code like this:
    <table id="mytable">
        <tr>...</tr>
    </table>
    

    I then used JavaScript code like the following to populate that table:

    var table = document.getElementById("mytable");
    var trTag = createRow();
    table.appendChild(trTag);
    

    This worked fine in Firefox, but no matter what I did, IE refused to render the table rows. After much hair tearing I remembered that Firefox’s DOM Inspector showed a TBODY element just beneath the TABLE element and TR elements are attached to the former. The DOM Inspector basically showed the following structure:

    • TABLE id=mytable
      • TBODY
        • TR (my header row from the HTML)
      • TR (a dynamically generated row)

    So I changed my code to append the newly generated rows to the TBODY instead:

    var table = document.getElementById("mytable").
        getElementsByTagName("tbody")[0];
    var trTag = createRow();
    table.appendChild(trTag);
    

    This works in Firefox and IE.

Apache, cgi-bins, and the Authorization header

I had a problem: Server A runs a web service, which requires users to authenticate using the standard HTTP authentication mechanism. Server B should have web pages that use AJAX to query A’s web services. Server B’s web pages also require authentication, using the same scheme, backend and database as server A. There are two problems:

  1. JavaScript web pages can only access web services/pages on the same server using XMLHttpRequest, for security reasons. Solution: Use a forwarding/proxy service. E.g. to access http://a.example.com/service from b.example.com add a service http://b.example.com/servicethat just forwards requests to the web service on A. This solution is quite straight forward.
  2. Since B uses the same authentication scheme as A we need to forward authentication information passed to B’s forwarding service on to A. Unfortunately this is not straight-forward, since the Apache HTTP Server provides no easy way to read the full authentication information passed to it via a cgi-bin. The only available information is the REMOTE_USER environment variable. This is not enough to construct a new Authenticationheader, though, since password information is stored encrypted in the account database.Finally I found a solution in the Zope 2 documentation. Apache’s mod_rewrite comes to the rescue. It allows you to read arbitrary HTTP headers and add arbitrary environment variables before executing a cgi-bin. The following recipe added to the appropriate .htaccess file adds a HTTP_AUTHORIZATION variable:
    RewriteEngine on
    RewriteBase /
    RewriteCond %{HTTP:Authorization}  ^(.*)
    RewriteRule ^(.*)$ $1 [e=HTTP_AUTHORIZATION:%1] 
    

Loading Modules in JavaScript

One of the weaknesses of the JavaScript language is that it does not have the concept of “importing” or “including” other source files or modules. In client-side JavaScript as used in web browsers you have to include all JavaScript files you intend to use in the main HTML file like this:

<html>
    <head>
        <title>...</title>
        <script src="A.js"></script>
        <script src="B.js"></script>
        <script src="S.js"></script>
    </head>
    ...
</body>

In this case S.js is a page-specific JavaScript script that uses features from A.js and B.js.

The standard way to load other JavaScript files dynamically is to add a script element programmatically to the head section of the document, using DOM:

function include(filename) {
    var scriptTag = document.createElement("script");
    scriptTag.setAttribute("lang"), "javascript"));
    scriptTag.setAttribute("type"), "application/javascript"));
    scriptTag.setAttribute("src", filename);
    document.getElementsByTagName("head")[0].appendChild(scriptTag);
}

Unfortunately this approach has some problems as well: The include file is loaded “lazily”, i.e. the next time the JavaScript interpreter hands back control to the browser. (At least it is this way in Firefox.) If, for example, a file A.js defines a function func_a, the following will not work:

include("A.js");
func_a();

My first solution was to use a callback, invoked when the module has been loaded:

function include(filename, callback) {
    var scriptTag = createScriptTag(filename); // as above
    scriptTag.onreadystatechange = function() {
        if (scriptTag.readyState == "complete") {
            callback();
        }
    }
    scriptTag.onload = function() {
        callback(file);
    }
}

This solution uses the non-standard onload (Firefox) or onreadstatechange (IE) events. It is invoked like this:

include("A.js", function() {
    a_func();
});

This solution also has several problems: It is non-standard, it is ugly (though this can’t be helped, due to the asynchronous script loading), and including a module from a module still doesn’t work. To explain the last problem, I will use a script S.js which depends on the function a_func defined in A.js, which in turn depends on b_func defined in B.js. Now the code should look like this:

// S.js
include("A.js", function() {
    a_func();
});

// A.js
include("B.js", function() {
    a_func = function() {
    }
    a_func.prototype = new b_func();
});

// B.js
// ...

The problem is that the callback in S.js is not necessarily processed after the callback in A.js, leading to a_func being undefined.

The solution I have come up with is not particularily nice, but this can’t really be helped. Basically each module has to call a function when it has fully loaded, like this:

// A.js

include.module("B.js", function() {
    // ...

    include.moduleLoaded("A");
});

The main HTML file must load the include.js before loading any other module and modules should only be loaded via include.module. Also my implementation uses “real” module names, e.g. a.b.c instead of a/b/c.js. This allows me to expand the mechanism easily, for example to add a module search path.

Sample implementation

PyUnit and Decorators

One of the shortcomings of the Python standard library is in my opinion that PyUnit doesn’t use decorators (as later versions of JUnit and NUnit do). Therefore I have written a sample implementation that adds decorator support. (unittest.pydiff to Python 2.5’s unittest.py)

To do:

  • Warn when old-style and new-style tests are mixed.
  • Throw an exception when multiple setup or teardown methods exist.
  • Ability to warn when old-style methods are used.

Porting Tomboy from C# to C++

Hubert Figuiere writes about porting Tomboy from C# to C++. I tried to leave a comment in his blog, but something doesn’t work, so here is my comment:

Porting Tomboy from C# to C++ seems like a step back to me. C# is a higher level language than C++ and is much easier to write good and clean code with than C++. C++ rises the maintenance burden.

Also, at the moment Tomboy’s code base is under heavy development with one WSOP project (Maria Soler Climent’s) and one Summer of Code project (mine) going on. The code base is changing rapidly, so it seems this is a bad time to do such porting.

XML Test Runner for PyUnit

JUnit features an XML Test Runner. This means that the result of a test run is not written to stdout, but instead written into an XML file. This XML file can then be automatically processed. This is for example useful for fully automated builds using software like CruiseControl. The XML output is easily converted into an XHTML file for easy human reading.

Until now PyUnit was lacking an XML Test Runner. Since I needed one for one of my projects that used CruiseControl, I wrote one. It is available for download here. Since this is an extension to a unit testing framework, the classes are of course fully tested. 🙂

If you have any suggestions or criticism, please let me know. I’m a nitpicker myself, so even if you have some small suggestions about coding style or improvement hints, I would like to hear about them.

Update: I have now submitted this patch to Python’s patch tracker, ticket number 1522704.

Update^2: I created a page with more information about PyUnit and CruiseControl integration, including sample configuration files and scripts.

Update^3: I updated the XmlTestRunner after input from Mirko Friedenhagen: It now recovers gracefully from unit tests overriding sys.stdout and sys.stderr, the XmlTestRunner returns the TestResult instance instead of a boolean value, and you can now stream multiple test suite results into a single XML file, since the XML header is not written to file streams by default.

Update 2017-10-30: While XMLTestRunner can be found on GitHub nowadays, its use has been deprecated for a while. Instead, use maintainer projects, such as unittest-xml-reporting.