The USES Issue
Tuesday, December 2. 2008 • Category: Automatic Mind • Comments (13) • Trackbacks (0)-->
Intro
It is hard to term the phenomenon without offending someone. Good names would be Scienceware, or Guruware, or even better Scientistware. They are all taken by companies or other institutions that presumably all do a way too good job to provide a name for a negative aspect. So let me call it USES for Unsustainable Software Emerging from Science. This blog post shall shed some light onto the issues of USES and onto possible reasons.
What USES is All About
As a computational linguist, I am working with specialized software each and every day. May it be part-of-speech taggers, tools to explore corpora or treebanks, or simply software development tools such as compilers, or even integrated development environments. But there is one type of software standing out: Software that emerged from Science (USES). Common features of this type of program include:
- Usually developed for a very special and highly sophisticated purpose that is only understood within the field.
- Developed by a single person who is a major expert within this field.
- The major expert developing the software often is not a major expert in neither software design nor software architecture.
- Examples of file formats are rare and there is little to no documentation about the software.
- The software reacts unexpectedly on certain types of input, e.g. it ignores syntax mistakes in grammar files and then malfunctions without telling users why.
- The software is often not completely finished and and includes some missing bridges at the end of some roads without any warning signs.
- The only person who knows how to work with it is the major expert in the field who is not the major expert in writing usable software.
Since I am not intending to offend anybody, let me give an anonymous example. In a paper about parsing noun phrases with a certain parser, it is written that a single day was spent on writing the slightly over 100 grammar rules in use. A number of source code examples garnish the publication and it seems to describe a mighty piece of software that performs well and is easy to operate. But this opinion can change rapidly once one starts using the software. The grammar parser is not safe enough to deal with missing semicolons. Sometimes it notices them, reporting an error 20 or 30 lines before or after, some other times it just ignores the issue and interprets something the grammar writer did not intend at all – without saying so.
I am sorry to say that this behavior – which is only one example of bad application behaviour – is shared by a number of applications I have been using so far. In all cases, documentation was sparse and I spent a number of days or weeks on trial and error procedures.
The Lack of Documentation
Why is there a lack of documentation of USES? Here are my hypotheses: science as a system rewards publications, may they be books or – probably even more important for most authors – papers accepted at conferences or by journal boards. In papers, people report about the great insights the gained. Of course, these insights were gained by employing USES. Which is alright. However, there are three things that are not rewarded by this system: 1) the free availability of the described software to other researchers, 2) the free availability of the data required for the described experiments, such as corpora, grammars, or other computational resources, 3) the existence and free availability of reasonably good documentation for the described software.
Authors should be encouraged to consider these three points. If they are not fulfilled, other researchers can neither confirm nor refute the results published in the corresponding papers. Which is, I hope not only to my opinion, what a large part of the science business should be about.
The Lack of Quality
With quality I refer to the engineering part of software: it must be stable, usable and not too complicated to install and maintain. I am not referring to the actual purpose of the software. The LaTeX typesetting system is a good example: its output is regarded to be some of the most properly typeset books and papers out there – but most people writing books and papers might find its programming-alike user interface simply not usable at all. Imagine Microsoft Word being »user-friendly« in the same way: it simply would not sell.
But why is this so hard? One possible answer: it takes time. Plenty of time. Designing a graphical user interface (GUI) is said to take 60% of a project's time and therefore costs. Now USES usually does not include GUIs, but every programmer knows how tedious it is to catch all errors and produce intelligent error messages to the user. Again, meaningful error messages and a good user interface are not rewarded in the world of publication-based science. They steal researchers' time, and so they better do without them.
The Lack of Design and Architecture
There is an even more important type of quality. The quality of the engine under the hood. The properties of those parts the car driver does not even know that they exist:
Knowledge in software design is a skill that many programmers in scientific business must do without. Even worse, it is regarded as a superfluos overhead of work. Its absence explains the lack of what software designers call the -ilities, denoting properties of code such as reusability, scalability, manageability, reliability, sustainability, …, Features one typically finds in enterprise software. Features that are not honored by the system of publications. An important side-effect of software design is that it allows the coordination of software development in a team. Without software design, this can be quite hard, depending on the size of the project and the team. Without a team, software becomes as idiosyncratic as USES tends to be.
One step further from software design, one finds software architecture. It is usually pattern-based. A pattern sketches a common problem and its solution. One could see it as a template solution or recipie to a given problem. These patterns are documented well. Using them can easy communication among developers. If a comment in the source code of a program reads »Using the Observer-Pattern here«, everybody with knowledge about software architecture does not need any further explanation on what is going on. This can simplify the development in a team or the takeover by a new maintainer of the project.
The Lack of Completeness and Maintainance
Most USES is either incomplete or many years old. If it has been written in rather machine-oriented programming languages such as C or C++, it is often hard or impossible to get USES running on an up-to-date operating system. Why is this? So far I bombarded the system of publications with criticism. But there is another issue: science often works project-based. A project proposal is written, hopefully a grant is given, and then the project is worked on. At some point, the project is over. This usually happens way before the USES product has reached a state of completeness. Researchers are then forced to move on to other projects and the old program lies there somewhere on the Web server, becomes old and grows gray hair and is rendered unusable by time bringing changes to computer platforms and file formats. Bugs are detected by users but they are not documented on a central bug tracking system and as the development period is over, nobody will ever fix them.
Outro
In this blog post I have describe the issues of software written by scientists. This is not to offend programmers out there, but the problem must be addressed. Good quality software is likely to quicken interest in your work in other researchers and students. It is likely to improve the gain of knowledge in computational scientific disciplines in general as it enables real reviews. Furthermore, good quality software has the potential of supporting good teaching instead of leaving students sitting madly frustrated in computer rooms.
One question remains: how can we reward people in science avoiding USES?
Graphics taken from Open Clip Art Library, modified by Niels Ott.
Addenda
- 2008-12-05: Jochen Leidner pointed me to a readable article that discusses the same issue with a lot more analytic expertise. Read Empiricism is Not a Matter of Faith by Ted Pedersen.

13 Comments
"One question remains: how can we reward people in science avoiding USES?"
By citing them frequently?
I think that people should get more recognition for writing useful, interesting, and easy-to-use programs. But then again, that might just happen one day. I think that the best thinng we can do to help, is write great software...
BTW. I pronounce USES as /useless/. Most of my experiences with software of this kind have been... aaaaaargh. Let's just say that I'd like those hours of my life *back*.
But you're right: for the most part, the motivation is not there. Too bad.
In my opinion, making code open source can be a first step towards making the world a better place. As Ted Pedersen points out in the article I added, one must decide early to publish code in order to avoid legal issues, e.g. by using code snippets or libraries that conflict with open source licenses.
What one always can do is to try to be a good example oneself, hoping that others might follow. This is why the »Software Projects« section of my web site here exists, even though it should be equipped with more code that I wrote in the past.
I think the only way there's gonna be change is by influencing the coming generations of students, change the curricula etc.
Just a couple of remarks:
The fact that code is open source doesn't make it better, it just makes it obvious for other people that the code is crap. However, I wouldn't associate the -ilities with commercial software. Companies want to sell software, not create good software - and not all of them realise that writing good software in the first place might save money further down the road.
Apart from that, I hate writing GUIs, mitigating all the one million ways users can screw up. Writing "behavior is undefined if a > b or c == 0" in a function docstring is so much more fulfilling.
By the way, a good way to go through USES hell is to enroll in CL at a French university and do the projects required for some seminars. I feel like I've spent an entire semester writing glue code.
Concerning user interfaces: some tools simply are stupid without GUIs. Displaying linguistic tree structures or AVMs without a GUI would be terrible. But then again, a GUI is not always a must. A good TUI should do the job in many situations. And with good I do not mean one saying thinks like "oops, signal 11 received, good bye" or "NullPointerException in WeirdClass, corrupting your data" but one with real error messages that make sense. As I wrote in the blog post, many people do not take the effort to even confirm to this small set of standards of well-behaved application behavior.
I remember really wishing http://casper.sf.net would get off the ground, but it seems to be stuck
-----
I once tried implementing two parsers that were described in those 10-page articles for a college project, it took us about 3 months longer than expected due to all the details we had to fill in on our own (or make up, in cases where we couldn't analyze or calculate our way to the correct formulas).
In the end, 3 months overdue, we got one of the parsers working at about 3/4 the accuracy that they reported. I still don't really know why it didn't perform as well, since so many details might be different.
(We released our own source though
However, just demanding Agile methodology from scientists is not likely to get anywhere fast. This is especially since even those who want to release code/data are afraid to do it before the publication, then forget to do it and then can't even remember how it works.
There is a need for a most basic guidance. I think that should be code release, plus runnable examples including source and result data. That way one could download the package, make sure the examples can run correctly and then run the same configuration on his/her own data. Or even re-implement the code with the basic assurance that the packaged examples still produce the same results as with original code.
Once we have that, we can ask for more. But it should be easy to start being good!
For literature on these issues, I would take a look at the “Empirical Studies of Software Development” research group at the Open University, UK, headed by Helen Sharp:
http://www.springerlink.com/content/w214725153770u22/
http://portal.acm.org/citation.cfm?id=1082983.1083117
ran on a single computer
used a bunch of data sources with mutually incompatible license requirements
was a monolithic blob of modules
relied on some C/C++ code that I just compiled manually (i.e., no makefile - someone who doesn't know the code would have to guess which programs to run at which point).
Now - I've gotten much better and my parser (which shares some of the code with the
anaphora resolution stuff) now uses Python's distutils, but still
gcc 4.2 miscompiles part of it and I haven't found out why
other people weren't successful in installing it, even though it's a one-line install in the case where it works
it still relies on proprietary data sets (notably, SMOR, and a word clustering derived from a corpus that is only for internal use at another Uni)
Now, people will say, that's because of the horrid mix of Python and C++ that you're using,
and doing stuff for German with pure open source is doomed anyways.
So, let's come to the last point, a 40KLoC project for coreference resolution (BART - see www.bart-coref.org). We're working on it with 3-4 people at the same time, sometimes doing different, independent research. But:
a lot of time goes into refactoring the system so it doesn't degenerate into a messy blob
(and that's with multiple people in there who actually know software engineering)
since we're writing and rewriting different parts of the system, it occurs that some modification that's useful for Italian makes the performance for English worse, or breaks things, or makes it not work with JDK 1.5, or activates the hidden function to shoot deadly microwaves at your neighbour.
So, in sum
it's a lot of work
not everyone can do it
it doesn't get rewarded at all (I really mean that. And if you ever wondered if it helped to explicitly fund that sort of things - guess again, what will happen is that those people who don't really care about re-use will write nice-sounding proposals and then spend their time creating overengineered ISO standards that no one will ever implement or find useful. CES or LAF, anyone?)
* surprisingly often, the approach is just doomed because you need proprietary component X anyway which you can't redistributed
It is of course true that the endeavor of releasing your code will face serious trouble in case you cannot do without components with incompatible licensing. But then again, if everybody would think like that, the world would never ever change. If you don't try, you can only fail.
(Please use the preview function next time, your formatting turned out to be kind of… confusing.)
Add Comment