Phantom Readability Library
Traditional Readability Formulas in Java Easily
Phantom is a library for computing readability measures developed by Niels Ott. It was originally a modification of Larry Ogrodnek's Java Fathom but the released version does not share too much with the original any more. Phantom includes a Java port of a syllable counter written in Perl by Laura Kassner. There is some regular expression-based language processing on board for sentence counting and tokenization. The library is designed for maximum flexibility, allowing you to make use of your own NLP analysis. It is not a requirement to use the built-in analysis components.
To play with a demo GUI, check out this little Java Web Start demo! If your browser is unsure what to do with the file, tell it to feed the file into javaws. The following measures can be computed:
- Automated Readability Index (ARI)
- Coleman-Liau Index
- Flesch-Kincaid
- Flesch Reading Ease
- FORCAST
- Gunning Fog Index
- Läsbarhetsindex (LIX)
- Simple Measure of Gobbledygook (SMOG)
The formulas in this library have all been checked with the corresponding original publications during the process of writing my MA thesis. In the thesis, there are explanations and references for each and every formula, just skim through chapter 2.2
Usage Examples
The simplest way if computing readability scores is the following:
String text = "...";
Readability r = new Readability(text);
System.out.println(r.calcFlesch());
Readability can also be instantiated with an instance of TextStats instead of a string. This can be obtained from TextAnalyzer which accepts various levels of given analyses. For example, it can be fed with a list of tokens.
System Requirements
The Phantom Readability Library requires Java 1.5 or newer.
Download
Please be aware that this package is released under the terms of the General Public License v.2.
- phantom-0.2.0.jar is the ready-to-use package (run it with
java -jarto get the demo GUI) - Source code in a Zip file
- View the JavaDoc online or download it in a Zip file
TODO/Open Issues
- The JavaDoc needs some reworking, it is incomplete and sometimes lousy.