GenericLevenshtein
Yet another Implementation of Levenshtein Distance
GenericLevenshtein is an implementation of Minimum Edit Distance, also called Levenshtein Distance, written by Ramon Ziai and Niels Ott. This algorithm is very popular and it is often used to compute the similarity of strings. The difference in the presented implementation is that it can operate on sequences of any Java object implementing equals(Object). So no matter if you want to compare genome sequences or sequences of numbers, or just strings, here you go!
Furthermore, the costs of the replace, insert, and delete operations can be customized by implementing the simple WeightCalculator<T> interface. In that case it is not a requirement to rely on equals(Object) as your implementation can do whatever you like it to do in oder to compare objects.
Usage Examples
There is a simple convenience method for comparing strings:
System.out.println(SimpleLevenshtein.getStringDistance( "Quasselsack", "Niels"));
This demonstrates the use of the generic algorithm:
LevenshteinDistance<Character> levDistance = new LevenshteinDistance<Character>();
System.out.println(levDistance.getDistance( Conversion.convertToArray("Quasselsack"), Conversion.convertToArray("Niels")));
System Requirements
The GenericLevensthein library requires Java 1.5 or later.
Download
Please be aware that this package is released under the terms of the Apache License v.2.
- ozGenericLevenshtein-0.4.0.jar is the ready-to-use package
- Source code in a Zip file
- View the JavaDoc online or download it in a Zip file