Archive for the ‘Programming’ Category
[FR] Brown Bag Lunch (BBL)
À la manière de David Gageot, je propose de venir dans votre entreprise entre midi et 2 pour vous présenter une session de code sur un sujet donné… contre un sandwich.
Plutôt que de réaliser des sessions de code qui resteront enfermées dans ma société, je propose de partager avec vous les sessions les plus réussies. Pour ce faire, j’ai besoin :
- d’une date,
- d’une salle (avec son adresse et éventuellement des gens dedans),
- d’un projecteur,
- d’un sandwich ^^
Actuellement, je propose les sujets suivant :
-
Réaliser en Scala en moins d’une heure un framework Web qui poutre
Découverte de Scala à travers la réalisation d’un mini framework tout en restant lisible -
Fabriquez vous même votre type Option
Tests unitaires et refactoring pour comprendre la monade Option en Java 8
Par la suite, d’autre sujet pourrait venir.
Sinon, je suis principalement disponible sur Paris et proche Paris.
Tout le détail est ici : https://kerflyn.wordpress.com/bbl/
Playing with Scala Parser Combinator
Notice
The work in this article is in progress. But you are free to read it in its current state and free to post comments. If you want to see the change set, go to the bottom of the page.
Maybe you are of those people who gets headache when writing a parser. LL or LALR grammars, lex and yacc/bison are completely unknown or are some sort of old phantoms from the past, that you are trying to forget since you have leaved the university. You cannot imagine that creating a parser can be as easy as 1-2-3. But, did you take a look at parser combinators?
A really simple parser
Parser combinators comes directly from functional programming to help you create complex parsers in a declarative way. From the Scala point of view, it looks like writing almost directly an EBNF grammar.
Here, we start with a really simple parser that recognizes the string "hello"
only.
import scala.util.parser.combinator.RegexParsers object ReallySimpleParser1 extends RegexParsers { def hello = "hello" }
First, notice that your parser combinator must be declared has an object
(a kind of mix between a singleton and a purely static class). It has to extends one of the object derived from Parsers
. RegexParsers
is one of them. JavaTokenParsers
is another one, with some predefine parsers to recognize Java-compliant identifiers, numbers, strings, etc.
Second, you can see that the pattern to recognize is declared through the method hello
. The method hello
is a Parser
and more exactly a Parser[String]
. A Parser[+T]
is a function that takes an input and returns a ParseResult[T]
. A ParseResult
is a class that has three children:
Success
: the parser has recognized the input. The instance ofSuccess
contains the result and the remaining input.Failure
: the parser has not recognized the input and will back-track. The instance contains an error message and the remaining input.Error
: something fatal occurred. The instance contains an error message and the remaining input. No back-track is proceeded.
Usually, to use your parser combinator, you have to call the method parseAll
. It takes a Parser
(like our method hello
above) and a char sequence as parameters. parseAll
returns a ParseResult
.
Below are two tests as it would appear in tests based on specs2. They show the use of our really simple parser:
import org.specs2.mutable._ import org.specs2.matcher.ParserMatchers class ReallySimpleParser1Test extends Specification with ParserMatchers { val parsers = ReallySimpleParser1 import ReallySimpleParser1.hello "hello" should { "succeed to recognize 'hello'" { hello must succeedOn("hello").withResult("hello") } "fail to recognize 'world'" { hello must failOn("world") } }
Parse numbers
RegexParsers
is used for parsers based on regular expressions. The previous example does not show this.
Below is an example that recognizes numbers. It uses a regular expression to do so:
object NumberParser extends RegexParsers { def number: Parser[Int] = """-?\d""".r ^^ { _.toInt } }
The number parser is more exactly defined in the method number
with the regular expression -?\d+
where -?
represents the optional sign, \d
represents a digit, and thus \d+
a series of at least one digit. The regular expression is followed by a function. The operation ^^
says that if the previous pattern is recognized then apply the following function. The function have to take a string in parameter, representing the recognized sequence, and can return a value of any kind (limited to what has been declared for the method). Otherwise, the method that defines the parser returns implicitly a Parser[String]
.
number must succeedOn("42").withResult(42) number must succeedOn("-24").withResult(-24) number must failOn("hello")
Yet another simple example to parse a sequence
object ReallySimpleParser2 extends RegexParsers { def sentence = hello ~ world def hello = "hello" def world = "world" }
a ~ b
parse a sequence made ofa
thenb
a | b
introduce an alternative parser that parsesa
orb
a?
introduce an optional parsera*
introduce on optional and repeatable parsera+
introduce a repeatable parsera ~> b
like~
but ignore the left member (a
)a <~ b
like~
but ignore the right member (b
)
Parse an arithmetic expression and evaluate it
Remember my previous article on Scala? The one talking about pattern matching? Do you remember that it ends with a way to evaluate arithmetic expressions based on a kind of AST? Here I present the missing part of this article: the parse of an arithmetic expression to generate the AST.
sealed abstract class ExpressionSymbol case class Const(value: Int) extends ExpressionSymbol case object X extends ExpressionSymbol case class Add(x: ExpressionSymbol, y: ExpressionSymbol) extends ExpressionSymbol case class Mul(x: ExpressionSymbol, y: ExpressionSymbol) extends ExpressionSymbol
object ExpressionParser extends RegexParsers { def expression = operation1 def operation1: Parser[ExpressionSymbol] = operation2 ~ rep("+" ~ operation2) ^^ { case op ~ list => list.foldLeft(op) { case (x, "+" ~ y) => Add(x, y) } } def operation2: Parser[ExpressionSymbol] = operand ~ rep("*" ~ operand) ^^ { case op ~ list => list.foldLeft(op) { case (x, "*" ~ y) => Mul(x, y) } } def operand: Parser[ExpressionSymbol] = (constant | variable) def variable: Parser[ExpressionSymbol] = "x" ^^ { _ => X } def constant: Parser[ExpressionSymbol] = """-?\d+""".r ^^ { s => Const(s.toInt) } def apply(input: String): Option[ExpressionSymbol] = parseAll(expression, input) match { case Success(result, _) => Some(result) case NoSuccess(_, _) => None } }
There is a method apply
that aims to simplify the use of the parser. It only takes a string in parameter. It parses it and returns an Option
that indicates if the parse has succeeded or failed.
Changeset
- [2012-08-25] initial publication
- [2012-09-07] added more explanation for the first example + show longer version for the first test
Towards Pattern Matching in Java
Pattern matching is such a cool feature that you always dream of in your language to represent alternative structures. It already exists in OCaml, in Haskell, or in Scala (here is an post in this blog showing the different usages of the pattern matching in Scala: Playing with Scala’s pattern matching). In Java, what is almost closed to pattern matching is named switch-case. But, you’re limited to scalar values (chars, integer types, booleans), enum types, and since Java 7 you can use strings. With the Java’s switch-case, you don’t have a way to check the type of arguments or to really mix the different kind of check, for example. These are features that pattern matching provides.
In this post, we’ll study a possibility to get an implementation in Java close to pattern matching. We’ll use lambda expressions of Java 8 in a view to get a more readable structure. Here, I don’t propose to make a real pattern matching, but to get a structure that can match different kinds of patterns and which is more flexible than the traditional switch-case.
Main implementation
First, we need a PatternMatching class. This class is a set of Pattern instances. Each Pattern can check if a value matches (a pattern) and is bound to a process/function that it can apply on.
The class PatternMatching provides just the method matchFor
. This method tests the different Pattern instances that it contains to a value. If a Pattern matches, the method matchFor
is applied to the value the function bound to the Pattern. If no pattern has been found, the method throws an exception.
To instantiate this class, you’ll need to provide a set of patterns.
public class PatternMatching { private Pattern[] patterns; public PatternMatching(Pattern... patterns) { this.patterns = patterns; } public Object matchFor(Object value) { for (Pattern pattern : patterns) if (pattern.matches(value)) return pattern.apply(value); throw new IllegalArgumentException("cannot match " + value); } }
Pattern
Pattern is an interface that provides two methods. matches
should check if an object meets some requirements. apply
should execute a process on a object. apply
should be called once matches
succeeded.
public interface Pattern { boolean matches(Object value); Object apply(Object value); }
A first Pattern example: pattern for types
For example, ClassPattern is a Pattern implementation that checks the type of a value.
public class ClassPattern<T> implements Pattern { private Class<T> clazz; private Function<T, Object> function; public ClassPattern(Class<T> clazz, Function<T, Object> function) { this.clazz = clazz; this.function = function; } public boolean matches(Object value) { return clazz.isInstance(value); } public Object apply(Object value) { return function.apply((T) value); } public static <T> Pattern inCaseOf(Class<T> clazz, Function<T, Object> function) { return new ClassPattern<T>(clazz, function); } }
Take a look at the example below:
PatternMatching pm = new PatternMatching( inCaseOf(Integer.class, x -> "Integer: " + x), inCaseOf(Double.class, x -> "Double: " + x) ); System.out.println(pm.matchFor(42)); System.out.println(pm.matchFor(1.42));
This code displays:
Integer: 42 Double: 1.42
Usual patterns
Here are some “usual patterns”, in the sense that they imitate some functionality you already have with switch-case. First, we start with the pattern to match strings:
public class StringPattern implements Pattern { private final String pattern; private final Function<String, Object> function; public StringPattern(String pattern, Function<String, Object> function) { this.pattern = pattern; this.function = function; } @Override public boolean matches(Object value) { return pattern.equals(value); } @Override public Object apply(Object value) { return function.apply((String) value); } public static Pattern inCaseOf(String pattern, Function<String, Object> function) { return new StringPattern(pattern, function); } }
Now, here is a pattern to match integers:
public class IntegerPattern implements Pattern { private final Integer pattern; private final Function<Integer, Object> function; public IntegerPattern(int pattern, Function<Integer, Object> function) { this.pattern = pattern; this.function = function; } @Override public boolean matches(Object value) { return pattern.equals(value); } @Override public Object apply(Object value) { return function.apply((Integer) value); } public static Pattern inCaseOf(int pattern, Function<Integer, Object> function) { return new IntegerPattern(pattern, function); } }
And to finish, here is a pattern that matches everything. It is the equivalent of default
in switch-case.
public class OtherwisePattern implements Pattern { private final Function<Object, Object> function; public OtherwisePattern(Function<Object, Object> function) { this.function = function; } @Override public boolean matches(Object value) { return true; } @Override public Object apply(Object value) { return function.apply(value); } public static Pattern otherwise(Function<Object, Object> function) { return new OtherwisePattern(function); } }
Now, you can express something like this:
PatternMatching pm = new PatternMatching( inCaseOf(Integer.class, x -> "Integer: " + x), inCaseOf( "world", x -> "Hello " + x), inCaseOf( 42, _ -> "forty-two"), otherwise( x -> "got this object: " + x.toString()) );
Factorial based on PatternMatching
Our pattern matching API in Java 8 can define the body of recursive functions:
public static int fact(int n) { return ((Integer) new PatternMatching( inCaseOf(0, _ -> 1), otherwise( _ -> n * fact(n - 1)) ).matchFor(n)); }
Here, we use the lambda expressions to improve the readability. In this case, Java 8 is necessary. But it is easy to use the implementations of PatternMatching and Pattern above with former versions of Java. Nevertheless, you have to sacrify the readability.
For example, here is the implementation of the same factorial function for versions 6 and 7 of Java:
public static int fact_old(final int n) { return ((Integer) new PatternMatching( inCaseOf(0, new Function<Integer, Object>() { @Override public Object apply(Integer _) { return 1; } }), otherwise(new Function<Object, Object>() { @Override public Object apply(Object _) { return n * fact(n - 1); } }) ).matchFor(n)); }
Conclusion
In this post, we’ve seen a way to define a structure close to pattern matching. This structure helps us to extend the semantic of switch-case in Java. Besides, it’s extendable with new implementations based on Pattern. The use of the Java 8’s lambda expressions make it more readable.
Nevertheless, if a PatternMatching is defined in a method, it’ll be completely reinstantiated at each call of this method. But if performance is needed, you can store the pattern matching in a constant.
So, we are able to represent pretty much the same functionnality than switch-case. The usual patterns I haven’t presented can be easily implemented. In addition, we can find a way to do pattern matching based on regular expressions. What is missing here is the ability to aggregate a set of patterns to the same process (ie. a kind of OR relation between some patterns for the same process), the addition of guard condition and above all the ability to represent algebraic types and use them in our structure.
From Option to Monad with Java Only
In a previous post, I’ve shown how to use the type Optional provided by Guava, in a view to create a monad. Personally, I’m not satisfied with the resulting code. Since, I’ve found another way to represent the Option monad in Java only, and the new representation seems to be better.
So we start with an interface to represent functions (this is the same as the Guava).
public interface Function<T, R> { R apply(T input); }
Then we implement the Option type. This type contains three monadic operations :
wrap
that converts an object into an Option,andThen
that applies a function to the current instance of Option,fail
that returns a None instance.
I add also two getter methods : get
and isPresent
.
public abstract class Option<T> { // opérations de base public abstract T get(); public abstract boolean isPresent(); // opérations monadiques public static <T> Option<T> wrap(T element) { return new Some<T>(element); } public static <T> Option<T> fail() { return new None<T>(); } public abstract <U> Option<U> andThen(Function<T, Option<U>> function); }
And now, here is the implementation of the classes Some and None.
public class None<T> extends Option<T> { public T get() { throw new IllegalStateException(); } public boolean isPresent() { return false; } public <U> Option<U> andThen(Function<T, Option<U>> function) { return new None<U>(); } } public class Some<T> extends Option<T> { private T element; private Some(T element) { this.element = element; } public T get() { return element; } public boolean isPresent() { return true; } public <U> Option<U> andThen(Function<T, Option<U>> function) { return function.apply(this.element); } }
To come back to the example of the previous post, here are implementations of the different accessors that enables you to explore the (ugly) recursive structure made of Java Maps representing a set of products arranged according to their supplier, city, and country:
public static Function< Map<String, Map<String, Map<String, Map<String, Product>>>>, Option<Map<String, Map<String, Map<String, Product>>>> > accessCountry(String country) { return getFromKey(country); } public static Function< Map<String, Map<String, Map<String, Product>>>, Option<Map<String, Map<String, Product>>> > accessCity(String city) { return getFromKey(city); } public static Function< Map<String, Map<String, Map<String, Product>>>, Option<Map<String, Map<String, Product>>> > accessCity(String city) { return getFromKey(city); } public static Function< Map<String, Map<String, Product>>, Option<Map<String, Product>> > accessSupplier(String supplier) { return getFromKey(supplier); } public static Function< Map<String, Product>, Option<Product> > accessProduct(String productCode) { return getFromKey(productCode); }
Here is the definition of the method getFromKey
:
public static <K, V> Function<Map<K, V>, Option<V>> getFromKey(final K key) { return new Function<Map<K, V>, Option<V>>() { @Override public Option<V> apply(Map<K, V> map) { V value = map.get(key); if (value == null) { return Option.none(); } return Option.wrap(value); } }; }
Now look how the method getProductFrom
is simplified:
public Option<Product> getProductFrom( Map<String, Map<String, Map<String, Map<String, Product>>>> productsByCountry, String country, String city, String supplier, String code) { return wrap(productsByCountry) .andThen(accessCountry(country)) .andThen(accessCity(city)) .andThen(accessSupplier(supplier)) .andThen(accessProduct(code)); }
But I cheating a little bit with the way monads are usually used. Normally, a succession of process calls within a monad isn’t a linear chaining like in the example above, but a kind of recursive chaining. I can get such a chaining (without the ugliness of the Java verbosity) if I use the lambda expression of Java 8.
public Option<Product> getProductFrom( Map<String, Map<String, Map<String, Map<String, Product>>>> productsByCountry, String country, String city, String supplier, String code) { return getFrom(productsByCountry, country).andThen( cities -> getFrom(cities, city).andThen( suppliers -> getFrom(suppliers, supplier).andThen( products -> getFrom(products, code)))); } public <K, V> Option<V> getFrom(Map<K, V> map, K key) { V value = map.get(key); if (value == null) { return Option.none(); } return Option.wrap(value); }
As a remainder, the construct x -> f(x)
currently defines a lambda expression in Java 8.
From Optional to Monad with Guava
Guava is a kind of Swiss Army knife API proposed by Google for Java 5 and more. It contains many functionalities that help the developer in its every day life. Guava proposes also a basic functional programming API. You can represent almost statically typed functions with it, map functions on collections, compose them, etc. Since its version 10.0, Guava has added a new class called Optional
. It’s an equivalent to the types Option
in Scala or Maybe
in Haskell. Optional
aims to represent the existence of a value or not. In a sense, this class is a generalization of the Null Object pattern.
In this article, we see the Optional
class in action through two use cases involving java.util.Map
instances. The first use case is a very basic one. It shows how to use the Optional
class. The second use case is based on a problem I’ve met in a real project. We’ll see if Optional
is helpful in this case.
Optional vs. null
Suppose that you want to get a value from a Map
. The value is supposed to be located under the key "a"
, but you aren’t sure. In the case where the key "a"
doesn’t exist, the Map implementation is supposed to return null
. But, you want to continue with a default value.
Here, you have two solutions. This one
Integer value = map.get("a"); if (value == null) { value = DEFAULT; } process(value);
And, this one
Integer value = map.get("a"); process(value == null ? DEFAULT : value);
In order to use the Optional
class, we need to define a new accessor for Map
instances.
public static <K, V> Optional<V> getFrom(Map<K, V> map, K key) { Optional.fromNullable(maps.get(key)) }
We notice that you can instantiate Optional by different ways: 1/ by a call to Optional.absent()
if there is nothing to return (except the Optional
instance), 2/ by of call to Optional.of(value)
if you want to return a value. It sounds logical that an accessor to Map
returns an Optional
, because you aren’t sure that the given key exists in the Map
instance.
Below is the same program but using Optional
class.
process(getFrom(map, "a").or(DEFAULT));
With this code, we don’t see null
anymore.
Going further: Optional to the limit
Now, we suppose that we develop a application based on a set of products differentiated by a unique identifier. The products are organized by product identifier, supplier, city, and country. In order to stock these products, imbricated Map
are used: the first level uses the country, the second level uses the city, the third level uses the supplier, and the fourth level uses the product identifier to give access to the product. Here is the code you get with no use of Optional
class (please, don’t do this at work! Seriously, don’t do this!)
public Product getProductFrom( Map<String, Map<String, Map<String, Map<String, Product>>>> productsByCountry, String country, String city, String supplier, String code) { Map<String, Map<String, Map<String, Product>>> productsByCity = productsByCountry.get(country); if (productsByCity != null) { Map<String, Map<String, Product>> productsBySupplier = productsByCity.get(city); if (productsBySupplier != null) { Map<String, Product> productsByCode = productsBySupplier.get(supplier); if (productsByCode != null) { return productByCode.get(code); } } } return null; }
This doesn’t sounds great, is it?
Now, below is the solution using Optional
class.
public Optional<Product> getProductFrom( Map<String, Map<String, Map<String, Map<String, Product>>>> productsByCountry, String country, String city, String supplier, String code) { Optional<Map<String, Map<String, Map<String, Product>>>> productsByCity = getFrom(productsByCountry, country); if (productsByCity.isPresent()) { Optional<Map<String, Map<String, Product>>> productsBySupplier = getFrom(productsByCity.get(), city); if (productsBySupplier.isPresent()) { Optional<Map<String, Product>> productsByCode = getFrom(productsBySupplier.get(), supplier); if (productsByCode.isPresent()) { return getFrom(productByCode.get(), code); } } } return Optional.absent(); }
It looks ugly too!
In a view to simplified this implementation, I propose to introduce the notion of monad.
Option(al) monad
A monad is a programming structure with two operations: unit and bind. Applied to the Optional
class, unit converts a value into an Optional
instance and bind applied to an Optional
a function from value to Optional
. Below, you’ve there definitions in Java:
public class OptionalMonad { public static <T> Optional<T> unit(T value) { return Optional.of(value); } public static <T, U> Optional<U> bind( Optional<T> value, Function<T, Optional<U>> function) { if (value.isPresent()) return function.apply(value.get()); else return Optional.absent(); } }
Notice that bind checks the presence of a value before to apply the function.
Now, to be used by our example in the section above, we need to define a function that will be used as parameter for the bind operator. This function is based on the method getFrom
, previously defined, which gives access to a value of a Map
from a given key.
public static <K, V> Function<Map<K, V>, Optional<V>> getFromKey(final K key) { return new Function<Map<K, V>, Optional<V>>() { @Override public Optional<V> apply(Map<K, V> map) { return getFrom(map, key); } }; }
Here is the new code
public Optional<Product> getProductFrom( Map<String, Map<String, Map<String, Map<String, Product>>>> productsByCountry, String country, String city, String supplier, String code) { Optional<Map<String, Map<String, Map<String, Product>>>> productsByCity = bind(unit(productsByCountry), Maps2.<String, Map<String, Map<String, Map<String, Product>>>>getFromKey(country)); Optional<Map<String, Map<String, Product>>> productsBySupplier = bind(productsByCity, Maps2.<String, Map<String, Map<String, Product>>>getFromKey(city)); Optional<Map<String, Product>> productsByCode = bind(productsBySupplier, Maps2.<String, Map<String, Product>>getFromKey(supplier)); Optional<Product> product = bind(productsByCode, Maps2.<String, Product>getFromKey(code)); return product; }
Or more directly
public Optional<Product> getProductFrom( Map<String, Map<String, Map<String, Map<String, Product>>>> productsByCountry, String country, String city, String supplier, String code) { return bind(bind(bind(bind(unit(productsByCountry), Maps2.<String, Map<String, Map<String, Map<String, Product>>>>getFromKey(country)), Maps2.<String, Map<String, Map<String, Product>>>getFromKey(city)), Maps2.<String, Map<String, Product>>getFromKey(supplier)), Maps2.<String, Product>getFromKey(code)); }
OK! It’s weird too. We have to ‘fight’ with inline recursive calls of the bind
method and also with Java generics. The Java type inference system isn’t sufficiently powerful to guess them. To reduce occurrences of generics in this code, we could have written a specific version of getFromKey
method for each part of the given Map
: getFromCountry
, getFromCity
, getFromSupplier
, etc. This moves and distributes the complexity of the code in those methods.
public Function< Map<String, Map<String, Map<String, Map<String, Product>>>>, Optional<Map<String, Map<String, Map<String, Product>>>> > getFromCountry(String country) { // type inference system knows what to do here return Maps2.getFromKey(country); } // declaration of the other methods here // ... public Optional<Product> getProductFrom( Map<String, Map<String, Map<String, Map<String, Product>>>> productsByCountry, String country, String city, String supplier, String code) { return bind( bind( bind( bind( unit(productsByCountry), getFromCountry(country)), getFromCity(city)), getFromSupplier(supplier)), getFromCode(code)); }
The positive side lies into the fact that the if imbrications have disappeared and the code is linear. Notice that due to the bind
operator, if one of those call to getFromKey
method returns an Optional.absent()
, then all the following call to bind
will also return an Optional.absent()
.
Is there an happy end for this?
It’s difficult to do something better in Java by using the Optional
class (unless you have better solution). By considering another JVM language, you’ve below a solution in Scala.
def getProductFrom(products: Map[String, Map[String, Map[String, Map[String, Product]]]], country: String, city: String, supplier: String, code: String): Option[Product] = { for ( cities <- products.get(country); suppliers <- cities.get(city); codes <- vendors.get(supplier); product <- codes.get(code) ) yield product }
Here, the for
structure provides syntactic sugar to do something close to imperative programming. But in fact, each line in this for
structure does the same thing as our previous implementations using the bind
operator. The get
method here acts the same way as our getFrom
method by returning an instance of the type Option
. The type Option
in Sala is equivalent to the type Optional
in Guava. So when you call our Scala version of getProductFrom
, if all your parameters appears in the Map
, it returns a product. But if one of the parameter isn’t present, you get the default value.
In Other JVM languages proposes the safe-navigation or null-safe operator value?.method()
, like in Groovy, in Fantom, in Gosu, etc. It checks if the value isn’t null
before calling the method.
// get a product or null product = products?.get(country)?.get(city)?.get(supplier)?.get(code)
For the kind of problems presented in this post, the null-safe operator does a better job than the monad approach. In fact, monads represent a programming structure with a really wider scope. Associated with types other than Optional
, you can extend the application field of monads and get the necessary expressivity to explore non-determinism, concurrent programming, handle of errors, etc.
Conclusion
So, we’ve seen the class Optional
provided by Guava in two use cases. For the simple case, we’ve seen how Optional
helps to make null reference disappears. But for the complex case, we’ve seen that Optional
alone provides no real advantage. Optional
class can make the approach a little easier if we introduce the notion of Optional
monad. Thereby, not only the null references disappear but also the recursive if structure. But, we have to fight with the Java generics. And even after this, the code is still hard to read.
It seems clear that to explore a recursive data structure made of Map
with Java, you have to choose between to fight with if statements or to fight with generics. But, Java may be not the good language for this kind of problem. In other hand, you should ask yourself if using such a structure is a judicious choice?
EDIT 2012-02-23: simplified getFrom method + corrected the returned type from getFromCountry method.
Implementing the Factorial Function Using Java and Guava
What I like with factorial function is that it’s easy to implement by using different approaches. This month, I’ve presented functional programming and some of its main concepts (recursion, tail recursion optimization, list processing) through different ways to code the factorial function. After the presentation, I’ve asked the audience to implement the factorial by using a recursive and a tail recursive approaches, with Java alone, then with Guava’s Function
interface.
In this post, we see the solutions of the exercises I’ve proposed and some explanation about the functional programming concepts used.
Factorial
Here is a reminder of the behavior of the factorial:
The factorial of a given integer n (or n!) is the product of the integers between 1 and n.
In imperative style, you write the factorial like this:
public static int factorial(int n) { int result = 1; for (int i = 1; i <= n; i++) { result *= i; } return result; }
The recursive solution in Java
The previous implementation of factorial is based explicitly on state manipulation: the variable result
is changed successively in the for loop at each iteration. The functional approach dislike particularly the explicit manipulation of a state in the body of a function. Usually, functional programming prohibits the use of loops like for, while, repeat, etc., for this reason. The only way to express a loop is to use recursion (ie. a function that calls itself).
In Mathematics, you express the factorial recursively like this:
0! = 1
n! = n . (n – 1), for n > 0
Or in plain English:
The factorial of 0 is 1 and the factorial of n is the product between n and the factorial of the preceding integer.
The implementation in Java of the recursive factorial is closed to the mathematic definition:
public static int factorial(int n) { if (n == 0) return 1; else return n * factorial(n - 1); }
Recursive solution in Guava
Guava proposes a function representation as object. It’s based on the generic interface Function<T, R>
. It looks like this very simple implementation:
public interface Function<T, R> { R apply(T value); }
T is the input type and R the returned type of the function. If you create an instance based on the interface Function, you’ve to defined the method apply(), that represents the body of the function. Here’s the implementation of the recursive version of factorial based on Guava:
Function<Integer, Integer> factorial = new Function<Integer, Integer>() { @Override public Integer apply(Integer n) { if (n == 0) return 1; else return n * apply(n - 1); } };
If you want to use this function, you’ve to write this:
Integer result = factorial.apply(5);
The Guava approach has this particularity to allow you to declare recursive and anonymous functions:
Integer result = new Function<Integer, Integer>() { @Override public Integer apply(Integer n) { if (n == 0) return 1; else return n * apply(n - 1); } }.apply(5);
Tail recursion in Java
There’s another way to implement the recursive version of the factorial. This approach consists in having the recursive call executed just before to return from the function. There must have no other instruction to execute between the recursive call and the return instruction. This approach is called tail recursion. The previous implementation isn’t a case of tail recursion, because you have to execute a multiplication between n and the value returned by the recursive call before to exit the function.
In order to implement a tail recursive factorial, we have to introduce a second parameter (named k) to the function. This parameter contains the partial result of the function through the successive recursive calls. Once the stop condition is reached, k contains the final result. Below, you’ve the implementation of the tail recursive factorial:
private static int fact(int n, int k) { if (n == 0) return k; else return fact(n - 1, n * k); } public static int factorial(int n) { return fact(n, 1); }
Notice that for the first call, the parameter k is initialized to 1. This relates to the basis case: when n is 0 then the function should return 1.
Motivation behind the tail recursion
There’s an important difference in behavior between the recursive and the tail recursive implementations. These difference is visible through the the call stack. In the recursive case, the result is built as you come back from recursive calls. Here is the different states of the call stack in recursive version when we call factorial(5)
:
-> factorial(5) // first call -> factorial(4) -> factorial(3) -> factorial(2) -> factorial(1) -> factorial(0) // here, we've reached the stop condition <- 1 <- 1 = 1 * 1 // all remaining multiplications are executed <- 2 = 2 * 1 <- 6 = 3 * 2 <- 24 = 4 * 6 <- 120 = 5 * 24 // we have computed the result in the last return
Now, you can see the call stack for the tail recursive implementation for the same call:
-> fact(5, 1) // first call -> fact(4, 5) // result is built through the successive calls -> fact(3, 20) -> fact(2, 60) -> fact(1, 120) -> fact(0, 120) // here, we've reached the stop condition <- 120 // the final result is obtained directly in the last recursive call <- 120 <- 120 <- 120 <- 120 <- 120
You can notice that when we’re coming back from the recursive calls here, the same value is returned. Thus, we can easily imagine to optimize the tail recursive implementation. In fact, there two possible optimizations at this level. The first one is call trampolining. It consists in generating some small modification where recursive call is marked bounce and return is marked landing. Then an external fonction is used to emulate the recursive calls based on a while loop.
The second optimization uses a deeper transformation of the source code, where the while loop is directly put inside the function body. For the tail recursive implementation of the factorial function, this second optimization would turn our source code into this:
public static int factorial(int n, int k) { while (!(n == 0)) { k = n * k; n = n - 1; } return k; }
These optimizations prevent the deep use of the call stack. Thus, you have no occurrence of stack overflow. But, you might have an infinite loop if you mistype the stop condition. The advantage of the second optimization over the first one and all recursive implementations is that it’s really quick as there’s no use of the call stack. A language like Scala proposes this second optimization by default. A precise look at the produced bytecode shows the transformation of the recursive call into a goto.
These optimizations aren’t present in Java.
Notice that not all recursive functions can be converted to a tail recursive function. This is the case of the function that computes the Fibonnacci series. It based on the jonction of two recursive calls.
public static fib(int n) { if (n <= 1) return 1; else return fib(n-1) + fib(n-2); }
Tail recursion in Guava
We’ve seen that the tail recursive implementation of the factorial needs two parameters. But in Guava, you can only define functions that accept a unique argument! So how do we do to transform a function of one argument into a function of two arguments?
There are two possibilities. The first one consists in creating a class that represents a pair of elements. Thus a function that takes two arguments is equivalent to a function that takes a pair of elements. The second possibility consists in using the curryfication. The curryfication is a use case of the higher order functions. With this, a function that takes many arguments is converted into a function that takes the first argument, and return a function that takes the second argument, and so on till we get the last argument. For example, the addition function (add
) is typically a function of two arguments (a
and b
). You can write add
in such a way that add
takes a
and return a function that waits for b
. Once you get b
, the function is evaluated. The interest of such an approach isn’t to force you to provide all parameters of a function at the same time. For our add
function, you can use add
directly to execute an addition: add.apply(1).apply(3) == 4
. Or you can use add
to define the function add_one
just by providing the first parameter only: add_one = add.apply(1)
. Then, you can provide the second parameter when you want through add_one
: add_one.apply(3) == 4
.
Below is the implementation of tail recursive factorial based on Guava. The signature of this function is Function<Integer, Function<Integer, Integer>>
. Here, you’ve to understand:
factorial is a function that takes a first parameter n and returns another function that takes a second parameter k and returns the factorial of n.
public class FactorialFunction implements Function<Integer, Function<Integer, Integer>> { @Override Function<Integer, Integer> apply(final Integer n) { return new Function<Integer, Integer>() { @Override public Integer apply(Integer k) { if (n == 0) return k; else return FactorialFunction.this.apply(n - 1).apply(k * n); } }; } } FactorialFunction fact = new FactorialFunction(); Integer result = fact.apply(5).apply(1); // factorial of 5
Notice the use of FactorialFunction.this
in order to use the outer object in the inner one. You have to reference the outer object in order to set all parameters before the recursive call. This forces you to create a named class to represent your factorial function.
Compare this implementation with the implementation below in Haskell, which is tail recursive and curryfied already. They’re both equivalent:
fact n k = if n == 0 then k else fact (n-1) (n*k)
In fact, there are differences in the Guava implementation: it isn’t optimized and you create a new object for each recursive call.
Instantiate in Java with the Scala’s Way
Case classes in Scala facilitate the instantiation by removing the use of the keyword new
. It isn’t a big invention, but it helps to have a source code more readable. Especially when you imbricate these instantiations one in another. The force of Scala’s case classes is to provide a way to practice symbolic computation — in my post named Playing with Scala’s pattern matching, read the section Advanced pattern matching: case class.
Suppose that you want to declare a type that represents a pair of elements. We don’t know the type of these elements and they may have different types. In Scala, you’ll write something like this:
case class Pair[T1, T2](first: T1, second: T2) // usage: val myPair = Pair(1, 2)
The implementation in Java below is a (hugely) simplified equivalent to the implementation in Scala:
public class Pair<T1, T2> { private T1 first; private T2 second; public Pair(T1 first, T2 second) { this.first = first; this.second = second; } public static <T1, T2> Pair<T1, T2> Pair(T1 first, T2 second) { return new Pair(first, second); } }
Here is an example where I design a binary tree structure with the help of the Java version of Pair
. Notice that with this approach, I don’t have to use the diamond declaration in the instantiation (ie. Pair<Integer, Pair<String, ...>>
). It’s automatically determined by the type inference algorithm.
Pair<Integer, Pair<String, Pair<Boolean, Object>>> tree = Pair(1, Pair("hello", Pair(true, new Object())));
Now, you can say it’s really helpful!
My First Coding Dojo
I took part in a coding dojo. We aimed to find the firsts prime numbers with the help of Clojure. The primary goal was not to solve the prime number problem. In fact, we tried to learn how to program in Clojure, also trying to be as close to the Clojure’s programming style as we could.
Photo by Ulrich Vachon
We started from a configuration using Cake to manage the project, a text editor, and some unit tests. We’ve worked by pair, changing one member and switching roles time to time.
I must confess that it isn’t something easy to think FP and write a program in Clojure. But the cohesion of the group and the overall will to reach the goal made it easier to finally write a program that succeeded all the tests.
So, after 1 hour and 20 minutes, here’s the result of our dojo :
Comparing Java APIs for Functional Programming [FR]
While the Java Community Process (JCP) has announced the appearance of the functional programming in the Java language, with the introduction of the lambda expressions (JSR 335: Lambda Expressions for the JavaTM Programming Language), is it possible with the current version of Java to practice this paradigm? While writing those lines, the JCP is in a deep brainstorming on this topic. There are different propositions about the syntax to adopt for the JSR 335 : a straw-man proposal, a prototype for OpenJDK is in progress, the BGGA proposal (Bracha, Gafter, Gosling, and von der Ahé), etc. But none of these syntaxes have formalized. Nevertheless, a first draft should be available during September 2011 and should appear in 2012 with Java 8.
Till then, there are different APIs that allow the developers to use functional programming with Java and they don’t have to learn a new the language.
My first article on the Xebia’s blog is available in French: http://blog.xebia.fr/2011/06/29/comparaison-dapi-java-de-programmation-fonctionnelle/
How TreeMap can save your day?
In Java collections, the TreeMap is a sorted and navigable map that organizes elements in a self-balancing binary tree. It can help you solve problems like “Which element in this collection is the closest to this one?”
Introduction
You want to plan reservations for a room. Here are some reservations:
- From 4-Jan to 7-Jan, occupant: Mr. A
- From 10-Jan to 21-Jan, occupant: Mrs. B
- From 5-Feb to 18-Feb, occupant: Mr. A
- From 20-Feb to 3-Mar, occupant: Mr. C
Suppose that there is no possibility for a reservation to override another one. How do you determine in a programmatic way who is the occupant the 10-Feb? And the 19-Feb?
Algorithm
There are many possibilities to answer to those questions. Basically, one of them is to store the reservations in a set. When searching for an occupant at a given date, you walk through the set, testing each element. You stop when you have a reservation that contains the given date or when there is no more reservation to test. This is a linear search algorithm. In the worst case, the time complexity is O(n), ie. when you have to test all elements and there is no matching element or the matching element is the last one.
To improve this algorithm, you can also use a binary search algorithm. It consists at each step in dividing the set of elements in two parts and continuing searching in one of those parts. This algorithm needs a set of sorted elements. Its time complexity is O(log(n)) in the worst case. This is better because if you have 7 elements, you will only need 3 tests against 7 for linear search, in the worst case. And if you have 1000 elements, you will only need 10 tests against 1000 for the linear version, in the worst case. However, with the binary search algorithm, you may lost time while sorting elements. Java proposes in classes java.util.Arrays and java.util.Collections the method sort() the uses the merge sort algorithm that guaranties a time complexity of O(n.log(n)). Thus, sorting + binary search is worst than a linear search, that doesn’t need a sorted set of elements. Note that Java SE proposes implementations of the binary search algorithm in the methods java.util.Arrays.binarySearch() and java.util.Collections.binarySearch().
Another approach to the binary search is to generate the corresponding binary tree. Indeed, the binary search algorithm is like a walk through a height-balanced binary tree. At each step of the algorithm, we select the left part (left branch) or right part (right branch) of the subset of elements (the subtree) according to a central element (a node). But the big difference is that if you apply an algorithm like the red-black tree for your binary tree, two aspects are guaranteed:
- Each time you add an element, it is sorted with the rest of the tree.
- The tree is height-balanced. It means that its maximal height is O(log(n)).
For such a tree, the operations insert and search cost of time is O(log(n)) each.
The class java.util.TreeMap represents such a height-balancing binary tree and uses the red-black tree algorithm. But what is even more interesting with this class, is that it provides the methods ceilingEntry() and floorEntry(). They return the map entry which key is the closest respectively after and before the given key.
Representation of a reservation
Below is the source code of a simplified class to represent such reservations. This implementation uses guava.
public class Reservation { public Date from; public Date to; public String occupant; public Reservation(Date from, Date to, String occupant) { this.from = from; this.to = to; this.occupant = occupant; } public boolean contains(Date date) { return !(from.after(date) || to.before(date)); } @Override public String toString() { return Objects.toStringHelper(this) .add("from", from) .add("to", to) .add("occupant", occupant) .toString(); } @Override public int hashCode() { return Objects.hashCode(from, to, occupant); } @Override public boolean equals(Object obj) { if (this == obj) { return true; } if (obj == null || !(obj instanceof Reservation)) { return false; } Reservation reservation = (Reservation) obj; return Objects.equals(from, reservation.from) && Objects.equals(to, reservation.to) && Objects.equals(occupant, reservation.occupant); } }
We declare below all four reservations that you can find in the introduction at the top of the post.
Date from, to; Calendar calendar = Calendar.getInstance(); calendar.set(2011, 0, 4); from = calendar.getTime(); calendar.set(2011, 0, 7); to = calendar.getTime(); Reservation reserv1MrA = new Reservation(from, to, "Mr. A"); calendar.set(2011, 0, 10); from = calendar.getTime(); calendar.set(2011, 0, 21); to = calendar.getTime(); Reservation reserv1MrsB = new Reservation(from, to, "Mrs. B"); calendar.set(2011, 1, 5); from = calendar.getTime(); calendar.set(2011, 1, 18); to = calendar.getTime(); Reservation reserv2MrA = new Reservation(from, to, "Mr. A"); calendar.set(2011, 1, 20); from = calendar.getTime(); calendar.set(2011, 2, 3); to = calendar.getTime(); Reservation reserv1MrC = new Reservation(from, to, "Mr. C");
How to find a reservation from a given date
I group a set of reservations in a class that I’ve called Planning. Once instantiated, you can add reservations (method add()) and you can get a reservation at a given date (method getReservationAt()).
There are two steps to get a reservation close to a date.
- I first use the method floorEntry(). It gets you the map entry which key is the greatest one less than the given date.
- Second, I check if the reservation (that is the value of the entry) contains the given date.
public class Planning { public final TreeMap<Date, Reservation> reservations; public Planning() { reservations = new TreeMap<Date, Reservation>(); } public void add(Reservation reservation) { reservations.put(reservation.from, reservation); } public Reservation getReservationAt(Date date) { Entry<Date, Reservation> entry = reservations.floorEntry(date); if (entry == null) { return null; } Reservation reservation = entry.getValue(); if (!reservation.contains(date)) { return null; } return reservation; } }
Test
We first store the reservations declared above in a planning.
Planning planning = new Planning(); planning.add(reserv1MrA); planning.add(reserv1MrsB); planning.add(reserv2MrA); planning.add(reserv1MrC);
Now, who has made a reservation the 10-Feb and the 19-Feb? (Here, I use FEST-assert as assertion API.)
Calendar calendar = Calendar.getInstance(); calendar.set(2011, 1, 10); Date dateOfMrA = calendar.getTime(); assertThat(planning.getReservationAt(dateOfMrA).occupant).isEqualTo("Mr. A"); calendar.set(2011, 1, 19); Date dateNoReservation = calendar.getTime(); assertThat(planning.getReservationAt(dateNoReservation)).isNull();
You must be logged in to post a comment.