Kerflyn's Blog

Well… It's a blog!

Posts Tagged ‘pattern matching

Towards Pattern Matching in Java

Pattern matching is such a cool feature that you always dream of in your language to represent alternative structures. It already exists in OCaml, in Haskell, or in Scala (here is an post in this blog showing the different usages of the pattern matching in Scala: Playing with Scala’s pattern matching). In Java, what is almost closed to pattern matching is named switch-case. But, you’re limited to scalar values (chars, integer types, booleans), enum types, and since Java 7 you can use strings. With the Java’s switch-case, you don’t have a way to check the type of arguments or to really mix the different kind of check, for example. These are features that pattern matching provides.

In this post, we’ll study a possibility to get an implementation in Java close to pattern matching. We’ll use lambda expressions of Java 8 in a view to get a more readable structure. Here, I don’t propose to make a real pattern matching, but to get a structure that can match different kinds of patterns and which is more flexible than the traditional switch-case.

Main implementation

First, we need a PatternMatching class. This class is a set of Pattern instances. Each Pattern can check if a value matches (a pattern) and is bound to a process/function that it can apply on.

The class PatternMatching provides just the method matchFor. This method tests the different Pattern instances that it contains to a value. If a Pattern matches, the method matchFor is applied to the value the function bound to the Pattern. If no pattern has been found, the method throws an exception.

To instantiate this class, you’ll need to provide a set of patterns.

public class PatternMatching {
    private Pattern[] patterns;

    public PatternMatching(Pattern... patterns) { this.patterns = patterns; }

    public Object matchFor(Object value) {
        for (Pattern pattern : patterns)
            if (pattern.matches(value))
                return pattern.apply(value);

        throw new IllegalArgumentException("cannot match " + value);
    }
}

Pattern

Pattern is an interface that provides two methods. matches should check if an object meets some requirements. apply should execute a process on a object. apply should be called once matches succeeded.

public interface Pattern {
    boolean matches(Object value);
    Object apply(Object value);
}

A first Pattern example: pattern for types

For example, ClassPattern is a Pattern implementation that checks the type of a value.

public class ClassPattern<T> implements Pattern {

    private Class<T> clazz;

    private Function<T, Object> function;

    public ClassPattern(Class<T> clazz, Function<T, Object> function) {
        this.clazz = clazz;
        this.function = function;
    }

    public boolean matches(Object value) {
        return clazz.isInstance(value);
    }

    public Object apply(Object value) {
        return function.apply((T) value);
    }

    public static <T> Pattern inCaseOf(Class<T> clazz, Function<T, Object> function) {
        return new ClassPattern<T>(clazz, function);
    }
}

Take a look at the example below:

PatternMatching pm = new PatternMatching(
    inCaseOf(Integer.class, x -> "Integer: " + x),
    inCaseOf(Double.class, x -> "Double: " + x)
);

System.out.println(pm.matchFor(42));
System.out.println(pm.matchFor(1.42));

This code displays:

Integer: 42
Double: 1.42

Usual patterns

Here are some “usual patterns”, in the sense that they imitate some functionality you already have with switch-case. First, we start with the pattern to match strings:

public class StringPattern implements Pattern {
    private final String pattern;
    private final Function<String, Object> function;

    public StringPattern(String pattern, Function<String, Object> function) {
        this.pattern = pattern;
        this.function = function;
    }

    @Override
    public boolean matches(Object value) { return pattern.equals(value); }

    @Override
    public Object apply(Object value) { return function.apply((String) value); }

    public static Pattern inCaseOf(String pattern, Function<String, Object> function) {
        return new StringPattern(pattern, function);
    }
}

Now, here is a pattern to match integers:

public class IntegerPattern implements Pattern {
    private final Integer pattern;
    private final Function<Integer, Object> function;

    public IntegerPattern(int pattern, Function<Integer, Object> function) {
        this.pattern = pattern;
        this.function = function;
    }

    @Override
    public boolean matches(Object value) { return pattern.equals(value); }

    @Override
    public Object apply(Object value) { return function.apply((Integer) value); }

    public static Pattern inCaseOf(int pattern, Function<Integer, Object> function) {
        return new IntegerPattern(pattern, function);
    }
}

And to finish, here is a pattern that matches everything. It is the equivalent of default in switch-case.

public class OtherwisePattern implements Pattern {
    private final Function<Object, Object> function;

    public OtherwisePattern(Function<Object, Object> function) {
        this.function = function;
    }

    @Override
    public boolean matches(Object value) { return true; }

    @Override
    public Object apply(Object value) { return function.apply(value); }

    public static Pattern otherwise(Function<Object, Object> function) {
        return new OtherwisePattern(function);
    }
}

Now, you can express something like this:

PatternMatching pm = new PatternMatching(
    inCaseOf(Integer.class, x -> "Integer: " + x),
    inCaseOf(      "world", x -> "Hello " + x),
    inCaseOf(           42, _ -> "forty-two"),
    otherwise(              x -> "got this object: " + x.toString())
);

Factorial based on PatternMatching

Our pattern matching API in Java 8 can define the body of recursive functions:

public static int fact(int n) {
    return ((Integer) new PatternMatching(
            inCaseOf(0, _ -> 1),
            otherwise(  _ -> n * fact(n - 1))
    ).matchFor(n));
}

Here, we use the lambda expressions to improve the readability. In this case, Java 8 is necessary. But it is easy to use the implementations of PatternMatching and Pattern above with former versions of Java. Nevertheless, you have to sacrify the readability.

For example, here is the implementation of the same factorial function for versions 6 and 7 of Java:

public static int fact_old(final int n) {
    return ((Integer) new PatternMatching(
            inCaseOf(0, new Function<Integer, Object>() {
                @Override
                public Object apply(Integer _) {
                    return 1;
                }
            }),
            otherwise(new Function<Object, Object>() {
                @Override
                public Object apply(Object _) {
                    return n * fact(n - 1);
                }
            })
    ).matchFor(n));
}

Conclusion

In this post, we’ve seen a way to define a structure close to pattern matching. This structure helps us to extend the semantic of switch-case in Java. Besides, it’s extendable with new implementations based on Pattern. The use of the Java 8’s lambda expressions make it more readable.

Nevertheless, if a PatternMatching is defined in a method, it’ll be completely reinstantiated at each call of this method. But if performance is needed, you can store the pattern matching in a constant.

So, we are able to represent pretty much the same functionnality than switch-case. The usual patterns I haven’t presented can be easily implemented. In addition, we can find a way to do pattern matching based on regular expressions. What is missing here is the ability to aggregate a set of patterns to the same process (ie. a kind of OR relation between some patterns for the same process), the addition of guard condition and above all the ability to represent algebraic types and use them in our structure.

Advertisement

Written by fsarradin

2012/05/09 at 22:39

Playing with Scala’s pattern matching

How many times have you been stuck in your frustration because you were unable to use strings as entries in switch-case statements. Such an ability would be really useful for example to analyze the arguments of your application or to parse a file, or any content of a string. Meanwhile, you have to write a series of if-else-if statements (and this is annoying). Another solution is to use a hash map, where the keys are those strings and values are the associated reified processes, for example a Runnable or a Callable in Java (but this is not really natural, long to develop, and boring too).

If a switch-case statement that accepts strings as entries would be a revolution for you, the Scala’s pattern matching says that this is not enough! Indeed, there are other cases where a series of if-else-if statements would be generously transformed into a look-alike switch-case statement. For example, it would be really nice to simplify a series of instanceof and cast included in if-else-if to execute the good process according to the type of a parameter.

In this post, we see the power of the Scala’s pattern matching in different use cases.

Notice that the pattern matching is not something new. It is something that appears already in some other languages like OCaml, Haskell. A kind of pattern matching also exits in Prolog, but it uses a far different mechanism (unification in this case).

Traditional approach

There is an approach to the Scala’s pattern matching that looks similar to the switch-case structure in C and Java: each case entry use an integer or any scalar type. Here is an example:

def toYesOrNo(choice: Int): String = choice match {
    case 1 => "yes"
    case 0 => "no"
    case _ => "error"
  }

So if  you enter toYesOrNo(1), Scala says “yes”. And if you enter toYesOrNo(2), Scala says “error”. Notice that the symbol _ is used to formulate the default case.You also have to remark that there is no need to use the break statement.

But if you want that Scala says “yes” when you enter toYesOrNo(1), toYesOrNo(2), or toYesOrNo(3), you will write the function like this:

def toYesOrNo(choice: Int): String = choice match {
    case 1 | 2 | 3 => "yes"
    case 0 => "no"
    case _ => "error"
  }

Now, you can use a string for each case entry. Using strings is interesting when you want to parse options of your applications:

def parseArgument(arg: String) = arg match {
    case "-h" | "--help" => displayHelp
    case "-v" | "--version" => displayVerion
    case whatever => unknownArgument(whatever)
  }

So if you enter parseArgument(“-h”) or parseArgument(“–help”), Scala calls the displayHelp function. And if you enter parseArgument(“huh?”), Scala calls unknownArgument(“huh?”).

Notice that I have not used _ for the default case. Instead, I have put an identifier as its value is used in the associated instruction.

Typed pattern

Sometimes in Java, you have to deal with an instance of something that appears like an Object or any high level classes or interfaces. After checking for nullity, you have to use series of if-else statements and instanceof-cast to check the class or the interface that the instance extends or implements, before using it and processing the instance correctly. This the case when you have to override the equals() method in Java. Here is the way Scala sees the thing:

def f(x: Any): String = x match {
    case i:Int => "integer: " + i
    case _:Double => "a double"
    case s:String => "I want to say " + s
  }
  • f(1) → “integer: 1”
  • f(1.0) → “a double”
  • f(“hello”) → “I want to say hello”

Is not it better than a succession of if+instanceof+cast?

This kind of pattern matching is useful to walk through a structure using the composite design pattern. For example, you can use it to explore the DOM of an XML document or the object model of a JSON message.

Functional approach to pattern matching

A side effect of such a matching is that it provides an alternative way to design functions. For example, consider the factorial function. If you choose the recursive version, usually, you would define it like this:

def fact(n: Int): Int =
    if (n == 0) 1
    else n * fact(n - 1)

But you can use Scala’s pattern matching in this case:

def fact(n: Int): Int = n match {
    case 0 => 1
    case n => n * fact(n - 1)
  }

Notice the use of the variable n in this case. It is matched with any value that does not appears in the preceding cases. You have to understand that the n in the last case is not the same as the one used in the function signature. If you wanted to use the parameter n and not a new variable with the same name, you have to delimit the variable name with back-quote in the case definition, like this: `n`

Here is a simple way to build you factorial function in Scala:

def fact(n) = (1 to n).foldLeft(1) { _ * _ }

Pattern matching and collection: the look-alike approach

Pattern matching may be applied to the collections. Below is a function that computes the length of a list without pattern matching:

def length[A](list : List[A]) : Int = {
    if (list.isEmpty) 0
    else 1 + length(list.tail)
  }

Here is the same function with pattern matching:

def length[A](list : List[A]) : Int = list match {
    case _ :: tail => 1 + length(tail)
    case Nil => 0
  }

In this function, there are two cases. The last one checks if the list is empty with the Nil value. The first one checks if there is at least one element is the list. The notation _ :: tail should be understood “a list with whatever head followed by a tail”. Here the tail can be Nil (ie. empty list) or a non-empty list.

To go further, we will use this approach with tuples in order to improve our method parserArgument() above:

  def parseArgument(arg : String, value: Any) = (arg, value) match {
    case ("-l", lang) => setLanguageTo(lang)
    case ("-o" | "--optim", n : Int) if ((0 < n) && (n <= 5)) => setOptimizationLevelTo(n)
    case ("-o" | "--optim", badLevel) => badOptimizationLevel(badLevel)
    case ("-h" | "--help", null) => displayHelp()
    case bad => badArgument(bad)
  }

Notice first the use of the operator | that allows to match alternative forms of arg inside the tuple. Notice also the use of two patterns for options -o and --optim. These patterns are distinguished by the use of a guard condition. The guard conditions help you to get fine tuned pattern matching when pattern matching alone is not enough.

Advanced pattern matching: case class

Case classes are classes with part of their behavior predefined in order to make easier their construction and their use in a pattern. Case classes enables you to manipulate parameterized symbols for example, parsed by a compiler or used by your internal Domain Specific Language (DSL).

The example below shows how to use case classes and pattern matching in order to build simple mathematical expressions, evaluate them, and compute their derivative. First, lets define the symbol to represent expressions: the variable X, constants, addition, multiplication, and the negative operator (for the fun!). Here sealed means that there is no other children of the class Expression outside of the namespace.

  sealed abstract class Expression
  case class X() extends Expression
  case class Const(value : Int) extends Expression
  case class Add(left : Expression, right : Expression) extends Expression
  case class Mult(left : Expression, right : Expression) extends Expression
  case class Neg(expr : Expression) extends Expression

Now, lets define a function to evaluate expressions with a given value for the variable by using pattern matching.

  def eval(expression : Expression, xValue : Int) : Int = expression match {
    case X() => xValue
    case Const(cst) => cst
    case Add(left, right) => eval(left, xValue) + eval(right, xValue)
    case Mult(left, right) => eval(left, xValue) * eval(right, xValue)
    case Neg(expr) => - eval(expr, xValue)
  }

Lets try the eval() function:

val expr = Add(Const(1), Mult(Const(2), Mult(X(), X())))  // 1 + 2 * X*X
assert(eval(expr, 3) == 19)

Now, we define a function that compute the (unreduced) derivative of an expression:

  def deriv(expression : Expression) : Expression = expression match {
    case X() => Const(1)
    case Const(_) => Const(0)
    case Add(left, right) => Add(deriv(left), deriv(right))
    case Mult(left, right) => Add(Mult(deriv(left), right), Mult(left, deriv(right)))
    case Neg(expr) => Neg(deriv(expr))
  }

Lets try the deriv() function:

val df = deriv(expr)

Here is what you get in df:

Add(Const(0),Add(Mult(Const(0),Mult(X(),X())),Mult(Const(2),Add(Mult(Const(1),X()),Mult(X(),Const(1))))))
// = 0 + (0 * X*X + 2 * (1*X + X*1)) = 4 * X
assert(eval(df, 3), 12)

Other advanced pattern matching features

There are some special notations used in the Scala’s pattern matching. Scala allows to put aliases on patterns or on parts of a pattern. The alias is put before the part of the pattern, separated by @. For example, in the expression address @ Address(_, _, "Paris", "France"), we want any addresses in Paris/France, but we do not care about its content after the match is done. So after, we just use the alias address.

There is also a specific notation to match sequences with _*. It matches zero, one or more elements in a sequence to its end.

In Scala, pattern matching does not appears only after a the function match. You can put pattern matching in any closure and also in a catch block.

You can provide your own behavior for pattern matching. This is called extractor. To do so, you have to provide your own implementation of the unapply() method (and/or unapplySeq() for a sequence).

Conclusion

In this post, we have seen different ways to use the pattern matching with Scala. The main advantage of such a feature is to provide an easy way to build alternative structures based the matching of scalar values, strings, collections, but also types and parameterized symbols. For me, pattern matching is one of the sexiest alternatives to if statements! A good use of pattern matching can make your program really readable and helps you to build an internal DSL.

Written by fsarradin

2011/02/14 at 18:52

Posted in Programming

Tagged with , ,