I decided to write this article because I remember that back when I started to learn Scala around 2013-2014 the problem of how to return early from a loop actually happened to me a few times. I had a collection which I wanted to go through looking for the first entry which fulfilled some requirements, and then I wanted to jump out of the loop, taking that fitting result with me. At the time I was working in Java on a time-management webapp and situations as the one described above happened to me pretty often: There were long collections of entries of how much time a given person worked on a given project in a given day, and I ran complicated queries against them. „Find an occurrence of a day when the team worked more person-hours than X”. „Find an example of a bug which took more than Y days to fix”. And so on. In Java, I was usually following one and the same pattern — I took a collection of original entries (let’s call them foos of the type Foo), performed a sometimes quite expensive conversion to a derivate entry (a bar of the type Bar), then made an also sometimes pretty complex validation of that bar, checking if it fulfills the requirements, and if yes, then I returned it. If I didn’t find anything, I returned null.

Bar complexConversion(Foo foo) {
  ...
}

bool complexValidation(Bar bar) {
  ...
}

Bar findFirstValidBar(Collection<Foo> foos) {
  for(Foo foo : foos) {
    Bar bar = complexConversion(foo)
    if (complexValidation(bar)) return bar
  }
  return null
}

The imperative approach

Naturally, my first steps in Scala as a Java developer were to simply translate the Java code to Scala almost one to one, a word for word:

def complexConversion(foo: Foo): Bar = ...
def complexValidation(bar: Bar): Boolean = ...

def findFirstValidBar(seq: Seq[Foo]): Option[Bar] = {
  for (foo <- seq) {
    val bar = complexConversion(foo)
    if (complexValidation(bar)) return Some(bar)
  }
  None
}

The only real difference here is that I avoid null and use an Option instead. For all our practical purposes here, an Option is a collection which can consist of either zero or one element. It’s either Some(bar), and then I can later extract that bar from it, or it is None.

But this is not good Scala code, far from it. The keyword return, even though it exists in Scala, is strongly discouraged. In most cases, developers use return as the last statement in a block. In Scala, every block of code is an expression which automatically returns the result of the last expression in the block. So, instead of return x you can simply write x on the last line, but most often you don’t even do that — the fact that everything is an expression gives you the power to reorder the whole block of code in ways that in Java would look very clunky, and end up with much more concise Scala code. That eliminates 90% of returns. As for the other 10%, i.e. early returns from loops mostly, well… Early returns are considered a code smell. They make a sequence of expressions less readable because the programmer can no longer rely on that the last statement is the one returning the result from the block. If possible, you should reorder the code in such a way that the return is no longer necessary.

But this is not so easy in this case. First I learned a few half-solutions to get around it, then I finally learned the proper method, then I forgot all about it, because back then I had so much to learn in the same time that it just left my brain to make room for other new concepts, and only recently I remembered that the code I’m writing from time to time is actually an FP equivalent of that old Java early return. And hence this article. The answer is…

Baby steps

… we’ll get to it step by step. Or you can just skip the next few paragraphs if you want. But I’d like to invite you to just go with me through a few intermediate versions of the method findFirstValidBar, so that in the end you will have a better understanding of why the final solution looks like it does and what more you can do with it.

First, let’s just go along the line of the least resistance: Every standard collection in Scala defines methods find and map. find returns the first element of the collection satisfying a predicate as an option — it will return None if no element fulfills the requirements. map converts a collection of elements of one type into a collection of elements of another type. And since Option is a collection too, we can use it here and so turn the whole findFirstValidBar method into this:

def findFirstValidBar(seq: Seq[Foo]): Option[Bar] = 
  seq.find(foo => complexValidation(complexConversion(foo)))
     .map(complexConversion)

For every foo in seq, we convert it to a bar, then we validate it, and if it passes the validation, then we stop iterating over seq, and return… well, find returns the original foo, not the derivate bar, so we need to take that foo and convert it again before we can return it. If conversion is trivial we may ignore this drawback — after all, it’s only one additional conversion, or zero if we didn’t find any valid element — but in general, I don’t like to settle down with a sub-optimal solution, especially if a better one is not that much more complex.

In the Scala standard collection library there is a method called collect which merges the functionality of filter and map. Filtering a collection and then mapping the results to a collection of something else is so common that there is an advantage in writing in shorter. Similarly, there is a method collectFirst which merges find and map — after all, find is just a filter which stops after finding the first valid element. So we can modify the version from above and write it as:

def findFirstValidBar(seq: Seq[Foo]): Option[Bar] =   
  seq.collectFirst {
    case foo if complexValidation(complexConversion(foo)) => complexConversion(foo) 
  }

The curly braces and a line starting with case indicate that collectFirst takes one argument — a partial function. The function will produce a result (complexConversion(foo)) only if the condition (complexValidation(complexConversion(foo))) is fulfilled. If the result is produced, collectFirst will stop iterating over seq and return it. Otherwise, it will try with another foo.

unapply

By itself, collectFirst still doesn’t solve our problem. To check if the condition is fulfilled, we need to perform both a conversion and a validation, and then, if the validation is successful, we need to once more convert the original foo, just as it was happening in the version with find and map. But this time we get a hint. Partial functions in Scala take advantage of pattern matching — just as when we do match/case, a case in a partial function may be used to deconstruct the element foo into something else and that deconstruction mechanism, also known as the unapply method, can be used for both conversion and validation.

But wait, isn’t unapply used simply to extract field values from case classes?

Nope.

object ValidBar {
  def unapply(foo: Foo): Option[Bar] = {
    val bar = complexConversion(foo)
    if (complexValidation(bar)) Some(bar) else None
  }
}

In virtually any place in the code, we can create a new object, call it something meaningful (note that objects do not have to always be companion objects to classes), and implement inside it an unapply method which will take an instance of one type, and return an Option of another. In our case, it takes a foo, convert it, validate it, and returns Some(bar) if the validation is successful, or None otherwise. Afterward, we can use the unapply method in pattern matching everywhere, just like this:

foo match {
  case ValidBar(bar) => // do something with our valid bar
  case _ => // ...
}

So let’s use it in collectFirst:

object ValidBar {
  def unapply(foo: Foo): Option[Bar] = {
    val bar = complexConversion(foo)
    if (complexValidation(bar)) Some(bar) else None
  }
}

def findFirstValidBar(seq: Seq[Foo]): Option[Bar] =   
  seq.collectFirst {
    case ValidBar(bar) => bar 
  }

Victory! now we have only one conversion, one validation, and we finish iteration over seq when we find the first valid element. On top of that, we can use it also in collect methods: simple seq.collect { case ValidBar(bar) => bar } will give use all bars that pass the validation.

What we may not like about this solution, however, is its verbosity. It’s no longer a one-liner. Actually, it’s even longer than the original imperative version of findFirstValidBar. But that’s only because this solution requires a certain overhead at the start: we need an object and an unapply method even for the simplest applications. But as the application — that is, conversions and validations — becomes more complex, the overhead becomes less significant. We only need to write the unapply once, and then we can use it in collectFirst everywhere. In fact, there is a case where unapply may even let us save a few lines.

Imagine that the conversion from Foo to Bar may not always be possible. Instead of def complexConversion(foo: Foo): Bar we need to have def safeComplexConversion(foo: Foo): Option[Bar] where we return Some(bar) if the conversion was successful, and None otherwise. Now our imperative version of findFirstValidBar needs to look something like this:

def safeComplexConversion(foo: Foo): Option[Bar] = ...
def complexValidation(bar: Bar): Boolean = ...

def findFirstValidBar(seq: Seq[Foo]): Option[Bar] = {  
  for (foo <- seq) 
    safeComplexConversion(foo) match { 
      case Some(bar) if complexValidation(bar) => return Some(bar)    
      case _ => 
    } 
  None
}

It’s more complex and it’s ugly. The return keyword is nested not only in a for loop anymore, but also in a match/case. Here we again encounter the reason why return should not be used. In Scala code many levels of nesting is quite common — features like lambdas, anonymous classes and pattern matching let us make the code more concise. But the return keyword in such code makes it much harder to read.

Safety from impossible conversions

Meanwhile, since our unapply method deals with options, the changes in the last version of findFirstValidBar make the code even simpler than before:

def safeComplexConversion(foo: Foo): Option[Bar] = ...
def complexValidation(bar: Bar): Boolean = ...

object ValidBar {
  def unapply(foo: Foo): Option[Bar] =     
    safeComplexConversion(foo).find(complexValidation)
}

def findFirstValidBar(seq: Seq[Foo]): Option[Bar] =   
  seq.collectFirst {
    case ValidBar(bar) => bar 
  }

Besides, the conversion and validation methods also need to go somewhere — if they are always used together, now they can be put next to unapply, inside the object. It means that if Foo needs to be deconstructed in several ways in your code — it’s quite common that from one original data structure we can derive more than one secondary type — you will only need to care about meaningful object names for those derivations:

object ValidBar {
  def convert(foo: Foo): Option[Bar] = ...
  def validate(bar: Bar): Boolean = ...
  def unapply(foo: Foo): Option[Bar] = convert(foo).find(validate)
}
// use as case ValidBar(bar) => ...

object ValidBaz {
  def convert(foo: Foo): Option[Baz] = ...
  def validate(baz: Baz): Boolean = ...
  def unapply(foo: Foo): Option[Baz] = convert(foo).find(validate)
}
// use as case ValidBaz(baz) => ...

object BarValidInADifferentWay {
  def convert(foo: Foo): Option[Bar] = ...
  def validate(bar: Bar): Boolean = ...
  def unapply(foo: Foo): Option[Bar] = convert(foo).find(validate)
}
// use as case BarValidInADifferentWay(bar) => ...

This allows for significant code reuse. Not only we can now use those unapply methods in collectFirst and collect, but also in match/case and in virtually any method from the Scala collections library that accepts partial functions. map, flatMap, foreach — whichever you need. We can also combine them in partial functions if, let’s say, we look for a foo that can be converted and validated in either one way or another:

seq.collectFirst {
  case ValidBar(bar) => bar
  case BarValidInADifferentWay(bar) => bar
} // stops at the first element valid in any of those ways

A common trait for all deconstructions

And if you get used to this pattern, you may as well notice that unapply is always the same and extract the common logic to a trait:

// one such trait for the whole codebase
trait Deconstruct[From, To] {
  def convert(from: From): Option[To]
  def validate(to: To): Boolean
  def unapply(from: From): Option[To] = 
    convert(from).find(validate)
}

// one for each implementation of convert and validate
object ValidBar extends Deconstruct[Foo, Bar] {
  override def convert(foo: Foo): Option[Bar] = ...
  override def validate(bar: Bar): Boolean = ...
}

// for each use
def findFirstValidBar(seq: Seq[Foo]): Option[Bar] =   
  seq.collectFirst {
    case ValidBar(bar) => bar 
  }

Lazy collections

It was brought to my attention after I published the initial version of this blog note that there is at least one other way of achieving the same effect with lazy collections (thank you, Balmung-san!). A lazy collection is a collection that instead of having its elements already computed and ready to access, has a way to compute a given element when it is needed. it means that, since we go through a collection only until we find the first element fulfilling certain conditions, and when we find it we are no longer interested in accessing any others, the collection won’t run the conversion and validation methods for them.

One such lazy collection in Scala is Iterator. We can create an iterator from the original Seq[Foo] simply by writing seq.iterator. The whole code needed to find a valid bar looks like this:

def findFirstValidBar(seq: Seq[Foo]): Option[Bar] =
  seq.iterator
     .map(safeComplexConversion)
     .find(_.exists(complexValidation))
     .flatten

There is, however, some overhead involved. It’s unsafe to keep the iterator around — we should create it every time from the original seq. Also, the code reusability is reduced — if we have more than one way to convert and validate an element, we will need to write more code than in the case of unapply + collectFirst. But overall this is a good example how in Scala everything can be done in more than one way, depending on what trade-offs are we willing to accept.

And that’s it. Thanks for reading. I hope all this will be useful to you.

A couple of links:

The cover photo made by Lucian from Flickr, Creative Commons, some rights reserved.

Skomentuj

Wprowadź swoje dane lub kliknij jedną z tych ikon, aby się zalogować:

Logo WordPress.com

Komentujesz korzystając z konta WordPress.com. Wyloguj /  Zmień )

Zdjęcie na Google

Komentujesz korzystając z konta Google. Wyloguj /  Zmień )

Zdjęcie z Twittera

Komentujesz korzystając z konta Twitter. Wyloguj /  Zmień )

Zdjęcie na Facebooku

Komentujesz korzystając z konta Facebook. Wyloguj /  Zmień )

Połączenie z %s