Sequence Comprehensions describes Scala’s for “loop”
You can read the wikipedia entry on list comprehensions if you like, but what we’re talking about here is the many ways in which Scala allows you to traverse and process a list.
We’ve seen some of this in ScalaFunctions and XmlLiterals , but these deal with more specific means of traversing a list (e.g. to
find a specific element or
map elements to other elements). These are all specialized versions of the so-called “for-comprehension”, which is a fancy name for a
Suppose we have a list of U.S. States that have been tagged with their general location in the U.S. (e.g. “east” vs. “midwest” vs. “south”). Now suppose we wish to get a list of all the states that aren’t on the east, except for Washington, DC, which, for some reason, we don’t consider east coast
state <- states is a generator, which assigns a value from
states to the value
state in succession. Each time this happens, the “guard condition” that starts with
if is evaluated. If it evaluates to true (or if there is no guard condition), we “mark” this item for iteration. Once the collection has been processed, we iterate over each “marked” item executing the body of the for-comprehension (this is the part that comes after the final paren). In this case, we use the
yield keyword, which essentially means “yield this value back to the list we are creating”. Yes, we are creating a list, here. The end result of this is a list of states that matched our guard condition.
Note the subtle difference here; in Java we would iterate over the entire collection; in Scala we are building up a collection over which to iterate via the guard condition and then executing the body of the for-comprehension. This means that, in general, conditions inside the for-comprehension body cannot affect conditions in the guard condition:
var done = false for (state <- states if !done) if state.code == "DC" done = true
When we hit DC, we will not stop; this is because the guard condition has already been evaluated for every item in the list before we got to the body. Yikes (See this mind-blowing article on the crazy subtleties of the
If we didn’t want to actually yield a list, we can omit the
yield keyword alltogether:
for( state <- states if state.location != 'east || state.code == "DC") println(state)
Instead of nesting
for loops, we can simply add expressions to the comprehension. Suppose we wish to find pairs of states that could be “sister” states; states that aren’t in the same geographic area.
val sisters = for( state1 <- states; state2 <- states if state1.location != state2.location) yield (state1,state2)
Here, we loop through each pair of states, yield a tuple of states that don’t have the same
In reality, the
for comprehensions are translated by the compiler into calls to
flatMap. Section 10.3 of Scala By Example details this.
My Thoughts on this Feature
I thought I understood this feature until I read this article . Now, I’m a bit lost, mostly as to why it works this way. Perhaps the name
for is misleading; in most programming languages,
for means “do something for everything in the list”. I guess the definition of the “list” is what’s unclear. I find it hard to know what a particular
for statement in Scala will actually do. That doesn’t seem good.
I suppose once I’ve internalized how it works, it will seem more obvious.