Loading the file with Source

Our implementation must drop the first line which contains the headers, then for each line, split and create a new instance of EquityData:

package retcalc

import scala.io.Source

case class EquityData(monthId: String, value: Double, annualDividend: Double) {
val monthlyDividend: Double = annualDividend / 12
}

object EquityData {
def fromResource(resource: String): Vector[EquityData] =
Source.fromResource(resource).getLines().drop(1).map { line =>
val fields = line.split("\t")
EquityData(
monthId = fields(0),
value = fields(1).toDouble,
annualDividend = fields(2).toDouble)
}.toVector
}

This code is quite compact, and you might lose a sense of what types are returned by intermediate calls. In IntelliJ, you can select a portion of code and hit Alt + = to show the inferred type of the expression.

We first load the .tsv file using scala.io.Source.fromResource. This takes the name of a file located in a resource folder and returns a Source object. It can be in src/test/resources or src/main/resources. When you run a test, both folders will be searched. If you run the production code, only the files in src/main/resources will be accessible.

getLines returns Iterator[String]. An iterator is a mutable data structure that allows you to iterate over a sequence of elements. It provides many functions that are common to other collections. Here, we drop the first line, which contains the header, and transforms each line using an anonymous function passed to map.

The anonymous function takes line of type string, transforms it into Array[String] using split, and instantiates a new EquityData object.

Finally, we convert the resulting Iterator[EquityData] into Vector[EquityData] using .toVector. This step is very important: we convert the mutable, unsafe, iterator into an immutable, safe Vector. Public functions should, in general, not accept or return mutable data structures:

  • It makes the code harder to reason about, as you have to remember the state the mutable structure is in.
  • The program will behave differently depending on the order/repetition of the function calls. In the case of an iterator, it can be iterated only once. If you need to iterate again, you won't get any data:
scala> val iterator = (1 to 3).iterator
iterator: Iterator[Int] = non-empty iterator

scala> iterator foreach println
1
2
3

scala> iterator foreach println

scala>