Make your Google Spreadsheets Scala-ble
Google Spreadsheets (and softwares alike such as Excel) are a great way to visualise and organise (medium-sized) data. Google spreadsheets are also easily connectible to a whole lot of Google services such as Google analytics. Hence, they are the perfect entry point for your data. They even offer powerful tools for manipulating the data via a fully featured API.
However, the problem is that with this power comes a great responsibility to keep the code inside the cells readable and easy to maintain. Sadly, this task quickly becomes impossible to accomplish, and you soon break any of the “clean code good practices” out there (such as DRY, KISS…). Keeping a clear versioning also is a challenge.
Fortunately, Google allows us to write custom functions that you can then use as any other built in function inside your spreadsheet. This is already a huge leap forward, but you have to write them in JavaScript, which is arguably not the most scalable and easy to maintain language. Wouldn’t it be great, however, if we could write these function in a language that is, aka Scala?
Luckily for us, Scala code can be compiled to JavaScript, and allows to expose functions for JavaScript to use them. That is all we need to achieve what we want!
In the following, we’re going to describe how to use the project available here.
TLDR
In order to use Scala for making custom functions, follow these steps:
- clone the repo.
- write and export your functions to the global scope
- (install and) launch sbt
- execute the `fastCompileCreateFunctions` task
- copy-paste the compiled `google.js` file (in `target/scala-2.13/google.js`) into Google Spreadsheets’ Script Editor
How it works
Scala-js compiles a Scala project into a big JavaScript file that we can then copy-paste into the “Script Editor” associated to the Spreadsheet. But there is a catch. Scala-js puts the exported functions into the global scope, while Google wants custom functions to be exactly defined with the syntax
function foo(args) { … }
This is a minor problem, as it is easy to write a small script that parses the Scala source files and adds all the relevant definitions at the end of the compiled JavaScript file. This is exactly what is done in there. It’s not pretty, but now, inside sbt (one of the Scala build tools), you can simply type in
fastCompileCreateFunctions
and you’ll get a nice `google.js` file to be copy-pasted in the Script Editor. What’s more, Google Spreadsheets offer auto-completion and a nice doc display if you provide documentary comments. The comments in your Scala code will also be copy-pasted, so you don’t have anything to do.
If you don’t know Scala-js, it’s not a big deal. Just write usual Scala code (without relying on Java libraries) and you’ll be fine. If you don’t know Scala, it’s also not a big deal, it’s easy to learn.
How to make it work
According to Google custom function documentation, you have four types of data from the cells, namely `String`, `Double`, `Boolean` or `js.Date` (see here for JavaScript Date class). Cell contents either enter your functions as is (if you only feed in one cell value), or as arrays of arrays of data, where indices are first along the rows, then along the columns.
That means that when creating a function to be exported, you need to take that as argument type. A function that computes the double of the content of the Cell would be written
def times2(x: Double): Double = 2 * x
and a function that computes the double of each cell in the range would be written
import scala.scalajs.jsdef times2(xs: js.Array[js.Array[Double]]): js.Array[js.Array[Double]] = xs.map(_.map(2 * _))
In order to have access to it from the spreadsheet, we simply need to “annotate” it with the `JSExportTopLevel` annotation. For example, like so
@JSExportTopLevel(“TIMESTWO”)def times2(x: Double): Double = 2 * x
The `Cell` class and the `Data` type
Since you can receive different types from the cell contents, the project contains a type `Data` defined as follows:
type Data = String | Double | Boolean | js.Date
When defining functions you can always use `Data` (or the range counterpart `js.Array[js.Array[Data]]`) as arguments to the function.
Then, the project also contains a class `Cell`, which is simply a wrapper around the `Data` type and has facility methods to actually check and cast to the concrete type that it contains.
Also, as is it more natural in Scala to work with `Vector[Vector[Cell]]` instead of JavaScript arrays of `Data`, the `Cell` companion object contains two implicit classes, one for going from JS arrays to Scala vector, with an implicit method called `asScala`, and another one to go back to JS arrays, with an implicit method called `toGoogleCells`. A typical function would then look like
In the Google sheets, this would then be used in a cell as, e.g.,
= FOO(A1:B7)
Once you developed your functions, you can use the `fastCompileCreateFunctions` sbt task, and copy paste the contents of the `google.js` file located at `target/scala-2.13/google.js`.
The CustomFunction abstraction
The way of doing in the previous paragraph is rather cumbersome, and not incredibly type-safe. Indeed, in order to manipulate the data, we need to manually takes the matrices in arguments, massage them, and form a returned matrix to go back to the spreadsheet. Not to mention that we should always overload the method to check whether we actually received a matrix (multiple cells) or a scalar (when only one cell is passed in argument).
For these reasons, the project offers an abstraction that can help reduce the boilerplate, and increase the type-safety.
The idea behind it is quite easy. Let’s expose it in the case of a single argument function. What we have to give to Google is a function from a matrix of `Data` (or a `Data`) to a matrix of `Data`, while what we want to do is write a function from a type `T` to a type `U`. For that, we just need to have a way to transform a matrix of data into a `T`, and a way to transform a `U` into a matrix of data, with a pipeline that looks like
Matrix of data -> encode into a T -> apply function to get a U -> Matrix of data
With the CustomFunction abstraction, the first and the last part of the pipeline are taken care of implicitly with, respectively, an `Encoder[T]` and a `Decoder[U]`. That way, if you define an implicit encoder in the companion object of `T`, and an implicit decoder in the companion object of `U`, you can define functions with type `T => U` and use them as custom functions.
Here is an example of counting the number of occurrences of “foo” in the given cells.
Above, we have `Input` as a type alias for `Data | js.Array[js.Array[Data]]` and `Output` an alias for `js.Array[js.Array[Data]]`. The `asCustomFunction` implicit method comes from the `FromFunction1` implicit class. In here, we don’t have to create encoders and decoders respectively for `Vector[Vector[String]]` and `Int` because there are already defined.
Another good thing is that an `Encoder[T]` actually asks you to encode an `Input` into a `Try[T]`. If there is a failure while encoding, then the function will return a single cell with the error message that occurred when encoding, or later in the execution.
For other examples, please have a look in here.
Always use range based functions
Scala.js now gave us even more powers than the build in functions of Google spreadsheets. They however come with a cost.
From the Google documentation, we can learn that each time a custom function is called, the editor script content is queried. Since the number of such calls is limited on a given window of time, it is better to make functions that operate on big cell ranges.
Typically, it is best and not unreasonable to generate an entire page as output, and calling the custom function in position `A1` in the sheet.
A concrete example
A typical use of Google spreadsheets are in conjunction with Google analytics, as you can easily pull the results into your spreadsheets.
Let’s say that you collected the data of visits on a website for each month and each device (mobile, desktop and tablet). For each month, you gather the information about the number of “sessions” and the “bounce rate”. The bounce rate is the proportion of people that land into your page and leave without doing anything on it. (You typically want to keep it low.) In order to better understand the trend of the data, we would like a view that gives us the current month values, the previous month values and the “year to date” values (“year to date” — ytd for short — is the time window between the first of January and the current month). The “current month” will sit in a “configuration cell” and corresponds to the month for which we desire our GA report.
When you pull your data into Google spreadsheet, you get a table with four columns: “Month” (let’s say as a `js.Date` of the first of the month), “Device”, “Sessions” and “Bounce rate”. We can then model our data with a case class of the form
final case class MonthData(date: js.Date, device: String, sessions: Int, bounceRate: Double)
Of course, we will need an encoder for that. This is done as follows
Now we define the output type and its decoder. The final output in the sheets will be a table summarising the aggregated session and bounce rate data for each device. We thus define a class that contains all the data, and a decoder for that class which formats the data into a table. The result of this is:
And finally, we define the function that is going to take a `Vector[MonthData]` as input, and return a `GASessionsBounceRateReport` as output. This function is wrapped into a `CustomFunction2` and exported to JavaScript. The implementation details can be found here. It basically makes use of the shiny new `groupMapReduce` method of the 2.13 collections.
And here are the results in action:
The formula in `G4` is simply `=GASESSIONSBOUNCERATEREPORT(A1:D, H1)`.
Google services
Many Google services can be used within Google spreadsheet. That is part of what makes them great. In custom functions, these services are available through global objects which have utility functions.
You can of course use these functions from Scala. This is out of the scope of this blog post, but you can have a look here and the imports thereof for a small example.
Conclusion
Making custom functions for Google Spreadsheets in Scala is very easy, yet it allows to have a powerful and scalable tool for your most complex spreadsheets. In the same fashion, Scala can be used to make complicated and premium quality Spreadsheets Add-ons.
Thanks to its comprehensive standard collection Library, Scala is an excellent language for manipulating data, especially the kind that you have in spreadsheets.
Don’t hesitate to give it a try!