Setting up Scala for Spark App Development

by Ian Hellström (4 January 2016)

Apache Spark is a popular framework for distributed computing, both within and without the Hadoop ecosystem. Spark offers interactive shells for Scala as well as Python. Applications can be written in any language for which there is an API: Scala, Python, Java, or R. Since it can be daunting to set up your environment to begin developing applications, I have created a presentation that gets you up and running with Spark, Scala, sbt, and ScalaTest in (almost) no time.

You can access the presentation via Bitbucket. Navigation between the slides is done with the arrows on your keyboard, clicking with your mouse, or swiping with your finger. There is loads of additional information available to the ‘presenter’. You can enter presenter mode by hitting ‘p’ on the keyboard.


The presentation is built on top of remark.js and styled with an adaptation of the CSS template included in the overview presentation of the project. The style sheet (remarkjs.less) is available on Bitbucket. Thanks to Less you can easily modify it to suit your needs (e.g. use different colours). As always, attribution to the source (i.e. me) would be greatly appreciated, although I cannot stop you from putting your name on it and claiming it as your own work.

Remark.js allows you to create HTML presentations by means of Markdown. I have seen and made too many bad PowerPoint presentations, especially ones with code in them. Formatting is usually a pain, which is why I like the proposition of Markdown with a touch of CSS: separation of content and formatting, where the content is still very readable. Sure, LaTeX and its beamer class are acceptable too, but even as a long-time LaTeX user I’m still not sold on its user friendliness. There is also reveal.js, which looks amazing. However, I prefer Markdown instead of HTML.

One caveat with remark.js is that when you use ??? to separate slides from presenter notes: make sure there are no trailing spaces after the question marks, as that tends to mess up the formatting of the next slides.

That’s all, folks! I hope I could help you get your stuff up and running, so you can start building Spark applications.