Scala and the JVM as a Big Data Platform Lessons from Apache Spark

Another one of the recordings from last year’s Reactive Summit in Austin. Dean Wampler talked about “Scala and the JVM as a Big Data Platform Lessons from Apache Spark”.

Dean is the Big Data Architect for Lightbend, where he leads the projects building products and services centered around Kafka, Spark, Flink, Mesos, and Akka. He is the author of “Programming Scala, Second Edition”, the co-author of “Programming Hive”, and the author of “Functional Programming for Java Developers”, all from O’Reilly. Dean is a contributor to several open source projects and the co-organizer of several technology conferences and Chicago-based user groups.

In this talk he explains, that the success of Apache Spark is bringing developers to Scala. For Big Data, the JVM uses memory inefficiently, causing significant GC challenges. Spark’s project “Tungsten” is fixing these problems with custom data layouts and code generation.
In this talk, you’ll see what we’ve learned from Spark, ongoing improvements, and what we should do to improve Scala and the JVM for Big Data.


If you want even more information around Big Data, you can checkout the following resources: