日本語

Scio Deep Dive - Building a Scala API from Scratch

We will re-implement Scio* from scratch, step by step, in 15 self-contained monofile mini frameworks. By doing so, we will learn about common patterns in Scala API design, distributed data processing frameworks, Scala-Java interop, and some under the hood optimization. This talk will focus on code and have minimal slides.

*Scio is a Scala API for Apache Beam and Google Cloud Dataflow for unified batch and streaming data processing. It’s used by 300+ developers within Spotify for 1500+ production batch and streaming data pipelines, plus many other companies world wide.

Code for the talk: https://github.com/nevillelyh/scio-deep-dive Scio: https://github.com/spotify/scio

Session length
40 minutes
Language of the presentation
English
Target audience
Intermediate: Requires a basic knowledge of the area
Who is your session intended to
People who work with distributed data processing systems, e.g. Scio, Flink, Spark
People who work with Scala APIs for Java libraries.
Speaker
Neville Li (Spotify)

voted / votable

Candidate sessions