Spark is a framework for parallel stream processing, however, that probably sounds like something difficult to work with even if you are interested. In reality, Spark is a data processing tool which allows you to execute it on your local machine as well as a large-scale cluster. At the same time, Scala users will feel familiar with it as it is OSS in Scala. Thus, you can start it quickly with your local machine to run a data processing pipeline in Scala and then keep building upon it. Moreover, processing structured data types like string and numeral types, as well as stream processing are both possible. This session will introduce how to build up a stream processing pipeline easily using Spark’s new feature, Spark Structured Streaming, which allows you to concisely write stream data processing as structured data.
voted / votable