This is a candidate session. ScalaMatsuri selects sessions using as a reference participants voting later.

日本語

(Scala, Apache Spark, TensorFlow, XGBoost) => Easy Machine Learning

Sophisticated OSS Machine Learning (ML) libraries such as Google TensorFlow and XGBoost have emerged. However, ML essentially requires a lot of annoying ‘tuning’ with much trial and error required to achieve high prediction accuracy. To relieve us from this pain, we have developed a tool which executes a bunch of TensorFlow/XGBoost processes simultaneously and calculates prediction scores of trained models at high speed by leveraging Scala and Apache Spark. Our tool is also designed to have extensibility to easily employ the upcoming ML OSS. In this talk, we will unveil our tool and share both the attractive and challenging points of using Spark with Scala. We will also share our knowledge on how to handle Spark skillfully.

Session length: 40 minutes
Language of the presentation: Japanese
Target audience: Beginner: No need to have prior knowledge
Who is your session intended to: People who are interested in data analysis in Scala
People who have experience using Apache Spark for distributed processing
People who use machine learning libraries such as TensorFlow and XGBoost
Speaker: Masato Asahara (@m_asahara) (NEC R&D)

Candidate sessions