Feature engineering is the process of selecting and transforming properties of a data set to prepare for training a machine learning model, and is a vital component of successful ML systems. Often, this task involves writing cumbersome boilerplate and is usually coupled to a specific processing system. At Spotify, we built Featran to simplify this time-consuming task and support several processing frameworks under a uniform API, leveraging the powerful features and type-safety of Scala. This talk will begin with an overview of big data and ML at Spotify, and then we’ll dive into the design and implementation of Featran.
voted / votable