This session will focus on:
What are features and why is feature engineering the most creative and time-consuming data science activity?
What are typical feature engineering approaches for different types of data, such as time-series data, event log data, geographical data, etc?
How do you decide which features are most relevant to solve your problem?
Description of the session:
Any intelligent algorithm that is used to learn something from data requires that this data is presented in the most optimal way. The process of transforming the data and extracting the most relevant distinguishing characteristics out of it is called feature engineering. It is arguably the most important step in the data science workflow as even the most intelligent algorithm will not produce satisfactory results if the used data does not capture the most essential properties of the phenomenon under study. There is no clearly-defined formal process for engineering features and consequently this requires a lot of creativity, iterations, domain knowledge, etc.
The goal of this session is to give an overview of the most commonly used approaches, as well as lessons learnt and common pitfalls for different types of data (sensor data, location data, etc.) and problem settings (prediction, profiling, etc.).
Further information and registration: http://www.sirris.be/agenda/mastercourse-data-innovation-art-feature-engineering-3rd-edition