Budapest BI 2015 has ended
International Business Intelligence and analytics conference in the lovely city of Budapest, Hungary
Back To Schedule
Wednesday, October 14 • 14:40 - 15:10
Composing testable and robust machine learning pipelines

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Finding a good structure for number-crunching code can be a problem, this especially applies to routines preceding the core algorithms: transformations such as data processing and cleanup, as well as feature construction. With such code, the programmer faces the problem, that their code easily turns into a sequence of highly interdependent operations, which are hard to separate. It can be challenging to test, maintain and reuse such “Data Science Spaghetti code”.
Data scientists are faced with these problems on a day-to-day basis when writing machine learning pipelines. This is even more important if the models should be used in a production environment. Scikit-Learn offers a simple yet powerful interface for data science algorithms: the estimator and composite classes (called meta-estimators). By example, I show how clever usage of meta-estimators can encapsulate elaborate machine Looking at examples, I will show how this approach simplifies model development, testing and validation and how to brings together best practices from software engineering as well as data science.learning models into a maintainable tree of objects that is both handy to use and simple to test.

avatar for Holger Peters

Holger Peters

Data Scientist and Software Developer, Blue Yonder GmbH
Holger studied physics at the Karlsruhe Institute of Technology and the University of Waterloo. He has used the scientific Python stack for data analysis ever since working on his thesis. He works at Blue Yonder GmbH as a data scientist & software engineer, mainly programming in Python... Read More →

Wednesday October 14, 2015 14:40 - 15:10 CEST
Mátyás II.