Practical introduction to Pandas and Scikit-learn via Kaggle problems
August 26, 2014 · 7:00 PM
Recent update:
Due to space constraints, we placed a limit on the number of attendees. It would be great if you could update your RSVP if you're no longer able to attend, as currently there is a waitlist.
We will be covering the following materials with the IPython Notebook:
https://github.com/savarin/kaggleberlin-tutorial
You would need to have iPython, pandas and scikit-learn installed. If you're not sure how to install these packages, we recommend the freeAnacondadistribution.
Naturally you would also need to have aKaggle account.
Please arrive early to guarantee table space and access to a power outlet. We will spend the first 15 minutes setting up, and start promptly after.
Description:
Spending months on the nuances of machine learning models is a luxury most of us don’t have. This hands-on, practical iPython tutorial aims to provide a highly-directed introduction to machine learning through solving Kaggle problems.
We’ll start with data manipulation using pandas - loading data, cleaning data and making simple plots. We'll then use scikit-learn to make predictions. By the end of the session, we would have solved a supervised learning problem from start to finish, as well as see how well we did on the leaderboard.
This tutorial is designed to be an introductory session to pandas and scikit-learn. If you're familiar with GridSearchCV and Pipeline, then our intermediate tutorial might be more suitable.
If you're also interested in improving your Python skills in general, we would recommend theOpen Tech School.