During the Summer of 2019 I taught a bootcamp of 432 hours on Data Science at NEOLAND with the following contents:
- Matematical foundations. Basic notions of Probability and Statistics. Elementary Linear Algebra. Multivariable Calculus.
- Practical programming. Introduction to Python. Numpy. Pandas. Input/Output libraries. Introduction to R.
- Data visualization. Matplotlib. Seaborn.
- Data mining. Twitter API. Scrapping. Machine Learning. Scikit-Learn. Linear Regression. Logistic Regression. Naive Bayes. K Nearest Neighbours. Singular Value Decomposition and Principal Component Analysis. Support Vector Machines. K Means. Trees. Ensemble Learning. Natural Language Processing. Sentiment analysis. Deep Learning. Neural nets. TensorFlow. Keras.
- Database systems. MySQL. SQLite3. MongoDB.
- Final project.
Here are some useful references:
- Hands-On Machine Learning with Scikit-Learn and TensorFlow, by Aurélien Géron. Also check out the Machine Learning Notebooks associated to this fantastic book.
- Kaggle. The biggest Data Science platform. This is the best place to get datasets, take part in competitions and study online courses.
- William Chen's answer to "How can I become a data scientist?". In this Quora post, William Chen offers a great guide, full of references to online resources. Highly recommended reading.
Here is some material that I developed for the course (in Spanish):
- Introductory notes on Probability and Statistics.
- Notes on Multivariable Calculus.
- Exercises on Multivariable Calculus.
- Introductory notes on databases.
- Slides on relational and non-relational databases.
- SQL in Python Database mini-project.
- Notebook on SQL in Python (Jupyter File).
- Notebook on MongoDB in Python (Jupyter File).
- Introduction to Machine Learning notebook (Jupyter File).
- Notes about notation in modelization problems.
- Notes on Logistic Regression and classification problems.
- Notebook on Logistic Regression (Jupyter File).
- Notes on the Naive Bayes classifier.
- Notebook on the Naive Bayes classifier (Jupyter File).
- Notes on the K Nearest Neighbours algorithm.
- Notebook on the K Nearest Neighbours algorithm (Jupyter File).
- Notes on Support Vector Machines.
- Notebook on Support Vector Machines (Jupyter File).
- Notes on Ensemble Learning.
- Notebook on Ensemble Learning (Jupyter File).
- Notes on the K Means clustering algorithm.
- Notebook on the K Means clustering algorithm (Jupyter File).
- Notebook on Natural Language Processing and Sentiment Analysis (Jupyter File).