There is no better time to learn Machine Learning and Deep Learning than now.
The machine learning1 tools and frameworks have improved dramatically over the last few years. They can do most of the heavy lifting so you can focus on gaining invaluable practical skills.
This post will provide you with a detailed guide to learn machine learning on your own. I’ll share the topics you’ll need to know, the best available resources, and the order in which to study them.
If you follow this guide, you’ll intuitively understand machine learning techniques and gain practical experience in solving a broad range of problems.
What You Need to Know 🔗
Machine learning requires mastery of below knowledge areas and topics2:
|Knowledge Area||Topic (Course / Language / Framework)|
|Data Manipulation||NumPy and Pandas|
|Visualization||Matplotlib and Seaborn|
|Deep Learning||Keras and Tensorflow|
Let’s discuss these topics and the related study resources in detail.
Image Credit: OpenClipart-Vectors
Python is the most popular programming language for machine learning. Here are a few reasons for that:
- It has a simple, beginner-friendly syntax which almost seems like a natural language.
- Python has a vast number of open-source libraries covering every aspect of machine learning and data science.
- It has a vibrant developer community. So if you face any issues, you can quickly get help.
- There are far more machine learning job opportunities involving Python than any other language.
Python Crash Course, 3rd Edition is the only book you’ll need to learn Python fundamentals.
Here’s what I recommend you study from this book:
Chapters 1 to 11: They cover Python essentials such as data types, conditional statements, functions, classes, working with files, and unit testing.
Chapters 15 to 18: These project-based chapters will give you the first (bite-sized) taste of data manipulation and analysis. You’ll learn to load the data using APIs & CSV files and visualize it using Matplotlib.
Image Credit: GraphicMama-team
If you intend to master machine learning, you must have a solid foundation in statistics. There is simply no way around it.
You don’t need a master’s or Ph.D. in statistics, though. You’ll be ready to grasp machine learning concepts if you are comfortable with high school statistics.
Become a Probability & Statistics Master (Udemy) by Krista King is the best introductory statistics course I’ve come across. It has enough exercises to help you reinforce what you’ve learned.
If you prefer physical books, Statistics in Plain English, 5th Edition by Timothy Urdan is an excellent alternative.
I used both of these resources as they complement each other nicely.
NumPy and Pandas 🔗
Machine learning involves storing and manipulating large amounts of data. The regular Python arrays can become too bloated and slow for such demanding tasks.
That’s where NumPy comes to our rescue. It can store and perform operations on multi-dimensional numerical data efficiently. NumPy also contains useful classes and functions for random number generation, statistics, linear algebra, etc.
Image Credit: GraphicMama-team
The Pandas library goes one step further. It provides tools to handle messy datasets you’ll find in the real world.
Pandas can load data from any source you can imagine - CSV, Excel, Database, JSON, XML, HTML, etc. Then you can use Pandas to clean, summarize, filter, and even merge datasets!
I highly recommend The Complete Pandas Bootcamp (Udemy) by Alexander Hagmann for an in-depth introduction to Pandas.
The best feature of this course is that it has tons of practical exercises. If you finish all of them (without peeking at solutions!), you can consider yourself an expert in Pandas.
The course also has a section on NumPy. You should finish it before starting with Pandas.
Python Data Science Handbook, 2nd Edition by Jake VanderPlas covers NumPy and Pandas extensively as well (Chapters 4 - 24). The only downside is that it doesn’t include any exercises. So you don’t get a chance to practice your newly acquired skills.
Matplotlib and Seaborn 🔗
Image Credit: fauxels
Matplotlib is the default visualization library for Python. You can use it to draw various graphs - lineplot, piechart, barplot, scatterplot, histogram, boxplot, and many more. You can even draw 3D and interactive plots.
Seaborn extends Matplotlib and helps you create beautiful graphs. You can use its themes and color palettes to customize the look and feel of your plots.
Seaborn also works well with Pandas. That’s a huge plus, given that you’ll be doing most of your data analysis using Pandas.
The udemy course mentioned earlier, The Complete Pandas Bootcamp, has great introductory sections on Matplotlib and Seaborn.
For more thorough coverage, you can read chapters 25 - 36 from Python Data Science Handbook, 2nd Edition.
Machine Learning 🔗
Image Credit: Kevin Ku
Machine Learning can be a challenging subject for beginners. They are introduced to many concepts in rapid succession:
- Exploratory data analysis
- Data preparation
- Training techniques (Train test split, cross validation, hyperparameter tuning, etc.)
- Machine learning algorithms (linear & logistic regression, support vector machine, tree-based methods, etc.)
- Model evaluation metrics
- Scikit-Learn classes & methods for training, tuning, and testing models
- Deploying trained models in production
This barrage of new information can overwhelm even the most motivated students! That’s why it’s crucial to find learning resources that introduce these topics as gently as possible.
2022 Python for Machine Learning (Udemy) by Jose Portilla is the perfect course for a beginner. It provides an intuitive understanding of each topic before delving into technical details. There are ample projects for you to practice your skills.
Once you finish the above course, you’ll be ready to take your knowledge to an advanced level.
That’s where Hands-On Machine Learning with Scikit-Learn.., 3rd Edition by Aurélien Géron will come in handy. Chapters 1 - 9 from this book provide a mathematical foundation for the algorithms and cover Scikit-Learn in a much greater depth.
Deep Learning 🔗
Image Credit: ahmedgad
Deep learning has exploded into the mainstream within the last few years. It’s now used in various applications such as speech recognition, computer vision, fraud detection, autonomous driving, etc.
I highly recommend Deep Learning with Python, 2nd Edition by François Chollet (Keras creator). It’s one of the rare books that’s written by a top expert and yet is extremely beginner friendly.
The book doesn’t have any exercises to practice your skills. Therefore you should consider studying it along with Complete Tensorflow 2 and Keras Deep Learning Bootcamp by Jose Portilla. This udemy course includes plenty of projects to get your hands dirty.
Order of Study 🔗
We’ve covered all the resources you’ll need to master machine learning. But how should you use them?
In what order should you study them to have the most optimal learning experience?
Here’s the recommended sequence. Think of each numbered step as the prerequisite for the next one.
- Start with Python and statistics. They are independent subjects so you can learn them at the same time.
- Next, you can learn data analysis and visualization simultaneously as well.
- You can now dive into machine learning.
- Finally, study deep learning.
I’ve summarized this plan and all the resources for your reference:
One Last Thought 🔗
I’ll leave you with some advice that helped me immensely. Be sure to complete all exercises and projects when you study from the resources mentioned in this guide.
You can’t prepare for a marathon by just reading about it. That’s true for machine learning as well. Practice your skills at every chance you get.