Welcome to Proclus Academy

Insightful Tutorials for Inquisitive Developers
Here's our latest articles on Machine Learning, React, REST APIs, Databases, and more..

Let's explore how Data Distribution enables you to extract general patterns from the data.

You'll also learn to visualize distribution as Histogram and Density Curve using Matplotlib and Seaborn.

Visualizing Data Distribution. Image showing histgogram, and density curve (KDE). Graph generated using python, matplotlib and seaborn
Let's explore how you can use Matplotlib to draw pie charts with customized colors and labels. You can even apply styles tailored to each slice.

Along the way, you'll see what's an exploding pie chart and how to draw it. Finally, you'll learn to plot Donut Charts!

Summary Image for: Customizing Matplotlib Pie Chart
Do you want samples that accurately represent the population? Here's how Stratified Sampling can help.

You'll also develop practical skills and learn how to do sampling using Python and Pandas.

Summary Image for: What is Stratified Sampling and How to do it using Pandas?
Let's explore how to create classification datasets with balanced or imbalanced classes and binary or multiclass labels.

You can even produce datasets that are harder to classify!

Summary Image for Scikit-Learn make_classification: 3 pears and an apple
Let's learn how to calculate Confusion Matrix and Accuracy using Python libraries.

I'll also show you two different ways to visualize the Confusion Matrix.

Summary Image: Confusion Matrix and Accuracy Using Scikit-Learn & Seaborn
Let's look at the basic metrics to estimate a classification model’s predictive performance.

You'll also gain practical skills to generate and visualize these metrics using Scikit-Learn and Seaborn.

Summary Image - Using Confusion Matrix and Accuracy to Test Classification Models
How can you perform K-Fold Cross-Validation to evaluate machine learning models?

I'll show you two ways using Python and Scikit-Learn's helper functions - cross_val_score() and cross_validate().

Summary Image: K-Fold Cross-Validation Using Python and Scikit-Learn
It's time to learn Cross-Validation, the tool serious data scientists use to estimate model performance.

Cross-Validation builds upon Train Test Split and provides a better estimate of a machine learning model's performance.

Summary Image: K-Fold Cross-Validation
Sometimes you want to draw boxplots where each column gets its own y-axis.

Here's how you can do it using pandas, Matplotlib, and Seaborn.

Boxplot with separate Y-Axis For Each Column: Summary image - Taipei skyline with skyscrapers of different heights
Outliers can overshadow other data points of a feature. That can negatively influence standard scaling.

Here's why you should use robust scaling to handle outliers.

Robust Scaling Summary Image: A tiny boat and a tall lighthouse.