# Pandas: Why and How to Use idxmin() and idxmax()

Today we’ll closely examine two pandas methods - **idxmin()** and **idxmax()**.

You can use these methods to get the **index labels** for the **minimum** or **maximum** **values** of a **Series** or **DataFrame**.

We’ll explore these operations using practical examples. Let’s dive in!

## The Example Dataset

Suppose you own a small bookstore and want to review the sales data for the past week.

Let’s create a pandas **Series** containing these sales numbers (US dollars). We’ll set the **day as the index**:

```
# Load pandas
import pandas as pd
# Day of the past week
days = [
'Sunday', 'Monday', 'Tuesday', 'Wednesday',
'Thursday', 'Friday', 'Saturday'
]
# and corresponding sales
sales = [987, 600, 320, 437, 520, 680, 1067]
store_sales = pd.Series(
data=sales, # Sales numbers
name='Sales', # Series name
index=days # Use days as the index
)
store_sales
```

```
Sunday 987
Monday 600
Tuesday 320
Wednesday 437
Thursday 520
Friday 680
Saturday 1067
Name: Sales, dtype: int64
```

## idxmin() - Index of the Minimum Value

Suppose you want to know - *which day of the last week had the lowest sales amount?*

Since we used the day as the index, we can translate this question to: *What’s the index of the minimum value of the Series store_sales?*

The method **idxmin()** will get you the answer:

```
# The series method idxmin() returns
# the index of the minimum value
store_sales.idxmin()
```

`'Tuesday'`

Thus the lowest sales of the week occurred on Tuesday. And what was that amount? Let’s get that using the method **min()**:

```
# Get the lowest sales amount
store_sales.min()
```

`320`

## idxmax() - Index of the Maximum Value

The Series method **idxmax()** returns the index of the maximum value. We’ll use it to find the day when the store generated the highest sales amount:

```
# The series method idxmax() returns
# the index of the maximum value
store_sales.idxmax()
```

`'Saturday'`

Let’s get the highest amount using the **max()** method:

`store_sales.max()`

`1067`

Thus the store had the best sales ($1067) on Saturday.

## Repeated Min / Max Values

What if the dataset contains the **minimum or maximum value** **more than once**? How would the methods **idxmin()** or **idxmax()** behave? Let’s find out.

Suppose you calculate the average sales per weekday for the last year. And store them in a pandas **Series**:

```
# week day
days = [
'Sunday', 'Monday', 'Tuesday', 'Wednesday',
'Thursday', 'Friday', 'Saturday'
]
# Average sale for each day
sales = [1100, 550, 300, 450, 300, 700, 1100]
avg_sales = pd.Series(
data=sales,
name='Sales',
index=days
)
# print as DataFrame for readability
avg_sales.to_frame()
```

Sales | |
---|---|

Sunday | 1100 |

Monday | 550 |

Tuesday | 300 |

Wednesday | 450 |

Thursday | 300 |

Friday | 700 |

Saturday | 1100 |

The minimum value, 300, occurs twice on Tuesday and Thursday. The maximum value, 1100, is repeated on Sunday and Saturday.

We expect the method **idxmin()** to return multiple index labels. But it only returns Tuesday, the row label of the **first occurrence of the minimum value**:

`avg_sales.idxmin()`

`'Tuesday'`

So **idxmin()** **failed** to get us all the days when the minimum value occurred. **What’s the alternative?**

We can use the boolean mask followed by the select operation. Let’s see this action.

First, we create a **boolean mask** that returns `True`

when the data value equals the minimum value:

```
# Boolean mask -
# Entries that equal the minimum value
# will be set to True
avg_sales == avg_sales.min()
```

```
Sunday False
Monday False
Tuesday True
Wednesday False
Thursday True
Friday False
Saturday False
Name: Sales, dtype: bool
```

The mask is set to `True`

for both Tuesday and Thursday. Next, use this mask to select the `True`

entries:

`avg_sales[avg_sales == avg_sales.min()]`

```
Tuesday 300
Thursday 300
Name: Sales, dtype: int64
```

We got the expected answer! Both Tuesday and Thursday had minimum sales of $300.

You can also extract the index labels (days) from the above query using the `index`

attribute:

`avg_sales[avg_sales == avg_sales.min()].index.to_list()`

`['Tuesday', 'Thursday']`

Let’s use this technique to extract the days of maximum average sales:

`avg_sales[avg_sales == avg_sales.max()].index.tolist()`

`['Sunday', 'Saturday']`

## idxmin() & idxmax() for DataFrames

The method **idxmin()** provides additional functionality for **DataFrames**. You can use it to **get the index label of the minimum value across rows or columns**.

Let me illustrate with an example.

Suppose you have employed three sales agents in your bookstore - Ava, Maya, and Priya. Let’s look at their average sales per day. We’ll store this data in a **DataFrame**:

```
days = [
'Sunday', 'Monday', 'Tuesday', 'Wednesday',
'Thursday', 'Friday', 'Saturday'
]
# Sales per day for each agent
sales = {
'Ava': [ 168, 186, 107, 155, 152, 212, 175],
'Maya': [ 200, 168, 106, 173, 164, 188, 184],
'Priya': [ 176, 172, 158, 164, 91, 198, 234]
}
# Load in a DataFrame
agent_sales = pd.DataFrame(
data=sales,
index=pd.Index(days, name='Days')
)
agent_sales
```

Ava | Maya | Priya | |
---|---|---|---|

Days | |||

Sunday | 168 | 200 | 176 |

Monday | 186 | 168 | 172 |

Tuesday | 107 | 106 | 158 |

Wednesday | 155 | 173 | 164 |

Thursday | 152 | 164 | 91 |

Friday | 212 | 188 | 198 |

Saturday | 175 | 184 | 234 |

Suppose you want to know: *Get me the day when each agent generated the minimum sales?*

You can use the DataFrame **idxmin()** method to get this information.

**By default**, it gets the minimum value across the row index. That is, it returns the **row index** (day) **of the minimum value** (sales) **for each column** (agent):

`agent_sales.idxmin()`

```
Ava Tuesday
Maya Tuesday
Priya Thursday
dtype: object
```

However, what if you ask a different question: *Which sales agent had the minimum sales on each day?*

We’ll need to use **idxmin()** with the ** axis** parameter. When you set the

**to**

`axis`

**,**

`'columns'`

**idxmin()**finds the

**index of minimum value across columns**.

Thus, the below code will return the agent (**column index**) with minimum sales for each day (**row**):

```
# Index of the minimum values across columns
agent_sales.idxmin(axis='columns')
```

```
Days
Sunday Ava
Monday Maya
Tuesday Maya
Wednesday Ava
Thursday Priya
Friday Maya
Saturday Ava
dtype: object
```

We can perform similar operations to get the index of the maximum values.

The default **idxmax()** call returns the **index of the maximum value across rows**. Thus below code finds the day (**row index**) of the maximum sales for each agent (**column**):

```
# DataFrame idxmax() call
# Find the index of the maximum value across rows
agent_sales.idxmax()
```

```
Ava Friday
Maya Sunday
Priya Saturday
dtype: object
```

When you set the ** axis** parameter to

**,**

`'columns'`

**idxmax()**finds the

**index of maximum value across columns**.

Thus it returns the **agent** (column index) with the maximum sales for each **day** (row index):

`agent_sales.idxmax(axis='columns')`

```
Days
Sunday Maya
Monday Ava
Tuesday Priya
Wednesday Maya
Thursday Maya
Friday Ava
Saturday Priya
dtype: object
```

## Summary

This quick tutorial showed you how to use pandas methods - **idxmin()** and **idxmax()**.

You can use them to get the index of the minimum or maximum value of a pandas **Series**.

Things get interesting with **DataFrames**. The **idxmin()** and **idxmax()** methods allow you to work across the **row** or **column** index. You can use the parameter ** axis** to control that behavior.

The tutorial included plenty of examples. So you should feel confident using these methods on your own.