Sorting Data

Sorting data is a fundamental operation in data analysis, enabling you to organize rows or columns based on specific criteria. Pandas provides the sort_values() and sort_index() methods for sorting DataFrames and Series. This tutorial explores how to sort data effectively using these methods.

Sorting by Column

To sort rows based on the values in a specific column, use the sort_values() method. By default, it sorts in ascending order. Here’s an example:

import pandas as pd

# Create a sample DataFrame
data = {
    "Name": ["Karthick", "Durai", "Praveen", "Naveen"],
    "Age": [25, 30, 22, 28],
    "City": ["Chennai", "Coimbatore", "Madurai", "Trichy"]
}

df = pd.DataFrame(data)

# Sort by Age in ascending order
sorted_df = df.sort_values(by="Age")
print(sorted_df)

Output

Name Age City
Praveen 22 Madurai
Karthick 25 Chennai
Naveen 28 Trichy
Durai 30 Coimbatore

Explanation: The sort_values(by="Age") method sorts the rows based on the Age column in ascending order. By default, the sorting is stable, retaining the order of equal values.

Sorting in Descending Order

To sort in descending order, set the ascending parameter to False. Here’s an example:

# Sort by Age in descending order
sorted_df = df.sort_values(by="Age", ascending=False)
print(sorted_df)

Output

Name Age City
Durai 30 Coimbatore
Naveen 28 Trichy
Karthick 25 Chennai
Praveen 22 Madurai

Explanation: The ascending=False parameter sorts the rows in descending order based on the Age column.

Sorting by Index

To sort rows based on their index, use the sort_index() method. This is useful for resetting row order after operations. Here’s an example:

# Sort by index
df_sorted_by_index = df.sort_index()
print(df_sorted_by_index)

Output

Name Age City
Karthick 25 Chennai
Durai 30 Coimbatore
Praveen 22 Madurai
Naveen 28 Trichy

Explanation: The sort_index() method sorts the rows by their index values. This is helpful for reordering rows after transformations.

Key Takeaways

  • Sort by Column: Use sort_values() to sort rows based on column values.
  • Descending Order: Set ascending=False to sort in descending order.
  • Sort by Index: Use sort_index() to reorder rows based on their index values.
  • Efficiency: Sorting helps organize data for better analysis and presentation.