Sorting Data
Sorting data is a fundamental operation in data analysis, enabling you to organize rows or columns based on specific criteria. Pandas provides the sort_values()
and sort_index()
methods for sorting DataFrames and Series. This tutorial explores how to sort data effectively using these methods.
Sorting by Column
To sort rows based on the values in a specific column, use the sort_values()
method. By default, it sorts in ascending order. Here’s an example:
import pandas as pd
# Create a sample DataFrame
data = {
"Name": ["Karthick", "Durai", "Praveen", "Naveen"],
"Age": [25, 30, 22, 28],
"City": ["Chennai", "Coimbatore", "Madurai", "Trichy"]
}
df = pd.DataFrame(data)
# Sort by Age in ascending order
sorted_df = df.sort_values(by="Age")
print(sorted_df)
Output
Name | Age | City |
---|---|---|
Praveen | 22 | Madurai |
Karthick | 25 | Chennai |
Naveen | 28 | Trichy |
Durai | 30 | Coimbatore |
Explanation: The sort_values(by="Age")
method sorts the rows based on the Age
column in ascending order. By default, the sorting is stable, retaining the order of equal values.
Sorting in Descending Order
To sort in descending order, set the ascending
parameter to False
. Here’s an example:
# Sort by Age in descending order
sorted_df = df.sort_values(by="Age", ascending=False)
print(sorted_df)
Output
Name | Age | City |
---|---|---|
Durai | 30 | Coimbatore |
Naveen | 28 | Trichy |
Karthick | 25 | Chennai |
Praveen | 22 | Madurai |
Explanation: The ascending=False
parameter sorts the rows in descending order based on the Age
column.
Sorting by Index
To sort rows based on their index, use the sort_index()
method. This is useful for resetting row order after operations. Here’s an example:
# Sort by index
df_sorted_by_index = df.sort_index()
print(df_sorted_by_index)
Output
Name | Age | City |
---|---|---|
Karthick | 25 | Chennai |
Durai | 30 | Coimbatore |
Praveen | 22 | Madurai |
Naveen | 28 | Trichy |
Explanation: The sort_index()
method sorts the rows by their index values. This is helpful for reordering rows after transformations.
Key Takeaways
- Sort by Column: Use
sort_values()
to sort rows based on column values. - Descending Order: Set
ascending=False
to sort in descending order. - Sort by Index: Use
sort_index()
to reorder rows based on their index values. - Efficiency: Sorting helps organize data for better analysis and presentation.