Pandas Read CSV
A CSV (Comma-Separated Values) file is one of the most common formats for storing tabular data. Pandas provides an easy-to-use function, read_csv()
, to read CSV files into a DataFrame. This allows for efficient data manipulation and analysis. Let’s explore how to read and handle CSV files using Pandas.
Reading a CSV File
To read a CSV file, you can use the pd.read_csv()
function by providing the file path as an argument. The following example demonstrates reading a CSV file containing information about Indian rivers:
import pandas as pd
# Read a CSV file into a DataFrame
df = pd.read_csv("indian_rivers.csv")
# Display the first 5 rows
print(df.head())
Output
River | Length (km) | Origin | States Covered |
---|---|---|---|
Ganga | 2525 | Gangotri Glacier | 11 |
Godavari | 1465 | Trimbakeshwar | 6 |
Krishna | 1400 | Mahabaleshwar | 5 |
Kaveri | 805 | Talakaveri | 4 |
Brahmaputra | 2900 | Angsi Glacier | 5 |
Explanation: The pd.read_csv()
function reads the CSV file indian_rivers.csv
into a DataFrame named df
. The .head()
method displays the first 5 rows of the DataFrame, making it easy to preview the dataset. The columns River
, Length (km)
, Origin
, and States Covered
represent the data fields in the CSV file.
Specifying Parameters
The read_csv()
function provides several parameters to customize the data import process. For example, you can specify a delimiter if the file uses a separator other than commas, skip rows, or select specific columns. Here’s an example:
# Read a CSV file with custom delimiter
df = pd.read_csv("indian_rivers.csv", delimiter=",", usecols=["River", "Length (km)"])
# Display the DataFrame
print(df)
Output
River | Length (km) |
---|---|
Ganga | 2525 |
Godavari | 1465 |
Krishna | 1400 |
Kaveri | 805 |
Brahmaputra | 2900 |
Explanation: In this example, the usecols
parameter selects only the River
and Length (km)
columns from the CSV file, and the delimiter
parameter ensures that the file is correctly parsed using commas as separators.
Key Takeaways
- Simple Import: The
pd.read_csv()
function is used to read CSV files into DataFrames. - Preview Data: Use
.head()
to display the first few rows of the DataFrame. - Customizable Parameters: Parameters like
usecols
anddelimiter
allow flexibility in reading specific parts of the data. - Common Format: CSV is a widely used format for tabular data, making it essential for real-world data analysis tasks.