Pandas Plotting

Pandas makes data visualization simple and powerful with its built-in plotting capabilities, powered by Matplotlib. You can create various types of plots directly from DataFrames and Series, such as line plots, bar charts, histograms, and scatter plots. This tutorial explores how to generate and customize plots in Pandas.

Basic Plotting

To create a basic plot, use the plot() method on a DataFrame or Series. By default, it generates a line plot. Here’s an example:

import pandas as pd
import matplotlib.pyplot as plt

# Create a sample DataFrame
data = {
    "Month": ["Jan", "Feb", "Mar", "Apr", "May"],
    "Sales": [100, 120, 150, 130, 170]
}

df = pd.DataFrame(data)

# Plot a line chart
df.plot(x="Month", y="Sales", kind="line", title="Monthly Sales", xlabel="Month", ylabel="Sales (Units)")
plt.show()

Output: A line plot displaying sales trends over months.

Explanation: The plot() method generates a line chart using the Month column as the x-axis and Sales as the y-axis. Titles and labels are added using the title, xlabel, and ylabel parameters for better understanding.

Customizing Plots

You can customize plots by specifying the kind parameter in the plot() method. Pandas supports various plot types such as bar, histogram, and scatter. Here’s an example of a bar plot:

# Create a bar plot
df.plot(x="Month", y="Sales", kind="bar", title="Monthly Sales", color="skyblue")
plt.show()

Output: A bar chart visualizing sales across different months.

Explanation: The kind="bar" parameter specifies a bar plot, with color used to customize the bar colors. Titles and axis labels make the plot easier to interpret.

Scatter Plots

Scatter plots visualize relationships between two numerical variables. To create a scatter plot, set kind="scatter" and provide both x and y columns. Here’s an example:

# Create a scatter plot
data = {
    "Temperature": [30, 35, 28, 25, 40],
    "Sales": [100, 150, 80, 50, 200]
}

df = pd.DataFrame(data)

df.plot(x="Temperature", y="Sales", kind="scatter", title="Temperature vs Sales", color="red")
plt.show()

Output: A scatter plot showing the relationship between temperature and sales.

Explanation: The scatter plot visualizes the relationship between temperature and sales, highlighting how changes in temperature may affect sales. The kind="scatter" parameter specifies the plot type.

Key Takeaways

  • Line Plots: The default plot type in Pandas is a line plot, ideal for visualizing trends.
  • Customizing Plots: Use the kind parameter to generate bar, histogram, scatter, or other plot types.
  • Scatter Plots: Scatter plots help visualize relationships between two numerical variables.
  • Matplotlib Integration: Pandas plots are built on Matplotlib, allowing for further customization using Matplotlib’s functions.