R Data Cleaning Functions

Data cleaning is an essential part of data analysis. R provides various functions to handle missing data, remove duplicates, and transform data types.

Key Topics

Handling Missing Data

# Removing rows with missing values
cleaned_data <- na.omit(data)

# Checking for missing values
anyNA(data)

Note:

Use na.omit() to remove rows with NA values and anyNA() to check for missing values.

Removing Duplicates

# Removing duplicate rows
unique_data <- unique(data)

Note:

Use unique() to remove duplicate rows from the data.

Key Takeaways

  • Use na.omit() to remove rows with missing values.
  • Use anyNA() to check for missing data in your dataset.
  • Use unique() to remove duplicate entries from your data.