C# String Manipulation with Regular Expressions

Regular expressions (regex) allow for powerful pattern matching and manipulation in strings. The System.Text.RegularExpressions namespace provides support through the Regex class, enabling developers to perform complex searches, validations, and transformations on text.

Key Topics

1. Using Regex.Match()

The Regex.Match() method searches an input string for a match to a regular expression pattern and returns the first occurrence.

Example: Extracting Numbers

using System.Text.RegularExpressions;

string input = "The price is 100 dollars.";
string pattern = "\d+";
Match match = Regex.Match(input, pattern);
Console.WriteLine(match.Value);  // Outputs: 100

Output:

100

Code Explanation: The pattern "\d+" matches one or more digits in the input string. Regex.Match() finds the first match, which is 100.

2. Validating Patterns

Regular expressions can be used to validate the format of strings, such as email addresses, phone numbers, and more.

Example: Validating an Email Address

using System.Text.RegularExpressions;

string email = "test@example.com";
string pattern = @"^[^\s@]+@[^\s@]+\.[^\s@]+$";
bool isValid = Regex.IsMatch(email, pattern);
Console.WriteLine(isValid);  // Outputs: True

Output:

True

Code Explanation: The regex pattern "^[^\s@]+@[^\s@]+\.[^\s@]+$" ensures that the email address has no spaces, contains exactly one @ symbol, and has a valid domain format.

3. Replacing Patterns

Regular expressions can replace parts of strings that match a specific pattern, which is useful for masking sensitive information or reformatting text.

Example: Masking Sensitive Information

using System.Text.RegularExpressions;

string text = "My phone number is 123-456-7890.";
string pattern = "\d{3}-\d{3}-\d{4}";
string result = Regex.Replace(text, pattern, "[redacted]");
Console.WriteLine(result);  // Outputs: My phone number is [redacted].

Output:

My phone number is [redacted].

Code Explanation: The pattern "\d{3}-\d{3}-\d{4}" matches a phone number format. Regex.Replace() replaces the matched phone number with "[redacted]".

4. Regex Options

The Regex class provides various options to control the behavior of regex operations, such as case-insensitivity, multiline mode, and more. These options allow you to fine-tune your pattern matching to suit specific requirements.

Example: Case-Insensitive Matching

using System.Text.RegularExpressions;

string input = "Hello World";
string pattern = "hello";
Match match = Regex.Match(input, pattern, RegexOptions.IgnoreCase);
Console.WriteLine(match.Success ? match.Value : "No match found.");  // Outputs: Hello

Output:

Hello

Code Explanation: By using RegexOptions.IgnoreCase, the regex engine ignores case differences, allowing "hello" to match "Hello".

5. Advanced Regex Techniques

Advanced regex techniques include using groups, lookaheads, and other assertions to perform more complex matching and extraction. These techniques enable more sophisticated string manipulations and validations.

Example: Using Groups to Extract Data

using System.Text.RegularExpressions;

string logEntry = "2024-04-27 14:35:22 ERROR An unexpected error occurred.";
string pattern = @"^(?\d{4}-\d{2}-\d{2}) (?

Output:

Date: 2024-04-27
Time: 14:35:22
Level: ERROR
Message: An unexpected error occurred.

Code Explanation: The regex pattern uses named groups to extract the date, time, log level, and message from a log entry. After a successful match, each group’s value is printed.

Key Takeaways

  • Regular expressions offer powerful pattern matching capabilities for strings.
  • The Regex class provides methods like Match(), IsMatch(), and Replace() for various operations.
  • Regex options like IgnoreCase and Multiline can modify matching behavior.
  • Advanced techniques such as grouping and lookaheads enable complex string manipulations.
  • Always test and validate your regex patterns to ensure they work as intended.