Java Regular Expressions

Regular expressions (regex) are a powerful tool for processing text. They allow you to specify a pattern of text to search for. Java provides the java.util.regex package which is used to define a pattern for the regular expressions. This tutorial will guide you through the basics of using regular expressions in Java.

Basics of Regular Expressions

A regular expression is a sequence of characters that define a search pattern. This pattern can be used for string searching and manipulation. In Java, regular expressions are implemented through the Pattern and Matcher classes.

Pattern Class

The Pattern class is used to define a fixed pattern that can be used to search for a regex in a text. You can't create a Pattern object directly using a constructor. Instead, you use the Pattern.compile() method.

java
1import java.util.regex.Pattern; 2 3Pattern pattern = Pattern.compile("regex");

Matcher Class

The Matcher class is used to search for the pattern in a string. You create a Matcher object by invoking the matcher() method on a Pattern object.

java
1import java.util.regex.Matcher; 2 3Matcher matcher = pattern.matcher("text to search");

Basic Example

java
1import java.util.regex.Pattern; 2import java.util.regex.Matcher; 3 4public class RegexExample { 5 public static void main(String[] args) { 6 Pattern pattern = Pattern.compile("w3schools"); 7 Matcher matcher = pattern.matcher("Visit w3schools!"); 8 9 boolean found = matcher.find(); 10 if (found) { 11 System.out.println("Match found"); 12 } else { 13 System.out.println("Match not found"); 14 } 15 } 16}

Special Characters in Regex

  • .: Matches any single character.
  • ^: Matches the beginning of a line.
  • $: Matches the end of a line.
  • *: Matches zero or more occurrences of the preceding element.
  • +: Matches one or more occurrences of the preceding element.
  • ?: Matches zero or one occurrence of the preceding element.
  • []: Matches any single character within the brackets.
  • {} : Used to specify the number of occurrences.
  • (): Defines a group.
  • |: Acts as a logical OR.

Methods of Matcher Class

  • find(): Searches for the pattern in the given text.
  • matches(): Attempts to match the entire region against the pattern.
  • lookingAt(): Attempts to match the input sequence, starting at the beginning, against the pattern.
  • group(): Returns the matched subsequence.

Examples

Matching Patterns

java
1Pattern pattern = Pattern.compile("[a-z]+"); // one or more lowercase letters 2Matcher matcher = pattern.matcher("regex123"); 3 4while (matcher.find()) { 5 System.out.println("Found a match: " + matcher.group()); 6}

Splitting Strings

java
1Pattern pattern = Pattern.compile("\\s+"); // one or more whitespace characters 2String[] words = pattern.split("One Two Three"); 3 4for (String word : words) { 5 System.out.println(word); 6}

Output:

1One 2Two 3Three

Replacing Text

java
1Pattern pattern = Pattern.compile("dog"); 2Matcher matcher = pattern.matcher("The quick brown fox jumps over the lazy dog."); 3 4String replaced = matcher.replaceAll("cat"); 5System.out.println(replaced);

Output:

1The quick brown fox jumps over the lazy cat.

Flags

The Pattern class also allows you to include flags to modify the behavior of the pattern matching:

  • Pattern.CASE_INSENSITIVE: Enables case-insensitive matching.
  • Pattern.MULTILINE: Enables multiline mode.
  • Pattern.DOTALL: Enables dotall mode.
java
1Pattern pattern = Pattern.compile("abc", Pattern.CASE_INSENSITIVE);

Regular expressions in Java are a powerful feature for string processing. The java.util.regex package provides all the necessary classes and methods to work with regex. With practice, you can use regex to simplify complex text processing tasks in your Java applications. Remember that regex can be complicated and sometimes difficult to read, so always document your patterns for future reference.