Vous êtes sur la page 1sur 9

Java - Regular Expressions

Java provides the java.util.regex package for pattern matching with regular expressions. Java regular expressions are very similar to the Perl programming language and very easy to learn. A regular expression is a special sequence of characters that helps you match or find other strings or sets of strings, using a specialized syntax held in a pattern. hey can !e used to search, edit, or manipulate text and data. he java.util.regex package primarily consists of the following three classes"

Pattern Class: A Pattern o!ject is a compiled representation of a regular expression. he Pattern class provides no pu!lic constructors. o create a pattern, you must first invoke one of its pu!lic static compile methods, which will then return a Pattern o!ject. hese methods accept a regular expression as the first argument. Matcher Class: A #atcher o!ject is the engine that interprets the pattern and performs match operations against an input string. $ike the Pattern class, #atcher defines no pu!lic constructors. %ou o!tain a #atcher o!ject !y invoking the matcher method on a Pattern o!ject. PatternSyntaxException: A Pattern&yntax'xception o!ject is an unchecked exception that indicates a syntax error in a regular expression pattern.

Capturing Groups:
(apturing groups are a way to treat multiple characters as a single unit. hey are created !y placing the characters to !e grouped inside a set of parentheses. )or example, the regular expression *dog+ creates a single group containing the letters ,d,, ,o,, and ,g,. (apturing groups are num!ered !y counting their opening parentheses from left to right. -n the expression **A+*.*(+++, for example, there are four such groups" /. **A+*.*(+++ 0. *A+ 1. *.*(++ 2. *(+ o find out how many groups are present in the expression, call the group(ount method on a matcher o!ject. he group(ount method returns an int showing the num!er of capturing groups present in the matcher3s pattern.

here is also a special group, group 4, which always represents the entire expression. his group is not included in the total reported !y group(ount.

Example:
)ollowing example illustrate how to find a digit string from the given alphanumeric string"
import java.util.regex.Matcher; import java.util.regex.Pattern; public class RegexMatches { public static void main( String args[]

!! String to be scanned to "ind the pattern. String line # $%his order &as places "or '%()))* +,-$; String pattern # $(.. (//d0 (.. $; !! 1reate a Pattern object Pattern r # Pattern.compile(pattern ; !! 2o& create matcher object. Matcher m # r.matcher(line ; i" (m."ind( { S3stem.out.println($4ound value5 $ 0 m.group() S3stem.out.println($4ound value5 $ 0 m.group(6 S3stem.out.println($4ound value5 $ 0 m.group(7 8 else { S3stem.out.println($2+ M9%1:$ ; 8 8 8

; ; ;

his would produce following result"


4ound value5 %his order &as places "or '%()))* +,4ound value5 %his order &as places "or '%()) 4ound value5 )

Regular Expression Syntax:


5ere is the ta!le listing down all the regular expression metacharacter syntax availa!le in Java" Subexpression 6 #atches !eginning of line. Matches

7 . 9...: 96...: ;A ;z ;< re= re> re? re@ nA re@ n,A re@ n, mA aB ! *re+ *?" re+ *?D re+ ;w ;E ;s ;& ;d ;H ;A ;<

#atches end of line. #atches any single character except newline. 8sing m option allows it to match newline as well. #atches any single character in !rackets. #atches any single character not in !rackets .eginning of entire string 'nd of entire string 'nd of entire string except allowa!le final line terminator. #atches 4 or more occurrences of preceding expression. #atches / or more of the previous thing #atches 4 or / occurrence of preceding expression. #atches exactly n num!er of occurrences of preceding expression. #atches n or more occurrences of preceding expression. #atches at least n and at most m occurrences of preceding expression. #atches either a or !. Croups regular expressions and remem!ers matched text. Croups regular expressions without remem!ering matched text. #atches independent pattern without !acktracking. #atches word characters. #atches nonword characters. #atches whitespace. 'quivalent to 9;t;n;r;f:. #atches nonwhitespace. #atches digits. 'quivalent to 94FG:. #atches nondigits. #atches !eginning of string. #atches end of string. -f a newline exists, it matches just !efore

newline. ;z ;C ;n ;! ;. ;n, ;t, etc. ;J ;' #atches end of string. #atches point where last match finished. .ackFreference to capture group num!er ,n, #atches word !oundaries when outside !rackets. #atches !ackspace *4x4I+ when inside !rackets. #atches nonword !oundaries. #atches newlines, carriage returns, ta!s, etc. 'scape *quote+ all characters up to ;' 'nds quoting !egun with ;J

Methods o the Matcher Class:


5ere is the lists of useful instance methods"

!ndex Methods:
-ndex methods provide useful index values that show precisely where the match was found in the input string" S" / Methods #ith $escription public int start%& Keturns the start index of the previous match. public int start%int group& Keturns the start index of the su!sequence captured !y the given group during the previous match operation. public int end%& Keturns the offset after the last character matched. public int end%int group& Keturns the offset after the last character of the su!sequence captured !y the given group during the previous match operation.

Study Methods:

&tudy methods review the input string and return a !oolean indicating whether or not the pattern is found" S" / Methods #ith $escription public boolean loo'ing(t%& Attempts to match the input sequence, starting at the !eginning of the region, against the pattern. public boolean ind%& Attempts to find the next su!sequence of the input sequence that matches the pattern. public boolean ind%int start Kesets this matcher and then attempts to find the next su!sequence of the input sequence that matches the pattern, starting at the specified index. public boolean matches%& Attempts to match the entire region against the pattern.

Replacement Methods:
Keplacement methods are useful methods for replacing text in an input string" S" / 0 Methods #ith $escription public Matcher appendReplacement%String)u er sb* String replacement& -mplements a nonFterminal appendFandFreplace step. public String)u er append+ail%String)u er sb& -mplements a terminal appendFandFreplace step. public String replace(ll%String replacement& Keplaces every su!sequence of the input sequence that matches the pattern with the given replacement string. public String replace,irst%String replacement& Keplaces the first su!sequence of the input sequence that matches the pattern with the given replacement string. public static String -uoteReplacement%String s& Keturns a literal replacement &tring for the specified &tring. his method produces a &tring that will work as a literal replacement s in the appendKeplacement method of the #atcher class.

+he start and end Methods:

)ollowing is the example that counts the num!er of times the word ,cats, appears in the input string"
import java.util.regex.Matcher; import java.util.regex.Pattern; public class RegexMatches { private static "inal String R;<;= # $//bcat//b$; private static "inal String >2P?% # $cat cat cat cattie cat$; public static void main( String args[] { Pattern p # Pattern.compile(R;<;= ; Matcher m # p.matcher(>2P?% ; !! get a matcher object int count # ); &hile(m."ind( { count00; S3stem.out.println($Match number $0count ; S3stem.out.println($start( 5 $0m.start( ; S3stem.out.println($end( 5 $0m.end( ; 8 8 8

his would produce following result"


Match number start( 5 ) end( 5 ( Match number start( 5 @ end( 5 A Match number start( 5 B end( 5 66 Match number start( 5 6C end( 5 77 6 7 ( @

%ou can see that this example uses word !oundaries to ensure that the letters ,c, ,a, ,t, are not merely a su!string in a longer word. -t also gives some useful information a!out where in the input string the match has occurred. he start method returns the start index of the su!sequence captured !y the given group during the previous match operation, and end returns the index of the last character matched, plus one.

+he matches and lookingAt Methods:

he matches and lookingAt methods !oth attempt to match an input sequence against a pattern. he difference, however, is that matches requires the entire input sequence to !e matched, while lookingAt does not. .oth methods always start at the !eginning of the input string. 5ere is the example explaining the functionality"
import java.util.regex.Matcher; import java.util.regex.Pattern; public class RegexMatches { private static "inal String R;<;= # $"oo$; private static "inal String >2P?% # $"ooooooooooooooooo$; private static Pattern pattern; private static Matcher matcher; public static void main( String args[] pattern # Pattern.compile(R;<;= ; matcher # pattern.matcher(>2P?% ; {

S3stem.out.println($1urrent R;<;= is5 $0R;<;= ; S3stem.out.println($1urrent >2P?% is5 $0>2P?% ; S3stem.out.println($looDing9t( 5 $0matcher.looDing9t( S3stem.out.println($matches( 5 $0matcher.matches( ; 8 8 ;

his would produce following result"


1urrent R;<;= is5 "oo 1urrent >2P?% is5 "ooooooooooooooooo looDing9t( 5 true matches( 5 "alse

+he replaceFirst and replaceAll Methods:


he replace)irst and replaceAll methods replace text that matches a given regular expression. As their names indicate, replace)irst replaces the first occurrence, and replaceAll replaces all occurences. 5ere is the example explaining the functionality"
import java.util.regex.Matcher; import java.util.regex.Pattern; public class RegexMatches {

private static String R;<;= # $dog$; private static String >2P?% # $%he dog sa3s meo&. $ 0 $9ll dogs sa3 meo&.$; private static String R;PE91; # $cat$; public static void main(String[] args { Pattern p # Pattern.compile(R;<;= ; !! get a matcher object Matcher m # p.matcher(>2P?% ; >2P?% # m.replace9ll(R;PE91; ; S3stem.out.println(>2P?% ; 8 8

his would produce following result"


%he cat sa3s meo&. 9ll cats sa3 meo&.

+he appendReplacement and appendTail Methods:


he #atcher class also provides appendKeplacement and append ail methods for text replacement. 5ere is the example explaining the functionality"
import java.util.regex.Matcher; import java.util.regex.Pattern; public class RegexMatches { private static String R;<;= # $a.b$; private static String >2P?% # $aab"ooaab"ooab"oob$; private static String R;PE91; # $F$; public static void main(String[] args { Pattern p # Pattern.compile(R;<;= ; !! get a matcher object Matcher m # p.matcher(>2P?% ; StringGu""er sb # ne& StringGu""er( ; &hile(m."ind( { m.appendReplacement(sbHR;PE91; ; 8 m.append%ail(sb ; S3stem.out.println(sb.toString( ; 8 8

his would produce following result"

F"ooF"ooF"ooF

PatternSyntaxException Class Methods:


A Pattern&yntax'xception is an unchecked exception that indicates a syntax error in a regular expression pattern. he Pattern&yntax'xception class provides the following methods to help you determine what went wrong" S" / 0 1 Methods #ith $escription public String get$escription%& Ketrieves the description of the error. public int get!ndex%& Ketrieves the error index. public String getPattern%& Ketrieves the erroneous regular expression pattern. public String getMessage%& Keturns a multiFline string containing the description of the syntax error and its index, the erroneous regular expression pattern, and a visual indication of the error index within the pattern.

Vous aimerez peut-être aussi