Re Expression

C# - Regular Expressions
Advertisements
Previous Page
Next Page
A regular expression is a pattern that could be matched against an input text.
The .Net framework provides a regular expression engine that allows such
matching. A pattern consists of one or more character literals, operators, or
constructs.
Constructs for Defining Regular Expressions

There are various categories of characters, operators, and constructs that lets
you to define regular expressions. Click the follwoing links to find these
constructs.
Character escapes
Character classes
Anchors
Grouping constructs
Quantifiers
Backreference constructs
Alternation constructs
Substitutions
Miscellaneous constructs
The Regex Class

The Regex class is used for representing a regular expression.
The Regex class has the following commonly used methods:
S.N Methods & Description
1
public bool IsMatch( string input )

Indicates whether the regular expression specified in the Regex
constructor finds a match in a specified input string.
public bool IsMatch( string input, int startat )

Indicates whether the regular expression specified in the Regex
constructor finds a match in the specified input string, beginning at
the specified starting position in the string.
public static bool IsMatch( string input, string pattern )

Indicates whether the specified regular expression finds a match
in the specified input string.
public MatchCollection Matches( string input )

Searches the specified input string for all occurrences of a regular
expression.
public string Replace( string input, string replacement )

In a specified input string, replaces all strings that match a regular
expression pattern with a specified replacement string.
public string[] Split( string input )

Splits an input string into an array of substrings at the positions
defined by a regular expression pattern specified in the Regex
constructor.
For the complete list of methods and properties, please read the Microsoft
documentation on C#.
Example 1
The following example matches words that start with 'S':
using System;
using System.Text.RegularExpressions;
namespace RegExApplication
{
class Program
{
private static void showMatch(string text, string expr)
{
Console.WriteLine("The Expression: " + expr);
MatchCollection mc = Regex.Matches(text, expr);
foreach (Match m in mc)
{
Console.WriteLine(m);
}
}
static void Main(string[] args)
{
string str = "A Thousand Splendid Suns";
Console.WriteLine("Matching words that start with 'S': ");
showMatch(str, @"\bS\S*");
Console.ReadKey();
}
}
}
When the above code is compiled and executed, it produces following result:
Matching words that start with 'S':

The Expression: \bS\S*
Splendid
Suns
Example 2
The following example matches words that start with 'm' and ends with 'e':
using System;
{
class Program
{
private static void showMatch(string text, string expr)
{
Console.WriteLine("The Expression: " + expr);
MatchCollection mc = Regex.Matches(text, expr);
foreach (Match m in mc)
{
Console.WriteLine(m);
}
}
{
string str = "make maze and manage to measure it";
Console.WriteLine("Matching words start with 'm' and ends with 'e':");
showMatch(str, @"\bm\S*e\b");
Console.ReadKey();
}
}
}
Matching words start with 'm' and ends with 'e':
The Expression: \bm\S*e\b
make
maze
manage
measure
Example 3
This example replaces extra white space:
using System;
{
class Program
{
{
string input = "Hello
World
";
string pattern = "\\s+";
string replacement = " ";
Regex rgx = new Regex(pattern);
string result = rgx.Replace(input, replacement);
Console.WriteLine("Original String: {0}", input);
Console.WriteLine("Replacement String: {0}", result);
Console.ReadKey();
}
}
}
Original String: Hello
World
Replacement String: Hello World
C# Regex: Checking for a-z and A-Z
up I want to check if a string inputted in a character between a-z or A-Z.

vote8do Somehow my regular expression doesn't seem to pick it up. It always
wn vote returns true. I am not sure why, I gather it has to do with how I am
2
writing my regular expression. Any help would be appreciated.
favorite
private static bool isValid(String str)

{
bool valid = false;
Regex reg = new Regex((@"a-zA-Z+"));
if (reg.Match(str).Success)
valid = false;
else
valid = true;
return valid;
}
c# regex
share|improve this question
edited May 16 '11 at

13:04
asked May 16 '11 at

12:58
jlafay
Sophie Ker
3,85632250
4313
1 You're setting it to false after it matches. jlafay May 16 '11 at 13:03
A TIP: Rather than writing a-zA-Z you can use ?i to make your regex pattern case
insensitive and then just write a-zwhere ever required. NeverHopeless Nov 12 '12 at
13:02
3 Answers
activeoldestvotes
up The right way would be like so:

vote5do
wn vote private static bool isValid(String str)
accepted
{
return Regex.IsMatch(str, @"^[a-zA-Z]+$");
}
This code has the following benefits:
Using the static method instead of creating a new instance every

time: The static method caches the regular expression
Fixed the regex. It now matches any string that consists of one or
more of the characters a-z or A-Z. No other characters are allowed.
Much shorter and readable.
answered May 16 '11
share|improve this answer
13:08
at 13:03
Daniel Hilgarth
75.9k1078158
Because of the anchors ^ and $, ^[a-zA-Z]+$ will match a string if it is entirely composed
of letters (probably what the OT intends, but you should update the explanation).
Ekkehard.Horner May 16 '11 at 13:10
@Ekkehard: IMHO, my explanation states exactly that... Daniel Hilgarth May 16 '11 at
13:11
up Use
vote5do
wn vote Regex.IsMatch(@"^[a-zA-Z]+$");
13:45
answered May 16 '11

at 12:59
mathieu
15.2k12361
up Regex reg = new Regex("^[a-zA-Z]+$");
vote4do ^ start of the string
wn vote [] character set
+ one time or the more

$ end of the string
^ and $ needed because you want validate all string, not part of the
string
Creating Regular Expressions
answered May 16 '11

at 13:05
Regular expressions are an efficient way to process text. The following regular
expression looks complicated to a beginner:
Collapse | Copy Code
^\w+$
The PERL developer would smile. All this regular expression does is return the exact
same word entered that the expression is compared to. The symbols look very difficult
to understand, and are.The ^ symbol refers to the start of the string. The $ refers to
the end of the string. The \w refers to the a whole word with the characters A-Z, a-z, 09 and underscore. The + is simply 0 or more repetitions. The regular expression would
match:
test
testtest
test1
1test
Using Regular Expressions in C# .NET

The System.Text.RegularExpressions namespace contains the Regex class used to
form and evaluate regular expressions. The Regex class contains static methods used to
compare regular expressions against strings. The Regex class uses the IsMatch() static
method to compare a string with a regular expression.
bool match = Regex.IsMatch

(string input, string pattern);
If writing C# code, the example above would be:
if (Regex.IsMatch("testtest", @"^\w+$"))
{
// Do something here
}
Another useful static method is Match(), which returns a Match object with all matches
in the input string. This is useful when more than one match exists in the input
text. The following code results in more than one match:
string text = "first second";

string reg = @"^([\w]+) ([\w]+)$";
Match m = Regex.Match(text, reg, RegexOptions.CultureInvariant);
foreach (Group g in m.Groups)
{
Console.WriteLine(g.Value);
}
The expression groups are entered in parentheses. The example above returns three
groups; the entire text as the first match, the first word, and the second
word. Expression groups are useful when text needs to broken down and grouped into
several pieces of related text for storage orfurther manipulation.
A Quick Example
In this example, we validate an email address using regular expressions. My regular
expressionworks:
^((([\w]+\.[\w]+)+)|([\w]+))@(([\w]+\.)+)([A-Za-z]{1,3})$
However, this isnt the only expression used to validate email addresses. There are at
least two other ways that I have come across. There are many more.
We write a small C# console application that takes some text as an input, and
determines if the text is an email address.
using System.Text;
string text = Console.ReadLine();
string reg = @"^((([\w]+\.[\w]+)+)|([\w]+))@(([\w]+\.)+)([A-Za-z]{1,3})$";
if (Regex.IsMatch(text, reg))
{
Console.WriteLine("Email.");
}
else
{
Console.WriteLine("Not email.");
}
Try this with a few real and fake email addresses and see if it works. Let me know if
you find an error.
Documentation
Regular expressions are developed differently. The same task can be accomplished
using many different expressions. Expressions created by a developer may be
undecipherable by another.
This is why documenting regular expressions is a very important part of the

development process.The expression code comments often span several lines, and is
worth the effort in case your expression has unintended effects, or if another developer
takes over your code. Enforcing good documentation standards for regular expressions
will ensure that maintenance issues are minimal.
For example, if we document the regular expression for validating email addresses
above, we would write comments like these:
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
Validating email addresses

@"^((([\w]+\.[\w]+)+)|([\w]+))@(([\w]+\.)+)([A-Za-z]{1,3})$"
The expression has three expression
groups.
1. ((([\w]+\.[\w]+)+)|([\w]+))
The LHS of the or clause states
that there may be more than one
sequence of two words with a .
between them.
The RHS of the or clause states
that there may be a single word.
2. (([\w]+\.)+)
This expression states that there
may be as many
words separated by a . between them
as necessary.
3. ([A-Za-z]{1,3})
This expression states that the
last set of characters may be upper
or lowercase letters. There must be
a minimum of 1 and a maximum of 3.
This may be considered a long set of comments for a lot of development standards, but
the expression has been broken down into expression groups. A new developer has
very little difficulty in understanding the function and motivation behind writing the
expression. This practice should be consistently enforced to avoid headaches when
upgrading or debugging software.
Useful Regex Software

If youve used a shell script in *NIX, then youve used grep. Windows has the
PowerGrep tool, which is similar to grep. PowerShell is a another tool which is built on
the .NET Regular Expression engine, and has command line scripting utilities. Espresso
by UltraPico (www.ultrapico.com) is a free Regular Expression Editor which you can use
to build and test your regular expressions.
Conslusion
Regular expressions are an efficient way to search, identify and validate large quantities
of text without having to write any comparisons. Although they may be complicated,
writing and documenting regular expressions allows the developer to concentrate on
more important parts of the implementation process. The use of several free and open
source regular expression tools makes understanding and building regular expressions
a worthwhile task.
To download this technical article in PDF format, go to the Coactum Solutions website
athttp://www.coactumsolutions.com/Articles.aspx.
C# Regex.Match
Regex.Match searches strings
based on a pattern. It isolates
part of a string based on the
pattern specified. It requires that
you use the text-processing
language for the pattern. It
proves to be useful and effective
in many C# programs.
String
Input and output required for examples
Input string:
/content/some-page.aspx
Required match: some-page
Input string:
/content/alternate-1.aspx
Required match: alternate-1
Input string:
/images/something.png
Required match: -
Example
We first see how you can match
the filename in a directory path
with Regex. This has more
constraints regarding the
acceptable characters than many
methods have. You can see the
char range in the second
parameter to Regex.Match.
Program that uses Regex.Match: C#
using System;
class Program
{
static void Main()
{
// First we see the input string.
string input = "/content/alternate-1.aspx";
// Here we call Regex.Match.

Match match = Regex.Match(input, @"content/([A-Za-z0-9\]+)\.aspx$",
RegexOptions.IgnoreCase);
// Here we check the Match instance.

if (match.Success)
{
// Finally, we get the Group value and display it.
string key = match.Groups[1].Value;
Console.WriteLine(key);
}
}
}
Output
alternate-1
In this example, we use the @

verbatim string syntax, which
designates the syntax we can
use in the pattern. Its pattern
starts with "content/". We
require that our group, which is
in parentheses, is after the
"content/" string.
String Literal
Also:The symbols in the "["
and "]" are ranges of
characters, or single
characters. These are the
allowed characters in our
group.
What it captures from the
string. It captures a Group. The
content in the parentheses,
Group, is collected. Then we
require that the match succeeds,
and then we access the value
with Groups[1].
Tip:It is important to note
that the indexing of the
Groups collection on Match
objects starts at 1.
And:Some computer
languages start with 1,
but the C# language
usually does not. It does
here, and we must

remember this.
ToLower
Using ToLower instead of

RegexOptions.IgnoreCase on the
Regex yielded a 10% or higher
improvement. Since I needed a
lowercase result, calling the C#
string ToLower method first was
simpler.
ToLower
Program that also uses Regex.Match: C#
using System;
class Program
{
static void Main()
{
// This is the input string.
// Here we lowercase our input first.

input = input.ToLower();
Match match = Regex.Match(input, @"content/([A-Za-z0-9\]+)\.aspx$");
}
}
Static Regex
Here we see that using a Regex

instance object is faster than
using the static Regex.Match. For
performance, you should always
use an instance object. It can be
shared throughout the entire
project.
Static Regex
Program that uses static Regex: C#
using System;
class Program
{
static void Main()
{
// The input string again.
// This calls the static method specified.

Console.WriteLine(RegexUtil.MatchKey(input));
}
static class RegexUtil

{
static Regex _regex = new Regex(@"/content/([a-z0-9\]+)\.aspx$");
/// <summary>
/// This returns the key that is matched within the input.
/// </summary>
static public string MatchKey(string input)
{
Match match = _regex.Match(input.ToLower());
if (match.Success)
{
return match.Groups[1].Value;
}
else
{
return null;
}
}
}
Output
alternate-1
This static class stores an

instance Regex that can be used
project-wide. We initialize it
inline. The custom method
exposes a MatchKey method.
This is a useful method I
developed to return the string
that we want from the input
value.
Static Class
Pattern description. It uses a
letter range. In this code I show
the Regex with the "A-Z" range
removed, because the string is
already lowercased. I found that
removing as many options from
the Regex as possible boosted
performance.
Tip:With this code, I found
that using
RegexOptions.RightToLeft
made the pattern slightly
faster as well.
Note:The expression
engine has to evaluate
fewer characters in this

case. But this option could
slow down or speed up
your Regex.
Numbers
One common requirement is

extracting a number from a
string. We can do this with
Regex.Match. Match handles only
one numberif a string has
more than one, use instead
Regex.Matches.
Next:We extract a group of
digit characters and access
the Value string
representation of that
number.
Also:To parse the
number, use int.Parse or
int.TryParse on the Value
here. This will convert it to
an int.
int.Parseint.TryParse
Program that uses Match on numbers: C#
using System;
class Program
{
static void Main()
{
// ... Input string.
string input = "Dot Net 100 Perls";
// ... One or more digits.

Match m = Regex.Match(input, @"\d+");
// ... Write value.

Console.WriteLine(m.Value);
}
}
Output
100
Performance
You can add the

RegexOptions.Compiled flag for a
substantial performance gain at
runtime. This will however make
your program start up slower.
With RegexOptions.Compiled we
see often 30% better
performance.
RegexOptions.CompiledPerformance
Summary
We used Regex.Match. This

method extracts a single match
from the input string. We can
access the matched data with
the Value property. And similar
methods, such as IsMatch and
Matches, are often helpful.
IsMatchMatches
How to: Search Strings Using

Regular Expressions (C#
Programming Guide)
Visual Studio 2008
Other Versions
The System.Text.RegularExpressions.Regex class can be used to search strings. These searches can range
in complexity from very simple to making full use of regular expressions. The following are two examples
of string searching by using the Regex class. For more information, see .NET Framework Regular
Expressions.
Example
The following code is a console application that performs a simple case-insensitive search of the strings in
an array. The static method Regex.IsMatch performs the search given the string to search and a string that
contains the search pattern. In this case, a third argument is used to indicate that case should be ignored.
For more information, see System.Text.RegularExpressions.RegexOptions.
C#
class TestRegularExpressions
{
static void Main()
{
string[] sentences =
{
"C# code",
"Chapter 2: Writing Code",
"Unicode",
"no match here"
};
string sPattern = "code";
foreach (string s in sentences)
{
System.Console.Write("{0,24}", s);
if (System.Text.RegularExpressions.Regex.IsMatch(s, sPattern,
System.Text.RegularExpressions.RegexOptions.IgnoreCase))
{
System.Console.WriteLine(" (match for '{0}' found)", sPattern);
}
else
{
System.Console.WriteLine();
}
}
// Keep the console window open in debug mode.
System.Console.WriteLine("Press any key to exit.");
System.Console.ReadKey();
}
}
/* Output:
C# code (match for 'code' found)
Chapter 2: Writing Code (match for 'code' found)
Unicode (match for 'code' found)
no match here
*/
The following code is a console application that uses regular expressions to validate the format of each
string in an array. The validation requires that each string take the form of a telephone number in which
three groups of digits are separated by dashes, the first two groups contain three digits, and the third
group contains four digits. This is done by using the regular expression ^\\d{3}-\\d{3}-\\d{4}$. For
more information, see Regular Expression Language - Quick Reference.
C#
class TestRegularExpressionValidation
{
static void Main()
{
string[] numbers =
{
"123-555-0190",
"444-234-22450",
"690-555-0178",
"146-893-232",
"146-555-0122",
"4007-555-0111",
"407-555-0111",
"407-2-5555",
};
string sPattern = "^\\d{3}-\\d{3}-\\d{4}$";
foreach (string s in numbers)
{
System.Console.Write("{0,14}", s);
if (System.Text.RegularExpressions.Regex.IsMatch(s, sPattern))
{
System.Console.WriteLine(" - valid");
}
else
{
System.Console.WriteLine(" - invalid");
}
}
// Keep the console window open in debug mode.
System.Console.WriteLine("Press any key to exit.");
System.Console.ReadKey();
}
}
/* Output:
123-555-0190
444-234-22450
690-555-0178
146-893-232
146-555-0122
4007-555-0111
407-555-0111
407-2-5555
*/
valid
invalid
valid
invalid
valid
invalid
valid
invalid

Re Expression

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Re Expression

Transféré par

Droits d'auteur :

Formats disponibles

C# - Regular Expressions

Constructs for Defining Regular Expressions

The Regex Class

public bool IsMatch( string input )

public bool IsMatch( string input, int startat )

public static bool IsMatch( string input, string pattern )

public MatchCollection Matches( string input )

public string Replace( string input, string replacement )

public string[] Split( string input )

Matching words that start with 'S':

C# Regex: Checking for a-z and A-Z

up I want to check if a string inputted in a character between a-z or A-Z.

private static bool isValid(String str)

share|improve this question

edited May 16 '11 at

asked May 16 '11 at

up The right way would be like so:

Using the static method instead of creating a new instance every

answered May 16 '11

+ one time or the more

Creating Regular Expressions

answered May 16 '11

Using Regular Expressions in C# .NET

bool match = Regex.IsMatch

string text = "first second";

This is why documenting regular expressions is a very important part of the

Validating email addresses

Useful Regex Software

Required match: some-page

Required match: alternate-1

// Here we call Regex.Match.

// Here we check the Match instance.

In this example, we use the @

here, and we must

Using ToLower instead of

// Here we lowercase our input first.

Here we see that using a Regex

// This calls the static method specified.

static class RegexUtil

This static class stores an

fewer characters in this

One common requirement is

// ... One or more digits.

// ... Write value.

You can add the

We used Regex.Match. This

How to: Search Strings Using

Vous aimerez peut-être aussi