Tip

The RegularExpression validator control and a primer on regular expressions

Let other users know how useful this tip is by rating it below. Got a tip or code of your own you'd like to share? Submit it here!


A vision

The job given me by the Almighty Programmer was gatekeeper. The clouds parted below me and I could see a long sinewy line of expressions marching toward me in single file. Some looked like dates, others like digits and some (to be honest) looked like gibberish. One by one, they would try to get past me but I know no fear - for I am the RegularExpressionValidator.

The RegularExpressionValidator control

The purpose of the RegularExpressionValidator control is to filter out unwanted or invalid input. It can be the first line of defense against input that is not formatted the way you want to see it. Correctly formatted expressions get through with no fanfare. Exceptions, however, cause the control to display its error message and the postback process is halted.

To demonstrate how this control works, let's drop a few controls on a Web form. I'm using Visual Studio .NET with code-behind and my language is VB. I've set my form to Flow Control but all that isn't really necessary to understand the principles involved. VS.NET has declared my control automatically with the following statement: Protected WithEvents RegularExpressionValidator1 As System.Web.UI.WebControls.RegularExpressionValidator

The rest is as easy as 1,2,3.

1. First I drop a text box on my form. This is where I enter my test expression. I'll give the control an id of "txtText."

<asp:TextBox id="txtTest" runat="server"> </asp:TextBox>

2. Next I drop a RegularExpressionValidator control on the form. I give this control an ID of "RegularExpressionValidator1" since I'm not feeling very creative today. I set the "ControlToValidate" property to the ID of the control I want to validate (which is the textbox I just created). I set the "ErrorMessage" property to what I want to display if the expression is not valid. I set the "ValidExpression" property to a regular expression that I want to use to compare against the input. In this case, I've just set the expression to the letter "m."

<asp:RegularExpressionValidator id="RegularExpressionValidator1"
runat="server" ControlToValidate="txtTest" ErrorMessage="Sorry. 
You are not a valid expression!" ValidationExpression="m"> </asp:RegularExpressionValidator>

3. Next I drop a button control on the form. This is just so we have a place for the cursor to go after we leave the text box.

<asp:Button id="Button1" runat="server" Text="Submit"> 
</asp:Button>

Now we are ready to test our control that should only allow the letter "m" (lower case) to get by. If we enter "m" and press TAB or click the SUBMIT button, everything looks good -- no error messages. If we leave the text box blank, there is no error message, either because the control only checks input that exists, not the existence of input. If, however, we enter any other character or digit the conrol's error message will be displayed. In this case, the error message is: "Sorry. You are not a valid expression!"

Now that we know how to use the control, what we really need is a primer on regular expressions. Then we can easily drop a RegularExpressionValidator control on our form, associate it with an entry in a text box and set the "ValidationExpression" property to what we want the input to look like.

Regular expressions

The subject of regular expressions is often confusing but we are going to take an approach that will give you a basic understanding upon which you can build. Becoming proficient with regular expressions takes a lot of practice, just like anything else.

Regular expressions let us to search for certain patterns. In the case of our validator control, if we find the pattern in the text box, the control is satisfied and let's the text through to the promise land. If we do NOT find that pattern, the control is not happy and displays its error message. (This is not all we can do with regular expressions. We can also "search and replace" and reformat text. Just be aware that you can use regular expressions to find all instances of some word or phrase and replace it with another or reformat it. You can use them to mine documents for e-mail addresses or URLs.

Character matching

The easiest kind of matching we can do is "character matching." That's what we did when we put the letter "m" in our validator control. We are simply asking if the text is the letter "m" and nothing else. So the lowercase letter "m" passes but the uppercase "M" does not. If you enter "mom," it does not pass because "mom" is not "m" -- it is something more than just "m."

Character matching is not limited to a single character. If you use the regular expression "mom," then the only expression that matches will be "mom." "MOM" will not work, "mommy" will not work and "I want my mom" will not work because none of these expressions are exactly equivalent to "mom." OK, I think we are clear on that point.

A period is a special character that matches any single character. So the regular expression "m.m" would match "mom" or "mam" or "m9m."

You can search for a string that has a single character from a group of predetermined characters. For example: "m[ao]m" would match up with "mam" or "mom" because the middle letter is in the group [ao], but "m9m" would not match because "9" is not an "a" or an "o."

You can search for a string that has a single character that is in a range. For example: "m[a-y]m" will match up with "mam" or "mbm" or "mcm" or any other letter in the middle so long as it is in the range [a-y]. The letter "z" is not in that range and all uppercase letters are not in that range.

Here is a list of the most common special characters that relate to character matching:

{x}  Match exactly x occurrences of a regular expression. 
d{5} Matches five digits such as 12345. 
{x,}  Match x or more occurrences of a regular expression. 
s{2,}  Matches at least two-space characters.  
{x,y}  Matches x to y number of occurrences of a regular expression. 
d{2,3}  Matches at least two but no more than three digits. 
Match zero or one occurrences. Equivalent to {0,1}.
as?b  Matches "ab" or "a b".
Match zero or more occurrences. Equivalent to {0,}
Match one or more occurrences. Equivalent to {1,}

Repetition matching

The question mark is a special character that matches zero or one instances of the character that precedes it. So the regular expression "moms?" would match "mom" because the letter "s" appears zero times after "mom." It would match "moms", of course.

The asterisk is a special character that matches zero or more instances of the character that precedes it. So the regular expression "moms*" would match "mom" because the "s" appears zero times. It would match "moms" because the "s" appears once. It would also match "momsssssssss."

The plus sign is a special character that matches one or more instances of the character that precedes it. So the regular expression "moms+" would match "moms" or "momsssss" but not "mom" because the s has not occurred at least once.

If you wanted to find a match on n instances of the preceding character, you can use {n}. For instance, the regular expression "moms{3}" would only match "momsss" because there are exactly three instances of the letter "s". It would not match "moms" or "momss." If you wanted to match the word "moon," you could use the regular expression "moon" (the most direct method) or you could use "mo{2}n."

Here are some common examples of repetition matching:

{x}  Match exactly x occurrences of a regular expression. 
d{5} Matches five digits such as 12345. 
{x,}  Match x or more occurrences of a regular expression. 
s{2,}  Matches at least two-space characters.  
{x,y}  Matches x to y number of occurrences of a regular expression. 
d{2,3}  Matches at least two but no more than three digits. 
Match zero or one occurrences. Equivalent to {0,1}.
as?b  Matches "ab" or "a b".
Match zero or more occurrences. Equivalent to {0,}
Match one or more occurrences. Equivalent to {1,}

Matching special characters

We've already introduced several "special characters." Special characters are those which have a special meaning. In the above discussion, the period, the asterisk and the plus sign are all "special characters." To match a special character, you have to precede the special character with a "" in the regular expression. Here is a list of the most common ways to match a character which would otherwise have special meaning.

Matches a new line. 
Matches a form feed.
Matches a carriage return.
Matches horizontal tab. 
Matches vertical tab. 
Matches ?
* Matches *
+ Matches +
. Matches .
  Matches

Alternation and grouping

Alternation and grouping is used to develop more complex regular expressions. Grouping a clause to create a clause. May be nested. "(ab)?(c)" matches "abc" or "c."

Alternation combines clauses into one regular expression and then matches any of the individual clauses. "(ab)|(cd)|(ef)" matches "ab" or "cd" or "ef."

More examples and references

1. You can find many ready-made regular expression samples at http://www.regexlib.com/. These include regular expressions for dates, e-mail addresses, URLs, ZIP codes and things like that. Some of these are better than others, so make sure you test and understand the expression before you put it into production.

2. Microsoft has a fairly dense, if somewhat disorganized, coverage of the subject here.

Conclusion

The RegularExpressionValidator Control is powerful and useful if you have a basic understanding of regular expressions. There are benefits to using this control over other methods. (1) Depending on the browser being used, client-side code will be generated for the validation. This means expressions will be validated at the client without having to make the round trip to the server. (Up-level browsers (IE 4.0+) will have the validation rendered in JavaScript/DHTML to enable client-side validation, while down-level browsers will provide strictly server-side validation.) (2) Pattern matching may be faster than other methods. For instance, a date validator is much faster than testing for a valid date using the IsDate() function.

I will talk in more depth about regular expressions in a future article. There is much more to this subject than just the validator control but I think this will get you off to a good start. If you have any ideas, complaints or suggestions, please email me at RogerMcCook@hotmail.com.

Roger D. McCook
McCook Software, Inc.

This was first published in January 2003

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

Disclaimer: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.