Professional C__ - Marc Gregoire [259]
if (regex_match(str, r))
cout << " Valid date." << endl;
else
cout << " Invalid date!" << endl;
}
Code snippet from RegularExpressions\regex_match_dates_1.cpp
The first line creates the regular expression. The expression consists of three parts separated by a forward slash / character, one part for year, one for month, and one for day. The following list explains these parts:
^\d{4}: This will match any combination of four digits, for example 1234, 2010, and so on, at the beginning of the string.
(?:0?[1-9]|1[0-2]): This sub part of the regular expression is wrapped inside parentheses to make sure the precedence is correct. We don’t need any capture group so (?:...) is used. The inner expression consists of an alternation of two parts separated by the | character. 0?[1-9]: This will match any number from 1 to 9 with an optional 0 in front of it. For example it will match 1, 2, 9, 03, 04, and so on. It will not match 0, 10, 11, and so on.
1[0-2]: This will match 10, 11, or 12, nothing else.
(?:0?[1-9]|[1-2][0-9]|3[0-1])$: This sub part is also wrapped inside a non-capture group and consists of an alternation of three parts followed by the end of the string: 0?[1-9]: This is the same as the first part of the month matcher explained above.
[1-2][0-9]: This will match any number between 10 and 29 and nothing else.
3[0-1]: This will match 30 or 31 and nothing else.
The example then enters an infinite loop to ask the user to enter a date. Each date entered is then given to the regex_match() algorithm. When regex_match() returns true the user has entered a date that matches the date regular expression pattern.
This example can be expanded a bit by asking the regex_match() algorithm to return captured sub-expressions in a results object. The following code extracts the year, month, and day digits into three separate integer variables.
To understand this code, you have to understand what a capture group does. By specifying a match_results object like smatch in the call to regex_match(), the elements of the match_results object are filled in when the regular expression matches the string. To be able to extract these sub-strings, you must create capture groups, so although parentheses are not required for grouping in this example, they are used to define new capture groups.
The first element, [0], in a match_results object contains the string that matched the entire pattern. When using regex_match() and a match is found, this is the entire source sequence. When using regex_search(), discussed in the next section, this is a sub-string in the source sequence that matches the regular expression. Element [1] is the sub-string matched by the first capture group, [2] by the second capture group, and so on.
The regular expression in the revised example has a few small changes. The first part matching the year is wrapped in a capture group, while the month and day parts are now also capture groups instead of non-capture groups. The call to regex_match() includes a smatch parameter, which will contain the matched capture groups. Here is the adapted example:
regex r("^(\d{4})/(0?[1-9]|1[0-2])/(0?[1-9]|[1-2][0-9]|3[0-1])$");
while (true) {
cout << "Enter a date (year/month/day) (q=quit): ";
string str;
if (!getline(cin, str) || str == "q")
break;
smatch m;
if (regex_match(str, m, r)) {
int year = atoi(m[1].str().c_str());
int month = atoi(m[2].str().c_str());
int day = atoi(m[3].str().c_str());
cout << " Valid date: Year=" << year
<< ", month=" << month
<< ", day=" << day << endl;
} else {
cout << " Invalid date!" << endl;
}
}
Code snippet from RegularExpressions\regex_match_dates_2.cpp
In this example there are four elements in the smatch results objects, the full match, and three captured groups:
[0]: the string matching the full regular expression, which is the full date in this example
[1]: the year
[2]: the month
[3]: the day
When you execute this example you can get the following output:
Enter a date (year/month/day) (q=quit): 2011/12/01
Valid date: Year=2011, month=12, day=1