Online Book Reader

Home Category

MariaDB Crash Course - Ben Forta [27]

By Root 526 0
match those characters, how would you do so? For example, if you wanted to find values that contain the . character, how would you search for it? Look at this example:

Input

SELECT vend_name

FROM vendors

WHERE vend_name REGEXP '.'

ORDER BY vend_name;

Output

+----------------+

| vend_name |

+----------------+

| ACME |

| Anvils R Us |

| Furball Inc. |

| Jet Set |

| Jouets Et Ours |

| LT Supplies |

+----------------+

Analysis

That did not work. . matches any character, and so every row was retrieved.

To match special characters they must be preceded by \\. So, \\- means find – and \\. means find .

Input

SELECT vend_name

FROM vendors

WHERE vend_name REGEXP '\\.'

ORDER BY vend_name;

Output

+--------------+

| vend_name |

+--------------+

| Furball Inc. |

+--------------+

Analysis

That worked. \\. matches ., and so only a single row was retrieved. This process is known as escaping, and all characters that have special significance within regular expressions must be escaped this way. This includes ., |, [], and all the other special characters used thus far.

\\ is also used to refer to metacharacters (characters that have specific meanings), as listed in Table 9.1.

Table 9.1 Whitespace Metacharacters

* * *

Tip: To Match \

To match the backslash character itself (\), you need to use \\\.

* * *

* * *

Note: \ or \\?

Most regular expression implementations use a single backslash to escape special characters to be able to use them as literals. MariaDB, however, requires two backslashes (MariaDB itself interprets one, and the regular expression library interprets the other).

* * *

Matching Character Classes


There are matches that you’ll find yourself using frequently—digits, or all alphabetical characters, or all alphanumerical characters, and so on. To make working with these easier, you may use predefined character sets known as character classes. Table 9.2 lists the character classes and what they mean.

Table 9.2 Character Classes

Matching Multiple Instances


All the regular expressions used thus far attempt to match a single occurrence. If there is a match, the row is retrieved, and if not, nothing is retrieved. But sometimes you require greater control over the number of matches. For example, you might want to locate all numbers regardless of how many digits the number contains, or you might want to locate a word but also be able to accommodate a trailing s if one exists, and so on.

This can be accomplished using the regular expressions repetition metacharacters, listed in Table 9.3.

Table 9.3 Repetition Metacharacters

Following are some examples.

Input

SELECT prod_name

FROM products

WHERE prod_name REGEXP '\\([0-9] sticks?\\)'

ORDER BY prod_name;

Output

+----------------+

| prod_name |

+----------------+

| TNT (1 stick) |

| TNT (5 sticks) |

+----------------+

Analysis

Regular expression \\([0-9] sticks?\\) requires some explanation. \\( matches (, [0-9] matches any digit (1 and 5 in this example), sticks? matches stick and sticks (the ? after the s makes that s optional because ? matches 0 or 1 occurrence of whatever it follows), and \\) matches the closing ). Without ? it would have been difficult to match both stick and sticks.

Here’s another example. This time we try to match four consecutive digits:

Input

SELECT prod_name

FROM products

WHERE prod_name REGEXP '[[:digit:]]{4}'

ORDER BY prod_name;

Output

+--------------+

| prod_name |

+--------------+

| JetPack 1000 |

| JetPack 2000 |

+--------------+

Analysis

As explained previously, [:digit:] matches any digit, and so [[:digit:]] is a set of digits. {4} requires exactly four occurrences of whatever it follows (any digit), and so [[:digit:]]{4} matches any four consecutive digits.

It is worth noting that when using regular expressions there is almost always more than one way to write a specific expression. The previous example could have also been written as follows:

Input

SELECT prod_name

FROM products

WHERE prod_name REGEXP '[0-9][0-9][0-9][0-9]'

ORDER BY prod_name;

Return Main Page Previous Page Next Page

®Online Book Reader