| Name | Description | NOT REGEXP |
Negation of REGEXP |
|---|---|
REGEXP |
Pattern matching using regular expressions |
RLIKE |
Synonym for REGEXP |
A regular expression is a powerful way of specifying a pattern for a complex search.
MySQL uses Henry Spencer’s implementation of regular expressions, which is aimed at conformance with POSIX 1003.2. See Appendix G, Credits. MySQL uses the extended version to support pattern-matching operations performed with the REGEXP operator in SQL statements. See Section 3.3.4.7, “Pattern Matching”, and Section 11.4.1, “String Comparison Functions”.
This section is a summary, with examples, of the special characters and constructs that can be used in MySQL for REGEXP operations. It does not contain all the details that can be found in Henry Spencer’s regex(7) manual page. That manual page is included in MySQL source distributions, in the regex.7 file under the regex directory.
A regular expression describes a set of strings. The simplest regular expression is one that has no special characters in it. For example, the regular expression hello matches hello and nothing else.
Non-trivial regular expressions use certain special constructs so that they can match more than one string. For example, the regular expression hello|word matches either the string hello or the string word.
As a more complex example, the regular expression B[an]*s matches any of the strings Bananas, Baaaaas, Bs, and any other string starting with a B, ending with an s, and containing any number of a or n characters in between.
A regular expression for the REGEXP operator may use any of the following special characters and constructs:
-
^Match the beginning of a string.mysql>
SELECT 'fo\nfo' REGEXP '^fo$';-> 0 mysql>SELECT 'fofo' REGEXP '^fo';-> 1 -
$Match the end of a string.mysql>
SELECT 'fo\no' REGEXP '^fo\no$';-> 1 mysql>SELECT 'fo\no' REGEXP '^fo$';-> 0 -
.Match any character (including carriage return and newline).mysql>
SELECT 'fofo' REGEXP '^f.*$';-> 1 mysql>SELECT 'fo\r\nfo' REGEXP '^f.*$';-> 1 -
a*Match any sequence of zero or moreacharacters.mysql>
SELECT 'Ban' REGEXP '^Ba*n';-> 1 mysql>SELECT 'Baaan' REGEXP '^Ba*n';-> 1 mysql>SELECT 'Bn' REGEXP '^Ba*n';-> 1 -
a+Match any sequence of one or moreacharacters.mysql>
SELECT 'Ban' REGEXP '^Ba+n';-> 1 mysql>SELECT 'Bn' REGEXP '^Ba+n';-> 0 -
a?Match either zero or oneacharacter.mysql>
SELECT 'Bn' REGEXP '^Ba?n';-> 1 mysql>SELECT 'Ban' REGEXP '^Ba?n';-> 1 mysql>SELECT 'Baan' REGEXP '^Ba?n';-> 0 -
de|abcMatch either of the sequencesdeorabc.mysql>
SELECT 'pi' REGEXP 'pi|apa';-> 1 mysql>SELECT 'axe' REGEXP 'pi|apa';-> 0 mysql>SELECT 'apa' REGEXP 'pi|apa';-> 1 mysql>SELECT 'apa' REGEXP '^(pi|apa)$';-> 1 mysql>SELECT 'pi' REGEXP '^(pi|apa)$';-> 1 mysql>SELECT 'pix' REGEXP '^(pi|apa)$';-> 0 -
(abc)*Match zero or more instances of the sequenceabc.mysql>
SELECT 'pi' REGEXP '^(pi)*$';-> 1 mysql>SELECT 'pip' REGEXP '^(pi)*$';-> 0 mysql>SELECT 'pipi' REGEXP '^(pi)*$';-> 1 -
{1},{2,3}{n}or{m,n}notation provides a more general way of writing regular expressions that match many occurrences of the previous atom (or “piece”) of the pattern.mandnare integers.-
a*Can be written asa{0,}. -
a+Can be written asa{1,}. -
a?Can be written asa{0,1}.
To be more precise,
a{n}matches exactlyninstances ofa.a{n,}matchesnor more instances ofa.a{m,n}matchesmthroughninstances ofa, inclusive.mandnmust be in the range from0toRE_DUP_MAX(default 255), inclusive. If bothmandnare given,mmust be less than or equal ton.mysql>
SELECT 'abcde' REGEXP 'a[bcd]{2}e';-> 0 mysql>SELECT 'abcde' REGEXP 'a[bcd]{3}e';-> 1 mysql>SELECT 'abcde' REGEXP 'a[bcd]{1,10}e';-> 1 -
-
[a-dX],[^a-dX]Matches any character that is (or is not, if ^ is used) eithera,b,c,dorX. A-character between two other characters forms a range that matches all characters from the first character to the second. For example,[0-9]matches any decimal digit. To include a literal]character, it must immediately follow the opening bracket[. To include a literal-character, it must be written first or last. Any character that does not have a defined special meaning inside a[]pair matches only itself.mysql>
SELECT 'aXbc' REGEXP '[a-dXYZ]';-> 1 mysql>SELECT 'aXbc' REGEXP '^[a-dXYZ]$';-> 0 mysql>SELECT 'aXbc' REGEXP '^[a-dXYZ]+$';-> 1 mysql>SELECT 'aXbc' REGEXP '^[^a-dXYZ]+$';-> 0 mysql>SELECT 'gheis' REGEXP '^[^a-dXYZ]+$';-> 1 mysql>SELECT 'gheisa' REGEXP '^[^a-dXYZ]+$';-> 0 -
[.characters.]Within a bracket expression (written using[and]), matches the sequence of characters of that collating element.charactersis either a single character or a character name likenewline. The following table lists the allowable character names.The following table shows the allowable character names and the characters that they match. For characters given as numeric values, the values are represented in octal.
Name Character Name Character NUL0SOH001STX002ETX003EOT004ENQ005ACK006BEL007alert007BS010backspace'\b'HT011tab'\t'LF012newline'\n'VT013vertical-tab'\v'FF014form-feed'\f'CR015carriage-return'\r'SO016SI017DLE020DC1021DC2022DC3023DC4024NAK025SYN026ETB027CAN030EM031SUB032ESC033IS4034FS034IS3035GS035IS2036RS036IS1037US037space' 'exclamation-mark'!'quotation-mark'"'number-sign'#'dollar-sign'$'percent-sign'%'ampersand'&'apostrophe'\''left-parenthesis'('right-parenthesis')'asterisk'*'plus-sign'+'comma','hyphen'-'hyphen-minus'-'period'.'full-stop'.'slash'/'solidus'/'zero'0'one'1'two'2'three'3'four'4'five'5'six'6'seven'7'eight'8'nine'9'colon':'semicolon';'less-than-sign'<'equals-sign'='greater-than-sign'>'question-mark'?'commercial-at'@'left-square-bracket'['backslash'\\'reverse-solidus'\\'right-square-bracket']'circumflex'^'circumflex-accent'^'underscore'_'low-line'_'grave-accent'`'left-brace'{'left-curly-bracket'{'vertical-line'|'right-brace'}'right-curly-bracket'}'tilde'~'DEL177mysql>
SELECT '~' REGEXP '[[.~.]]';-> 1 mysql>SELECT '~' REGEXP '[[.tilde.]]';-> 1 -
[=character_class=]Within a bracket expression (written using[and]),[=character_class=]represents an equivalence class. It matches all characters with the same collation value, including itself. For example, ifoand(+)are the members of an equivalence class, then[[=o=]],[[=(+)=]], and[o(+)]are all synonymous. An equivalence class may not be used as an endpoint of a range. -
[:character_class:]Within a bracket expression (written using[and]),[:character_class:]represents a character class that matches all characters belonging to that class. The following table lists the standard class names. These names stand for the character classes defined in thectype(3)manual page. A particular locale may provide other class names. A character class may not be used as an endpoint of a range.alnumAlphanumeric characters alphaAlphabetic characters blankWhitespace characters cntrlControl characters digitDigit characters graphGraphic characters lowerLowercase alphabetic characters printGraphic or space characters punctPunctuation characters spaceSpace, tab, newline, and carriage return upperUppercase alphabetic characters xdigitHexadecimal digit characters mysql>
SELECT 'justalnums' REGEXP '[[:alnum:]]+';-> 1 mysql>SELECT '!!' REGEXP '[[:alnum:]]+';-> 0 -
[[:<:]],[[:>:]]These markers stand for word boundaries. They match the beginning and end of words, respectively. A word is a sequence of word characters that is not preceded by or followed by word characters. A word character is an alphanumeric character in thealnumclass or an underscore (_).mysql>
SELECT 'a word a' REGEXP '[[:<:]]word[[:>:]]';-> 1 mysql>SELECT 'a xword a' REGEXP '[[:<:]]word[[:>:]]';-> 0
To use a literal instance of a special character in a regular expression, precede it by two backslash (\) characters. The MySQL parser interprets one of the backslashes, and the regular expression library interprets the other. For example, to match the string 1+2 that contains the special + character, only the last of the following regular expressions is the correct one:
mysql>SELECT '1+2' REGEXP '1+2';-> 0 mysql>SELECT '1+2' REGEXP '1\+2';-> 0 mysql>SELECT '1+2' REGEXP '1\\+2';-> 1
Filed under: MySql