REGEXP Operator

The REGEXP operator (or its synonym RLIKE) in SQL is used for searching and manipulating string data using regular expressions. Regular expressions provide powerful capabilities for complex search patterns that are difficult to implement with the LIKE operator.

When to use REGEXP instead of LIKE?

The LIKE operator is convenient for simple search patterns, such as finding strings that start or end with a certain set of characters or contain certain substrings. However, if a more complex and flexible search is required, such as searching by multiple conditions or using special characters and ranges, the REGEXP operator becomes an indispensable tool.

Regular expression syntax

MySQL
... WHERE table_field REGEXP 'pattern';

Where pattern is the regular expression that defines the search pattern.

Important Considerations

  1. Case insensitive

    By default, regular expressions in MySQL are not case-sensitive.
    For example, the expression REGEXP 'abc' will match the string abc, Abc, and ABC.

  2. Special characters

    Some characters have special meanings in regular expressions and require escaping (e.g., ., *, +, ?, [, ], (, ), {, }, |, \). Use a backslash \ for escaping.

Special characters and constructs

Characters and constructsMatches
*0 or more instances of the preceding string
+1 or more instances of the preceding string
.Any single character
?0 or 1 instance of the preceding string
^Matches the start of the string
$Matches the end of the string
[abc]Any character listed in the square brackets
[^abc]Any character not listed in the square brackets
[A-Z]Matches any uppercase letter
[a-z]Matches any lowercase letter
[0-9]Matches any digit
p1|p2|p3Matches any of the patterns p1 or p2 or p3
{n}n instances of the preceding string
{m,n}Between m and n instances of the preceding string

Examples with explanation

  • Get all users whose names start with "John":

    MySQL
    SELECT * FROM Users WHERE name REGEXP '^John'
    
    idnameemailemail_verified_atpasswordphone_number
    18John Travolta[email protected]2016-11-19T12:30:43.000Zfzjhl0v82o0amalr8649+1 202 555 0176
    28Johnny Depp[email protected]2017-05-26T01:19:06.000Zqpp6hbnae42cdhmxlk4j+7 401 195 7363

    This expression searches for strings starting with "John". The ^ symbol indicates the start of the string.

  • Display all school subjects whose names end with the letter “e” or “y”:

    MySQL
    SELECT * FROM  Subject WHERE name REGEXP '[ey]$'
    
    idname
    2Russian language
    3Literature
    5Chemistry
    6Geography
    7History
    8Biology
    9English language
    11Physical Culture
    13Technology

    In this example, [ey] defines a list of possible values for the pattern $, which defines what the string should end with.

  • Find all users whose email addresses end with “@outlook.com” or “@icloud.com”:

    MySQL
    SELECT * FROM Users WHERE email REGEXP '@(outlook.com|icloud.com)$'
    
    idnameemailemail_verified_atpasswordphone_number
    7Samuel L. Jackson[email protected]2018-07-19T11:16:13.000Zi6yvht95527z3idgqx9y+1 202 555 0162
    13Steve Martin[email protected]2016-07-29T04:25:00.000Zw76yphg3kvzg77ilmxfs+1 202 555 0138
    29Pierce Brosnan[email protected]2019-03-08T01:56:00.000Zlqiwecclne9rv8woo2go+7 401 749 3620
    30Sean Connery[email protected]2016-05-21T00:45:17.000Zlyh4jkdxkvtvulvqi5db+7 401 511 6783
    31Bruce Willis[email protected]2016-12-08T20:18:59.000Z0ofa2khvnptiackbssv0+375 154 771 3462

    Here, $ is used to indicate the end of the string and | is used to specify multiple options.

  • Find all users whose phone numbers do not contain the digits “2” and “8”:

    MySQL
    SELECT * FROM Users WHERE phone_number REGEXP '^[^28]*$'
    
    idnameemailemail_verified_atpasswordphone_number
    27Brad Pitt[email protected]2017-02-11T05:45:15.000Z829j2ygocn8btzae49kv+7 401 741 3797
    28Johnny Depp[email protected]2017-05-26T01:19:06.000Zqpp6hbnae42cdhmxlk4j+7 401 195 7363

    In this example, the symbol [^28] represents any character except “2” and “8”, and * means any number of such characters. The ^ and $ symbols indicate the start and end of the string respectively, ensuring that the entire string matches the pattern.

  • Find all users whose phone number starts with «+7»

    MySQL
    SELECT name, phone_number FROM Users WHERE phone_number REGEXP '^\\+7'
    
    namephone_number
    Hideo Kojima+7 401 452 0052
    ClINT Eastwood+7 401 722 0912
    Brad Pitt+7 401 741 3797
    Johnny Depp+7 401 195 7363
    Pierce Brosnan+7 401 749 3620
    Sean Connery+7 401 511 6783

    In this example, ^ denotes the beginning of a string. This means we are looking for strings that start with a specific pattern.

    Since + is a special character in regular expressions, it needs to be escaped with a double backslash (\\) so that it is treated as the literal + character. As a result, \\+ matches the + sign in the string.