The REGEXEXTRACT function allows you to extract text from a string based on a supplied regular expression. You can extract the first match, all matches or capturing groups from the first match.
Syntax
The REGEXEXTRACT function extracts strings within the provided text that matches the pattern.
The syntax of the REGEXEXTRACT function is:
REGEXEXTRACT(text, pattern, [return_mode], [case_sensitivity])
Argument |
Description |
---|---|
text (required) |
The text or the reference to a cell containing the text you want to extract strings from. |
pattern (required) |
The regular expression ("regex") that describes the pattern of text you want to extract. |
return_mode |
A number that specifies what strings you want to extract. By default, return mode is 0. The possible values are: 0: Return the first string that matches the pattern 1: Return all strings that match the pattern as an array 2: Return capturing groups from the first match as an array Note: Capturing groups are parts of a regex pattern surrounded by parentheses "(...)". They allow you to return separate parts of a single match individually. |
case_sensitivity |
Determines whether the match is case-sensitive. By default, the match is case-sensitive. Enter one of the following: 0: Case sensitive 1: Case insensitive |
Notes:
-
When writing regex patterns, symbols called ‘tokens’ can be used that match with a variety of characters. These are some simple tokens for reference:
-
“[0-9]”: any numerical digit
-
“[a-z]”: a character in the range of a to z
-
“.”: any character
-
“a”: the “a” character
-
“a*”: zero or more “a”
-
“a+”: one or more “a”
-
All regular expressions for this function, as well as REGEXTEST and REGEXREPLACE use the PCRE2 'flavor' of regex.
-
REGEXEXTRACT always return text values. You can convert these results back to a number with the VALUE function.
Examples
Copy the example data and paste it in cell A1 of a new Excel worksheet. If you need to, you can adjust the column widths to see all the data.
Extract names based on capital letters with pattern "[A-Z][a-z]+"
Data |
DylanWilliams |
Formulas |
=REGEXEXTRACT(A2,"[A-Z][a-z]+") |
=REGEXEXTRACT(A2,"[A-Z][a-z]+",1) |
The following illustration shows the results.
Extract phone numbers based on their structure with pattern "[0-9()]+ [0-9-]+"
Data |
Sonia Rees (378) 555-4195 Angel Brown (878) 555-8622 Blake Martin (437) 555-8987 William Kirby (619) 555-4212 Avery Smith (579) 555-5658 Parker Jones (346) 555-0925 Liam Small (405) 555-0887 Hollie Rees (666) 555-1872 |
Formulas |
=REGEXEXTRACT(A2,"[0-9()]+ [0-9-]+",1) |
The following illustration shows the result.