Extracting structured pieces of text from a free-form cell — emails out of a notes column, phone numbers out of a contact field, the domain out of a URL — is one of the most common Google Sheets tasks. This tutorial covers the regular-expression family (REGEXEXTRACT, REGEXMATCH, REGEXREPLACE) and the simpler positional functions (LEFT, RIGHT, MID, FIND, SEARCH) and shows when each fits.
Pick the right tool
If the field has a fixed format and the piece you want sits at a known position ("the first three characters"), the LEFT/RIGHT/MID family is faster and easier to reason about. If the piece is identified by a pattern ("anything that looks like an email"), regex wins. Mixing the two is fine: use LEFT for the obvious slice and REGEXEXTRACT for the messy bit.
REGEXEXTRACT in 60 seconds
=REGEXEXTRACT(A2, "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}")
REGEXEXTRACT returns the first match, or #N/A if there is no match. The pattern uses Google's RE2 dialect: most things you expect from PCRE work, but lookbehinds and some advanced features do not. Wrap in IFERROR for missing matches: =IFERROR(REGEXEXTRACT(A2, pattern), "").
REGEXMATCH for true/false tests
=IF(REGEXMATCH(A2, "^\d{5}(-\d{4})?\$"), "Valid ZIP", "Invalid")
REGEXMATCH returns TRUE/FALSE. Use it inside IF, FILTER, or COUNTIF criteria. The example above accepts five-digit ZIPs and ZIP+4. Always anchor your patterns with ^ and \$ when validating — without them, a substring match counts as success and you accept invalid data.
REGEXREPLACE for cleaning
=REGEXREPLACE(A2, "[^0-9]", "")
REGEXREPLACE substitutes every match. The pattern above strips everything that is not a digit — handy for normalizing phone numbers before comparison. Use capture groups in the pattern and $1, $2 in the replacement to keep specific pieces while reformatting around them.
Worked examples
Pull the email out of a notes field
=IFERROR(REGEXEXTRACT(A2, "[\w.+-]+@[\w-]+\.[\w.-]+"), "")
Pattern accepts most real-world email shapes; tighten to your own validation rules if needed.
Domain from a URL
=REGEXEXTRACT(A2, "https?://(?:www\.)?([^/]+)")
Capture group returns the domain without the scheme or leading www.
Strip non-digits from a phone number
=REGEXREPLACE(A2, "\D", "")
\D is shorthand for any non-digit character.
Common mistakes (and how to avoid them)
- Forgetting that backslashes in the formula need to be doubled when the pattern is inside a quoted string.
- Using lookbehinds — Google Sheets' RE2 engine does not support them. Restructure with capture groups instead.
- Returning
#N/Ainto a downstream calculation. Always wrap REGEXEXTRACT in IFERROR for resilience. - Validating without anchors.
REGEXMATCH(A2, "\d{5}")accepts "abc12345xyz" — anchor with^and\$.
Tips that pay off later
- Use named capture groups (when migrating from another engine) by switching to plain groups in RE2 — it does not support named groups.
- Test patterns in a scratch cell against three or four real values before pasting them into a column-wide ARRAYFORMULA.
- When you need multiple matches per cell, switch to
SPLITwith a regex separator, or call REGEXEXTRACT in a loop withBYROW.
When to reach for something else
If the source data is genuinely tabular but split across one column, SPLIT followed by trimming is faster and easier to maintain than three regex extracts.