Skip to content

Regular Expressions (RegEx)

Regular Expressions, commonly known as RegEx or RegExp, are patterns used to match, search, and manipulate text. They provide a powerful and concise way to work with strings, making tasks like validation, searching, and text replacement much easier than using basic string methods.

Why learn RegEx?

RegEx is essential for web development because you’ll frequently need to:

  • Validate user input (emails, phone numbers, passwords)
  • Search and replace text in strings
  • Parse and extract data from text
  • Clean and format user data
  • Process log files or data exports

When to use RegEx:

  • ✅ Validating structured text patterns (email, phone, postal codes)
  • ✅ Searching for complex patterns in text
  • ✅ Replacing text with pattern-based rules
  • ✅ Extracting specific parts of strings

When NOT to use RegEx:

  • ❌ Parsing HTML or XML (use a proper parser like DOMParser)
  • ❌ Simple string operations (use includes(), startsWith(), etc.)
  • ❌ When readability matters more than conciseness
  • ❌ Very complex parsing (consider a parser library)

Learning curve warning

RegEx has a reputation for being difficult to read and write. This is true - RegEx syntax is dense and cryptic. However, the basic patterns you’ll use 80% of the time are not that complex. Start simple, and gradually build up your knowledge.

Creating Regular Expressions

JavaScript provides native support for regular expressions without needing external libraries. You can create them in two ways:

Use forward slashes to define a pattern directly in your code:

// Basic pattern
let pattern = /hello/;

// Pattern with flags
let caseInsensitive = /hello/i;

Advantages:

  • More concise and readable
  • Pattern is compiled at script load time (slightly faster)
  • Most common approach in modern JavaScript

RegExp Constructor

Use the RegExp constructor when you need to build patterns dynamically:

// Basic pattern
let pattern = new RegExp("hello");

// Pattern with flags
let caseInsensitive = new RegExp("hello", "i");

// Dynamic pattern from variable
let searchTerm = "world";
let dynamic = new RegExp(searchTerm, "gi");

Advantages:

  • Allows dynamic pattern creation
  • Useful when pattern comes from user input or variables

Escaping in constructor

When using the constructor, you need double backslashes for escape sequences:

// Literal notation
let pattern1 = /\d+/;

// Constructor - needs double backslash
let pattern2 = new RegExp("\\d+");

// Both match digits, but constructor syntax is more verbose

Testing Patterns

The most basic operation is testing if a pattern exists in a string:

let pattern = /hello/;

// test() - returns true/false
console.log(pattern.test("hello world"));  // true
console.log(pattern.test("goodbye world")); // false

// Case-sensitive by default
let caseSensitive = /Hello/;
console.log(caseSensitive.test("hello")); // false

// Case-insensitive with i flag
let caseInsensitive = /Hello/i;
console.log(caseInsensitive.test("hello")); // true

RegEx Syntax Essentials

RegEx uses special characters to define patterns. Here are the most important ones:

Character Classes

// \d - any digit (0-9)
let digits = /\d+/;
console.log(digits.test("123"));    // true
console.log(digits.test("abc"));    // false

// \w - any word character (letters, digits, underscore)
let word = /\w+/;
console.log(word.test("hello"));    // true
console.log(word.test("hello123")); // true

// \s - any whitespace (space, tab, newline)
let space = /\s+/;
console.log(space.test("hello world")); // true
console.log(space.test("hello"));       // false

// . - any character except newline
let any = /h.llo/;
console.log(any.test("hello")); // true
console.log(any.test("hallo")); // true
console.log(any.test("hllo"));  // false

Negated character classes (uppercase = opposite):

// \D - any non-digit
let nonDigit = /\D/;
console.log(nonDigit.test("123"));  // false
console.log(nonDigit.test("12a3")); // true

// \W - any non-word character
let nonWord = /\W/;
console.log(nonWord.test("hello")); // false
console.log(nonWord.test("hello!")); // true

// \S - any non-whitespace
let nonSpace = /\S/;
console.log(nonSpace.test("   ")); // false
console.log(nonSpace.test(" a ")); // true

Quantifiers

Quantifiers specify how many times a pattern should match:

// + - one or more
let oneOrMore = /\d+/;
console.log(oneOrMore.test("1"));     // true
console.log(oneOrMore.test("123"));   // true
console.log(oneOrMore.test(""));      // false

// * - zero or more
let zeroOrMore = /\d*/;
console.log(zeroOrMore.test(""));     // true
console.log(zeroOrMore.test("123"));  // true

// ? - zero or one (optional)
let optional = /colou?r/;
console.log(optional.test("color"));  // true
console.log(optional.test("colour")); // true

// {n} - exactly n times
let exactly = /\d{3}/;
console.log(exactly.test("12"));    // false
console.log(exactly.test("123"));   // true
console.log(exactly.test("1234"));  // true (contains 3 digits)

// {n,m} - between n and m times
let range = /\d{2,4}/;
console.log(range.test("1"));     // false
console.log(range.test("12"));    // true
console.log(range.test("1234"));  // true
console.log(range.test("12345")); // true (contains 2-4 digits)

// {n,} - n or more times
let minimum = /\d{3,}/;
console.log(minimum.test("12"));   // false
console.log(minimum.test("123"));  // true
console.log(minimum.test("12345")); // true

Anchors

Anchors match positions, not characters:

// ^ - start of string
let startsWith = /^hello/;
console.log(startsWith.test("hello world")); // true
console.log(startsWith.test("world hello")); // false

// $ - end of string
let endsWith = /world$/;
console.log(endsWith.test("hello world")); // true
console.log(endsWith.test("world hello")); // false

// Combined - exact match
let exact = /^hello$/;
console.log(exact.test("hello"));       // true
console.log(exact.test("hello world")); // false
console.log(exact.test("hello!"));      // false

Character Sets

// [abc] - match any character in the set
let vowel = /[aeiou]/;
console.log(vowel.test("hello")); // true (contains 'e' and 'o')
console.log(vowel.test("xyz"));   // false

// [a-z] - match any character in range
let lowercase = /[a-z]+/;
console.log(lowercase.test("hello")); // true
console.log(lowercase.test("HELLO")); // false

// [0-9] - match any digit (same as \d)
let digit = /[0-9]/;
console.log(digit.test("abc123")); // true

// [^abc] - match any character NOT in the set
let notVowel = /[^aeiou]/;
console.log(notVowel.test("aaa")); // false
console.log(notVowel.test("abc")); // true (contains 'b' and 'c')

// Combining ranges
let alphanumeric = /[a-zA-Z0-9]+/;
console.log(alphanumeric.test("Hello123")); // true

Alternation

// | - OR operator
let color = /gray|grey/;
console.log(color.test("gray"));  // true
console.log(color.test("grey"));  // true
console.log(color.test("green")); // false

// Multiple alternatives
let fruit = /apple|banana|orange/;
console.log(fruit.test("I like apples")); // true

Escaping Special Characters

To match special characters literally, escape them with backslash:

// Match a literal dot
let dot = /\./;
console.log(dot.test("hello.world")); // true
console.log(dot.test("hello world")); // false

// Match literal parentheses
let parens = /\(hello\)/;
console.log(parens.test("(hello)")); // true

// Common characters to escape: . * + ? ^ $ { } ( ) | [ ] \ /
let specialChars = /\$\d+\.\d{2}/; // Match $10.99
console.log(specialChars.test("Price: $10.99")); // true

Flags

Flags modify how the pattern matching works:

// i - case-insensitive
let caseInsensitive = /hello/i;
console.log(caseInsensitive.test("HELLO")); // true
console.log(caseInsensitive.test("HeLLo")); // true

// g - global (find all matches, not just first)
let global = /\d/g;
let text = "abc123def456";
console.log(text.match(global)); // ["1", "2", "3", "4", "5", "6"]

// m - multiline (^ and $ match line breaks)
let multiline = /^test$/m;
let multilineText = "line1\ntest\nline3";
console.log(multiline.test(multilineText)); // true

// s - dotAll (. matches newlines) - ES2018
let dotAll = /hello.world/s;
console.log(dotAll.test("hello\nworld")); // true

// u - unicode (proper Unicode handling)
let unicode = /\u{1F600}/u; // 😀 emoji
console.log(unicode.test("😀")); // true

// y - sticky (matches from lastIndex position)
let sticky = /\d/y;
sticky.lastIndex = 3;
console.log(sticky.test("abc123")); // true (starts at index 3)

// d - indices (provides start/end indices of matches) - ES2022
let indices = /(\d+)/d;
let result = indices.exec("abc123def");
console.log(result.indices); // [[3, 6], [3, 6]]

// Combining flags
let combined = /test/gi; // global + case-insensitive

String Methods with RegEx

JavaScript strings have several methods that work with regular expressions:

match() - Find Matches

let text = "The year is 2023, not 2022";

// Without g flag - returns first match with details
let single = text.match(/\d+/);
console.log(single); // ["2023", index: 12, input: "...", groups: undefined]

// With g flag - returns all matches
let all = text.match(/\d+/g);
console.log(all); // ["2023", "2022"]

// No match returns null
let noMatch = text.match(/\d{5}/);
console.log(noMatch); // null

matchAll() - Find All Matches with Details (ES2020)

let text = "Email: john@example.com, jane@test.com";
let pattern = /(\w+)@(\w+\.\w+)/g;

// matchAll returns an iterator
for (let match of text.matchAll(pattern)) {
    console.log(`Full: ${match[0]}`);
    console.log(`User: ${match[1]}`);
    console.log(`Domain: ${match[2]}`);
}
// Full: john@example.com, User: john, Domain: example.com
// Full: jane@test.com, User: jane, Domain: test.com

replace() - Replace Matches

let text = "Hello World";

// Simple replacement
let result1 = text.replace(/World/, "JavaScript");
console.log(result1); // "Hello JavaScript"

// With g flag - replace all occurrences
let text2 = "foo bar foo baz";
let result2 = text2.replace(/foo/g, "FOO");
console.log(result2); // "FOO bar FOO baz"

// Using function for dynamic replacement
let text3 = "I have 2 apples and 5 oranges";
let result3 = text3.replace(/\d+/g, match => parseInt(match) * 2);
console.log(result3); // "I have 4 apples and 10 oranges"

// Using captured groups
let date = "2023-08-27";
let formatted = date.replace(/(\d{4})-(\d{2})-(\d{2})/, "$3/$2/$1");
console.log(formatted); // "27/08/2023"

replaceAll() - Replace All (ES2021)

let text = "foo bar foo baz";

// No need for g flag
let result = text.replaceAll("foo", "FOO");
console.log(result); // "FOO bar FOO baz"

// Works with RegEx too (but requires g flag)
let result2 = text.replaceAll(/foo/g, "FOO");
console.log(result2); // "FOO bar FOO baz"

search() - Find Position

let text = "The year is 2023";

// Returns index of first match
let position = text.search(/\d+/);
console.log(position); // 12

// Returns -1 if not found
let notFound = text.search(/xyz/);
console.log(notFound); // -1

split() - Split String

let text = "one,two;three:four";

// Split by multiple delimiters
let parts = text.split(/[,;:]/);
console.log(parts); // ["one", "two", "three", "four"]

// Split on whitespace
let sentence = "Hello   world  from   JavaScript";
let words = sentence.split(/\s+/);
console.log(words); // ["Hello", "world", "from", "JavaScript"]

test() and exec() - Pattern Methods

let pattern = /(\d{4})-(\d{2})-(\d{2})/;
let text = "Date: 2023-08-27";

// test() - returns boolean
console.log(pattern.test(text)); // true

// exec() - returns match details or null
let result = pattern.exec(text);
console.log(result[0]); // "2023-08-27" (full match)
console.log(result[1]); // "2023" (first group)
console.log(result[2]); // "08" (second group)
console.log(result[3]); // "27" (third group)

Capturing Groups

Groups allow you to extract specific parts of a match:

Basic Groups

// Parentheses create capturing groups
let datePattern = /(\d{4})-(\d{2})-(\d{2})/;
let result = datePattern.exec("2023-08-27");

console.log(result[0]); // "2023-08-27" (full match)
console.log(result[1]); // "2023" (group 1)
console.log(result[2]); // "08" (group 2)
console.log(result[3]); // "27" (group 3)

Named Capturing Groups (ES2018)

Named groups make code more readable:

// Syntax: (?<name>pattern)
let datePattern = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
let result = datePattern.exec("2023-08-27");

// Access by name instead of number
console.log(result.groups.year);  // "2023"
console.log(result.groups.month); // "08"
console.log(result.groups.day);   // "27"

// Use in replace
let date = "2023-08-27";
let formatted = date.replace(
    /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/,
    "$<day>/$<month>/$<year>"
);
console.log(formatted); // "27/08/2023"

Non-capturing Groups

Use (?:pattern) when you need grouping but don’t want to capture:

// Without non-capturing group (3 captures)
let pattern1 = /(\d{3})-(\d{2})-(\d{4})/;
let result1 = pattern1.exec("123-45-6789");
console.log(result1.length); // 4 (full match + 3 groups)

// With non-capturing group (2 captures)
let pattern2 = /(?:\d{3})-(\d{2})-(\d{4})/;
let result2 = pattern2.exec("123-45-6789");
console.log(result2.length); // 3 (full match + 2 groups)
console.log(result2[1]); // "45"
console.log(result2[2]); // "6789"

Common Validation Patterns

Here are battle-tested patterns for common validation tasks:

Email Validation

// Basic email validation
let emailPattern = /^[a-zA-Z0-9._-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;

console.log(emailPattern.test("john.doe@example.com"));  // true
console.log(emailPattern.test("user+tag@domain.co.uk")); // false (doesn't allow +)
console.log(emailPattern.test("invalid@"));              // false

// More permissive pattern
let emailPermissive = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
console.log(emailPermissive.test("user+tag@domain.co.uk")); // true

Email validation complexity

Perfect email validation with RegEx is nearly impossible due to RFC 5322 complexity. For production:

  • Use a simple pattern for basic validation
  • Send verification email for confirmation
  • Consider using a library like validator.js for comprehensive validation

Phone Numbers

// US phone number: (123) 456-7890 or 123-456-7890
let phonePattern = /^\(?(\d{3})\)?[-.\s]?(\d{3})[-.\s]?(\d{4})$/;

console.log(phonePattern.test("(123) 456-7890")); // true
console.log(phonePattern.test("123-456-7890"));   // true
console.log(phonePattern.test("123.456.7890"));   // true
console.log(phonePattern.test("1234567890"));     // true

// Danish phone: +45 12 34 56 78 or 12345678
let danishPhone = /^(\+45\s?)?(\d{8}|\d{2}\s\d{2}\s\d{2}\s\d{2})$/;
console.log(danishPhone.test("+45 12 34 56 78")); // true
console.log(danishPhone.test("12345678"));         // true

Postal Codes

// US ZIP code: 12345 or 12345-6789
let usZip = /^\d{5}(-\d{4})?$/;
console.log(usZip.test("12345"));       // true
console.log(usZip.test("12345-6789"));  // true

// Danish postal code: 4 digits
let danishPostal = /^\d{4}$/;
console.log(danishPostal.test("2100")); // true
console.log(danishPostal.test("1234")); // true

// UK postcode: SW1A 1AA
let ukPostcode = /^[A-Z]{1,2}\d{1,2}\s?\d[A-Z]{2}$/i;
console.log(ukPostcode.test("SW1A 1AA")); // true
console.log(ukPostcode.test("M1 1AE"));   // true

Danish CPR Number

// Format: DDMMYY-XXXX
let cprPattern = /^(\d{2})(\d{2})(\d{2})-?(\d{4})$/;

let result = cprPattern.exec("271082-1234");
if (result) {
    console.log(`Day: ${result[1]}`);    // 27
    console.log(`Month: ${result[2]}`);  // 10
    console.log(`Year: ${result[3]}`);   // 82
    console.log(`Sequence: ${result[4]}`); // 1234
}

// With validation (basic)
function validateCPR(cpr) {
    let match = cprPattern.exec(cpr);
    if (!match) return false;

    let day = parseInt(match[1]);
    let month = parseInt(match[2]);
    let year = parseInt(match[3]);

    // Basic range checks
    return day >= 1 && day <= 31 && 
           month >= 1 && month <= 12 &&
           year >= 0 && year <= 99;
}

console.log(validateCPR("271082-1234")); // true
console.log(validateCPR("321382-5678")); // false (day > 31)

Password Strength

// Requirements:
// - At least 8 characters
// - At least one uppercase letter
// - At least one lowercase letter
// - At least one digit
// - At least one special character

let passwordPattern = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/;

console.log(passwordPattern.test("P@ssw0rd"));     // true
console.log(passwordPattern.test("password"));     // false (no uppercase, digit, special)
console.log(passwordPattern.test("PASSWORD1!"));   // false (no lowercase)
console.log(passwordPattern.test("Pass1!"));       // false (too short)

// More readable version with individual checks
function validatePassword(password) {
    return password.length >= 8 &&
           /[a-z]/.test(password) &&
           /[A-Z]/.test(password) &&
           /\d/.test(password) &&
           /[@$!%*?&]/.test(password);
}

console.log(validatePassword("P@ssw0rd")); // true

URL Validation

// Basic URL validation
let urlPattern = /^(https?:\/\/)?([\da-z.-]+)\.([a-z.]{2,6})([\/\w .-]*)*\/?$/;

console.log(urlPattern.test("http://example.com"));              // true
console.log(urlPattern.test("https://www.example.com/path"));   // true
console.log(urlPattern.test("example.com"));                     // true
console.log(urlPattern.test("not a url"));                       // false

// More comprehensive (with ports and query strings)
let urlComprehensive = /^https?:\/\/[^\s/$.?#].[^\s]*$/i;
console.log(urlComprehensive.test("https://example.com:8080/path?query=value#anchor")); // true

Real-World Examples

Form Validation

function validateForm(formData) {
    let errors = [];

    // Email validation
    let emailPattern = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
    if (!emailPattern.test(formData.email)) {
        errors.push("Invalid email address");
    }

    // Phone validation (flexible format)
    let phonePattern = /^\+?[\d\s()-]{10,}$/;
    if (!phonePattern.test(formData.phone)) {
        errors.push("Invalid phone number");
    }

    // Password strength
    if (formData.password.length < 8) {
        errors.push("Password must be at least 8 characters");
    }
    if (!/[A-Z]/.test(formData.password)) {
        errors.push("Password must contain uppercase letter");
    }
    if (!/[a-z]/.test(formData.password)) {
        errors.push("Password must contain lowercase letter");
    }
    if (!/\d/.test(formData.password)) {
        errors.push("Password must contain a number");
    }

    return {
        valid: errors.length === 0,
        errors: errors
    };
}

// Usage
let result = validateForm({
    email: "user@example.com",
    phone: "+45 12345678",
    password: "MyP@ss123"
});

if (result.valid) {
    console.log("Form is valid!");
} else {
    console.log("Errors:", result.errors);
}

Data Extraction from Text

// Extract all email addresses from text
let text = `
    Contact us at info@example.com or support@example.com.
    Sales: sales@company.org
`;

let emailPattern = /[a-zA-Z0-9._-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g;
let emails = text.match(emailPattern);
console.log(emails); // ["info@example.com", "support@example.com", "sales@company.org"]

// Extract hashtags from social media text
let tweet = "Learning #JavaScript and #RegEx is fun! #WebDev";
let hashtags = tweet.match(/#\w+/g);
console.log(hashtags); // ["#JavaScript", "#RegEx", "#WebDev"]

// Extract prices from text
let productText = "Laptop: $999.99, Mouse: $25.00, Keyboard: $75.50";
let prices = productText.match(/\$\d+\.\d{2}/g);
console.log(prices); // ["$999.99", "$25.00", "$75.50"]

Text Sanitization

// Remove HTML tags
function stripHtml(html) {
    return html.replace(/<[^>]*>/g, "");
}

let html = "<p>Hello <strong>World</strong>!</p>";
console.log(stripHtml(html)); // "Hello World!"

// Sanitize filename (remove invalid characters)
function sanitizeFilename(filename) {
    return filename.replace(/[<>:"/\\|?*]/g, "_");
}

console.log(sanitizeFilename("my file?.txt")); // "my file_.txt"

// Remove extra whitespace
function normalizeWhitespace(text) {
    return text.replace(/\s+/g, " ").trim();
}

let messy = "  Hello    world   \n  from    JavaScript  ";
console.log(normalizeWhitespace(messy)); // "Hello world from JavaScript"

String Formatting

// Format phone number
function formatPhone(phone) {
    let cleaned = phone.replace(/\D/g, "");
    let match = cleaned.match(/^(\d{3})(\d{3})(\d{4})$/);
    if (match) {
        return `(${match[1]}) ${match[2]}-${match[3]}`;
    }
    return phone;
}

console.log(formatPhone("1234567890")); // "(123) 456-7890"

// Format credit card number
function formatCreditCard(number) {
    let cleaned = number.replace(/\s/g, "");
    return cleaned.replace(/(\d{4})/g, "$1 ").trim();
}

console.log(formatCreditCard("1234567890123456")); // "1234 5678 9012 3456"

// Slug generation for URLs
function createSlug(title) {
    return title
        .toLowerCase()
        .replace(/[^\w\s-]/g, "")    // Remove non-word chars
        .replace(/\s+/g, "-")         // Replace spaces with -
        .replace(/--+/g, "-")         // Replace multiple - with single -
        .trim();
}

console.log(createSlug("Hello World! This is a Test.")); // "hello-world-this-is-a-test"

Advanced Features

Lookahead and Lookbehind Assertions

Assertions check for patterns without including them in the match:

// Positive lookahead (?=pattern) - matches if pattern follows
let followedBy = /\d+(?=px)/g;
let text = "10px 20em 30px";
console.log(text.match(followedBy)); // ["10", "30"] (numbers followed by 'px')

// Negative lookahead (?!pattern) - matches if pattern doesn't follow
let notFollowedBy = /\d+(?!px)/g;
console.log(text.match(notFollowedBy)); // ["20"] (number not followed by 'px')

// Positive lookbehind (?<=pattern) - matches if pattern precedes (ES2018)
let precededBy = /(?<=\$)\d+/g;
let prices = "Item costs $50 and €30";
console.log(prices.match(precededBy)); // ["50"] (numbers preceded by '$')

// Negative lookbehind (?<!pattern) - matches if pattern doesn't precede (ES2018)
let notPrecededBy = /(?<!\$)\d+/g;
console.log(prices.match(notPrecededBy)); // ["30"] (number not preceded by '$')

Practical example - password validation with lookaheads:

// Must contain lowercase, uppercase, and digit
let strongPassword = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d).{8,}$/;

console.log(strongPassword.test("Password1"));  // true
console.log(strongPassword.test("password1"));  // false (no uppercase)
console.log(strongPassword.test("PASSWORD1"));  // false (no lowercase)
console.log(strongPassword.test("Password"));   // false (no digit)

Unicode Property Escapes (ES2018)

Match characters by Unicode properties:

// Match any letter from any language
let anyLetter = /\p{Letter}/u;
console.log(anyLetter.test("a"));  // true
console.log(anyLetter.test("å"));  // true
console.log(anyLetter.test("中")); // true

// Match emoji
let emoji = /\p{Emoji}/u;
console.log(emoji.test("😀")); // true
console.log(emoji.test("a"));  // false

// Match specific scripts
let cyrillic = /\p{Script=Cyrillic}/u;
console.log(cyrillic.test("Я"));  // true (Russian)
console.log(cyrillic.test("a"));  // false

// Match currency symbols
let currency = /\p{Currency_Symbol}/u;
console.log(currency.test("$"));  // true
console.log(currency.test("€"));  // true
console.log(currency.test("¥"));  // true

Greedy vs Lazy Matching

By default, quantifiers are greedy (match as much as possible):

let text = "<div>Hello</div><div>World</div>";

// Greedy - matches as much as possible
let greedy = /<div>.*<\/div>/;
console.log(text.match(greedy)[0]); 
// "<div>Hello</div><div>World</div>" (entire string)

// Lazy - matches as little as possible (add ?)
let lazy = /<div>.*?<\/div>/;
console.log(text.match(lazy)[0]); 
// "<div>Hello</div>" (first div only)

// More examples
let html = '<a href="link1">text1</a> and <a href="link2">text2</a>';

// Greedy
console.log(html.match(/href=".*"/)[0]); 
// href="link1">text1</a> and <a href="link2" (too much!)

// Lazy
console.log(html.match(/href=".*?"/)[0]); 
// href="link1" (correct!)

Testing and Debugging RegEx

Online Tools

RegEx can be tricky to write and debug. Use these tools:

Testing in Code

// Create test suite for your patterns
function testPattern(pattern, tests) {
    console.log(`Testing: ${pattern}`);
    tests.forEach(test => {
        let result = pattern.test(test.input);
        let status = result === test.expected ? "✓" : "✗";
        console.log(`${status} "${test.input}" => ${result}`);
    });
}

// Usage
let emailPattern = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
testPattern(emailPattern, [
    { input: "user@example.com", expected: true },
    { input: "invalid.email", expected: false },
    { input: "user@domain", expected: false },
    { input: "@example.com", expected: false }
]);

Common Pitfalls

// Pitfall 1: Forgetting to escape special characters
let wrong = /user.name/; // . matches any character!
console.log(wrong.test("user@name")); // true (oops!)

let correct = /user\.name/; // Escaped dot
console.log(correct.test("user@name")); // false
console.log(correct.test("user.name")); // true

// Pitfall 2: Not anchoring patterns
let wrong2 = /\d{3}/; // Matches 3 digits anywhere
console.log(wrong2.test("abc123def")); // true

let correct2 = /^\d{3}$/; // Exactly 3 digits, nothing else
console.log(correct2.test("abc123def")); // false
console.log(correct2.test("123")); // true

// Pitfall 3: Catastrophic backtracking (performance issue)
let dangerous = /^(a+)+$/; // Can cause browser freeze!
// This pattern has exponential time complexity
// For input "aaaaaaaaaaaaaaaaaaaaX" it tries billions of combinations

// Better approach
let safe = /^a+$/; // Linear time complexity

// Pitfall 4: Reusing regex with g flag
let pattern = /\d/g;
console.log(pattern.test("123")); // true
console.log(pattern.test("123")); // true
console.log(pattern.test("123")); // false (lastIndex is at end!)
// Solution: Create new regex each time or reset lastIndex
pattern.lastIndex = 0;

Best Practices

When to Use RegEx

// ✅ DO: Use RegEx for pattern matching
let email = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
if (email.test(userInput)) { /* ... */ }

// ❌ DON'T: Use RegEx for simple string operations
// Bad:
let hasHello = /hello/.test(str);
// Good:
let hasHello = str.includes("hello");

// Bad:
let startsWithHello = /^hello/.test(str);
// Good:
let startsWithHello = str.startsWith("hello");

Keep It Simple

// ❌ DON'T: Write complex, unreadable patterns
let complex = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/;

// ✅ DO: Break into multiple simple checks
function isStrongPassword(password) {
    return password.length >= 8 &&
           /[a-z]/.test(password) &&
           /[A-Z]/.test(password) &&
           /\d/.test(password) &&
           /[@$!%*?&]/.test(password);
}

Comment Complex Patterns

// ✅ DO: Add comments for complex patterns
let phonePattern = 
    /^                      # Start of string
     \(?                    # Optional opening parenthesis
     (\d{3})                # Area code (3 digits)
     \)?                    # Optional closing parenthesis
     [-.\s]?                # Optional separator
     (\d{3})                # Exchange (3 digits)
     [-.\s]?                # Optional separator
     (\d{4})                # Line number (4 digits)
     $/x;

// Or use regular comments
let emailPattern = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
// Matches: username + @ + domain + . + extension
// username: anything except whitespace and @
// domain: anything except whitespace and @
// extension: anything except whitespace and @

Validate on Both Client and Server

// ✅ DO: Use RegEx for client-side validation
function validateEmailClient(email) {
    let pattern = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
    return pattern.test(email);
}

// ✅ DO: Also validate on server
// Client-side validation can be bypassed!
// Always validate critical data server-side as well

Consider Alternatives

// For HTML parsing - DON'T use RegEx
// ❌ Bad:
let tags = html.match(/<[^>]+>/g); // Unreliable!

// ✅ Good:
let parser = new DOMParser();
let doc = parser.parseFromString(html, "text/html");
let tags = doc.getElementsByTagName("*");

// For complex validation - use a library
// ❌ Bad:
let complexEmailPattern = /^[a-zA-Z0-9...very long pattern...$/;

// ✅ Good:
import validator from 'validator';
validator.isEmail(email);

Summary

Regular expressions are a powerful tool for pattern matching and text manipulation in JavaScript. While they have a steep learning curve, mastering the basics will significantly improve your ability to work with strings.

Key Takeaways

Creating RegEx:

  • Use literal notation /pattern/flags for static patterns
  • Use new RegExp() for dynamic patterns
  • Remember double backslashes in constructor: new RegExp("\\d+")

Essential Syntax:

  • Character classes: \d (digit), \w (word), \s (space), . (any)
  • Quantifiers: + (1+), * (0+), ? (0-1), {n} (exactly n), {n,m} (range)
  • Anchors: ^ (start), $ (end)
  • Sets: [abc] (one of), [^abc] (not one of), [a-z] (range)
  • Escape special chars: \. \* \+ \? \$ etc.

Common Flags:

  • i - case-insensitive
  • g - global (find all matches)
  • m - multiline (^ and $ match lines)
  • s - dotAll (. matches newlines) - ES2018
  • u - Unicode mode
  • d - indices (match positions) - ES2022

String Methods:

  • test() - returns true/false
  • match() - returns array of matches
  • matchAll() - returns iterator with details - ES2020
  • replace() / replaceAll() - replace matches
  • search() - find position
  • split() - split by pattern

Modern Features:

  • Named groups (?<name>...) - ES2018
  • Lookbehind (?<=...) and (?<!...) - ES2018
  • Unicode properties \p{Letter} - ES2018
  • matchAll() for iteration - ES2020
  • d flag for indices - ES2022

Best Practices:

  • Keep patterns simple and readable
  • Comment complex patterns
  • Use online tools (regex101.com) for testing
  • Break complex logic into multiple simple checks
  • Escape special characters when matching literally
  • Validate on both client AND server
  • Use startsWith(), includes() etc. for simple operations
  • Don’t parse HTML with RegEx - use proper parsers
  • Watch for catastrophic backtracking in complex patterns

Common Use Cases:

  • Form validation (email, phone, passwords)
  • Data extraction (URLs, hashtags, prices)
  • Text sanitization (removing HTML, special chars)
  • String formatting (phone numbers, credit cards, slugs)
  • Search and replace with patterns

When NOT to use RegEx:

  • Simple string checks (use includes(), startsWith())
  • HTML/XML parsing (use DOMParser or libraries)
  • Very complex validation (use validator libraries)
  • When readability matters more than conciseness

RegEx is a skill that improves with practice. Start with simple patterns, use testing tools, and gradually build more complex expressions as needed. Remember: a simple, readable solution is usually better than a clever but incomprehensible RegEx pattern!