Regular Expressions

Regular expressions, often called pattern in Perl, is a template that either matches or doesn’t match a given string.

  • Metacharacters

o When referencing groups, remember that perl takes the last group value

. # match any single character except newline
 \ # use literal value of next character
 * # quantifier, match zero or more times
 () # used for grouping
 + # match at least one
 | # or
 ^ # negation - except for
$_ = "yabba dabba doo baba foo fofo foooo!";
 if ( /(.)[^ba]/ ) { print "$1 \n"; } # a - matched: baba, which is last "ba"
  • Quantifiers

o * The star (*) meas to match the preceding item zero or more times

o + Match at least once or more

o ? Optional – can match or not match

  • Grouping

o Parentheses are used for grouping

$_ = “yabba dabba doo”;
 if (/y((.)(.)\s\2) d\1/) {
 print “It Matched!”;
 }
  • Character Classes

o A list of possible characters inside square brackets ( [ ] )

o Example ( [abcdef] ) means to match anyone of those characters

o ( [a-f] ) is another way to write the example above

o There are character class shortcuts and negation shortcuts, such as:

( [0-9] ) # \d
 ( [a-zA-Z] ) # \w
 \s # whitespace
 ( [^\d] ) # \D
 ( [^\w] ) # \W
 ( [^\s] ) # \S\

Example Code for Regex:

#!/usr/bin/perl
 # REGEX
 print "\nStart Program\n";

# Example Code:
 $_ = "yabba dabba doo baba foo fofo foooo!"; # set the default input
 if ( /(.)\1/ ) { print "$1 \n"; } # b - matched: bb
 if ( /y(....) d\1/ ) { print "$1 \n"; } # abba - matched: abba abba
 if ( /y(.)(.)\2\1/ ) { print "$2 $1 \n"; } # b a - matched: abba
 if ( /y(.)(.)\2\3/ ) { print "$2 $1 \n"; } # error!

# error: /y(.)(.)\2\3/: reference to nonexistent group at ch7_regex.pl line 14.
 if ( /(.)\111/ ) { print "$1 \n"; } # no match - looking for string 11
 if ( /(.)\1\1\1/ ) { print "$1 \n"; } # o - matched: fooo
 if ( /(.)[^bar]/ ) { print "$1 \n"; } # a - matched: baba, which is last 'ba'
 if ( /(.c)|(.f)/ ) { print "$1 $2 \n"; } # _f - matched _foooo, which is last 'f'

# Exercise 1
 $_ = "Fred Flintsone Harvey Dent Gothem Alfred Wayne Freddy. Kruger\n";
 if (/fred/) { print "1 $1 \n"; } # 1
 if (/jack/) { print "2 $1 \n"; } # _
 if (/[A|a]lfred/) { print "3 $1 \n"; } # 3
 if (/alfred/) { print "4 $1 \n"; } # _
 if (/\.+/) { print "5 $1 \n"; } # 5