Some people, when confronted with a problem, think I know, Ill use regular expressions. Now they have two problems. --Jamie Zawinski, in comp.lang.emacs
Used to extremes in Perl. Available in many languages. The following is designed as a quick reference / memory jog for experienced RE users. Any new users should... A) find another solution B) copy existing working code C) join a newsgroup or mailing list and ask for help D) take a class. RE is like shaking hands with an octopus.
Matches ^ beginning $ end . any character [.-.] any character from the first "." to the second where . is any character e.g. [A-Z] matches any uppercase letter Literals \. Quote. Treats "." as a literal value where . is any character e.g. \$ matches the dollar sign, not the end of line. \### Byte where ### are three octal digits. \x## Byte where ## are two hexadecimal digits. Flow control (.*) Group. Matches everything in the parens or nothing. Saves the match in $# were # counts up the groups. e.g. Time: (..):(..):(..) will put the hours in $1, minutes in $2 and seconds in $3. .*|.* Or. If the pattern before the "|" fails to match, it will try the pattern after. e.g. A|B will match A or B Repeat * 0 or more times. Same as {0,}. Will "eat" to the end unless followed by ? or something else + 1 or more times. Same as {1,}. Will "eat" to the end unless followed by ? or something else ? 0 or 1 times. Same as {0,1} {n} Match exactly n times {n,} Match at least n times. Will "eat" to the end unless followed ? or something else {n,m} Match at least n but not more than m times. .*? Match the minimum number of times possible where .* is one of the repeat patterns above. e.g. foo(.*)bar used against "the food is barbecued in the barn" will set $1 to "d is barbecued in the " but foo(.*?)bar will set it to "d is ". Notice that foo(.*)barb will also produce "d is "
For a regular expression to match, the entire regular expression must match, not just part of it. So if the beginning of a pattern containing a quantifier succeeds in a way that causes later parts in the pattern to fail, the matching engine backs up and recalculates the beginning part--that's why it's called backtracking.
Also:
See also:
City state zip | \s*(.*)\s*,\s*([A-Z]{{2}})\s+(\d{{5}}(\-\d{{4}})?)\s*" |
HTML eMail with only an image in it |
The following expression will match a message that contains one or more
images and no text at
all: <BODY[^>]*>(<[^>]+>|\n|\r)*<IMG[^>]+>(<[^>]+>|\n|\r)*</BODY> |
HTML eMail with an image |
<BODY[^>]*>(<[^>]+>|\n|\r|\s)*<IMG[^>]*src=['"]?cid: |
IPv4 dotted IP address: Anything from "/^\d+\.\d+\.\d+\.\d+$/" (which allows "448.90210.0.65535") to "/^([1-9]?\d|1\d\d|2[0-4]\d|25[0-5])\.([1-9]?\d|1\d\d|2[0-4]\d|25[0-5])\.([1-9]?\d|1\d\d|2[0-4]\d|25[0-5])\.([1-9]?\d|1\d\d|2[0-4]\d|25[0-5])$/" which is impossible for normal humans to understand. +
Interested:
file: /Techref/language/regxs.htm, 6KB, , updated: 2018/11/17 21:08, local time: 2025/1/12 13:35,
18.225.92.25:LOG IN
|
©2025 These pages are served without commercial sponsorship. (No popup ads, etc...).Bandwidth abuse increases hosting cost forcing sponsorship or shutdown. This server aggressively defends against automated copying for any reason including offline viewing, duplication, etc... Please respect this requirement and DO NOT RIP THIS SITE. Questions? <A HREF="http://linistepper.com/Techref/language/regxs.htm"> Regular Expressions</A> |
Did you find what you needed? |