nas probe: information on using and testing Regex/Regular Expressions

Document ID : KB000033958
Last Modified Date : 14/02/2018
Show Technical Document Details

Some customers report getting unexpected or inconsistent results when using the internal nas regex tester option or external (online) regex testers for use in parsing alarm messages.

A quick and easy way to test a regular expression for filtering on an alarm is to send a test alarm. One easy way to send a test alarm is to open the nas GUI and choose the Status Tab then rt-click on the alarm windows and Send a Test Alarm.

User-added image

Input the exact alarm message you would like to test.

Tip: enter PLS IGNORE or TEST in the source filed so your Operators know it is just a test message and you can easily filter on the alarm using the source value just in case you have thousands of alarms in the alarm subconsole window. It simply makes it easier to find using a 'source' filter.

Then once the alarm shows up in the console, click on the Alarm Filter icon and enter your regular expression into the alarm subconsole message filter, then click the Apply button. IF the alarm shows up (you get a 'hit' so to speak) then you're filter is working.

If you do not see your alarm displayed then the filter is wrong and has to be adjusted.

User-added image

Additional Links/Information:
See the Infrastructure Manager Help doc, starting on page 304 for help and more details on using regular expressions.

Regular expressions

Regular expressions?provide a mechanism to select specific strings from a set of character strings.The following text is extracted from the perlre man-page, and should be consulted for more in-depth understanding of regular expressions.?

The patterns used in pattern matching are regular expressions such as those supplied in the Version 8 regexp routines. (In fact, the routines are derived (distantly) from Henry Spencer's freely redistributable reimplementation of the V8 routines.) See the section on Version 8 Regular Expressions for details. In particular the following metacharacters have their standard egrep-ish meanings:

\ Quote the next metacharacter
^ Match the beginning of the line
. Match any character (except newline)
$ Match the end of the line (or before newline at the end)
| Alternation
( ) Grouping
[ ] Character class

By default, the "^" character is guaranteed to match only at the beginning of the string, the "$" character only at the end (or before the newline at the end) and Perl does certain optimizations with the assumption that the string contains only one line.

Embedded newlines will not be matched by "^" or "$". You may, however, wish to treat a string as a multi-line buffer, such that the "^" will match after any newline within the string, and "$" will match before any newline. At the cost of a little more overhead, you can do this by using the /m modifier on the pattern match operator. (Older programs did this by setting $*, but this practice is deprecated in Perl 5.) To facilitate multi-line substitutions, the "." character never matches a newline unless you use the /s modifier, which tells Perl to pretend the string is a single line- even if it isn't. The /s modifier also overrides the setting of $*, in case you have some (badly behaved) older code that sets it in another module.

The following standard quantifiers are recognized:

* Match 0 or more times
+ Match 1 or more times
? Match 1 or 0 times
{n} Match exactly n times
{n,} Match at least n times
{n,m} Match at least n but not more than m times
(If a curly bracket occurs in any other context, it is treated as a regular character.) The "*" modifier is equivalent to {0,}, the "+" modifier to {1,}, and the "?" modifier to {0,1}. n and m are limited to integral values less than 65536. By default, a quantified subpattern is "greedy", that is, it will match as many times as possible without causing the rest of the pattern not to match. The standard quantifiers are all "greedy", in that they match as many occurrences as possible (given a particular starting location) without causing the pattern to fail. If you want it to match the minimum number of times possible, follow the quantifier with a "?" after any of them. Note that the meanings don't change, just the "gravity":

*? Match 0 or more times
+? Match 1 or more times
?? Match 0 or 1 time
{n}? Match exactly n times
{n,}? Match at least n times
{n,m}? Match at least n but not more than m times

Since patterns are processed as double quoted strings, the following also work:

\t tab
\n newline
\r return
\f form feed
\a alarm (bell)
\e escape (think troff)
\033 octal char (think of a PDP-11)
\x1B hex char
\c[ control char
\l lowercase next char (think vi)
\u uppercase next char (think vi)
\L lowercase till \E (think vi)
\U uppercase till \E (think vi)
\E end case modification (think vi)
\Q quote regexp metacharacters till \E

In addition, Perl defines the following:

\w Match a "word" character (alphanumeric plus "_")
\W Match a non-word character
\s Match a whitespace character
\S Match a non-whitespace character
\d Match a digit character
\D Match a non-digit character

Perl defines a consistent extension syntax for regular expressions. The syntax is a pair of parentheses with a question mark as the first thing within the parentheses (this was a syntax error in older versions of Perl). The character after the question mark gives the function of the extension.
Several extensions are already supported:
(?#text)A comment. The text is ignored. If the /x switch is used to enable whitespace formatting, a simple # will suffice.
(?:regexp)This groups things like "()" but doesn't make backreferences like "()" does. So split(/\b(?:a|b|c)\b/) is like split(/\b(a|b|c)\b/) but doesn't spit out extra fields.
(?=regexp)A zero-width positive lookahead assertion. For example, /\w+(?=\t)/ matches a word followed by a tab, without including the tab in $&.
(?!regexp)A zero-width negative lookahead assertion. For example /foo(?!bar)/ matches any occurrence of "foo" that isn't followed by "bar". Note however that lookahead and lookbehind are NOT the same thing. You cannot use this for lookbehind: /(?!foo)bar/ will not find an occurrence of "bar" that is preceded by something which is not "foo". That's because the (?!foo) is just saying that the next thing cannot be "foo"--and it's not, it's a "bar", so "foobar" will match. You would have to do something like /(?!foo) for that. We say "like" because there's the case of your "bar".
Note that \w matches a single alphanumeric character, not a whole word. To match a word you'd need to say \w+. You may use \w, \W, \s, \S, \d and \D within character classes (though not as either end of a range).
/.*substring.*/match substring somewhere in complete string (beginning to end)
/^at start.*/match at start beginning at the first character position, match everything after start.
/^\s+$/match a sequence of spaces (one or more) to end of line (blankline)
/^(\s+$|\*+\s+$)/match a blankline or a line of * (one or more) with one or more spaces to end of line.
/^(?!193.71.55).*index.htm.*/match all lines with a substring of index.htm not starting with 193.71.55