Pattern Matching
I found pattern matching in Bash extremely confusing, partly because there’s
- globbing (described by
man glob.7
) - regexing (described by
man regex.7
) used in[[ $X =~ regex ]]
and grep, sed, awk… with various dialects.
Furthermore, the pattern-list notation as in ?(pattern-list)
in the manual don’t work by default, but require
shopt -s extglob
to be set at the top of the script.
Back-references and Subexpressions
The above links to a secion in the grep manual which I’ve not managed to find any working examples of.
A similar section appears in the sed manual which appears to be the better way of extracting subexpressions from text in files.
When wanting subexpressions from text stored in a Bash variable, using an array ${BASH_REMATCH[@]}
automagically created whenever [[ $X =~ regex ]]
is used strikes me as the easiest way to do this.
BASH_REMATCH
When you use the =~
operator to match a regular expression against a string, any captured groups (i.e., parenthesized subexpressions) are stored in the ${BASH_REMATCH[@]} array starting from index 1. The entire matched text is stored at index 0.