Frontier Software

Trimming

After trying to figure things out myself, I worked through the examples in the Pure Bash Bible where I encountered lots of syntax I’d never seen before.

#!/bin/bash

trim_string() {
    # Usage: trim_string "   example   string    "
    : "${1#"${1%%[![:space:]]*}"}"
    : "${_%"${_##*[![:space:]]}"}"
    printf '%s\n' "$_"
}


trim_string "    Hello,  World    "

First off, there’s the leading colon. An answer to What is the purpose of the : (colon) GNU Bash builtin?

A useful application for : is if you’re only interested in using parameter expansions for their side-effects rather than actually passing their result to a command.

Next there is the underscore which I’d never encountered before. I’d always been holding the output of parameter expansion in a variable to pass on to the next step, turns out _ does that automagically, holding the value of the previous output.

At shell startup, set to the pathname used to invoke the shell or shell script being executed as passed in the environment or argument list. Subsequently, expands to the last argument to the previous simple command executed in the foreground, after expansion. Also set to the full pathname used to invoke each command executed and placed in the environment exported to that command. — Bash Variables

The symbol for removing from the front is # and from the back is %. A handy mneumonic is number sign # usually precedes numbers while percentage sign % usually follows numbers.

As shown below, these usually need to be doubled to remove the longest match (possibly several whitespaces for my initial examples).

We need Bash’s composite patterns which are only available if

shopt -s extglob

is set. Composite patterns allow +(pattern-list) as used below to match possibly more than one space.

Trimming leading whitespace

The easiest is probably ${parameter##pattern} which removes the longest matching pattern. I only discovered this through trial and error, battling with my earlier attempts recorded below to use just a single #, which only removed the first leading space.

#!/usr/bin/bash

shopt -s extglob

ExampleGroup 'ways to remove leading, or left-side, whitespace'

  Example 'Easiest is ${str##+([[:blank:]])}'
    str='  Hello World'
    When call echo "${str##+([[:blank:]])}"
    The output should eq "Hello World"
  End

  Example 'Works correctly if there are no leading whitespaces'
    str='Hello World'
    When call echo "${str##+([[:blank:]])}"
    The output should eq "Hello World"
  End

  Example 'Don`t use ${str/+([[:blank:]])/} because if there is no leading whitespace, it will remove first in string'
    str='  Hello World'
    When call echo "${str/+([[:blank:]])/}"
    The output should eq "Hello World"
  End

  Example 'Here ${str/+([[:blank:]])/} returns HelloWorld, which is not what we want'
    str='Hello World'
    When call echo "${str/+([[:blank:]])/}"
    The output should eq "HelloWorld"
  End

  Example 'This can be fixed by using /# which `anchors` pattern to start, similar to ^ in regex'
    str='  Hello World'
    When call echo "${str/#+([[:blank:]])/}"
    The output should eq "Hello World"
  End

  Example '${str#[[:blank:]]} works if only one leading space'
    str=' Hello World'
    When call echo "${str#[[:blank:]]}"
    The output should eq "Hello World"
  End

  Example 'but fails if there are several'
    str='  Hello World'
    When call echo "${str#[[:blank:]]}"
    The output should eq " Hello World"
  End

End

Trimming trailing whitespace

ExampleGroup 'ways to remove trailing, or right-side, whitespace'

shopt -s extglob

  Example 'Easiest is ${str%%+([[:blank:]])}'
    str='Hello World  '
    When call echo "${str%%+([[:blank:]])}"
    The output should eq "Hello World"
  End

  Example 'Works correctly if there are no trailing whitespaces'
    str='Hello World'
    When call echo "${str%%+([[:blank:]])}"
    The output should eq "Hello World"
  End

  Example 'Using /% which `anchors` pattern to end, similar to $ in regex'
    str='Hello World  '
    When call echo "${str/%+([[:blank:]])/}"
    The output should eq "Hello World"
  End

  Example '${str%[[:blank:]]} works if only one trailing space'
    str='Hello World '
    When call echo "${str%[[:blank:]]}"
    The output should eq "Hello World"
  End

  Example 'but fails if there are several'
    str='Hello World  '
    When call echo "${str%[[:blank:]]}"
    The output should eq "Hello World "
  End

End

trim function

Writing this 2 line function took me a while to figure out the need for escaping double quotes in the eval string (something that keeps tripping me up) and that ${!1} is used to retriece the contents of a variable passed as a name.

#!/usr/bin/bash

shopt -s extglob

function trim {
  eval "$1=\"${!1##+([[:blank:]])}\""
  eval "$1=\"${!1%%+([[:blank:]])}\""
}

ExampleGroup 'trimming function'

  Example 'trim "  Hello World"'
    declare -A arr
    arr["\"name\""]='  Hello World'
    trim 'arr["\"name\""]'
    When call echo "${arr["\"name\""]}"
    The output should eq 'Hello World'
  End

  Example 'trim "Hello World  "'
    declare -A arr
    arr["\"name\""]='Hello World  '
    trim 'arr["\"name\""]'
    When call echo "${arr["\"name\""]}"
    The output should eq 'Hello World'
  End

  Example 'trim "  Hello World  "'
    declare -A arr
    arr["\"name\""]='  Hello World  '
    trim 'arr["\"name\""]'
    When call echo "${arr["\"name\""]}"
    The output should eq 'Hello World'
  End

  Example 'trim "Hello World"'
    declare -A arr
    arr["\"name\""]='Hello World'
    trim 'arr["\"name\""]'
    When call echo "${arr["\"name\""]}"
    The output should eq 'Hello World'
  End


End

Finding directory basename or dirname

Another use for bash’s # and % string operators is more efficient ways of pathname functions basename and dirname than calling these external programs.

ExampleGroup 'using # or % rather than basename and dirname'


  Example '${MYFILENAME##*/} duplicates basename'
    MYFILENAME='/home/digby/myfile.txt'
    When call echo "${MYFILENAME##*/}"
    The output should eq "myfile.txt"
  End

  Example 'basename duplicates ${MYFILENAME##*/}'
    MYFILENAME='/home/digby/myfile.txt'
    When call echo "$(basename $MYFILENAME)"
    The output should eq "myfile.txt"
  End

  Example 'using basename to get filename without extension}'
    MYFILENAME='/home/digby/myfile.txt'
    When call echo "$(basename $MYFILENAME .txt)"
    The output should eq "myfile"
  End

  Example 'using ${FILE%.*} to get filename without extension}'
    MYFILENAME='/home/digby/myfile.txt'
    FILE="${MYFILENAME##*/}"
    When call echo "${FILE%.*}"
    The output should eq "myfile"
  End

  Example '${MYFILENAME%/*} duplicates dirname'
    MYFILENAME='/home/digby/myfile.txt'
    When call echo "${MYFILENAME%/*}"
    The output should eq "/home/digby"
  End

  Example 'dirname duplicates ${MYFILENAME%/*}'
    MYFILENAME='/home/digby/myfile.txt'
    When call echo "$(dirname $MYFILENAME)"
    The output should eq "/home/digby"
  End 

End