Core Utilities

grep me no patterns and I’ll tell you no lines. — fortune cooky

As explained in The Unix Programing Environment by Brian Kernighan and Rob Pike, sed can do just about everything grep can.

sed -n '/pattern/p' files

is the same as

grep -h pattern files

Why do we have both sed and grep? After all, grep is just a simple special case of sed. Part of the reason is history — grep came well before sed. But grep survives, indeed thrives, because for the job that they both do, it is significantly easier to use than sed is.

sed

Manual

Four types of sed scripts

1. Multiple Edits to the Same File

2. Making Changes Across a Set of Files

3. Extracting Contents of a File

4. Edits To Go

Substitute [address]s/regexp/replacement/[flags]

The ‘s’ command (as in substitute) is probably the most important in ‘sed’ and has a lot of different options. The syntax of the ‘s’ command is ’s/REGEXP/REPLACEMENT/FLAGS'.

The ‘/’ characters may be uniformly replaced by any other single character within any given ‘s’ command.

awk

Manual

print
#!/usr/bin/bash

awk '{ print }' "$@"
BEGIN
#!/usr/bin/bash

awk 'BEGIN { print "NAME\tRATE\tHOURS"; print "" }
    { print }' "$@"
END
#!/usr/bin/bash

awk ' { OFS="\t" }
BEGIN { print "NAME", "RATE", "HOURS"; print "" }
    { print $1, $2, $3}
    { rate = rate + $2 }
    { hours = hours + $3 }
END { print "AVERAGE", rate / NR, hours / NR }' "$@"
#!/usr/bin/bash

ExampleGroup 'basic awk oneliner'

  Example 'print emp.data'
    When call bin/eg1.sh bin/emp.data
    The output should eq 'Beth	4.00	0
Dan	3.75	0
Kathy	4.00	10
Mark	5.00	20
Mary	5.50	22
Susie	4.25	18'
  End

  Example 'print a header using BEGIN'
    When call bin/eg2.sh bin/emp.data
    The output should eq 'NAME	RATE	HOURS

Beth	4.00	0
Dan	3.75	0
Kathy	4.00	10
Mark	5.00	20
Mary	5.50	22
Susie	4.25	18'
  End

  Example 'print average rates and hours using END'
    When call bin/eg3.sh bin/emp.data
    The output should eq 'NAME RATE HOURS

Beth	4.00	0
Dan	3.75	0
Kathy	4.00	10
Mark	5.00	20
Mary	5.50	22
Susie	4.25	18
AVERAGE	4.41667	11.6667'
  End

End
Built-in variables

ARGC
number of command-line arguments

ARGV[n]
array of command-line arguments

FILENAME
name of current input file

FNR
input record number in current file

FS
input field separator (default blank)

NF
number of fields in current input record

NR
input record number since beginning

OFMT
output format for numbers (default "%.6g")

OFS
output field separator (default blank)

ORS
output record separator (default newline)

RLENGTH
length of string matched by regular expression in match

RS
input record separator (default newline)

RSTART
beginning position of string matched by match

SUBSEP
separator for array subscripts of form [i,j...] (default "\034")

Chapter 2 Awk book examples

countries.tsv
USSR	8649	275	Asia
Canada	3852	25	North America
China	3705	1032	Asia
USA	3615	237	North America
Brazil	3286	134	South America
India	1267	746	Asia
Mexico	762	78	North America
France	211	55	Europe
Japan	144	120	Asia
Germany	96	61	Europe
England	94	56	Europe
BEGIN and END example
#!/usr/bin/bash

awk 'BEGIN { FS = "\t"
             printf("%10s %6s %5s   %s\n\n", 
                   "COUNTRY", "AREA", "POP", "CONTINENT")
           }
    { printf("%10s %6d %5d   %s\n", $1, $2, $3, $4) 
      area = area + $2
      pop = pop + $3
    }
    END { printf("\n%10s %6d %5d\n", "TOTAL", area, pop) }' "$@"
#!/usr/bin/bash

ExampleGroup 'Chapter 2 Examples BEGIN and END'

  Example 'print countries with column headers and totals'
    When call bin/ch2_eg1.sh bin/countries.tsv
    The output should eq '   COUNTRY   AREA   POP   CONTINENT

      USSR   8649   275   Asia
    Canada   3852    25   North America
     China   3705  1032   Asia
       USA   3615   237   North America
    Brazil   3286   134   South America
     India   1267   746   Asia
    Mexico    762    78   North America
    France    211    55   Europe
     Japan    144   120   Asia
   Germany     96    61   Europe
   England     94    56   Europe

     TOTAL  25681  2819'
  End

End
String-matching Patterns

/regexpr/
Matches when the current input line contains a substring matched by regexpr. Shorthand for $0 ~ /regexpr/

expression ~ /regexpr/
Matches if the string value of expression contains a substring matched by regexpr.

expression !~ /regexpr/
Matches if the string value of expression does not contain a substring matched by regexpr.
#!/usr/bin/bash

ExampleGroup '/regexpr/'

  Example '$4 ~ /Asia/'
    When call awk '$4 ~ /Asia/' bin/countries.tsv
    The output should eq 'USSR	8649	275	Asia
China	3705	1032	Asia
India	1267	746	Asia
Japan	144	120	Asia'
  End

End
Range Patterns start, end
#!/usr/bin/bash

ExampleGroup 'Range Patterns'

  Example '/Canada/, /USA/'
    When call awk '/Canada/, /USA/' bin/countries.tsv
    The output should eq 'Canada	3852	25	North America
China	3705	1032	Asia
USA	3615	237	North America'
  End

End
date

The primary reason I’m using bash rather than some programing language is to get easy access to GNU’s powerful date program. It automagically reads most text descriptions of dates and can then translate them to what I want, which is ISO 8601 as specified by schema.org.

Getting dates right has been a constant source of bugs — it took me a while to figure that because my development machine is set to my local time, “Africa/Johannesburg” and my server to “UTC”, joeblog.co.za was showing all events 2 hours early.

Frontier Software

Core Utilities

grep

sed

Four types of sed scripts

1. Multiple Edits to the Same File

2. Making Changes Across a Set of Files

3. Extracting Contents of a File

4. Edits To Go

Substitute [address]s/regexp/replacement/[flags]

awk

print

BEGIN

END

Built-in variables

Chapter 2 Awk book examples

countries.tsv

BEGIN and END example

String-matching Patterns

Range Patterns start, end

date