awk
Manual
print
#!/usr/bin/bash
awk '{ print }' "$@"
BEGIN
#!/usr/bin/bash
awk 'BEGIN { print "NAME\tRATE\tHOURS"; print "" }
{ print }' "$@"
END
#!/usr/bin/bash
awk ' { OFS="\t" }
BEGIN { print "NAME", "RATE", "HOURS"; print "" }
{ print $1, $2, $3}
{ rate = rate + $2 }
{ hours = hours + $3 }
END { print "AVERAGE", rate / NR, hours / NR }' "$@"
#!/usr/bin/bash
ExampleGroup 'basic awk oneliner'
Example 'print emp.data'
When call bin/eg1.sh bin/emp.data
The output should eq 'Beth 4.00 0
Dan 3.75 0
Kathy 4.00 10
Mark 5.00 20
Mary 5.50 22
Susie 4.25 18'
End
Example 'print a header using BEGIN'
When call bin/eg2.sh bin/emp.data
The output should eq 'NAME RATE HOURS
Beth 4.00 0
Dan 3.75 0
Kathy 4.00 10
Mark 5.00 20
Mary 5.50 22
Susie 4.25 18'
End
Example 'print average rates and hours using END'
When call bin/eg3.sh bin/emp.data
The output should eq 'NAME RATE HOURS
Beth 4.00 0
Dan 3.75 0
Kathy 4.00 10
Mark 5.00 20
Mary 5.50 22
Susie 4.25 18
AVERAGE 4.41667 11.6667'
End
End
Built-in variables
- ARGC
- number of command-line arguments
- ARGV[n]
- array of command-line arguments
- FILENAME
- name of current input file
- FNR
- input record number in current file
- FS
- input field separator (default blank)
- NF
- number of fields in current input record
- NR
- input record number since beginning
- OFMT
- output format for numbers (default "%.6g")
- OFS
- output field separator (default blank)
- ORS
- output record separator (default newline)
- RLENGTH
- length of string matched by regular expression in match
- RS
- input record separator (default newline)
- RSTART
- beginning position of string matched by match
- SUBSEP
- separator for array subscripts of form [i,j...] (default "\034")
Chapter 2 Awk book examples
countries.tsv
USSR 8649 275 Asia
Canada 3852 25 North America
China 3705 1032 Asia
USA 3615 237 North America
Brazil 3286 134 South America
India 1267 746 Asia
Mexico 762 78 North America
France 211 55 Europe
Japan 144 120 Asia
Germany 96 61 Europe
England 94 56 Europe
BEGIN and END example
#!/usr/bin/bash
awk 'BEGIN { FS = "\t"
printf("%10s %6s %5s %s\n\n",
"COUNTRY", "AREA", "POP", "CONTINENT")
}
{ printf("%10s %6d %5d %s\n", $1, $2, $3, $4)
area = area + $2
pop = pop + $3
}
END { printf("\n%10s %6d %5d\n", "TOTAL", area, pop) }' "$@"
#!/usr/bin/bash
ExampleGroup 'Chapter 2 Examples BEGIN and END'
Example 'print countries with column headers and totals'
When call bin/ch2_eg1.sh bin/countries.tsv
The output should eq ' COUNTRY AREA POP CONTINENT
USSR 8649 275 Asia
Canada 3852 25 North America
China 3705 1032 Asia
USA 3615 237 North America
Brazil 3286 134 South America
India 1267 746 Asia
Mexico 762 78 North America
France 211 55 Europe
Japan 144 120 Asia
Germany 96 61 Europe
England 94 56 Europe
TOTAL 25681 2819'
End
End
String-matching Patterns
- /regexpr/
- Matches when the current input line contains a substring matched by regexpr. Shorthand for $0 ~ /regexpr/
- expression ~ /regexpr/
- Matches if the string value of expression contains a substring matched by regexpr.
- expression !~ /regexpr/
- Matches if the string value of expression does not contain a substring matched by regexpr.
#!/usr/bin/bash
ExampleGroup '/regexpr/'
Example '$4 ~ /Asia/'
When call awk '$4 ~ /Asia/' bin/countries.tsv
The output should eq 'USSR 8649 275 Asia
China 3705 1032 Asia
India 1267 746 Asia
Japan 144 120 Asia'
End
End
Range Patterns start, end
#!/usr/bin/bash
ExampleGroup 'Range Patterns'
Example '/Canada/, /USA/'
When call awk '/Canada/, /USA/' bin/countries.tsv
The output should eq 'Canada 3852 25 North America
China 3705 1032 Asia
USA 3615 237 North America'
End
End