Frontier Software

nc

Netcat (nc) allows us to write a simple http server in Bash.

Though nc was a traditional part of Unix, Linux distributions don’t tend to include it. I installed openbsd-netcat rather than gnu-netcat since it seems to be more actively maintained.

A nice tutorial which got me started is Building a Web server in Bash. Besides nc, this introduced me to named pipes created with mkfifo.

A simple manpage webapp written in Bash

For my first webapp, I want to create a manpage reader.

#!/usr/bash

PORT=$1

rm -f response
mkfifo response

HEADLINE_REGEX='(.*?)\s(.*?)\sHTTP.*?'

function handleGet() {
  readarray -d '/' -t uri_arr <<< "$URI"
  TOPIC="${uri_arr[1]//[[:cntrl:]]}"
  SECTION="${uri_arr[2]//[[:cntrl:]]}"
  read -r -d '' BODY << EndOfMessage
<!DOCTYPE html>
<html>
<head>
<title>man ${TOPIC:-man}(${SECTION:-1})</title>
</head>
<body>
<pre>
$(man "${TOPIC:-man}(${SECTION:-1})")
</pre>
</body>
</html>
EndOfMessage
#  BODY="${BODY//[[:cntrl:]]/\\r\\n}"
  BODY="$(sed ':a;N;$ba;s/\n/\r\n/g' <<< "$BODY")"
  CONTENT_LENGTH=$(wc -c <<< "$BODY")
echo $CONTENT_LENGTH
  read -r -d '' RESPONSE << EndOfMessage
HTTP/1.1 200 OK
Content-Type: text/html;charset=utf-8
Content-Length: $CONTENT_LENGTH

$BODY
EndOfMessage
}

function handleRequest() {
  while read -r line; do
    echo "$line"
    trline="$(echo "$line" | tr -d "[:cntrl:]")"
    [[ -z "$trline" ]] && break
    [[ "$trline" =~ $HEADLINE_REGEX ]] && { 
      METHOD="${BASH_REMATCH[1]@U}"; 
      URI="${BASH_REMATCH[2]@L}";
    }
  done
  case "$METHOD" in
    GET) handleGet ;;
  esac
  echo -e "$RESPONSE" > response
}

echo "Listening on $PORT..."

while true; do
  cat response | nc -lN "$PORT" | handleRequest
done

Firing the server up with bash man-server.sh 8000 and then pointing my browser to http://localhost:8000/man or http://localhost:8000/man/1 brings up the man page for man, while http://localhost:8000/man/7 renders the manpage for groff_man(7).

After trying I man2html and then groff -Thtml, I found simply putting the output of man between pre tags garbled the output least.

shellcheck complains:

In man-server.sh line 27:
  cat response | nc -lN "$PORT" | handleRequest
      ^------^ SC2002 (style): Useless cat. Consider 'cmd < file | ..' or 'cmd file | ..' instead.

For more information:
  https://www.shellcheck.net/wiki/SC2002 -- Useless cat. Consider 'cmd < file...

However, trying to use the named pipe with < doesn’t seem to work.

Step 1 in the handleRequest function involves parsing the HTTP request sent by the browser. The specification is RFC 9110 and Mozilla has a handy HTTP guide.

Handling POST

Next I want to generalise my website to possibly handle help, info… besides man. Instead of using the URL path as my arguments, I want a form which sends a JSON-RPC formatted request to the server and then receives a JSON-RPC formatted response.

A JSON-RPC request could look like:

{
  "method": "man",
  "params": ["man", 7],
  "id": 1
}

And the response would look like:

{
  "response": "text output of man man 7 with escaped newlines",
  "error": null,
  "id": 1
}

Whereas the man page server above assumes GET, the next step is to accept POST, reading the data sent by the browser, specifically JSON sent using fetch.

Sending forms through JavaScript

#!/usr/bash

PORT=$1

rm -f response
mkfifo response

HEADLINE_REGEX='(.*?)\s(.*?)\sHTTP.*?'
CONTENT_LENGTH_REGEX='Content-Length:\s(.*?)'

function handleGet() {
BODY="$(cat test.html)"
read -r -d '' RESPONSE << EndOfMessage
HTTP/1.1 200 OK
Content-Type: text/html;charset=utf-8

$BODY
EndOfMessage
}

function handlePost() {
id=$(jq -rS '.id' <<< $BODY)
method=$(jq -r '.method' <<< $BODY)
topic=$(jq -r '.params[0]' <<< $BODY)
section=$(jq -r '.params[1]' <<< $BODY)
read -r -d '' json << EndOfMessage
{
  "error": null,
  "id": $id,
  "result": "$method ${topic}.${section}"
}
EndOfMessage
# result="$($method ${topic}.${section})"
# json="$(jq -r --arg result "$result" '.result = $result' <<< "$json")"
json="$(jq -c '.' <<< "$json")"
read -r -d '' RESPONSE << EndOfMessage
HTTP/1.1 200 OK
Content-Type: application/json;charset=utf-8

$json
EndOfMessage
}


function handleRequest() {
  while read -r line; do
    echo "$line"
    trline="$(echo "$line" | tr -d "[:cntrl:]")"
    [[ -z "$trline" ]] && break
    [[ "$trline" =~ $HEADLINE_REGEX ]] && { 
      METHOD="${BASH_REMATCH[1]}"; 
      URI="${BASH_REMATCH[2]}";
    }
    [[ "$trline" =~ $CONTENT_LENGTH_REGEX ]] && CONTENT_LENGTH="${BASH_REMATCH[1]}"
  done
  [[ -n $CONTENT_LENGTH ]] && read -n $CONTENT_LENGTH -t1 BODY
  case "$METHOD" in
    GET) handleGet ;;
    POST) handlePost ;;
  esac
  echo -e "$RESPONSE" > response
}

echo "Listening on $PORT..."

while true; do
  cat response | nc -lNC "$PORT" | handleRequest
done

The above server intermittently gives errors, indicating that nc is not realistically a great solution. It could be the problems are due to the missing Content-Length: in the HTTP header, something I tried to fix unsuccessfully with ${#BODY} and wc -c ....

Unfortunately, I couldn’t get this to work with a large string in response, so pushing on with the go version.