Capturing Script Logs

By Miroslav Shubernetskiy / Software Engineer at GameChanger

December 1, 2020

At GameChanger, we use scripts in many of our flows such as during deploys or running Ansible while booting new instances. Some of these flows are critical to our operations and require good visibility. Traditionally we tried to send both metrics and logs to DataDog where we can both monitor what scripts are doing and set alerts on various metrics. DataDog integration, however, is not always available such as during the instance boot. In addition, sometimes a DataDog alert does not provide enough context of what failed and most importantly why it failed. It simply states that some threshold was reached. To investigate the issue requires more manual steps by looking for the appropriate log which is not always intuitive. That is why recently at GameChanger we started integrating Slack error reporting directly into some of our critical scripts. This post describes the exact mechanism of how that is achieved since it uses a really cool bash trick.

Useful Commands

First some background on some of the useful commands.

`tee` Anyone?

Lets drink some tea

Tee is a really useful command. It captures an output from a script and both echoes it to standard out as well as forwards it to a file. This allows to both see an output as well as capture the same output for later use. For example:

$ echo hello there | tee hello.log
hello there
$ cat hello.log
hello there

Process Substitution

Lets swap some things

Some commands only work with files. For example a classic diff:

$ echo one > one.txt
$ echo two > two.txt
$ diff -u one.txt two.txt
--- one.txt
+++ two.txt
@@ -1 +1 @@
-one
+two

Sometimes however it is convenient to be able to refer to an output of commands as a file without manually creating a temporary file. This is what process substitution allows to do. Same example as above but with process substitution:

$ diff -u <(echo one) <(echo two)
--- /dev/fd/63  2020-11-30 12:33:08.539905663 -0500
+++ /dev/fd/62  2020-11-30 12:33:08.540272967 -0500
@@ -1 +1 @@
-one
+two

This is the output form of process substitution which uses <(command) syntax. It stores an output of a command into a temporary file for the duration of a command. There is also input form of process substitution which uses >(command) syntax. It similarly creates a temporary file to which data can be written to and process substitution will forward that content to a command. For example:

$ echo "
> 2
> 1" > >(sort)
1
2

`exec`

Lets execute things

Most common form of exec is to simply execute another command by replacing the process:

$ bash -c 'exec echo foo'
foo

However exec also overtakes process file descriptors which allows exec to adjust what a process does with its file descriptors which includes stdout and stderr. For example exec can capture script output into a file:

$ cat test.sh
exec &> log.txt
echo stdout > /dev/stdout
echo stderr > /dev/stderr
$ bash test.sh
$ cat log.txt
stdout
stderr

Similar results can be achieved by manually sending script output to a file (e.g. command &> log.txt) however exec allows to do that directly within the script.

Putting Everything Together

Lets put everything together

Putting all commands together allows to do something like:

LOG_PATH=/tmp/deploy_$(date +%s).log
exec &> >(tee -a $LOG_PATH)

function error_msg {
    echo "Deploy last 50 logs:
$(tail -n 50 $LOG_PATH)
"
}

function slack_error {
    msg=${1:-}
    slacksend \
        --channel=alerts \
        --color=danger \
        --snippet="$(error_msg)" \
        --filename="deploy.log" \
        "$msg" \
        || true
}

deploy || (slack_error "terminating rolling deployment at $HOSTNAME"; exit 1)

Here is what the above does:

Shows script output as normal via tee. This allows normal script log aggregation to work as normal.
In addition it fully captures script stdout and stderr to a log file via exec, process substitution and tee
When the deploy command fails, it sends a slack message with the last 50 lines of the script logs

Example slack message

Wrap Up

It might not seem like much, but immediately seeing what failed and why by including the failure logs makes for a much more pleasant debugging experience. Most importantly it does not adjust how the script is used so no other changes are required to other systems since all the enhancements are baked directly in into the script itself.

Hopefully you will find some of these cool bash capabilities useful as well.