Web Log Visualizations Using GoAccess
One of the challenges of running a website is knowing what your web server is doing. While various monitoring applications can alert you when your server is running with high loads or slow page responses, the only way to fully understand what is going on is to look at the web logs. It can take a lot of time to both read through pages of log data and understand what’s going on. This is where GoAccess comes in. GoAccess provides a real-time overview of what’s happening from your logs and provides statistics and visualizations to help convey this information. Data is available both through a web browser and at the terminal.
If you go over to the GoAccess website there is a live demo where you can get a feel for the program, and decide whether you think it’s a tool you could find helpful. If it is, read on for information on how you can install it on your server. For this tutorial, we’ll assume you already have Apache installed and serving web pages on your server.
Installing GoAccess
To start, we need to install GoAccess. On CentOS and Red Hat systems you’ll need to have enabled the Fedora EPEL repositories first, after which installation is simply:
sudo yum install goaccess
Unfortunately, the versions in the repositories on Debian or Ubuntu systems are quite out of date. However, the good news is that the GoAccess team maintains their own repositories. Here’s how to install using a GoAccess repository:
echo “deb http://deb.goaccess.io/ $(lsb_release -cs) main” | sudo tee -a /etc/apt/sources.list.d/goaccess.list
wget -O – http://deb.goaccess.io/gnugpg.key | sudo apt-key add –
sudo apt-get update
sudo apt-get install goaccess
Log File Analysis
Once GoAccess has been installed, you can start using it to analyze your log files. The default behavior is for GoAccess to perform analysis and display it at the command line for a given file:
sudo goaccess /var/log/apache2/access.log
In this case GoAccess will start and request the file format for the log file. For Apache access logs you can choose the “NCSA Combined Log Format” option. This will then parse the file and display the output for you to read. You can scroll up and down with the arrow keys to look at the information. The question mark (?) opens a help screen with information about more controls. Pressing q quits the program.
Log Outputs in a Web Browser
If you want to view the log outputs in a web browser, we’ll need to make some tweaks to the configuration file. By default, this is the /etc/goaccess.conf file.
sudo nano /etc/goaccess.conf
This is a well-commented file with loads of explanations of what you are looking at. So first, in the “Time Format Options” section, we need to uncomment the time-format line for Apache/NGINX:
time-format %H:%M:%S
Then in the “Date Format Options” section, you need to uncomment the date-format line for Apache/NGINX:
date-format %d/%b/%Y
Next in the “Log Format Options” section, uncomment the line for NCSA Combined Log Format:
log-format %h %^[%d:%t %^] “%r” %s Sb “%R” “%u”
Now save and exit the file. Note that if you are planning on looking at different log files, there are plenty of options for various log file formats.
Generating A Web Page
Now we can look at generating a web page of the log information with:
sudo goaccess /var/log/apache2/access.log -o /var/www/html/goaccess.html
Note that you need to set the output file (designated with the -o flag) to somewhere accessible from your web server in order to view it in the browser. In this case, I’ve used the root of the default website directory for Apache. I can now access the report by opening that file on my website. For example:
http://www.example.com/goaccess.html
In this case, we’ve created a one off report taking a snapshot of your logfile as it stood at that point. To make a continuously updating report, you need to add the “–real-time-html” flag to the command:
sudo goaccess /var/log/apache2/access.log -o /var/www/html/goaccess.html –real-time-html
You’ll notice that this time the GoAccess application stays running and prints the message “WebSocket server ready to accept new client connections”. Navigating to the web page now looks much the same as before, but it is kept up to date as new connections come in on the server.
A nice feature is that you aren’t limited to only analyzing a single log file at a time, so if you keep multiple simultaneous access logs on your server you can just list them all and GoAccess can analyze them all together:
sudo goaccess /var/log/apache2/access.log /var/log/apache2/another-access.log /var/log/apache2/yet-another-access.log -o /var/www/html/goaccess.html
Again, going to the web page will give you the output of GoAccess’s analysis of these log files.
This just scratches the surface of what you can do with log files and GoAccess, and it’s well worth looking through the help information and the details on the website to see what else you can do.