How To Filter Journalctl By Unit

22 Oct 2022, revised 28 Oct 2022
6 minute read

Journalctl is the standard way of viewing system logs on Linux. But since the logs from all processes get consolidated in one place, it can get quit spammy if one process dumps a lot of output. Here is how I filter out those bad processes.

A net stretched across a flowing river.
A net stretched across a flowing river. Image by Dall-E.

Journalctl will let you filter the logs it shows, if you specify the unit files you are interested in, with the --unit option. If I want to see log entries from certbot I can do that:

$ journalctl --unit certbot
Sep 14 07:07:17 reiterate-03 systemd[1]: Starting Certbot...
Sep 14 07:07:17 reiterate-03 systemd[1]: certbot.service: Succeeded.
Sep 14 07:07:17 reiterate-03 systemd[1]: Finished Certbot.
Sep 14 20:01:40 reiterate-03 systemd[1]: Starting Certbot...
Sep 14 20:01:40 reiterate-03 systemd[1]: certbot.service: Succeeded.
Sep 14 20:01:40 reiterate-03 systemd[1]: Finished Certbot.
Sep 15 07:41:32 reiterate-03 systemd[1]: Starting Certbot...

But if I want to see the logs from every process except certbot, there’s no way to do that. Journalctl doesn’t let you exclude entries by unit. I checked and there is, in fact, a request for enhancement to add this functionality, from 2016, and it’s still open. Apparently it’s really hard to add filter-by-exclusion to the journal logs.

On my server, there are two spammy processes that I’d like to filter out. The first is sysstat-collect.service, which is run every ten minutes by default and fills my system logs with low-value entries. The second process I want to filter isn’t run under systemd directly, but rather through cron. Debian has been very slowly migrating from cron to systemd, and in general it’s taken an approach that tries to be all things for all people. There’s still plenty of jobs that run under cron, but cron itself is run by systemd. For my purposes, I’d like to filter out any process that’s run by cron.

The way we’re going to do this is via JSON. Journalcrl has a -o json flag which we can then feed into jq. Jq is a powerful json processing engine that will let us do the filtering we want. Then we’ll have to reformat the json back into nicely readable log lines.

First, install jq if you don’t have it already (again, I’m doing everything with Debian here)

$ sudo apt-get install jq

Next, we install our new filtering command. I’m making this a bash function since I prefer that over a short shell script. If you’re not using bash it shouldn’t be too hard to translate this into some other scripting language. I’ll call the new alias jf for journal filtered

I’m adding this into my .bash_aliases file

jf() {
  journalctl -e -o json | \
    jq -Mr 'select(
      [(.UNIT != "sysstat-collect.service"), (._SYSTEMD_UNIT != "cron.service")] | all
    ) | 
    [(((.__REALTIME_TIMESTAMP | tonumber) / 1000000) | strflocaltime("%b %d %H:%M:%S")), .MESSAGE] |
    join(" ")' |
  less -S +G

Here’s a breakdown of how the script works, line by line.

  • line 2 We start by invoking journalctl and passing the -o json option. I’m also adding -e because I like to start viewing my logs from the end. This then gets piped forward to the rest of the function.
  • lines 3-5 jq starts processing through a select filter that removes anything from the sysstat-collect or cron systemd units. It has to check both the UNIT and _SYSTEMD_UNIT attributes. In my testing, either one or the other could be set, but it’s not clear why either one might be set. Some log lines have one, some have the other. If you use journalctl --unit to only show logs from one unit, it actually checks both of those as well.
  • lines 6-7 Next we reformat the JSON back into a format that’s close to what journalctl outputs by default. The MESSAGE attribute is the main log message, and __REALTIME_TIMESTAMP contains our timestamp. However, journald records the timestamp in microseconds, and the strftime method expects seconds, so we have to divide by a million.
  • line 8 Finally, we pipe it back to less -S +G so we can page up/down/left/right

Here’s a sample output

Oct 22 05:40:27 Starting Certbot...
Oct 22 05:40:28 certbot.service: Succeeded.
Oct 22 05:40:28 Finished Certbot.
Oct 22 06:02:23 Started Generate Log stats for Reiterate.
Oct 22 06:02:24 reiterate-logstats.service: Succeeded.
Oct 22 06:50:36 Starting Daily apt upgrade and clean activities...
Oct 22 06:50:58 apt-daily-upgrade.service: Succeeded.
Oct 22 06:50:58 Finished Daily apt upgrade and clean activities.
Oct 22 06:50:58 apt-daily-upgrade.service: Consumed 18.282s CPU time.
Oct 22 07:38:39 Starting Cleanup of Temporary Directories...
Oct 22 07:38:39 systemd-tmpfiles-clean.service: Succeeded.
Oct 22 07:38:39 Finished Cleanup of Temporary Directories.
Oct 22 09:05:14 Started Generate Log stats for Reiterate.
Oct 22 09:05:15 reiterate-logstats.service: Succeeded.
Oct 22 09:25:54 Accepted publickey for meckler from 2603:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx port 59758 ssh2: ED25519 SHA256:xxxxXXXXxxxx>

Nice! It contains just the essentials, and I can quickly scan through that output to see everything important my server has been doing over the past few hours.

This is the command I use every time I log in to my server, but with a little additional work we can remove the hardcoded filters and make a more flexible command. We’ll call it jx for journal eXcept

jx() {
  local delim=""
  local all_cond=""
  for xc in "$@"; do
    all_cond="$all_cond$delim(.UNIT | contains(\"$xc\") | not), (._SYSTEMD_UNIT | contains(\"$xc\") | not)"
    delim=", "
  jq_args="select([$all_cond] | all) |" \
  "[(((.__REALTIME_TIMESTAMP | tonumber) / 1000000) | strflocaltime(\"%b %d %H:%M:%S\")), .MESSAGE] |" \
  "join(\" \")"
  journalctl -e -o json | jq -Mr "$jq_args"

This alias lets you specify which unit files you’d like to be filtered from your output. If you invoke it like

$ jx sysstat cron

Then it will perform identically as jf above. Unlike the jf alias, you can feed this alias any number of units to filter out.

If you found this useful, let me know in the coments below!

Tagged with

Comments and Webmentions

You can respond to this post using Webmentions. If you published a response to this elsewhere,

This post is licensed under CC BY 4.0