lunedì 28 ottobre 2013

Log4j syslogAppender and Rsyslog

This is how I workarounded an annoying situation, working with log4j.syslogAppender (the syslog module from the Java logging library) and Rsyslog togheter, in order to sends Tomcat logs to a remote log analyzer (Splunk, on the same machine with Rsyslog), that understands 'log4j' format

The article is just meant to show some Rsyslog hacks

Of course, I noticed the problem from the log analyzer, which wasn't working because of the confused log format. So firstly, I wanted to have a look how the logs were cominig to Rsyslog

To show all the fields, I've enabled the RSYSLOG_DebugFormat, next to the entry in /etc/rsyslog.conf
local1.info /var/log/log_analyzer/application.log;RSYSLOG_DebugFormat
This the awfull result:
Debug line with all properties:
FROMHOST: 'localhost', fromhost-ip: '127.0.0.1', HOSTNAME: '2013-10-25', PRI: 142,
syslogtag '19:', programname: '19', APP-NAME: '19', PROCID: '', MSGID: '-',
TIMESTAMP: 'Oct 25 19:38:51', STRUCTURED-DATA: '-',
msg: '38:51,915 INFO
As shown, those fields are totally messed up (HOSTNAME: '2013-10-25', MSGID: '-' ..) The reasons behind this "unmatching" log4j/rsyslog fields, could be either a misconfiguration log4j side (don't ask me where since the configuration is minimal) or the broken syslogAppender module. Reading on the web I was convincing myself much for the latter

So, my decision at the moment is to bypass as much as I can, the Rsyslog handling of those logs in order to store the logs exactly as they come from log4j

These are the steps

1 - Let's define a custom template
Rsyslog has its default template, so for first we need to setup a custom template (Log4j, which basically will avoid any formatting for our logs) and make it point to our "local1.info" entry
$template Log4j, "%rawmsg%\n"
local1.info /var/log/log_analyzer/application.log;Log4j
2 - Turn the control character escaping off
With %rawmsg%, our logs will look like this
<142>2013-10-25 20:20:53,862 INFO  ["http-80"] org.apache.cxf.interceptor: Inbound Message#012Content-Type: xxx#012Headers: xxx
As we can see the string has many #012 in the middle. This represents a form feed control character. The default global option in Rsyslog is meant to escape the control characters, so let's disable it
$EscapeControlCharactersOnReceive off
Verify the Rsyslog configuration
lurch~# rsyslogd -N1
...
rsyslogd: Warning: backward compatibility layer added to following directive to rsyslog.conf: ModLoad imuxsock
rsyslogd: End of config validation run. Bye.
And restart the service
lurch~# service rsyslog restart 
Let's have a look to log again
<142>2013-10-25 20:20:53,862 INFO  ["http-80"] org.apache.cxf.interceptor: Inbound Message
Content-Type: xxx
Headers: xxx
Much better but

3 - Remove some chars from the string
Our string still begins with a <142>. This is the PRIO value in Rsyslog. Since the message is a RAW message, I assume that the PRIO is sent directly from log4j, so it wont be possible to play with Rsyslog fields, in order to remove it
So I decide to cut off the first characters in the string(<142>), in the template definition
$template Log4j, "%rawmsg:6:$%\n"
Finally:
2013-10-25 20:20:53,862 INFO  ["http-80"] org.apache.cxf.interceptor: Inbound Message
Content-Type: xxx
Headers: xxx