Using NoSQL databases to store RADIUS and Syslog data

Using NoSQL databases to
store RADIUS and Syslog
data, part 1: Idea
Karri Huhtanen
18.9.2012

Some background
• currently RADIUS accounting data is stored usually
in SQL databases with ﬁxed database schema

• for Syslog messages an SQL database can be used,
but commercial log analyzers (like Splunk) usually
use their own solutions which may or may not be
SQL databases

• Started thinking if NoSQL database could be
applied to both or one of these?

RADIUS accounting message
Wed
Aug

8
13:49:33
2012

User-‐Name
=
"jotain@realm"

NAS-‐Port
=
8 One message

NAS-‐IP-‐Address
=
192.168.229.131 contains

Framed-‐IP-‐Address
=
192.168.163.226 undetermined

NAS-‐Identifier
=
"Cisco_66:77:88"

Airespace-‐WLAN-‐Id
=
4 number of

Acct-‐Session-‐Id
=
"50223ea9/00:11:22:33:44:55/2292" attributes.

Acct-‐Authentic
=
Remote

Tunnel-‐Type
=
0:VLAN
interpreted

Tunnel-‐Medium-‐Type
=
0:802 Some can be
attributes,

Tunnel-‐Private-‐Group-‐ID
=
0:222 interpreted, some
the unknown

Event-‐Timestamp
=
1344422780 stay unknown.

Acct-‐Status-‐Type
=
Alive
attributes are

Acct-‐Input-‐Octets
=
1262012
usually left in

Acct-‐Input-‐Gigawords
=
0 Because there can
OID:FieldDataTyp

Acct-‐Output-‐Octets
=
13518133 be a changing

Acct-‐Output-‐Gigawords
=
0
e binary format

Acct-‐Input-‐Packets
=
11692 number of

Acct-‐Output-‐Packets
=
11154 changing type of

Acct-‐Session-‐Time
=
1235 attributes I began

Acct-‐Delay-‐Time
=
19

Calling-‐Station-‐Id
=
"00:11:22:33:44:55" to wonder if

Called-‐Station-‐Id
=
"f4:7f:35:5e:bf:b0" NoSQL could be

cisco-‐avpair
=
"nas-‐update=true" used for storing

Digest-‐Response
=
"P"C<188>"

Digest-‐Response
=
"P"C<194>" these?

Timestamp
=
1344422954

Syslog message
Until researching The
syslog
message
has
the
following
ABNF
[RFC5234]
definition:

into this I thought

SYSLOG-‐MSG

=
HEADER
SP
STRUCTURED-‐DATA
[SP
MSG]

Syslog messages had

HEADER

=
PRI
VERSION
SP
TIMESTAMP
SP
HOSTNAME

SP
APP-‐NAME
SP
PROCID
SP
MSGID
fixed structure and

PRI

=
"<"
PRIVAL
">"

PRIVAL

=
1*3DIGIT
;
range
0
..
191
could be then

VERSION

=
NONZERO-‐DIGIT
0*2DIGIT

HOSTNAME

=
NILVALUE
/
1*255PRINTUSASCII
handled with fixed

APP-‐NAME

=
NILVALUE
/
1*48PRINTUSASCII
database schema.

PROCID

=
NILVALUE
/
1*128PRINTUSASCII

MSGID

=
NILVALUE
/
1*32PRINTUSASCII

TIMESTAMP

=
NILVALUE
/
FULL-‐DATE
"T"
FULL-‐TIME
Then I read the

FULL-‐DATE

=
DATE-‐FULLYEAR
"-‐"
DATE-‐MONTH
"-‐"
DATE-‐MDAY

DATE-‐FULLYEAR

=
4DIGIT
RFC5424: http://

DATE-‐MONTH

=
2DIGIT

;
01-‐12

DATE-‐MDAY

=
2DIGIT

;
01-‐28,
01-‐29,
01-‐30,
01-‐31
based
on
tools.ietf.org/html/

;
month/year

FULL-‐TIME

=
PARTIAL-‐TIME
TIME-‐OFFSET
rfc5424

PARTIAL-‐TIME

=
TIME-‐HOUR
":"
TIME-‐MINUTE
":"
TIME-‐SECOND

[TIME-‐SECFRAC]

TIME-‐HOUR

=
2DIGIT

;
00-‐23

TIME-‐MINUTE

=
2DIGIT

;
00-‐59

TIME-‐SECOND

=
2DIGIT

;
00-‐59

TIME-‐SECFRAC

=
"."
1*6DIGIT

TIME-‐OFFSET

=
"Z"
/
TIME-‐NUMOFFSET

TIME-‐NUMOFFSET

=
("+"
/
"-‐")
TIME-‐HOUR
":"
TIME-‐MINUTE Here we have once

STRUCTURED-‐DATA
=
NILVALUE
/
1*SD-‐ELEMENT
again parameters,

SD-‐ELEMENT

=
"["
SD-‐ID
*(SP
SD-‐PARAM)
"]"

SD-‐PARAM

=
PARAM-‐NAME
"="
%d34
PARAM-‐VALUE
%d34
although they are

SD-‐ID

=
SD-‐NAME

PARAM-‐NAME

=
SD-‐NAME
within one defined

PARAM-‐VALUE

=
UTF-‐8-‐STRING
;
characters
'"',
''
and

;
']'
MUST
be
escaped.
STRUCTURED-

SD-‐NAME

=
1*32PRINTUSASCII

;
except
'=',
SP,
']',
%d34
(")
DATA field.

MSG

=
MSG-‐ANY
/
MSG-‐UTF8

MSG-‐ANY

=
*OCTET
;
not
starting
with
BOM

MSG-‐UTF8

=
BOM
UTF-‐8-‐STRING
So could NoSQL be

BOM

=
%xEF.BB.BF used also for Syslog?

So what happens next?
• Selection of NoSQL database:

• Likely Column Family Store if no one can suggest a
better one?

• Something easy to setup and use, will concentrate into
getting RADIUS server and/or Syslogd transferring
data to database.

• Setting up a WiFi access point and/or controller to
provide real RADIUS and Syslog data

• Storing data, retrieving data, searching data, deleting data
to see what works

• Writing and presenting Part II: “Implementation and
Results” of these slides

Results (hopefully)
• Is storing RADIUS accounting and Syslog messages into
NoSQL database: a brilliant idea, brilliantly stupid idea or
something else?

• How hard can it be? What does it require to do this, is it
possible and how?

• Does it actually work? What can you do with data? Is
there some indication of performance improvements or
problems?

• Will not do complete performance measurements
though, designing and setting up reliable measurement
environment will probably take too much time.

Using NoSQL databases to
store RADIUS and Syslog data,
part 1I: The Saga Continues

Karri Huhtanen
27.11.2012

Happened earlier
• currently RADIUS accounting data is stored usually
in SQL databases with ﬁxed database schema

• for Syslog messages an SQL database can be used,
but commercial log analyzers (like Splunk) usually
use their own solutions which may or may not be
SQL databases

• Started thinking if NoSQL database could be
applied to both or one of these?

Results (luckily)
• Is storing RADIUS accounting and Syslog messages into
NoSQL database: a brilliant idea, brilliantly stupid idea or
something else? a good idea

• How hard can it be? What does it require to do this, is it
possible and how? easy, 1 night before
presentation required
• Does it actually work? What can you do with data? Is
there some indication of performance improvements or
problems? Yes. Store and Process. Unknown.
Some issues to be considered.
• Will not do complete performance measurements
though, designing and setting up reliable measurement
environment will probably take too much time.
Coded one Python script.

So what happened?
• Selection of NoSQL database:

• Likely Column Family Store if no one can suggest a

MongoDB
better one?

• Something easy to setup and use, will concentrate into
getting RADIUS server and/or Syslogd transferring
data to database.

• Setting up a WiFi access point and/or controller to
provide real RADIUS and Syslog data

• Storing data, retrieving data, searching data, deleting data
to see what works Done, but not thoroughly

• Writing and presenting Part II: “Implementation and
Results” of these slides Done

That was the executive
summary. Thank you.

Now some more
detailed information
and even some code.

storing RADIUS accounting and Syslog
messages into NoSQL database

• It is a good idea because:
• When we have massive amount of log or accounting data, we need massive
database clusters.

• Data is mainly stored, read, analyzed and occasionally deleted. Data will not be
updated or changed and is relatively simple (few tables with a lot of columns).

• NoSQL may provide better way to scale this horizontally by distribution and
sharding.

• It is already being done. Several log analyzers, stores already use NoSQL
databases as backends. There exists projects such as Greylog2 etc. which
provide complete solutions from log storage, visualization, analysis etc.

• Logs and accounting data are actually use cases for some NoSQL databases, for
example: http://docs.mongodb.org/manual/use-cases/storing-log-data/

storing RADIUS accounting and Syslog
messages into NoSQL database

• It is not a brilliant idea because:
• If we look what we need to do to optimize the performance it starts to look
like a lot like designing and optimizing a SQL database: http://docs.mongodb.org/
manual/use-cases/storing-log-data/

• You cannot forget datatypes or database design even with NoSQL databases
especially when going into production.

• Prototypes may be faster and easier for developers, but creating a design and
conﬁguration which survices production use may be as hard as it has ever been.
The difference is that instead of SQL database expert, you know need a NoSQL
expert.

• ... but it is not a brilliantly stupid idea either, it is an idea
worth considering depending of the project.

How hard can it be?
• With Ubuntu Linux Server 12.04 LTS:

• sudo apt-get install python-pymongo mongodb syslog-ng
syslog-ng-mod-mongodb

• for Syslog-NG, just some conﬁguration

• for Radiator, some conﬁguration and coding an external
Python script to handle accounting messages

• But this is far from production use, it is more like proto or
proof of concept implementation done in 1 work day.

Syslog-ng
# /etc/syslog-ng/syslog-ng.conf

# mongodb log destination
destination karrin_net_mongodb {
mongodb();
};

# ...

log {
source(s_src);
source(s_net);
destination(karrin_net_mongodb);
};

# that’s it

https://www.balabit.com/sites/default/files/documents/syslog-ng-ose-3.3-guides/syslog-
ng-ose-v3.3-guide-admin-en.html/reference_destination_mongodb.html

Radiator RADIUS server
# /etc/radiator/radiator.cfg
#
# send all RADIUS accounting requests to external script
#
<Handler Request-Type = Accounting-Request>
<AuthBy EXTERNAL>
Command %D/acct2mongo.py
</AuthBy>
AcctLogFileName %L/acct-acct2mongodb-%Y-%M.log
</Handler>

#!/usr/bin/env python
from pymongo import Connection
import datetime
acct2mongo.py
import sys

def main():

line = str()
post = dict()

# opening connection
connection = Connection( 'localhost', 27017)
# database 'radius'
db = connection['radius']
# collection 'accounting'
collection = db['accounting']

post['acct2mongotimestamp'] = datetime.datetime.utcnow()

for line in sys.stdin.readlines():
pieces = line.split(' = ', 1)
if len(pieces) == 2:
post[pieces[0].strip().strip('"')]=pieces[1].strip().strip('"')

collection.insert(post)

connection.end_request()
connection.disconnect()

# 0 Means reply with an acceptance. For Access-Requests,
# an Access-Accept will be sent. For Accounting-Requests,
# an Accounting-Response will be sent.
return 0

if __name__ == '__main__':
main()

Does it actually work? What
can you do with data?
• Yes it does actually work, but once again it does not solve or be
applicable to everything.

• One can store, read, search and delete data supposedly very
efﬁciently, but anything more complicated is harder and must be
implemented by developer.

• For example: MongoDB does not have a reliable decimal datatype. It
is better to keep numbers as a string and convert them when
processing data.

• Repeating earlier statement: “You cannot forget datatypes or
database design even with NoSQL databases especially when going
into production.”

Performance?
• Would need to be measured and veriﬁed and with
real production environment or solution.
• Would also need to be compared with well
designed and optimised SQL database, maybe even
one functioning as NoSQL one.
• In the implementation this was not tested as the
datasets were very small compared to real datasets.

Conclusions
• NoSQL should be at least considered as an option
when designing and implementing large scale Syslog or
Radius Accounting storages.

• For development it is ﬂexible.

• For production use NoSQL solution still needs design,
careful planning and testing to verify if the
performance, reliability and security is enough. Probably
as much as SQL database design.

• Key issue will probably be can the SQL database handle
the data or is horizontal scaling required.

Using NoSQL databases to store RADIUS and Syslog data

Recommended

Recommended

More Related Content

Similar to Using NoSQL databases to store RADIUS and Syslog data

Similar to Using NoSQL databases to store RADIUS and Syslog data (20)

More from Karri Huhtanen

More from Karri Huhtanen (20)

Recently uploaded

Recently uploaded (20)

Using NoSQL databases to store RADIUS and Syslog data