SlideShare a Scribd company logo
Using NoSQL databases to
store RADIUS and Syslog
     data, part 1: Idea
       Karri Huhtanen
         18.9.2012
Some background
•   currently RADIUS accounting data is stored usually
    in SQL databases with fixed database schema

•   for Syslog messages an SQL database can be used,
    but commercial log analyzers (like Splunk) usually
    use their own solutions which may or may not be
    SQL databases

•   Started thinking if NoSQL database could be
    applied to both or one of these?
RADIUS accounting message
             Wed	
  Aug	
  	
  8	
  13:49:33	
  2012
             	
  	
  	
  	
  	
  	
  	
  	
  User-­‐Name	
  =	
  "jotain@realm"
             	
  	
  	
  	
  	
  	
  	
  	
  NAS-­‐Port	
  =	
  8                                            One message
             	
  	
  	
  	
  	
  	
  	
  	
  NAS-­‐IP-­‐Address	
  =	
  192.168.229.131                      contains
             	
  	
  	
  	
  	
  	
  	
  	
  Framed-­‐IP-­‐Address	
  =	
  192.168.163.226                   undetermined
             	
  	
  	
  	
  	
  	
  	
  	
  NAS-­‐Identifier	
  =	
  "Cisco_66:77:88"
             	
  	
  	
  	
  	
  	
  	
  	
  Airespace-­‐WLAN-­‐Id	
  =	
  4                                 number of
             	
  	
  	
  	
  	
  	
  	
  	
  Acct-­‐Session-­‐Id	
  =	
  "50223ea9/00:11:22:33:44:55/2292"   attributes.
             	
  	
  	
  	
  	
  	
  	
  	
  Acct-­‐Authentic	
  =	
  Remote
             	
  	
  	
  	
  	
  	
  	
  	
  Tunnel-­‐Type	
  =	
  0:VLAN
interpreted	
  	
  	
  	
  	
  	
  	
  	
  Tunnel-­‐Medium-­‐Type	
  =	
  0:802                              Some can be
attributes, 	
  	
  	
  	
  	
  	
  	
  	
  Tunnel-­‐Private-­‐Group-­‐ID	
  =	
  0:222                      interpreted, some
the unknown  	
  	
  	
  	
  	
  	
  	
  	
  Event-­‐Timestamp	
  =	
  1344422780                            stay unknown.
             	
  	
  	
  	
  	
  	
  	
  	
  Acct-­‐Status-­‐Type	
  =	
  Alive
attributes are
             	
  	
  	
  	
  	
  	
  	
  	
  Acct-­‐Input-­‐Octets	
  =	
  1262012
usually left in
             	
  	
  	
  	
  	
  	
  	
  	
  Acct-­‐Input-­‐Gigawords	
  =	
  0                              Because there can
OID:FieldDataTyp
             	
  	
  	
  	
  	
  	
  	
  	
  Acct-­‐Output-­‐Octets	
  =	
  13518133                         be a changing
             	
  	
  	
  	
  	
  	
  	
  	
  Acct-­‐Output-­‐Gigawords	
  =	
  0
e binary format
             	
  	
  	
  	
  	
  	
  	
  	
  Acct-­‐Input-­‐Packets	
  =	
  11692                            number of
             	
  	
  	
  	
  	
  	
  	
  	
  Acct-­‐Output-­‐Packets	
  =	
  11154                           changing type of
             	
  	
  	
  	
  	
  	
  	
  	
  Acct-­‐Session-­‐Time	
  =	
  1235                              attributes I began
             	
  	
  	
  	
  	
  	
  	
  	
  Acct-­‐Delay-­‐Time	
  =	
  19
             	
  	
  	
  	
  	
  	
  	
  	
  Calling-­‐Station-­‐Id	
  =	
  "00:11:22:33:44:55"              to wonder if
             	
  	
  	
  	
  	
  	
  	
  	
  Called-­‐Station-­‐Id	
  =	
  "f4:7f:35:5e:bf:b0"               NoSQL could be
             	
  	
  	
  	
  	
  	
  	
  	
  cisco-­‐avpair	
  =	
  "nas-­‐update=true"                      used for storing
             	
  	
  	
  	
  	
  	
  	
  	
  Digest-­‐Response	
  =	
  "P"C<188>"
             	
  	
  	
  	
  	
  	
  	
  	
  Digest-­‐Response	
  =	
  "P"C<194>"                            these?
             	
  	
  	
  	
  	
  	
  	
  	
  Timestamp	
  =	
  1344422954
Syslog message
  Until researching    The	
  syslog	
  message	
  has	
  the	
  following	
  ABNF	
  [RFC5234]	
  definition:

 into this I thought   	
  	
  	
  	
  	
  	
  SYSLOG-­‐MSG	
  	
  	
  	
  	
  	
  =	
  HEADER	
  SP	
  STRUCTURED-­‐DATA	
  [SP	
  MSG]

Syslog messages had    	
  	
  	
  	
  	
  	
  HEADER	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  =	
  PRI	
  VERSION	
  SP	
  TIMESTAMP	
  SP	
  HOSTNAME
                       	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  SP	
  APP-­‐NAME	
  SP	
  PROCID	
  SP	
  MSGID
 fixed structure and    	
  	
  	
  	
  	
  	
  PRI	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  =	
  "<"	
  PRIVAL	
  ">"
                       	
  	
  	
  	
  	
  	
  PRIVAL	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  =	
  1*3DIGIT	
  ;	
  range	
  0	
  ..	
  191
   could be then       	
  	
  	
  	
  	
  	
  VERSION	
  	
  	
  	
  	
  	
  	
  	
  	
  =	
  NONZERO-­‐DIGIT	
  0*2DIGIT
                       	
  	
  	
  	
  	
  	
  HOSTNAME	
  	
  	
  	
  	
  	
  	
  	
  =	
  NILVALUE	
  /	
  1*255PRINTUSASCII
 handled with fixed
                       	
  	
  	
  	
  	
  	
  APP-­‐NAME	
  	
  	
  	
  	
  	
  	
  	
  =	
  NILVALUE	
  /	
  1*48PRINTUSASCII
  database schema.     	
  	
  	
  	
  	
  	
  PROCID	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  =	
  NILVALUE	
  /	
  1*128PRINTUSASCII
                       	
  	
  	
  	
  	
  	
  MSGID	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  =	
  NILVALUE	
  /	
  1*32PRINTUSASCII

                       	
  	
  	
  	
  	
  	
  TIMESTAMP	
  	
  	
  	
  	
  	
  	
  =	
  NILVALUE	
  /	
  FULL-­‐DATE	
  "T"	
  FULL-­‐TIME
  Then I read the      	
  	
  	
  	
  	
  	
  FULL-­‐DATE	
  	
  	
  	
  	
  	
  	
  =	
  DATE-­‐FULLYEAR	
  "-­‐"	
  DATE-­‐MONTH	
  "-­‐"	
  DATE-­‐MDAY
                       	
  	
  	
  	
  	
  	
  DATE-­‐FULLYEAR	
  	
  	
  =	
  4DIGIT
 RFC5424: http://      	
  	
  	
  	
  	
  	
  DATE-­‐MONTH	
  	
  	
  	
  	
  	
  =	
  2DIGIT	
  	
  ;	
  01-­‐12
                       	
  	
  	
  	
  	
  	
  DATE-­‐MDAY	
  	
  	
  	
  	
  	
  	
  =	
  2DIGIT	
  	
  ;	
  01-­‐28,	
  01-­‐29,	
  01-­‐30,	
  01-­‐31	
  based	
  on
tools.ietf.org/html/   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  ;	
  month/year
                       	
  	
  	
  	
  	
  	
  FULL-­‐TIME	
  	
  	
  	
  	
  	
  	
  =	
  PARTIAL-­‐TIME	
  TIME-­‐OFFSET
      rfc5424          	
  	
  	
  	
  	
  	
  PARTIAL-­‐TIME	
  	
  	
  	
  =	
  TIME-­‐HOUR	
  ":"	
  TIME-­‐MINUTE	
  ":"	
  TIME-­‐SECOND
                       	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  [TIME-­‐SECFRAC]
                       	
  	
  	
  	
  	
  	
  TIME-­‐HOUR	
  	
  	
  	
  	
  	
  	
  =	
  2DIGIT	
  	
  ;	
  00-­‐23
                       	
  	
  	
  	
  	
  	
  TIME-­‐MINUTE	
  	
  	
  	
  	
  =	
  2DIGIT	
  	
  ;	
  00-­‐59
                       	
  	
  	
  	
  	
  	
  TIME-­‐SECOND	
  	
  	
  	
  	
  =	
  2DIGIT	
  	
  ;	
  00-­‐59
                       	
  	
  	
  	
  	
  	
  TIME-­‐SECFRAC	
  	
  	
  	
  =	
  "."	
  1*6DIGIT
                       	
  	
  	
  	
  	
  	
  TIME-­‐OFFSET	
  	
  	
  	
  	
  =	
  "Z"	
  /	
  TIME-­‐NUMOFFSET
                       	
  	
  	
  	
  	
  	
  TIME-­‐NUMOFFSET	
  	
  =	
  ("+"	
  /	
  "-­‐")	
  TIME-­‐HOUR	
  ":"	
  TIME-­‐MINUTE                                                                          Here we have once
                       	
  	
  	
  	
  	
  	
  STRUCTURED-­‐DATA	
  =	
  NILVALUE	
  /	
  1*SD-­‐ELEMENT
                                                                                                                                                                                                                 again parameters,
                       	
  	
  	
  	
  	
  	
  SD-­‐ELEMENT	
  	
  	
  	
  	
  	
  =	
  "["	
  SD-­‐ID	
  *(SP	
  SD-­‐PARAM)	
  "]"
                       	
  	
  	
  	
  	
  	
  SD-­‐PARAM	
  	
  	
  	
  	
  	
  	
  	
  =	
  PARAM-­‐NAME	
  "="	
  %d34	
  PARAM-­‐VALUE	
  %d34
                                                                                                                                                                                                                 although they are
                       	
  	
  	
  	
  	
  	
  SD-­‐ID	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  =	
  SD-­‐NAME
                       	
  	
  	
  	
  	
  	
  PARAM-­‐NAME	
  	
  	
  	
  	
  	
  =	
  SD-­‐NAME
                                                                                                                                                                                                                within one defined
                       	
  	
  	
  	
  	
  	
  PARAM-­‐VALUE	
  	
  	
  	
  	
  =	
  UTF-­‐8-­‐STRING	
  ;	
  characters	
  '"',	
  ''	
  and
                       	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  ;	
  ']'	
  MUST	
  be	
  escaped.
                                                                                                                                                                                                                  STRUCTURED-
                       	
  	
  	
  	
  	
  	
  SD-­‐NAME	
  	
  	
  	
  	
  	
  	
  	
  	
  =	
  1*32PRINTUSASCII
                       	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  ;	
  except	
  '=',	
  SP,	
  ']',	
  %d34	
  (")
                                                                                                                                                                                                                    DATA field.
                       	
  	
  	
  	
  	
  	
  MSG	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  =	
  MSG-­‐ANY	
  /	
  MSG-­‐UTF8
                       	
  	
  	
  	
  	
  	
  MSG-­‐ANY	
  	
  	
  	
  	
  	
  	
  	
  	
  =	
  *OCTET	
  ;	
  not	
  starting	
  with	
  BOM
                       	
  	
  	
  	
  	
  	
  MSG-­‐UTF8	
  	
  	
  	
  	
  	
  	
  	
  =	
  BOM	
  UTF-­‐8-­‐STRING
                                                                                                                                                                                                                So could NoSQL be
                       	
  	
  	
  	
  	
  	
  BOM	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  =	
  %xEF.BB.BF                                                                                           used also for Syslog?
So what happens next?
•   Selection of NoSQL database:

    •   Likely Column Family Store if no one can suggest a
        better one?

    •   Something easy to setup and use, will concentrate into
        getting RADIUS server and/or Syslogd transferring
        data to database.

•   Setting up a WiFi access point and/or controller to
    provide real RADIUS and Syslog data

•   Storing data, retrieving data, searching data, deleting data
    to see what works

•   Writing and presenting Part II: “Implementation and
    Results” of these slides
Results (hopefully)
•   Is storing RADIUS accounting and Syslog messages into
    NoSQL database: a brilliant idea, brilliantly stupid idea or
    something else?

•   How hard can it be? What does it require to do this, is it
    possible and how?

•   Does it actually work? What can you do with data? Is
    there some indication of performance improvements or
    problems?

•   Will not do complete performance measurements
    though, designing and setting up reliable measurement
    environment will probably take too much time.
Using NoSQL databases to
store RADIUS and Syslog data,
  part 1I: The Saga Continues

        Karri Huhtanen
          27.11.2012
Happened earlier
•   currently RADIUS accounting data is stored usually
    in SQL databases with fixed database schema

•   for Syslog messages an SQL database can be used,
    but commercial log analyzers (like Splunk) usually
    use their own solutions which may or may not be
    SQL databases

•   Started thinking if NoSQL database could be
    applied to both or one of these?
Results (luckily)
•   Is storing RADIUS accounting and Syslog messages into
    NoSQL database: a brilliant idea, brilliantly stupid idea or
    something else? a good idea

•   How hard can it be? What does it require to do this, is it
    possible and how? easy, 1 night before
                       presentation required
•   Does it actually work? What can you do with data? Is
    there some indication of performance improvements or
    problems? Yes. Store and Process. Unknown.
                 Some issues to be considered.
•   Will not do complete performance measurements
    though, designing and setting up reliable measurement
    environment will probably take too much time.
                     Coded one Python script.
So what happened?
•   Selection of NoSQL database:

    •   Likely Column Family Store if no one can suggest a

                  MongoDB
        better one?

    •   Something easy to setup and use, will concentrate into
        getting RADIUS server and/or Syslogd transferring
        data to database.

•   Setting up a WiFi access point and/or controller to
    provide real RADIUS and Syslog data

•   Storing data, retrieving data, searching data, deleting data
    to see what works Done, but not thoroughly

•   Writing and presenting Part II: “Implementation and
    Results” of these slides Done
That was the executive
 summary. Thank you.
Now some more
detailed information
and even some code.
storing RADIUS accounting and Syslog
           messages into NoSQL database

•   It is a good idea because:
    •   When we have massive amount of log or accounting data, we need massive
        database clusters.

    •   Data is mainly stored, read, analyzed and occasionally deleted. Data will not be
        updated or changed and is relatively simple (few tables with a lot of columns).

    •   NoSQL may provide better way to scale this horizontally by distribution and
        sharding.

    •   It is already being done. Several log analyzers, stores already use NoSQL
        databases as backends. There exists projects such as Greylog2 etc. which
        provide complete solutions from log storage, visualization, analysis etc.

    •   Logs and accounting data are actually use cases for some NoSQL databases, for
        example: http://docs.mongodb.org/manual/use-cases/storing-log-data/
storing RADIUS accounting and Syslog
           messages into NoSQL database

•   It is not a brilliant idea because:
    •   If we look what we need to do to optimize the performance it starts to look
        like a lot like designing and optimizing a SQL database: http://docs.mongodb.org/
        manual/use-cases/storing-log-data/

    •   You cannot forget datatypes or database design even with NoSQL databases
        especially when going into production.

    •   Prototypes may be faster and easier for developers, but creating a design and
        configuration which survices production use may be as hard as it has ever been.
        The difference is that instead of SQL database expert, you know need a NoSQL
        expert.

•   ... but it is not a brilliantly stupid idea either, it is an idea
    worth considering depending of the project.
How hard can it be?
•   With Ubuntu Linux Server 12.04 LTS:

    •   sudo apt-get install python-pymongo mongodb syslog-ng
        syslog-ng-mod-mongodb

    •   for Syslog-NG, just some configuration

    •   for Radiator, some configuration and coding an external
        Python script to handle accounting messages

•   But this is far from production use, it is more like proto or
    proof of concept implementation done in 1 work day.
Demo
Syslog-ng
              # /etc/syslog-ng/syslog-ng.conf

              # mongodb log destination
              destination karrin_net_mongodb {
                      mongodb();
              };

              # ...

              log {
                         source(s_src);
                         source(s_net);
                         destination(karrin_net_mongodb);
              };

              # that’s it

https://www.balabit.com/sites/default/files/documents/syslog-ng-ose-3.3-guides/syslog-
          ng-ose-v3.3-guide-admin-en.html/reference_destination_mongodb.html
Radiator RADIUS server
# /etc/radiator/radiator.cfg
#
# send all RADIUS accounting requests to external script
#
<Handler Request-Type = Accounting-Request>
         <AuthBy EXTERNAL>
                 Command %D/acct2mongo.py
         </AuthBy>
         AcctLogFileName %L/acct-acct2mongodb-%Y-%M.log
</Handler>
#!/usr/bin/env python
from pymongo import Connection
import datetime
                                                              acct2mongo.py
import sys

def main():

        line = str()
        post = dict()

        # opening connection
        connection = Connection( 'localhost', 27017)
        # database 'radius'
        db = connection['radius']
        # collection 'accounting'
        collection = db['accounting']

        post['acct2mongotimestamp'] = datetime.datetime.utcnow()

        for line in sys.stdin.readlines():
                pieces = line.split(' = ', 1)
                if len(pieces) == 2:
                        post[pieces[0].strip().strip('"')]=pieces[1].strip().strip('"')

        collection.insert(post)

        connection.end_request()
        connection.disconnect()

        # 0 Means reply with an acceptance. For Access-Requests,
        # an Access-Accept will be sent. For Accounting-Requests,
        # an Accounting-Response will be sent.
        return 0

if __name__ == '__main__':
        main()
Does it actually work? What
       can you do with data?
•   Yes it does actually work, but once again it does not solve or be
    applicable to everything.

•   One can store, read, search and delete data supposedly very
    efficiently, but anything more complicated is harder and must be
    implemented by developer.

•   For example: MongoDB does not have a reliable decimal datatype. It
    is better to keep numbers as a string and convert them when
    processing data.

•   Repeating earlier statement: “You cannot forget datatypes or
    database design even with NoSQL databases especially when going
    into production.”
Performance?
•   Would need to be measured and verified and with
    real production environment or solution.
•   Would also need to be compared with well
    designed and optimised SQL database, maybe even
    one functioning as NoSQL one.
•   In the implementation this was not tested as the
    datasets were very small compared to real datasets.
Conclusions
•   NoSQL should be at least considered as an option
    when designing and implementing large scale Syslog or
    Radius Accounting storages.

•   For development it is flexible.

•   For production use NoSQL solution still needs design,
    careful planning and testing to verify if the
    performance, reliability and security is enough. Probably
    as much as SQL database design.

•   Key issue will probably be can the SQL database handle
    the data or is horizontal scaling required.

More Related Content

Similar to Using NoSQL databases to store RADIUS and Syslog data

LLVM Backend の紹介
LLVM Backend の紹介LLVM Backend の紹介
LLVM Backend の紹介
Akira Maruoka
 
Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...
Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...
Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...
CloudxLab
 
Using Spark to Load Oracle Data into Cassandra (Jim Hatcher, IHS Markit) | C*...
Using Spark to Load Oracle Data into Cassandra (Jim Hatcher, IHS Markit) | C*...Using Spark to Load Oracle Data into Cassandra (Jim Hatcher, IHS Markit) | C*...
Using Spark to Load Oracle Data into Cassandra (Jim Hatcher, IHS Markit) | C*...
DataStax
 
Using Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into CassandraUsing Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into Cassandra
Jim Hatcher
 
Redis for your boss
Redis for your bossRedis for your boss
Redis for your boss
Elena Kolevska
 
Osol Pgsql
Osol PgsqlOsol Pgsql
Osol Pgsql
Emanuel Calvo
 
Mongodb workshop
Mongodb workshopMongodb workshop
Mongodb workshop
Harun Yardımcı
 
Oracle sharding : Installation & Configuration
Oracle sharding : Installation & ConfigurationOracle sharding : Installation & Configuration
Oracle sharding : Installation & Configuration
suresh gandhi
 
Sysdig
SysdigSysdig
Sysdig
gnosek
 
Smolder @Silex
Smolder @SilexSmolder @Silex
Smolder @SilexJeen Lee
 
Introduction to cloudforecast
Introduction to cloudforecastIntroduction to cloudforecast
Introduction to cloudforecast
Masahiro Nagano
 
Introduction to CUDA
Introduction to CUDAIntroduction to CUDA
Introduction to CUDA
Raymond Tay
 
Pontos para criar_instancia_data guard_11g
Pontos para criar_instancia_data guard_11gPontos para criar_instancia_data guard_11g
Pontos para criar_instancia_data guard_11gLeandro Santos
 
Unlocking Your Hadoop Data with Apache Spark and CDH5
Unlocking Your Hadoop Data with Apache Spark and CDH5Unlocking Your Hadoop Data with Apache Spark and CDH5
Unlocking Your Hadoop Data with Apache Spark and CDH5
SAP Concur
 
Write a C program that reads the words the user types at the command.pdf
Write a C program that reads the words the user types at the command.pdfWrite a C program that reads the words the user types at the command.pdf
Write a C program that reads the words the user types at the command.pdf
SANDEEPARIHANT
 
About Data::ObjectDriver
About Data::ObjectDriverAbout Data::ObjectDriver
About Data::ObjectDriver
Yoshiki Kurihara
 
Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_maps
lcplcp1
 

Similar to Using NoSQL databases to store RADIUS and Syslog data (20)

LLVM Backend の紹介
LLVM Backend の紹介LLVM Backend の紹介
LLVM Backend の紹介
 
Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...
Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...
Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...
 
Computer networkppt4577
Computer networkppt4577Computer networkppt4577
Computer networkppt4577
 
Using Spark to Load Oracle Data into Cassandra (Jim Hatcher, IHS Markit) | C*...
Using Spark to Load Oracle Data into Cassandra (Jim Hatcher, IHS Markit) | C*...Using Spark to Load Oracle Data into Cassandra (Jim Hatcher, IHS Markit) | C*...
Using Spark to Load Oracle Data into Cassandra (Jim Hatcher, IHS Markit) | C*...
 
Using Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into CassandraUsing Spark to Load Oracle Data into Cassandra
Using Spark to Load Oracle Data into Cassandra
 
Redis for your boss
Redis for your bossRedis for your boss
Redis for your boss
 
A22 Introduction to DTrace by Kyle Hailey
A22 Introduction to DTrace by Kyle HaileyA22 Introduction to DTrace by Kyle Hailey
A22 Introduction to DTrace by Kyle Hailey
 
Osol Pgsql
Osol PgsqlOsol Pgsql
Osol Pgsql
 
Mongodb workshop
Mongodb workshopMongodb workshop
Mongodb workshop
 
Oracle sharding : Installation & Configuration
Oracle sharding : Installation & ConfigurationOracle sharding : Installation & Configuration
Oracle sharding : Installation & Configuration
 
Sysdig
SysdigSysdig
Sysdig
 
Smolder @Silex
Smolder @SilexSmolder @Silex
Smolder @Silex
 
Introduction to cloudforecast
Introduction to cloudforecastIntroduction to cloudforecast
Introduction to cloudforecast
 
Introduction to CUDA
Introduction to CUDAIntroduction to CUDA
Introduction to CUDA
 
Pontos para criar_instancia_data guard_11g
Pontos para criar_instancia_data guard_11gPontos para criar_instancia_data guard_11g
Pontos para criar_instancia_data guard_11g
 
Unlocking Your Hadoop Data with Apache Spark and CDH5
Unlocking Your Hadoop Data with Apache Spark and CDH5Unlocking Your Hadoop Data with Apache Spark and CDH5
Unlocking Your Hadoop Data with Apache Spark and CDH5
 
Npc14
Npc14Npc14
Npc14
 
Write a C program that reads the words the user types at the command.pdf
Write a C program that reads the words the user types at the command.pdfWrite a C program that reads the words the user types at the command.pdf
Write a C program that reads the words the user types at the command.pdf
 
About Data::ObjectDriver
About Data::ObjectDriverAbout Data::ObjectDriver
About Data::ObjectDriver
 
Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_maps
 

More from Karri Huhtanen

Disobey 2024: Karri Huhtanen: Wi-Fi Roaming Security and Privacy
Disobey 2024: Karri Huhtanen: Wi-Fi Roaming Security and PrivacyDisobey 2024: Karri Huhtanen: Wi-Fi Roaming Security and Privacy
Disobey 2024: Karri Huhtanen: Wi-Fi Roaming Security and Privacy
Karri Huhtanen
 
Wi-Fi Roaming Security and Privacy
Wi-Fi Roaming Security and PrivacyWi-Fi Roaming Security and Privacy
Wi-Fi Roaming Security and Privacy
Karri Huhtanen
 
OpenRoaming and CapPort
OpenRoaming and CapPortOpenRoaming and CapPort
OpenRoaming and CapPort
Karri Huhtanen
 
Suomen eduroam-juuripalvelun uudistukset
Suomen eduroam-juuripalvelun uudistuksetSuomen eduroam-juuripalvelun uudistukset
Suomen eduroam-juuripalvelun uudistukset
Karri Huhtanen
 
Adding OpenRoaming to existing IdP and roaming federation service
Adding OpenRoaming to existing IdP and roaming federation serviceAdding OpenRoaming to existing IdP and roaming federation service
Adding OpenRoaming to existing IdP and roaming federation service
Karri Huhtanen
 
OpenRoaming -- Wi-Fi Roaming for All
OpenRoaming -- Wi-Fi Roaming for AllOpenRoaming -- Wi-Fi Roaming for All
OpenRoaming -- Wi-Fi Roaming for All
Karri Huhtanen
 
Beyond eduroam: Combining eduroam, (5G) SIM authentication and OpenRoaming
Beyond eduroam: Combining eduroam, (5G) SIM authentication and OpenRoamingBeyond eduroam: Combining eduroam, (5G) SIM authentication and OpenRoaming
Beyond eduroam: Combining eduroam, (5G) SIM authentication and OpenRoaming
Karri Huhtanen
 
Routing host certificates in eduroam/govroam
Routing host certificates in eduroam/govroamRouting host certificates in eduroam/govroam
Routing host certificates in eduroam/govroam
Karri Huhtanen
 
Cooperative labs, testbeds and networks
Cooperative labs, testbeds and networksCooperative labs, testbeds and networks
Cooperative labs, testbeds and networks
Karri Huhtanen
 
Privacy and traceability in Wi-Fi networks
Privacy and traceability in Wi-Fi networksPrivacy and traceability in Wi-Fi networks
Privacy and traceability in Wi-Fi networks
Karri Huhtanen
 
EAP-TLS (extended version)
EAP-TLS (extended version)EAP-TLS (extended version)
EAP-TLS (extended version)
Karri Huhtanen
 
EAP-TLS
EAP-TLSEAP-TLS
Security issues in RADIUS based Wi-Fi AAA
Security issues in RADIUS based Wi-Fi AAASecurity issues in RADIUS based Wi-Fi AAA
Security issues in RADIUS based Wi-Fi AAA
Karri Huhtanen
 
TLS and Certificates
TLS and CertificatesTLS and Certificates
TLS and Certificates
Karri Huhtanen
 
What is Network Function Virtualisation (NFV)?
What is Network Function Virtualisation (NFV)?What is Network Function Virtualisation (NFV)?
What is Network Function Virtualisation (NFV)?
Karri Huhtanen
 
What is Network Function Virtualisation (NFV)?
What is Network Function Virtualisation (NFV)?What is Network Function Virtualisation (NFV)?
What is Network Function Virtualisation (NFV)?
Karri Huhtanen
 
Building secure, privacy aware, quality Wi-Fi coverage via cooperation
Building secure, privacy aware, quality Wi-Fi coverage via cooperationBuilding secure, privacy aware, quality Wi-Fi coverage via cooperation
Building secure, privacy aware, quality Wi-Fi coverage via cooperation
Karri Huhtanen
 
Connecting the Dots: Integrating RADIUS to Network Measurement and Monitoring
Connecting the Dots: Integrating RADIUS to Network Measurement and MonitoringConnecting the Dots: Integrating RADIUS to Network Measurement and Monitoring
Connecting the Dots: Integrating RADIUS to Network Measurement and Monitoring
Karri Huhtanen
 
Building city and nationwide Wi-Fi coverage via cooperation
Building city and nationwide Wi-Fi coverage via cooperationBuilding city and nationwide Wi-Fi coverage via cooperation
Building city and nationwide Wi-Fi coverage via cooperation
Karri Huhtanen
 
eduroam diagnostics in NTLR, IdPs and SPs
eduroam diagnostics in NTLR, IdPs and SPseduroam diagnostics in NTLR, IdPs and SPs
eduroam diagnostics in NTLR, IdPs and SPs
Karri Huhtanen
 

More from Karri Huhtanen (20)

Disobey 2024: Karri Huhtanen: Wi-Fi Roaming Security and Privacy
Disobey 2024: Karri Huhtanen: Wi-Fi Roaming Security and PrivacyDisobey 2024: Karri Huhtanen: Wi-Fi Roaming Security and Privacy
Disobey 2024: Karri Huhtanen: Wi-Fi Roaming Security and Privacy
 
Wi-Fi Roaming Security and Privacy
Wi-Fi Roaming Security and PrivacyWi-Fi Roaming Security and Privacy
Wi-Fi Roaming Security and Privacy
 
OpenRoaming and CapPort
OpenRoaming and CapPortOpenRoaming and CapPort
OpenRoaming and CapPort
 
Suomen eduroam-juuripalvelun uudistukset
Suomen eduroam-juuripalvelun uudistuksetSuomen eduroam-juuripalvelun uudistukset
Suomen eduroam-juuripalvelun uudistukset
 
Adding OpenRoaming to existing IdP and roaming federation service
Adding OpenRoaming to existing IdP and roaming federation serviceAdding OpenRoaming to existing IdP and roaming federation service
Adding OpenRoaming to existing IdP and roaming federation service
 
OpenRoaming -- Wi-Fi Roaming for All
OpenRoaming -- Wi-Fi Roaming for AllOpenRoaming -- Wi-Fi Roaming for All
OpenRoaming -- Wi-Fi Roaming for All
 
Beyond eduroam: Combining eduroam, (5G) SIM authentication and OpenRoaming
Beyond eduroam: Combining eduroam, (5G) SIM authentication and OpenRoamingBeyond eduroam: Combining eduroam, (5G) SIM authentication and OpenRoaming
Beyond eduroam: Combining eduroam, (5G) SIM authentication and OpenRoaming
 
Routing host certificates in eduroam/govroam
Routing host certificates in eduroam/govroamRouting host certificates in eduroam/govroam
Routing host certificates in eduroam/govroam
 
Cooperative labs, testbeds and networks
Cooperative labs, testbeds and networksCooperative labs, testbeds and networks
Cooperative labs, testbeds and networks
 
Privacy and traceability in Wi-Fi networks
Privacy and traceability in Wi-Fi networksPrivacy and traceability in Wi-Fi networks
Privacy and traceability in Wi-Fi networks
 
EAP-TLS (extended version)
EAP-TLS (extended version)EAP-TLS (extended version)
EAP-TLS (extended version)
 
EAP-TLS
EAP-TLSEAP-TLS
EAP-TLS
 
Security issues in RADIUS based Wi-Fi AAA
Security issues in RADIUS based Wi-Fi AAASecurity issues in RADIUS based Wi-Fi AAA
Security issues in RADIUS based Wi-Fi AAA
 
TLS and Certificates
TLS and CertificatesTLS and Certificates
TLS and Certificates
 
What is Network Function Virtualisation (NFV)?
What is Network Function Virtualisation (NFV)?What is Network Function Virtualisation (NFV)?
What is Network Function Virtualisation (NFV)?
 
What is Network Function Virtualisation (NFV)?
What is Network Function Virtualisation (NFV)?What is Network Function Virtualisation (NFV)?
What is Network Function Virtualisation (NFV)?
 
Building secure, privacy aware, quality Wi-Fi coverage via cooperation
Building secure, privacy aware, quality Wi-Fi coverage via cooperationBuilding secure, privacy aware, quality Wi-Fi coverage via cooperation
Building secure, privacy aware, quality Wi-Fi coverage via cooperation
 
Connecting the Dots: Integrating RADIUS to Network Measurement and Monitoring
Connecting the Dots: Integrating RADIUS to Network Measurement and MonitoringConnecting the Dots: Integrating RADIUS to Network Measurement and Monitoring
Connecting the Dots: Integrating RADIUS to Network Measurement and Monitoring
 
Building city and nationwide Wi-Fi coverage via cooperation
Building city and nationwide Wi-Fi coverage via cooperationBuilding city and nationwide Wi-Fi coverage via cooperation
Building city and nationwide Wi-Fi coverage via cooperation
 
eduroam diagnostics in NTLR, IdPs and SPs
eduroam diagnostics in NTLR, IdPs and SPseduroam diagnostics in NTLR, IdPs and SPs
eduroam diagnostics in NTLR, IdPs and SPs
 

Recently uploaded

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 

Recently uploaded (20)

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 

Using NoSQL databases to store RADIUS and Syslog data

  • 1. Using NoSQL databases to store RADIUS and Syslog data, part 1: Idea Karri Huhtanen 18.9.2012
  • 2. Some background • currently RADIUS accounting data is stored usually in SQL databases with fixed database schema • for Syslog messages an SQL database can be used, but commercial log analyzers (like Splunk) usually use their own solutions which may or may not be SQL databases • Started thinking if NoSQL database could be applied to both or one of these?
  • 3. RADIUS accounting message Wed  Aug    8  13:49:33  2012                User-­‐Name  =  "jotain@realm"                NAS-­‐Port  =  8 One message                NAS-­‐IP-­‐Address  =  192.168.229.131 contains                Framed-­‐IP-­‐Address  =  192.168.163.226 undetermined                NAS-­‐Identifier  =  "Cisco_66:77:88"                Airespace-­‐WLAN-­‐Id  =  4 number of                Acct-­‐Session-­‐Id  =  "50223ea9/00:11:22:33:44:55/2292" attributes.                Acct-­‐Authentic  =  Remote                Tunnel-­‐Type  =  0:VLAN interpreted                Tunnel-­‐Medium-­‐Type  =  0:802 Some can be attributes,                Tunnel-­‐Private-­‐Group-­‐ID  =  0:222 interpreted, some the unknown                Event-­‐Timestamp  =  1344422780 stay unknown.                Acct-­‐Status-­‐Type  =  Alive attributes are                Acct-­‐Input-­‐Octets  =  1262012 usually left in                Acct-­‐Input-­‐Gigawords  =  0 Because there can OID:FieldDataTyp                Acct-­‐Output-­‐Octets  =  13518133 be a changing                Acct-­‐Output-­‐Gigawords  =  0 e binary format                Acct-­‐Input-­‐Packets  =  11692 number of                Acct-­‐Output-­‐Packets  =  11154 changing type of                Acct-­‐Session-­‐Time  =  1235 attributes I began                Acct-­‐Delay-­‐Time  =  19                Calling-­‐Station-­‐Id  =  "00:11:22:33:44:55" to wonder if                Called-­‐Station-­‐Id  =  "f4:7f:35:5e:bf:b0" NoSQL could be                cisco-­‐avpair  =  "nas-­‐update=true" used for storing                Digest-­‐Response  =  "P"C<188>"                Digest-­‐Response  =  "P"C<194>" these?                Timestamp  =  1344422954
  • 4. Syslog message Until researching The  syslog  message  has  the  following  ABNF  [RFC5234]  definition: into this I thought            SYSLOG-­‐MSG            =  HEADER  SP  STRUCTURED-­‐DATA  [SP  MSG] Syslog messages had            HEADER                    =  PRI  VERSION  SP  TIMESTAMP  SP  HOSTNAME                                                SP  APP-­‐NAME  SP  PROCID  SP  MSGID fixed structure and            PRI                          =  "<"  PRIVAL  ">"            PRIVAL                    =  1*3DIGIT  ;  range  0  ..  191 could be then            VERSION                  =  NONZERO-­‐DIGIT  0*2DIGIT            HOSTNAME                =  NILVALUE  /  1*255PRINTUSASCII handled with fixed            APP-­‐NAME                =  NILVALUE  /  1*48PRINTUSASCII database schema.            PROCID                    =  NILVALUE  /  1*128PRINTUSASCII            MSGID                      =  NILVALUE  /  1*32PRINTUSASCII            TIMESTAMP              =  NILVALUE  /  FULL-­‐DATE  "T"  FULL-­‐TIME Then I read the            FULL-­‐DATE              =  DATE-­‐FULLYEAR  "-­‐"  DATE-­‐MONTH  "-­‐"  DATE-­‐MDAY            DATE-­‐FULLYEAR      =  4DIGIT RFC5424: http://            DATE-­‐MONTH            =  2DIGIT    ;  01-­‐12            DATE-­‐MDAY              =  2DIGIT    ;  01-­‐28,  01-­‐29,  01-­‐30,  01-­‐31  based  on tools.ietf.org/html/                                                                ;  month/year            FULL-­‐TIME              =  PARTIAL-­‐TIME  TIME-­‐OFFSET rfc5424            PARTIAL-­‐TIME        =  TIME-­‐HOUR  ":"  TIME-­‐MINUTE  ":"  TIME-­‐SECOND                                                [TIME-­‐SECFRAC]            TIME-­‐HOUR              =  2DIGIT    ;  00-­‐23            TIME-­‐MINUTE          =  2DIGIT    ;  00-­‐59            TIME-­‐SECOND          =  2DIGIT    ;  00-­‐59            TIME-­‐SECFRAC        =  "."  1*6DIGIT            TIME-­‐OFFSET          =  "Z"  /  TIME-­‐NUMOFFSET            TIME-­‐NUMOFFSET    =  ("+"  /  "-­‐")  TIME-­‐HOUR  ":"  TIME-­‐MINUTE Here we have once            STRUCTURED-­‐DATA  =  NILVALUE  /  1*SD-­‐ELEMENT again parameters,            SD-­‐ELEMENT            =  "["  SD-­‐ID  *(SP  SD-­‐PARAM)  "]"            SD-­‐PARAM                =  PARAM-­‐NAME  "="  %d34  PARAM-­‐VALUE  %d34 although they are            SD-­‐ID                      =  SD-­‐NAME            PARAM-­‐NAME            =  SD-­‐NAME within one defined            PARAM-­‐VALUE          =  UTF-­‐8-­‐STRING  ;  characters  '"',  ''  and                                                                          ;  ']'  MUST  be  escaped. STRUCTURED-            SD-­‐NAME                  =  1*32PRINTUSASCII                                                ;  except  '=',  SP,  ']',  %d34  (") DATA field.            MSG                          =  MSG-­‐ANY  /  MSG-­‐UTF8            MSG-­‐ANY                  =  *OCTET  ;  not  starting  with  BOM            MSG-­‐UTF8                =  BOM  UTF-­‐8-­‐STRING So could NoSQL be            BOM                          =  %xEF.BB.BF used also for Syslog?
  • 5. So what happens next? • Selection of NoSQL database: • Likely Column Family Store if no one can suggest a better one? • Something easy to setup and use, will concentrate into getting RADIUS server and/or Syslogd transferring data to database. • Setting up a WiFi access point and/or controller to provide real RADIUS and Syslog data • Storing data, retrieving data, searching data, deleting data to see what works • Writing and presenting Part II: “Implementation and Results” of these slides
  • 6. Results (hopefully) • Is storing RADIUS accounting and Syslog messages into NoSQL database: a brilliant idea, brilliantly stupid idea or something else? • How hard can it be? What does it require to do this, is it possible and how? • Does it actually work? What can you do with data? Is there some indication of performance improvements or problems? • Will not do complete performance measurements though, designing and setting up reliable measurement environment will probably take too much time.
  • 7. Using NoSQL databases to store RADIUS and Syslog data, part 1I: The Saga Continues Karri Huhtanen 27.11.2012
  • 8. Happened earlier • currently RADIUS accounting data is stored usually in SQL databases with fixed database schema • for Syslog messages an SQL database can be used, but commercial log analyzers (like Splunk) usually use their own solutions which may or may not be SQL databases • Started thinking if NoSQL database could be applied to both or one of these?
  • 9. Results (luckily) • Is storing RADIUS accounting and Syslog messages into NoSQL database: a brilliant idea, brilliantly stupid idea or something else? a good idea • How hard can it be? What does it require to do this, is it possible and how? easy, 1 night before presentation required • Does it actually work? What can you do with data? Is there some indication of performance improvements or problems? Yes. Store and Process. Unknown. Some issues to be considered. • Will not do complete performance measurements though, designing and setting up reliable measurement environment will probably take too much time. Coded one Python script.
  • 10. So what happened? • Selection of NoSQL database: • Likely Column Family Store if no one can suggest a MongoDB better one? • Something easy to setup and use, will concentrate into getting RADIUS server and/or Syslogd transferring data to database. • Setting up a WiFi access point and/or controller to provide real RADIUS and Syslog data • Storing data, retrieving data, searching data, deleting data to see what works Done, but not thoroughly • Writing and presenting Part II: “Implementation and Results” of these slides Done
  • 11. That was the executive summary. Thank you.
  • 12. Now some more detailed information and even some code.
  • 13. storing RADIUS accounting and Syslog messages into NoSQL database • It is a good idea because: • When we have massive amount of log or accounting data, we need massive database clusters. • Data is mainly stored, read, analyzed and occasionally deleted. Data will not be updated or changed and is relatively simple (few tables with a lot of columns). • NoSQL may provide better way to scale this horizontally by distribution and sharding. • It is already being done. Several log analyzers, stores already use NoSQL databases as backends. There exists projects such as Greylog2 etc. which provide complete solutions from log storage, visualization, analysis etc. • Logs and accounting data are actually use cases for some NoSQL databases, for example: http://docs.mongodb.org/manual/use-cases/storing-log-data/
  • 14. storing RADIUS accounting and Syslog messages into NoSQL database • It is not a brilliant idea because: • If we look what we need to do to optimize the performance it starts to look like a lot like designing and optimizing a SQL database: http://docs.mongodb.org/ manual/use-cases/storing-log-data/ • You cannot forget datatypes or database design even with NoSQL databases especially when going into production. • Prototypes may be faster and easier for developers, but creating a design and configuration which survices production use may be as hard as it has ever been. The difference is that instead of SQL database expert, you know need a NoSQL expert. • ... but it is not a brilliantly stupid idea either, it is an idea worth considering depending of the project.
  • 15. How hard can it be? • With Ubuntu Linux Server 12.04 LTS: • sudo apt-get install python-pymongo mongodb syslog-ng syslog-ng-mod-mongodb • for Syslog-NG, just some configuration • for Radiator, some configuration and coding an external Python script to handle accounting messages • But this is far from production use, it is more like proto or proof of concept implementation done in 1 work day.
  • 16. Demo
  • 17. Syslog-ng # /etc/syslog-ng/syslog-ng.conf # mongodb log destination destination karrin_net_mongodb { mongodb(); }; # ... log { source(s_src); source(s_net); destination(karrin_net_mongodb); }; # that’s it https://www.balabit.com/sites/default/files/documents/syslog-ng-ose-3.3-guides/syslog- ng-ose-v3.3-guide-admin-en.html/reference_destination_mongodb.html
  • 18. Radiator RADIUS server # /etc/radiator/radiator.cfg # # send all RADIUS accounting requests to external script # <Handler Request-Type = Accounting-Request> <AuthBy EXTERNAL> Command %D/acct2mongo.py </AuthBy> AcctLogFileName %L/acct-acct2mongodb-%Y-%M.log </Handler>
  • 19. #!/usr/bin/env python from pymongo import Connection import datetime acct2mongo.py import sys def main(): line = str() post = dict() # opening connection connection = Connection( 'localhost', 27017) # database 'radius' db = connection['radius'] # collection 'accounting' collection = db['accounting'] post['acct2mongotimestamp'] = datetime.datetime.utcnow() for line in sys.stdin.readlines(): pieces = line.split(' = ', 1) if len(pieces) == 2: post[pieces[0].strip().strip('"')]=pieces[1].strip().strip('"') collection.insert(post) connection.end_request() connection.disconnect() # 0 Means reply with an acceptance. For Access-Requests, # an Access-Accept will be sent. For Accounting-Requests, # an Accounting-Response will be sent. return 0 if __name__ == '__main__': main()
  • 20. Does it actually work? What can you do with data? • Yes it does actually work, but once again it does not solve or be applicable to everything. • One can store, read, search and delete data supposedly very efficiently, but anything more complicated is harder and must be implemented by developer. • For example: MongoDB does not have a reliable decimal datatype. It is better to keep numbers as a string and convert them when processing data. • Repeating earlier statement: “You cannot forget datatypes or database design even with NoSQL databases especially when going into production.”
  • 21. Performance? • Would need to be measured and verified and with real production environment or solution. • Would also need to be compared with well designed and optimised SQL database, maybe even one functioning as NoSQL one. • In the implementation this was not tested as the datasets were very small compared to real datasets.
  • 22. Conclusions • NoSQL should be at least considered as an option when designing and implementing large scale Syslog or Radius Accounting storages. • For development it is flexible. • For production use NoSQL solution still needs design, careful planning and testing to verify if the performance, reliability and security is enough. Probably as much as SQL database design. • Key issue will probably be can the SQL database handle the data or is horizontal scaling required.