Reconnoiter:
                 Large-scale trending and fault-detection



                             / another product built from pain



Sunday, August 1, 2010
Goals




                    •    make checks cheap: 10000+ checks on cheap hardware

                    •    centralized configuration management.

                    •    decentralized configuration manipulation.

                    •    decouple data collections from

                         •   visualization/trending

                         •   fault-detection

                    •    make life suck just a little less.




Sunday, August 1, 2010
Architectural Design




Sunday, August 1, 2010
System Components

                    •    noitd

                         •   C, hybrid thread/event model, async I/O, small, fast, efficient,

                         •   extensible in lua.

                    •    stratcond

                         •   C, brokers and aggregates data, feeds PostgreSQL,

                         •   feeds Esper complex event processing system for fault-detection,

                         •   has a comet-style webserver built in for feeding web clients.

                    •    postgres

                    •    webconsole

                         •   PHP, almost entirely AJAX based,

                         •   uses canvas to draw.




Sunday, August 1, 2010
Some basics on the architecture



                    •    Everything important happens over SSL:

                         •   Services exposed over SSL with certificates,

                         •   Client connects using client certificates as well.

                    •    Designed so that data collection is journalled and replayed

                         •   prevents data loss due to transient networking issues
                             between NOC an data center.

                    •    There are a lot of moving parts...

                         •   designed to work when the parts don’t get along.




Sunday, August 1, 2010
Installing the code




                    •    svn co https://labs.omniti.com/reconnoiter/tags/wangle

                    •    cd trunk

                    •    autoconf

                    •    ./configure

                    •    make

                    •    make install




Sunday, August 1, 2010
Installing the database


                    •    createdb reconnoiter

                    •    createlang -d reconnoiter plpgsql

                    •    createuser reconnoiter

                    •    createuser stratcon

                    •    createuser prism

                    •    psql -U postgres reconnoiter

                         •   BEGIN;

                         •   i sql/reconnoiter_ddl_dump.sql

                         •   COMMIT;

                    •    install the crontab in sql/crontab




Sunday, August 1, 2010
Installing the web console




                    •    Web web UI lives in trunk/web/ui

                    •    Setup Apache (left as an exercise for the reader)




Sunday, August 1, 2010
SSL is underneath everything



                    •    Setup SSL certs:

                         •   Need a CA (even if a dummy CA)

                         •   need a cert (signed) for each noitd server

                         •   need a client cert (signed) for the stratcond server

                         •   need a web cert (signed) for the stratcond server (future**)

                    •    More details by running make test and looking into the

                         •   test-noit.conf

                         •   test-stratcon.conf




Sunday, August 1, 2010
Configure noitd




                         •   noit.conf

                         •   pretty good as a starting point, just clear out all the checks

                         •   module loading is at boot time, so make sure all the modules you
                             want are loaded.

                         •   checks can be added at run time via the text console.

                         •   noitd must run as root, it drops privileges after module
                             initialization.

                         •   /usr/local/sbin/noitd -c /usr/local/etc/noit.conf




Sunday, August 1, 2010
Configure stratcond




                         •   stratcon.conf

                         •   pretty good as a starting point, just clear out all the noit addresses

                         •   each noit in the field requires a line in the stratcon.conf file:

                             •   <noit address="10.225.209.25" port="43191" />

                         •   database passwords must be configured

                         •   hostname and document_domain must be configured

                         •   /usr/local/sbin/stratcond -c /usr/local/etc/stratcon.conf




Sunday, August 1, 2010
Thank You




                    •    Documentation at https://labs.omniti.com/trac/reconnoiter

                    •    Commercial support available from OmniTI.

                    •    Please join in.

                    •    OmniTI is hiring: http://omniti.com/is/hiring

                    •    Thanks!




Sunday, August 1, 2010

Noit ocon-2010

  • 1.
    Reconnoiter: Large-scale trending and fault-detection / another product built from pain Sunday, August 1, 2010
  • 2.
    Goals • make checks cheap: 10000+ checks on cheap hardware • centralized configuration management. • decentralized configuration manipulation. • decouple data collections from • visualization/trending • fault-detection • make life suck just a little less. Sunday, August 1, 2010
  • 3.
  • 4.
    System Components • noitd • C, hybrid thread/event model, async I/O, small, fast, efficient, • extensible in lua. • stratcond • C, brokers and aggregates data, feeds PostgreSQL, • feeds Esper complex event processing system for fault-detection, • has a comet-style webserver built in for feeding web clients. • postgres • webconsole • PHP, almost entirely AJAX based, • uses canvas to draw. Sunday, August 1, 2010
  • 5.
    Some basics onthe architecture • Everything important happens over SSL: • Services exposed over SSL with certificates, • Client connects using client certificates as well. • Designed so that data collection is journalled and replayed • prevents data loss due to transient networking issues between NOC an data center. • There are a lot of moving parts... • designed to work when the parts don’t get along. Sunday, August 1, 2010
  • 6.
    Installing the code • svn co https://labs.omniti.com/reconnoiter/tags/wangle • cd trunk • autoconf • ./configure • make • make install Sunday, August 1, 2010
  • 7.
    Installing the database • createdb reconnoiter • createlang -d reconnoiter plpgsql • createuser reconnoiter • createuser stratcon • createuser prism • psql -U postgres reconnoiter • BEGIN; • i sql/reconnoiter_ddl_dump.sql • COMMIT; • install the crontab in sql/crontab Sunday, August 1, 2010
  • 8.
    Installing the webconsole • Web web UI lives in trunk/web/ui • Setup Apache (left as an exercise for the reader) Sunday, August 1, 2010
  • 9.
    SSL is underneatheverything • Setup SSL certs: • Need a CA (even if a dummy CA) • need a cert (signed) for each noitd server • need a client cert (signed) for the stratcond server • need a web cert (signed) for the stratcond server (future**) • More details by running make test and looking into the • test-noit.conf • test-stratcon.conf Sunday, August 1, 2010
  • 10.
    Configure noitd • noit.conf • pretty good as a starting point, just clear out all the checks • module loading is at boot time, so make sure all the modules you want are loaded. • checks can be added at run time via the text console. • noitd must run as root, it drops privileges after module initialization. • /usr/local/sbin/noitd -c /usr/local/etc/noit.conf Sunday, August 1, 2010
  • 11.
    Configure stratcond • stratcon.conf • pretty good as a starting point, just clear out all the noit addresses • each noit in the field requires a line in the stratcon.conf file: • <noit address="10.225.209.25" port="43191" /> • database passwords must be configured • hostname and document_domain must be configured • /usr/local/sbin/stratcond -c /usr/local/etc/stratcon.conf Sunday, August 1, 2010
  • 12.
    Thank You • Documentation at https://labs.omniti.com/trac/reconnoiter • Commercial support available from OmniTI. • Please join in. • OmniTI is hiring: http://omniti.com/is/hiring • Thanks! Sunday, August 1, 2010