Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Efficient logging in multithreaded C++ server

16,972 views

Published on

Published in: Technology, Business
  • Have a look at easylogging++ a simple (single-header based) multi-threaded logging c++ lib with many features (github.com/mkhan3189/easyloggingpp)
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • 大侠,这篇文章为何不让下载呢?呵呵
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • 很赞同,log不需要那么复杂,log4cxx之类太复杂了,基本满足我的需求(除了一些地方无法配置外)
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Efficient logging in multithreaded C++ server

  1. 1. 1 www.chenshuo.com EFFICIENT LOGGING IN MULTITHREADED C++ SERVER2012/06 Shuo Chen
  2. 2. Show me the code2  C++ logging library in muduo 0.5.0  http://code.google.com/p/muduo (release)  github.com/chenshuo/muduo (latest)  github.com/chenshuo/recipes/tree/master/logging  Performance on i5-2500  1,000,000+ log messages per second  Max throughput 100+MiB/s  1us~1.6us latency per message for async logging 2012/06 www.chenshuo.com
  3. 3. Two meanings of log3  Diagnostic log  Transaction log  log4j,logback, slf4j  Write-ahead log  log4cxx, log4cpp,  Binlog,redo log log4cplus, glog, g2log,  Journaling Pantheios, ezlogger  Log-structured FS  Here we mean diagnostic log/logging  Textual, human readable  grep/sed/awk friendly 2012/06 www.chenshuo.com
  4. 4. Features of common logging library4  Multiple log levels, enable/disable at runtime  TRACE, DEBUG, INFO, WARN, ERROR, FATAL  Flexibilities are not all necessary!  Appenders, layouts, filters  Many possible destinations? ie. Appender  There is one true destination, really  Configure log line format at runtime? ie. Layout  You won’t change format during life time of a project  Muduo logging is configured at compile time  No xml config file, but a few lines of code in main() 2012/06 www.chenshuo.com
  5. 5. What to log in distributed system5  Everything! All the time!  http://highscalability.com/log-everything-all-time  Log must be fast  Tensof thousands messages per sec is normal  without noticeable performance penalty  Should never block normal execution flow  Log data could be big  up to 1GB per minute, for one process on one host  An efficient logging library is a prerequisite for any non-trivial server-side program www.chenshuo.com 2012/06
  6. 6. Frontend and backend of log lib6  Frontend formats log messages  Backend sends log messages to destination (file)  The interface between could be as simple as  void output_log(const char* msg, int len)  However, in a multithreaded program, the most difficult part is neither frontend nor backend, but transfer log data from frontend to backend  Multiple producers (frontend), one consumer  Low latency / low CPU overhead for frontend  High throughput for backend 2012/06 www.chenshuo.com
  7. 7. Frontend should be easy to use7  Two styles in C++, C/Java function vs. C++ stream  printlog(“Received %d bytes from %s”, len, client);  LOG << “Received ” << len << “ bytes from ” << client;  print*() can be made type safe, but cumbersome  You can’t pass non-POD objects as (...) arguments  Pantheios uses overloaded function templates  LOG is easier to use IMO, no placeholder in fmt str  When logging level is disabled, the whole statement can be made a nop, almost no runtime overhead at all  http://www.drdobbs.com/cpp/201804215 2012/06 www.chenshuo.com
  8. 8. Why muduo::LogStream ?8  std::ostream is too slow and not thread-safe  One ostringstream object per log msg is expensive  LogStream is fast because  No formatting, no manipulators, no i18n or l10n  Output integers, doubles, pointers, strings  Fmt class is provided, though  Fixed-size buffer allocated on stack, no malloc call  Also limit the max log msg to 4000 bytes, same as glog  See benchmark result at  www.cnblogs.com/Solstice/archive/2011/07/17/2108715.html 2012/06  muduo/base/test/LogStream_bench.cc www.chenshuo.com
  9. 9. Log line format is also fixed9  You don’t want to change output format at runtime, do you?  One log message per line, easy for grep  date time(UTC) thread-id level message source 20120603 08:02:46.125770Z 23261 INFO Hello - test.cc:51 20120603 08:02:46.126926Z 23261 WARN World - test.cc:52 20120603 08:02:46.126997Z 23261 ERROR Error - test.cc:53  No overhead of parsing format string all the time  Further more, the date time string is cached in 1 sec  TRACE and DEBUG levels are disabled by default  Turn them on with environment variable www.chenshuo.com 2012/06
  10. 10. The one true destination: LogFile10  Local file, with rolling & timestamp in log filename  It is a joke to write large amount of log messages to  SMTP, Database, Network (FTP, “log server”)  The purpose of logging is to investigate what has happened in case of system failure or malfunction  Network could fail, how to log that event?  Log to network may also double bandwidth usage  Be aware of log to network mapped file system  What if disk fails? The host is not usable anyway  Check dmesg and other kernel logs 2012/06 www.chenshuo.com
  11. 11. Performance requirement11  Suppose PC server with SATA disks, no RAID  110MB/s 7200rpm single disk  Logging to local disk, 110 bytes per log message  1000k log messages per second with IO buffer  Target for a “high-performance” logging library  For a busy server with 100k qps, logging every request should not take too much CPU time  100k log msgs per second can only serve 10k qps  The log library should be able to write 100MB/s  And lasts for seconds (at peak time of course) 2012/06 www.chenshuo.com
  12. 12. Performance of muduo logging12  ~100-byte msgs, single thread, synchronous log  E5320  i5-2500  nop (msgs/s, bytes/s)  nop (msgs/s, bytes/s)  971k/s, 107.6MiB/s  2422k/s, 256.2MiB/s  /dev/null  /dev/null  912k/s, 101.1MiB/s  2342k/s, 247.7MiB/s  /tmp/log  /tmp/log  810k/s, =89.8MiB/s  2130k/s, 225.3MiB/s  1,000,000+ msgs per sec, saturate disk bandwidth 2012/06 www.chenshuo.com
  13. 13. Other trick for postmortem13  Log output must be buffered  can’t afford fflush(3)ing every time  What if program crashes, there must be some unwritten log messages in core file, how to find ?  Put a cookie (a sentry value) at beginning of message/buffer  Cookie can be an address of function, to be unique  Also set cookie to some other function in dtor  Identify messages with gdb find command 2012/06 www.chenshuo.com
  14. 14. Logging in multithreaded program14  Logging must be thread-safe (no interleaving)  And efficient  Better to log to one file per process, not per thread  Easierfor reading log, not jump around files  The OS kernel has to serialize writing anyway  Thread-safe is easy, efficient is not that easy  Global lock and blocking writing are bad ideas  One background thread gathers log messages and write them to disk. Aka. asynchronous logging. 2012/06 www.chenshuo.com
  15. 15. Asynchronous logging is a must15  Aka. Non-blocking logging  Disk IO can block for seconds, occasionally  Cause timeouts in distributed system  and cascade effects, eg. false alarms of deadlock, etc.  Absolutely no disk IO in normal execution flow  Very important for non-blocking network programming, check my other slides  We need a “queue” to pass log data efficiently  Not necessarily a traditional blocking queue,  no need to notify consumer every time there is 2012/06 something to write www.chenshuo.com
  16. 16. What if messages queue up?16  Program writes log faster than disk bandwidth?  Itqueues first in OS cache  Then in the process, memory usage increase rapidly  In case of overload, the logging library should not crash or OOM, drop messages instead  Send alerts via network if necessary  Not a problem in synchronous logging  Blocking-IO is easy/good for bandwidth throttling 2012/06 www.chenshuo.com
  17. 17. Double buffering for the queue17  Basic idea: two buffers, swap them when one is full  Allbusiness threads write to one buffer, memcpy only  Log thread writes the other buffer to disk  Improvement: four buffers, no waiting in most case  Critical code in critical sections, next two pages typedef boost::ptr_vector<LargeBuffer> BufferVector; typedef BufferVector::auto_type BufferPtr; muduo::MutexLock mutex_; muduo::Condition cond_; BufferPtr currentBuffer_; BufferPtr nextBuffer_; BufferVector buffers_; 2012/06 www.chenshuo.com
  18. 18. void AsyncLogging::append(const char* logline, int len) { muduo::MutexLockGuard lock(mutex_); if (currentBuffer_->avail() > len) { // most common case: buffer is not full, copy data here currentBuffer_->append(logline, len); } else // buffer is full, push it, and find next spare buffer { buffers_.push_back(currentBuffer_.release()); if (nextBuffer_) // is there is one already, use it { currentBuffer_ = boost::ptr_container::move(nextBuffer_); } else // allocate a new one { currentBuffer_.reset(new Buffer); // Rarely happens } currentBuffer_->append(logline, len); cond_.notify(); }18 } 2012/06 www.chenshuo.com
  19. 19. // in log thread BufferPtr newBuffer1(new Buffer); BufferPtr newBuffer2(new Buffer); boost::ptr_vector<Buffer> buffersToWrite(16); while (running_) { // swap out what need to be written, keep CS short { muduo::MutexLockGuard lock(mutex_); cond_.waitForSeconds(flushInterval_); buffers_.push_back(currentBuffer_.release()); currentBuffer_ = boost::ptr_container::move(newBuffer1); buffersToWrite.swap(buffers_); if (!nextBuffer_) { nextBuffer_ = boost::ptr_container::move(newBuffer2); } } // output buffersToWrite, re-fill newBuffer1/2 } // final note: bzero() each buffer initially to avoid page faults19 2012/06 www.chenshuo.com
  20. 20. Alternative solutions?20  Use normal muduo::BlockingQueue<string> or BoundedBlockingQueue<string> as the queue  Allocate memory for every log message, you need a good malloc that optimized for multithreading  Replace stack buffer with heap buffer in LogStream  Instead of copying data, passing pointer might be faster  But as I tested, copying data is 3x faster for small msgs ~4k  That’s why muduo only provides one AsyncLogging class  More buffers, reduce lock contention  Like ConcurrentHashMap, buckets hashed by thread id 2012/06 www.chenshuo.com
  21. 21. Conclusion21  Muduo logging library is fairly speedy  It provides the most fundamental features  No bells and whistles, no unnecessary flexibilities  Frontend: LogStream and Logging classes  LOG_INFO << "Hello"; 20120603 08:02:46.125770Z 23261 INFO Hello - test.cc:51  Backend: LogFile class, rolling & timestamped  logfile_test.20120603-144022.hostname.3605.log  For multithreaded, use AsyncLogging class  Check examples in muduo/base/tests for how to use 2012/06 www.chenshuo.com

×