Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Tensorflow and python : fault detection system - PyCon Taiwan 2017

954 views

Published on

Tensorflow and python : fault detection system - PyCon Taiwan 2017

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Tensorflow and python : fault detection system - PyCon Taiwan 2017

  1. 1. P Y C O N T W - 2 0 1 7 P Y C O N TA I WA N - 2 0 1 7 E R I C ( B Y U N G W O O K ) A H N Te n s o r f l o w & P y t h o n : F a u l t D e t e c t i o n S y s t e m
  2. 2. Who am I Experienced CDN Media streaming Device driver(windows, linux) 2015 PyCon KR 2015 PyCon HK 2016 Swift KR 2016 Tensorflow KR 2017 PyCon Taiwan ?
  3. 3. FA U LT D E T E C T I O N L O G L O G L O G L O G 2 M L T O D A Y … 3 P Y C O N T W - 2 0 1 7
  4. 4. P Y C O N T W - 2 0 1 7 Currently, My company has over 200 services. They’re using different systems, located at different IDC.
  5. 5. P Y C O N T W - 2 0 1 7 What is a Fault Detection? 5
  6. 6. P Y C O N T W - 2 0 1 7 Before knowing Fault Detection! We need to know what a Fault is. 6
  7. 7. P Y C O N T W - 2 0 1 7 Fault Wikipedia : A fault is defined as an abnormal condition or defect at the component, equipment, or sub-system level which may lead to a failure. 7
  8. 8. P Y C O N T W - 2 0 1 7 There are many Fault Detection System. 8
  9. 9. P Y C O N T W - 2 0 1 7 In Generally, Usage of CPU, Memory, Disk I/O Network bandwidth System Log, Application Log JVM monitoring URI check … 9
  10. 10. P Y C O N T W - 2 0 1 7 PRTG Cacti zabbix L4/L7(Commercial Product) Log and Process monitoring ( Commercial Product) DB Monitoring ( Open source, Commercial Product) Application Performance Monitoring ( APM ) for JVM ElasticSearch Hadoop grafana kibana … 10
  11. 11. P Y C O N T W - 2 0 1 7 Many views, Charts, and Alarm system in a IDC Center 11
  12. 12. P Y C O N T W - 2 0 1 7 I would like to detect a Fault with ML. 12
  13. 13. P Y C O N T W - 2 0 1 7 LOG LOG LOG 13
  14. 14. P Y C O N T W - 2 0 1 7 Log format apache squid custom log format .. 14
  15. 15. P Y C O N T W - 2 0 1 7 Type of log log filename daemon description kernel log /dev/console kernel Console log system log /var/log/messages syslogd security log /var/log/secure xinetd mail log /var/log/maillog sendmail Sendmail cron log /var/log/cron crowd booting log /var/log/boot.log kernel kernel boot log /var/dmesg kernel kernel log /var/log/wtmp kernel Record a system login totally kernel log /var/log/utmp kernel Record current login, ip .. ftp log /var/log/xferlog ftpd http log /var/log/httpd/access.log httpd http error log /var/log/httpd/error.log httpd named log /var/log/named.log named Application log depends on application app daemon 15
  16. 16. P Y C O N T W - 2 0 1 7 Example : system log ( /var/log/messages ) 16
  17. 17. P Y C O N T W - 2 0 1 7 May 14 03:43:01 web01 rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="1673" x-info="http://www.rsyslog.com"] rsyslogd was HUPed May 14 04:40:01 web01 ntpdate[6617]: the NTP socket is in use, exiting May 14 05:40:01 web01 ntpdate[8437]: the NTP socket is in use, exiting May 14 06:40:01 web01 ntpdate[10212]: the NTP socket is in use, exiting May 14 07:40:01 web01 ntpdate[12315]: the NTP socket is in use, exiting May 14 08:40:01 web01 ntpdate[14090]: the NTP socket is in use, exiting May 14 15:40:01 web01 ntpdate[27169]: the NTP socket is in use, exiting May 14 16:40:01 web01 ntpdate[28940]: the NTP socket is in use, exiting May 14 17:40:01 web01 ntpdate[30706]: the NTP socket is in use, exiting May 14 18:40:01 web01 ntpdate[32818]: the NTP socket is in use, exiting May 14 19:40:01 web01 ntpdate[34583]: the NTP socket is in use, exiting Mar 15 10:45:58 web01 oddjobd: oddjobd shutdown succeeded Mar 15 10:45:59 web01 sshd[1153]: Received signal 15; terminating. Mar 15 10:45:59 web01 snmpd[1139]: Received TERM or STOP signal... shutting down... Mar 15 10:45:59 web01 snmpd[1139]: snmpd: send_trap: Failure in sendto (Network is unreachable) Mar 15 10:45:59 web01 xinetd[1164]: Exiting... Mar 15 10:45:59 web01 ntpd[1175]: ntpd exiting on signal 15 Mar 15 10:45:59 web01 init: Disconnected from system bus Mar 15 10:45:59 web01 console-kit-daemon[37526]: WARNING: no sender#012 Mar 15 10:45:59 web01 nslcd[1073]: caught signal SIGTERM (15), shutting down Mar 15 10:45:59 web01 nslcd[1073]: version 0.7.5 bailing out Mar 15 10:45:59 web01 kernel: Kernel logging (proc) stopped. May 16 03:40:01 web01 ntpdate[49591]: the NTP socket is in use, exiting May 16 04:40:01 web01 ntpdate[51553]: the NTP socket is in use, exiting May 16 05:40:01 web01 ntpdate[53365]: the NTP socket is in use, exiting May 16 11:40:01 web01 ntpdate[64664]: the NTP socket is in use, exiting May 16 12:40:01 web01 ntpdate[1211]: the NTP socket is in use, exiting May 16 13:29:40 web01 sshd[2748]: Did not receive identification string from 10.40.133.188 May 16 13:29:45 web01 sshd[2749]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=10.40.133.188 user=pycon May 16 13:29:45 web01 sshd[2749]: Accepted password for pycon from 10.40.133.188 port 57463 ssh2 May 16 13:29:45 web01 sshd[2749]: pam_unix(sshd:session): session opened for user pycon by (uid=0) May 16 13:29:49 web01 su: pam_unix(su-l:session): session opened for user root by pycon(uid=9000709) May 16 13:31:24 web01 sshd[2749]: pam_unix(sshd:session): session closed for user pycon 17
  18. 18. P Y C O N T W - 2 0 1 7 LOG2ML 18
  19. 19. P Y C O N T W - 2 0 1 7 Log data is also natural language. The sequence of words and expressions is important sequential data. 19
  20. 20. P Y C O N T W - 2 0 1 7 Machine Learning Supervided learing vs unsupervised learning Binary classification vs Multi-label classification Sentences Clustering Topic modeling word2vec, doc2vec, paragraph2vec Sentiment Analysis CNN RNN 20
  21. 21. P Y C O N T W - 2 0 1 7 As you know, CNN is an architecture to process for image classification. 21
  22. 22. P Y C O N T W - 2 0 1 7 ref(2) : “cs231n : Covolutional Neural Networks for Visual Recognition”, Stanford A regular 3-layer Neural Network. ConvNet Architecture 22
  23. 23. P Y C O N T W - 2 0 1 7 What is a Conv layer compute the output of neurons that are connected to local regions in the input, each computing a dot product between their weights and a small region they are connected to in the input volume. 23
  24. 24. P Y C O N T W - 2 0 1 7 0 0 1 1 0 0 1 1 1 0 0 0 1 1 0 0 0 1 1 0 0 1 1 1 1 3 3 3 2 4 3 2 4 4 Original Image Convolved Feature 3x3 filter x1 x0 x1 x0 x1 x0 x1 x0 x1 24
  25. 25. P Y C O N T W - 2 0 1 7 3 3x3 filter x1 x0 x1 x0 x1 x0 x1 x0 x1 0x1 0x0 1x1 1 0 0x0 1x1 1x0 1 0 0x1 0x0 1x1 1 0 0 0 1 1 0 0 1 1 1 1 25
  26. 26. P Y C O N T W - 2 0 1 7 3 3 3x3 filter x1 x0 x1 x0 x1 x0 x1 x0 x1 0 0x1 1x0 1x1 0 0 1x0 1x1 1x0 0 0 0x1 1x0 1x1 0 0 0 1 1 0 0 1 1 1 1 26
  27. 27. P Y C O N T W - 2 0 1 7 3 3 3 3x3 filter x1 x0 x1 x0 x1 x0 x1 x0 x1 0 0 1x1 1x0 0x1 0 1 1x0 1x1 0x0 0 0 1x1 1x0 0x1 0 0 1 1 0 0 1 1 1 1 27
  28. 28. P Y C O N T W - 2 0 1 7 3 3 3 2 3x3 filter x1 x0 x1 x0 x1 x0 x1 x0 x1 0 0 1 1 0 0x1 1x0 1x1 1 0 0x0 0x1 1x0 1 0 0x1 0x0 1x1 1 0 0 1 1 1 1 28
  29. 29. P Y C O N T W - 2 0 1 7 3 3 3 2 4 3x3 filter x1 x0 x1 x0 x1 x0 x1 x0 x1 0 0 1 1 0 0 1x1 1x0 1x1 0 0 0x0 1x1 1x0 0 0 0x1 1x0 1x1 0 0 1 1 1 1 29
  30. 30. P Y C O N T W - 2 0 1 7 3 3 3 2 4 3 3x3 filter x1 x0 x1 x0 x1 x0 x1 x0 x1 0 0 1 1 0 0 1 1x1 1x0 0x1 0 0 1x0 1x1 0x0 0 0 1x1 1x0 0x1 0 1 1 1 1 30
  31. 31. P Y C O N T W - 2 0 1 7 3 3 3 2 4 3 2 3x3 filter x1 x0 x1 x0 x1 x0 x1 x0 x1 0 0 1 1 0 0 1 1 1 0 0x1 0x0 1x1 1 0 0x0 0x1 1x0 1 0 0x1 1x0 1x1 1 1 31
  32. 32. P Y C O N T W - 2 0 1 7 3 3 3 2 4 3 2 4 3x3 filter x1 x0 x1 x0 x1 x0 x1 x0 x1 0 0 1 1 0 0 1 1 1 0 0 0x1 1x0 1x1 0 0 0x0 1x1 1x0 0 0 1x1 1x0 1x1 1 32
  33. 33. P Y C O N T W - 2 0 1 7 3 3 3 2 4 3 2 4 4 3x3 filter x1 x0 x1 x0 x1 x0 x1 x0 x1 0 0 1 1 0 0 1 1 1 0 0 0 1x1 1x0 0x1 0 0 1x0 1x1 0x0 0 1 1x1 1x0 1x1 33 Convolved Feature
  34. 34. P Y C O N T W - 2 0 1 7 ref(2) : “cs231n : Covolutional Neural Networks for Visual Recognition”, Stanford A regular 3-layer Neural Network. ConvNet Architecture 34
  35. 35. P Y C O N T W - 2 0 1 7 Sentence, Paragraph, Document? 35
  36. 36. P Y C O N T W - 2 0 1 7 Convolutional Neural Networks for Sentence Classification ref(1) : “Convolutional Neural Networks for Sentence Classification”, 2014y 36 Using Text CNN filter - Save a locally information of text, sequential data, and context information
  37. 37. P Y C O N T W - 2 0 1 7 Log2ML ( CNN ) 37
  38. 38. P Y C O N T W - 2 0 1 7 Input Layer Conv. Layer Pooling Layer Fully- Connected Layer snmpd[1139]: Received TERM or STOP signal shutting down, fault caught SIGTERM shutting down, fault SIGHUP received Attempting to restart, fault sshd[2749]: Accepted password for pycon from 10.40.133.188 port 57463 ssh2, normal ntpdate[27169]: the NTP socket is in use exiting, normal Messages log : 38 fault normal category
  39. 39. P Y C O N T W - 2 0 1 7 # To make an index for words use VocabularyProcess() function of Tensorflow vocab_processor = tf.contrib.learn.preprocessing.VocabularyProcessor(max_document_length) x_train = np.array(list(self.vocab_processor.fit_transform(x_train))) log example Input Layer Conv. Layer Pooling Layer Fully- Connected Layer 39
  40. 40. P Y C O N T W - 2 0 1 7 embedding size vocab size : number of words Input Layer Conv. Layer Pooling Layer Fully- Connected Layer W = tf.Variable(tf.random_uniform([vocab_size, embedding_size], -1.0, 1.0), name=“W") embedded_chars = tf.nn.embedding_lookup(W, input_x) 40
  41. 41. P Y C O N T W - 2 0 1 7 Input Layer Conv. Layer Pooling Layer Fully- Connected Layer filter_shape = [filter_size, embedding_size, 1, num_filters] W = tf.Variable(tf.truncated_normal(filter_shape, stddev=0.1), name="W") b = tf.Variable(tf.constant(0.1, shape=[num_filters]), name="b") conv = tf.nn.conv2d( self.embedded_chars_expanded, W, strides=[1, 1, 1, 1], padding="VALID", name=“conv") h = tf.nn.relu(tf.nn.bias_add(conv, b), name="relu") 41
  42. 42. P Y C O N T W - 2 0 1 7 word V1 V2 V3 … Vp-2 Vp-1 Vp snmpd[1139]: Received TERM or … … signal shutting down filter Input Layer Conv. Layer Pooling Layer Fully- Connected Layer 42
  43. 43. P Y C O N T W - 2 0 1 7 sliding word V1 V2 V3 … Vp-2 Vp-1 Vp snmpd[1139]: Received TERM or … … signal shutting down filter Input Layer Conv. Layer Pooling Layer Fully- Connected Layer 43
  44. 44. P Y C O N T W - 2 0 1 7 word V1 V2 V3 … Vp-2 Vp-1 Vp snmpd[1139]: Received TERM or … … signal shutting down filter Input Layer Conv. Layer Pooling Layer Fully- Connected Layer 44
  45. 45. P Y C O N T W - 2 0 1 7 word V1 V2 V3 … Vp-2 Vp-1 Vp snmpd[1139]: Received TERM or … … signal shutting down filter Input Layer Conv. Layer Pooling Layer Fully- Connected Layer 45
  46. 46. P Y C O N T W - 2 0 1 7 word V1 V2 V3 … Vp-2 Vp-1 Vp snmpd[1139]: Received TERM or … … signal shutting down filter Input Layer Conv. Layer Pooling Layer Fully- Connected Layer 46
  47. 47. P Y C O N T W - 2 0 1 7 word V1 V2 V3 … Vp-2 Vp-1 Vp snmpd[1139]: Received TERM or … … signal shutting down filter Input Layer Conv. Layer Pooling Layer Fully- Connected Layer 47
  48. 48. P Y C O N T W - 2 0 1 7 word V1 V2 V3 … Vp-2 Vp-1 Vp snmpd[1139]: Received TERM or … … signal shutting down filter Input Layer Conv. Layer Pooling Layer Fully- Connected Layer 48 Finally, we get convolved feature
  49. 49. P Y C O N T W - 2 0 1 7 word V1 V2 V3 … Vp-2 Vp-1 Vp snmpd[1139]: Received TERM or … … signal shutting down filter Input Layer Conv. Layer Pooling Layer Fully- Connected Layer 49 filter size : 4
  50. 50. P Y C O N T W - 2 0 1 7 Input Layer Conv. Layer Pooling Layer Fully- Connected Layer pooled = tf.nn.max_pool( h, ksize=[1, sequence_length - filter_size + 1, 1, 1], strides=[1, 1, 1, 1], padding='VALID', name=“pool") # Combine all the pooled features num_filters_total = num_filters * len(filter_sizes) self.h_pool = tf.concat(3, pooled_outputs) self.h_pool_flat = tf.reshape(self.h_pool, [-1, num_filters_total]) 50
  51. 51. P Y C O N T W - 2 0 1 7 Input Layer Conv. Layer Pooling Layer Fully- Connected Layer with tf.name_scope("output"): W = tf.get_variable( "W", shape=[num_filters_total, num_classes], initializer=tf.contrib.layers.xavier_initializer()) b = tf.Variable(tf.constant(0.1, shape=[num_classes]), name="b") l2_loss += tf.nn.l2_loss(W) l2_loss += tf.nn.l2_loss(b) self.scores = tf.nn.xw_plus_b(self.h_drop, W, b, name="scores") self.predictions = tf.argmax(self.scores, 1, name="predictions") 51
  52. 52. P Y C O N T W - 2 0 1 7 Convolutional Neural Networks for Sentence Classification 52
  53. 53. P Y C O N T W - 2 0 1 7 $ tensorboard --logdir ./runs/1496236844/summaries 53
  54. 54. P Y C O N T W - 2 0 1 754
  55. 55. P Y C O N T W - 2 0 1 7 End Reference (1) “Convolutional Neural Networks for Sentence Classification”, Yoon Kim, 2014y (2) “cs231n : Covolutional Neural Networks for Visual Recognition”, Stanford 55

×