Advertisement

Analytics forward 2019-03

Mar. 19, 2019
Advertisement

More Related Content

Advertisement

Analytics forward 2019-03

  1. Introduction to Autoencoders & IoT Analytics Scott N. Gerard, PhD 3/19/2019 Copyright Scott N. Gerard 2019 1
  2. Neural Network Model 3/19/2019 Copyright Scott N. Gerard 2019 2 Input layer .6 .1 .9 .5 .4 .8 .6 .3 .3 .7 .3 .1 2 hidden layers Output layer • Layers are fully connected • Each edge contains a weight • Final answer is output neuron with highest value x f1(x) f2(f1(x)) f3(f2(f1(x))) • Each function/layer fi is a non-linear function width depth
  3. Neural Network Training and Inference 3/19/2019 Copyright Scott N. Gerard 2019 3 Input layer Error = label – prediction Labeled Input Backpropagation Feed forward Label (ground truth) .6 .1 .9 .5 .4 .8 .6 .3 .3 .7 .3 .1 2 hidden layers Output layer • Supervised learning • epoch = 1 fwd+bwd pass over all training • mini-batch = 1 fwd+bwd pass over fraction of training • # iterations = training size / mini-batch size Feed forward Training Phase Inference Phase NNmodel (weights) Unseen Input Prediction Train model Use model
  4. Bad Autoencoder 3/19/2019 Copyright Scott N. Gerard 2019 4 input feature vector same input feature vector • Unsupervised learning • Reconstruction loss =sum (output-input)2 identity(x)
  5. Autoencoder <date> Copyright Scott N. Gerard 2019 5 encoder compressor decoder generator input feature vector same input feature vector • Unsupervised learning • Compresses input • Learn important features • NLP’s word2vec is latent space • ½-hour sit-coms 😉 • How much compression? • Auto-generate new sit-coms? “bottleneck” coding latent space f(x) “f -1”(x)
  6. MNIST dataset (sample) Autoencoder Autoencoder Learns Handwritten Digits 3/19/2019 Copyright Scott N. Gerard 2019 6 • 784 neurons in input layer (=28x28 pixels) • 256 neurons in hidden layer • 128 neurons in latent space (middle layer) • 256 neurons in hidden layer • 10 neurons in output layer (1 for each digit) • 30,000 MNIST training images • Batch size = 256 images
  7. Compressor / Dimensionality Reduction 3/19/2019 Copyright Scott N. Gerard 2019 7 encoder compressor input feature vector • Save compressed version “bottleneck” coding latent space encoder
  8. Generate Faux Output 3/19/2019 Copyright Scott N. Gerard 2019 8 decoder generator • Random input decoder • Faux output
  9. Features for Another Analytic 3/19/2019 Copyright Scott N. Gerard 2019 9 encoder compressor another analytic input feature vector • Autoencoder features are input to another analytic • Classification analytic • Image analytic • Whatever Latent space, code encoder g(x) other features
  10. Anomaly Detector 3/19/2019 Copyright Scott N. Gerard 2019 10 encoder compressor decoder generator input feature vector same input feature vector • If reconstruction loss is too big, then it can’t be represented by a coding ==> anomaly “bottleneck” coding latent space encoder decoder
  11. Autoencoder • Autoencoder has to • Compress input to codings, • Reconstruct the output given ONLY the codings • Small reconstruction loss ==> input space successfully compressed to just the codings • Expect decrease coding => increased reconstruction loss <date> Copyright Scott N. Gerard 2019 11
  12. IoT Analytics for Eldercare 3/19/2019 Copyright Scott N. Gerard 2019 12
  13. Impact & business opportunity of a global demographic shift • US – Estimated assets for this demographic $8.4 to $11.6 Trillion • China – Estimated “silver hair” market to rise to $17 Trillion by 2050, amounting to a third of the Chinese economy. • Japan – Estimated 65+ financial assets $9.1 trillion • Rising Eldercare costs will disrupt economies 6% of US GDP and 4 to 8% of EU GDP will account for social service costs for the Elder. PercentageofPopulation65yearsandolder Japan Italy Germany Ireland China Australia Brazil US India Egypt 2017 •http://www.icis.com/blogs/chemicals-and-the-economy/2015/03/worlds-demographic-dividend-turns-deficit-populations-age/ •https://www.metlife.com/assets/cao/mmi/publications/studies/2010/mmi-inheritance-wealth-transfer-baby-boomers.pdf •http://blogs.ft.com/ftdata/2014/02/13/guest-post-adapting-to-the-aging-baby-boomers/ •http://www.marketsandmarkets.com/Market-Reports/healthcare-data-analytics-market-905.html •http://www.bloomberg.com/bw/articles/2014-09-25/chinas-rapidly-aging-population-drives-652-billion-silver-hair-market •Asian Journal of Gerontology & Geriatrics for Centenarians: According to the National Institute of Population and Social Security Research, Japan had 67,000 centenarians in 2014, but that number is forecast to reach 110,000 in 2020, 253,000 in 2030 and peak at 703,000 in the year 2051.
  14. ADLs (Activities of Daily Living) • Activities we normally do. Determines level of care needed. • Bathing and showering • Personal hygiene and grooming (including brushing/combing/styling hair) • Dressing • Toileting (getting to the toilet, cleaning oneself, and getting back up) • Eating (self-feeding not including cooking or chewing and swallowing) • Functional mobility, often referred to as "transferring", as measured by the ability to walk, get in and out of bed, and get into and out of a chair; the broader definition (moving from one place to another while performing activities) is useful for people with different physical abilities who are still able to get around independently. • We expect to see additional ADLs in our data • Sleeping, Watching TV, … 14 https://en.wikipedia.org/wiki/Activities_of_daily_living
  15. Avamere – High Density Sensor Deployment Instrumenting 20 Patient rooms in Skilled Nursing Facility & 5 Independent Living Apartment Over 1000 sensors deployed
  16. Autoencoder 3/19/2019 Copyright Scott N. Gerard 2019 16 encoder compressor decoder generator input feature vector same input feature vector • output = 3 x 30 features “bottleneck” coding latent space f(x) “f -1”(x) Input • 30 sensors • 1-minute windows • sensor fire counts • 3 adjacent time windows • 3 x 30 features
  17. <date> Copyright Scott N. Gerard 2018 17 Coding layer = 10 dimensions
  18. 18
  19. 19
  20. 20
  21. Questions 21
  22. Backup 22
  23. Conclusions • Tuning • Time window: 1 minute is good (5 min was too long) • Alpha (# concurrent ADLs) • Ideal: small alpha (0.1, 0.01, …) • But Spark LDA ML doesn’t allow alpha < 1.0 • Iterations: 100 is good (35 was too few) • Choose #ADLs up front. 6?, 7?, 10? … • No ADL looks like “dressing” or “grooming” • Found non-standard “Watch TV” ADL • Interpretation • Must manually characterize sensor sets (ADLs) • How to transfer learning across apartments (diff sensors) ? • Encouraging results, but more work is needed 23
  24. One Neuron in a Neural Network • Neuron (perceptron) computes weighted sum of inputs, then activation function 𝑎𝑗 = 𝜎 Σ 𝑘 𝑤 𝑘𝑗 𝑎 𝑘 • Activation function • Differentiable (nearly everywhere) • Sigmoid: 𝜎 𝑥 = exp(𝑥) 1+exp(𝑥) • soft-max 𝑥 𝑘 = exp(𝑥 𝑘) Σ 𝑗 exp(𝑥 𝑗) 3/19/2019 Copyright Scott N. Gerard 2018 24
  25. Activation Functions 3/19/2019 25 saturation saturation • Linear activation => linear network • Non-linear activation => general function • Often little difference between activation functions
  26. Learning in Neural Networks • Backpropagation pushes errors from outputs (layer i) to inputs (layer i-1) 𝜕𝐸 𝑛 𝜕𝑎𝑗 = 𝜎′(𝑎𝑗) 𝑘 𝑤 𝑘𝑗 𝜕𝐸 𝑛 𝜕𝑎 𝑘 3/19/2019 Copyright Scott N. Gerard 2018 26 Layer i errorLayer i-1 error Errors propagate backwards • 𝐸 𝑛 is error of n-th sample
  27. M&R =17 A&C =19 W =5 B =1 P =3 HUB =1 10ft ruler B1 W5 W2 P2 W3 W4 P1 W1M1 M2 M3 M4 M5 M6 M7 M8 M9 M12 M11 M10 M13 R1 R2 R3 R4 C1 C2 C4 C3 C7 C6 C5 C8 C9 P3 A1 A2 A3 A4 A5 A6 A7 A8 A9 A99 Elder’s Apartment
  28. ADL/Sensor Distribution 28 • Learn sensor => ADL • Unsupervised ML • Spark ml LDA SensorId cooking transferring toileting bathing TV watching sleeping I01BBB-b-nw---- 0.16 0.13 0.18 0.18 21165.05 0.14 0.16 I01BBB-b-smar2md 100.97 40366.36 4002.56 5.99 0.39 0.32 0.41 I01BBB-b-smcl010 0.56 38.24 3051.03 0.85 0.71 0.29 55928.33 I01BBB-b-smcl020 0.27 2.27 39292.91 0.34 0.27 0.36 0.58 I01BBB-b-smclbed 0.19 0.23 0.57 0.27 0.38 0.15 24340.21 I01BBB-c-scdoor2 0.08 0.09 15012.48 0.09 0.11 0.07 0.09 I01BBB-dkscdoor- 0.19 0.15 0.23 0.21 4634.85 0.16 0.21 I01BBB-dnsachar1 0.13 15921.06 0.16 0.15 0.23 0.13 0.14 I01BBB-fyscdoor- 14182.51 0.08 0.09 0.08 0.10 0.07 0.08 I01BBB-fysmar3md 13814.08 2.98 2.20 20673.86 0.31 0.28 0.28 I01BBB-fysmclent 21057.01 0.84 0.68 15147.70 0.27 0.25 0.24 I01BBB-ktnw----- 0.14 0.12 0.17 0.17 21199.14 0.13 0.14 I01BBB-ktsccplat 0.11 0.10 0.13 0.12 11546.33 0.10 0.11 I01BBB-ktscfrez- 0.25 0.16 0.22 15388.63 0.37 0.18 0.20 I01BBB-ktscfrig- 49370.50 0.08 0.08 0.09 0.10 0.07 0.07 I01BBB-ktscutenz 0.13 0.09 0.12 3670.31 0.15 0.09 0.10 I01BBB-ktsmcl--- 6637.83 0.20 0.23 0.35 0.15 0.11 0.13 I01BBB-ktspmicrw 0.24 0.16 0.26 0.37 2.14 0.45 0.39 I01BBB-ldsawashr 0.07 0.06 0.08 0.08 0.10 0.06 24336.54 I01BBB-ldscdoor1 0.08 0.08 15140.45 0.10 0.12 0.08 0.09 I01BBB-ldsmcl--- 11246.61 29988.13 5645.84 511.05 0.49 0.37 0.51 I01BBB-lrsachar1 0.06 0.06 0.08 0.06 0.07 37033.62 0.05 I01BBB-lrsmar4md 0.32 0.32 1.29 0.29 0.34 21564.24 0.21 I01BBB-lrsmcl000 1.54 40986.87 2191.49 1.04 0.32 0.42 0.32 I01BBB-lrsmcl100 4.25 48.19 24648.51 1.04 0.62 5647.11 0.27 I01BBB-lrsmcl200 0.41 0.62 68.51 0.37 0.50 40365.33 0.26 I01BBB-lrsptv--- 0.17 0.14 0.22 0.22 0.92 0.15 0.18 I01BBB-rrnw----- 0.14 0.12 0.17 0.16 21185.14 0.13 0.14 I01BBB-rrscdoor- 0.14 0.12 0.17 0.16 0.24 16324.03 0.14 I01BBB-rrsmar1md 0.24 0.16 0.19 25992.93 0.21 0.12 0.15 I01209-rrsmclshw 0.16 0.14 0.21 0.39 10417.79 0.13 0.18 I01209-rrsmclsnk 0.88 3.46 12.06 240.63 60.90 0.60 11303.46 I01209-rrsmcltoi 0.17 0.14 0.18 26898.94 0.29 0.13 0.16 Grand Total 116420.59 127361.95 109073.76 108537.23 90219.09 120939.84 115914.54
  29. ADL by Time Window 29 cooking transferring toileting bathing TV watching sleeping max ADL 96K windows
Advertisement