SenSec: Mobile Application Security through Passive Sensing

445 views
394 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
445
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

SenSec: Mobile Application Security through Passive Sensing

  1. 1. Jiang Zhu and Sean WangDec 5th, 2011 1
  2. 2. •  Monitor and track user behavior on smartphones using various on-device sensors•  Convert sensory traces and other context information to Personal Behavior Features•  Build Risk Analysis Trees with these features and use it for calculation of Certainty Scores•  Trigger various Authentication Schemes when certain application is launched. 2
  3. 3. 3
  4. 4. 4
  5. 5. 60% •  “The 329 organizations polled had collectively lost50% more than 86,000 devices … with average cost of lost40% data at $49,246 per device,30% worth $2.1 billion or $6.4 million per organization. 20% 10% "The Billion Dollar Lost-Laptop Study," 0% conducted by Intel Corporation and the Ponemon Institute, analyzed the scope and circumstances of missing laptop Mobile Device Loss or theft PCs.Strategy One Survey conducted among a U.S. sample of 3017 adults age 18 years older in September 21-28, 2010, with an oversample in the top 20 cities (based on population). 5
  6. 6. Application Password Different applications may have differentA major source of sensitivitiessecurity vulnerabilities.Easy to guess, reuse,forgotten, shared Usability Authentication too-often or sometimes too loose 6
  7. 7. 7
  8. 8. ApplicationAccessControl 8
  9. 9. •  MobiSens app collects sensor data •  Motion sensors •  GPS and WiFi Scanning •  In-use applications and their traffic patterns•  SenSec module build user behavior models •  Unsupervised Activity Segmentation and model the sequence using Language model •  Building Risk Analysis Tree (DT) to detect anomaly •  Combine above to estimate risk (online): certainty score•  SenSec broadcast certainty score to other applications•  Application Access Control Module uses broadcast receiver 9
  10. 10. •  Feature vector calculated from a step window represent the behavior state within a given time window •  surrounding environment: GPS location, WiFi signal •  activity: motions, applications in use •  communication: network traffic•  Using Decision Tree to detect anomaly in behaviors •  Each node represents a feature dimension •  Leaves can be one of the following •  Owner Detection: owner [0,1], 0: Anomaly, 1: Normal •  User Identification: user id [0,1,…. N], user’s identification, i.e. IMEI•  Multiple trees can be built with subset of feature space •  Weighted average •  Voting 10
  11. 11. •  Convert feature vector series to label streams – dimension reduction•  Using n-gram to model sequence of label stream for each sensory dimension – current state and transition captured•  Step window with assigned length A1 A2 A1 A4 G2 G5 G2 G2 W2 W1 W2 P1 P3 P6 P1 A2 G2G5 W1 P1P3 A1A4 G2 W1W2 P1 11
  12. 12. •  User behavior at time t depends only on the last n-1 behaviors•  Sequence of behaviors can be predicted by n consecutive location in the past•  Maximum Likelihood Estimation from training data by counting:•  MLE assign zero probability to unseen n-grams Incorporate smoothing function (Katz) Discount probability for observed grams Reserve probability for unseen grams 12
  13. 13. •  Feed sequence of the past behaviors in a stepping window of size N to n-gram model for testing•  For a testing sequence of behavior labels•  Estimate the average log probability this sequence is generated from the n-gram•  If this likelihood drops below a threshold, flag an anomaly alert 13
  14. 14. 14
  15. 15. Anomaly Preprocessing Detection Behavior Text N-gram Generation Fusion ModelMobiSens Extract Trace Features Sensing Decision Trees ~ Threshold > Anomaly Y/N 15
  16. 16. 16
  17. 17. •  Total data set size: 4GB Dataset •  Remove 2 heavy usersNumer of users 50 •  Remove users with veryDevice Android phones limited data duration •  Remove users that don’tLocation Bay area have application and trafficAverag period 30 days data due to older MobiSens versionNumber of data 7types •  25 users with comparableFinest sampling dataset sizeinterval (motion 200 mssensors) •  Data duration: 4 hour ~ 2.5 days 17
  18. 18. •  Motion Sensors (100) •  Used to summarize acceleration stream •  Calculated separately for each dimension [x,y,z,m]•  GPS: location label via density based clustering (1)•  WiFi: (SSIDs, RSSIs) pairs ranked by signal strength (6)•  Applications: Bitmap of well-known applications (60 + 1)•  Application Traffic Pattern: Tx/Rx traffic vectors (120 + 2)•  Step Window Size: 5 seconds 18
  19. 19. •  User Identification Test and Owner Detection Test for randomly selected partial data set (4 users) with 1:1 training/test split •  ~ 99% accuracy •  number of leaves: 56 , size of tree: 111•  Using non-motion attributes yields lower accuracy (96%) •  Significant tree size reduction, number of leaves: 3, size of tree: 5 •  Cross entropy may be significant to easily distinguish users using some features.•  Using only motion attributes can distinguish different users •  ~ 98% accuracy •  very large tree, number of leaves: 267, size of tree 533 •  may cause performance issues on mobile platform 19
  20. 20. •  Apply cross-entropy filter to remove users that could be identified easily using a small set of features•  12 users with 210k data instances•  User identification : train RAT model on 66% instances and rest as testing 84.8% 83.5 79.3 100 7649 80 60 Accuracy 40 Size Factor 20 221 35 0 All Non-Motion Motion-Only 20
  21. 21. 21
  22. 22. •  Experiments to discover anomaly usage with ~80% accuracy with only days of training data 22
  23. 23. •  Extended data set for feature construction TCP, UDP traffic; sound; ambient lighting; battery status, etc.•  Data and Modeling Gain more insights into the data, features and factorized relationships among various sensors Try other classification methods and compare results: LR, SVM, Random Forest, etc•  Enhanced security of SenSec components Integration with Android security framework and other applications•  Privacy challenges Data collection, model training, privacy policy, etc.•  Energy efficiency 23
  24. 24. 24
  25. 25. Thank you.
  26. 26. 26
  27. 27. ! 27
  28. 28. •  Data Collection 9.=$(1/69.=$;1 (1/6$/< 9.=$(1/67+"@1/: •  Running app list !55;$"+#$./ A$21;.<<1, C./#,.; D0 31%$"1 !55;$"+#$./6 •  Per-app traffic pattern 4,.2$;1!40 !"#$%$#& 9166+<1 ()**+,$-+#$./•  IPC Interface !"#$%$#& 4..; 0/#1,2+"1 (1<*1/#+#$./ 31%$"1 C./#,.;;1, 9.:1; (#.,+<1 718+%$.,9.:1;$/< •  Certainty Score 4)68$/< B1=(1,%$"1 (&6#1* !;<.,$#8*6 3+#+ Broadcast mechanism !<<,1<+#., 3+#+ >?"8+/<1 9.=$(1/6 !40 3+#+ 3+#+ 3+#+4,15,."166., >?"8+/<1 (1/6., D5;.+: !40 B$:<1#6 E+F9.=$(1/69.=$;1!55;$"+#$./ E=FG$1,H E"FG$1,I•  Offline-Model Push via Data Exchange API •  Risk Analysis Tree can be trained using global data on the MobiSens Server and pushed back to the mobile device 28
  29. 29. •  MobiSens Server •  Offline Clustering •  K-means package from Weka Data Mining Toolkit •  Using aggregated data from all users •  Offline RAT training •  Decision Tree package from Weka Data Mining Toolkit •  Construct training data set and design evaluation strategy•  MobiSens Client •  Retrive RAT model from MobiSens Server •  On-device n-gram label sequence construction (n=1,2,3; window size =5s) •  RAT inference using Weka Toolkit on device •  Status bar notification based on certainty value 29
  30. 30. •  Reactive API to Team Access API call from Team Access to SenSec to retrieve the current Certainty Score given the context getCertaintyScore(SenSecContextType ctx, count)•  Proactive API to Team Acess and other equivalent modules Broadcast Receiver on Certainty Score certaintyScore{ CertaintyScoreType scores[]; WindowSizeType window_size; SenSecContextType ctx; } 30

×