.
On the Way IN: DC Forensics
M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Ce...
.
Forensics Basics
.
(traditional) Forensics Stages...
..
.... are collection, examination, analysis, and reporting
• many...
.
Forensics : All is Traffic
.
Statement
..
.All information in data centers can be reduced to the traffic form
• logs are...
.
Practical DC Forensics
• we want Deep Packet Inspection (DPI) back on the table
• we want to not use sampling, but captu...
.
Conventional Multicore
M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center ...
.
Generic Multicore Design
M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Cente...
.
Generic Multicore Capture
• 2 roles: manager and
core
• traditional parallel
processing: message
passing or shared
memor...
.
Conventional Shortcomings
.
Reality is...
..
.
... that traditional parallel processing designs are extremely inefficien...
.
Conventional → Proposed
• spawn, but don't wait to merge
• collect results form cores
continuously to avoid lumps
• get ...
.
Proposal : the New Multicore
M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data C...
.
Proposal : Mission Statement
.
Proposal Components
..
.
• lockfree design
• tasks-into-cores packing problem and optimiz...
.
Proposal : Shared Memory
• communication happens over
shared memory
04
• C/C++ implementation
is common, but will work i...
.
Proposal : DLL is Key
.
DLL stands for...
..
.... Double Linked List
• common in C/C++
designs
• extremely flexible --
y...
.
Optimization
M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-...
.
Optimization Targets
• few cores, many data units
• need to pack latter into former
• moreover: scheduling problem, whic...
.
Prefix Packing Problem
minimize w1count(P) + w2max(M) + w3var(C)
subject of k1 < pi < k2 ∀ pi ∈ P.
Hashkey
- 32 bits0 -
...
.
Prefix Packing GA Heuristic
• Generic Algorithm (GA) 12
• chromosome is a tuple of prefixes packed into one core
gi = ⟨p...
.
Analysis
M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- ht...
.
Analysis Setup
• actual packet traces -- trace-based simulation 16
• input: 2 cases -- hashing verus raw
• items are ind...
.
Analysis (1) Cores
0 1 2 3 4 5 6 7 8 9
Time sequence
4.6
4.7
4.8
4.9
5
5.1
5.2
5.3
5.4
5.5
log(maxitemcount/core)
1 core...
.
Analysis (2) Hashing
0 0.2 0.4 0.6 0.8 1
Increasing cutoff parameter
0
40
80
120
160
200
240
Numberofuniqueprefixes
hash...
.
Forensics 2.0
M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics...
.
Forensics 2.0
• reporting part: let's use sketches from data streaming 11
Core 1
Core 1
Core X
TABID
Manager
Now(replay)...
.
Wrapup
• a natively multicore technology is proposed
• performance is opitimized using a packing heuristic
• raw input i...
.
That’s all, thank you ...
M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Cent...
.
[01] myself+0 (2013)
...community-based architecture for measuring E2E QoS at DCc
IJCSE
[02] myself+0 (2013)
Experiments...
.
[06] R.Brightwell (2008)
Workshop on Managed Many-Core Systems
1st Managed Many-Core Systems
[07] X.Sui+3 (2010)
Paralle...
.
Scalable and Efficient Data Streaming Algorithms for Detecting Common Content...
ICDE
[12] D.Knysh+1 (2010)
Parallel Gen...
.
http://mawi.wide.ad.jp/mawi
M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Ce...
.
Extras (1) Per-Unit Cost
Hashing
Increasing
Per-Unit Cost
Manager
Prefix
Matching
Cores that
do not
match
Process
Stage 1...
.
Extras (2) Share Memory Trick
M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data ...
A Software Design and Algorithms for Multicore Capture in Data Center Forensics
Upcoming SlideShare
Loading in...5
×

A Software Design and Algorithms for Multicore Capture in Data Center Forensics

283
-1

Published on

With rapid dissemination of cloud computing, data centers are quickly turning into platforms that host highly heterogeneous collections of services. Traditional approach to security and performance management finds it difficult to cope in such environments. Specifically, it is becoming increasingly difficult to capture and process all the necessary information at data centers in real time, where packet capture at data center gateways can serve as a practical example. This paper proposes a generic design for capturing and processing information on multicore architectures. The two main parts of the proposal are (1) the optimization formulation for distributing tasks across cores and (2) practical design and implementation of a shared memory which can be used for communication between processes in a non-traditional way that does not require memory locking or message passing.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
283
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

A Software Design and Algorithms for Multicore Capture in Data Center Forensics

  1. 1. . On the Way IN: DC Forensics M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 2/28 ... 2/28
  2. 2. . Forensics Basics . (traditional) Forensics Stages... .. .... are collection, examination, analysis, and reporting • many challenges in data centers • collection: realtime is really really really difficult • examiation: you can't examine what you can't collect, also flexibility is important • analysis: deeper form of examination, same problems • reporting: that part is actually easy, but DCs have no standards ◦ one standard is offered later in this presentation M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 3/28 ... 3/28
  3. 3. . Forensics : All is Traffic . Statement .. .All information in data centers can be reduced to the traffic form • logs are information carried on packets • logging, storage, etc. are distributed -- have to be communicate using traffic • a corrolary: if something is not traffic, it might be useful to convert it into traffic M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 4/28 ... 4/28
  4. 4. . Practical DC Forensics • we want Deep Packet Inspection (DPI) back on the table • we want to not use sampling, but capture everything • we want to differentiate attention spent to different classes of traffic ◦ called context-based sampling ◦ probability of capture/inspection depends on current context • note: all these are gradually removed from practice for infeasibility M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 5/28 ... 5/28
  5. 5. . Conventional Multicore M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 6/28 ... 6/28
  6. 6. . Generic Multicore Design M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 7/28 ... 7/28
  7. 7. . Generic Multicore Capture • 2 roles: manager and core • traditional parallel processing: message passing or shared memory 05 06 05 M.Aldinucci+2 "FastFlow: Efficient Parallel Streaming Applications on Multi-core" U.Pisa Techreport (2009) 06 R.Brightwell "Workshop on Managed Many-Core Systems" 1st Managed Many-Core Systems (2008) M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 8/28 ... 8/28
  8. 8. . Conventional Shortcomings . Reality is... .. . ... that traditional parallel processing designs are extremely inefficient on multicore • overhead from parallelization is too high • unit of processing is too small • streamline designs are rare but are recently discussed in BigData 08 . The solution is... .. .... to use a lockfree (message-less) parallelization design 08 R.Chen+2 "Tiled-MapReduce: Optimizing Resource Usages ... on Multicore with Tiling" 19th PACT (2010) M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 9/28 ... 9/28
  9. 9. . Conventional → Proposed • spawn, but don't wait to merge • collect results form cores continuously to avoid lumps • get used to not being able to communicate to cores (no messages) ◦ relatively short tasks diminish this effect 02 02 myself+0 "Experiments with Practical On-Demand Multi-Core Packet Capture" APNOMS (2013) M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 10/28 ... 10/28
  10. 10. . Proposal : the New Multicore M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 11/28 ... 11/28
  11. 11. . Proposal : Mission Statement . Proposal Components .. . • lockfree design • tasks-into-cores packing problem and optimization • implementation that support lockfree design • remember: the easiest way to aggregate traffic is to use IP address prefixes • again, generic, so we do not care about the contents M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 12/28 ... 12/28
  12. 12. . Proposal : Shared Memory • communication happens over shared memory 04 • C/C++ implementation is common, but will work in other languages as well • shared memory is persistent, but cores come and go 04 K.Michael "The Linux Programming Interface" No Starch Press (2010) M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 13/28 ... 13/28
  13. 13. . Proposal : DLL is Key . DLL stands for... .. .... Double Linked List • common in C/C++ designs • extremely flexible -- you can swap elements by reassigning pointers • sideways DLL is a method to avoid collisions in hashing M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 14/28 ... 14/28
  14. 14. . Optimization M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 15/28 ... 15/28
  15. 15. . Optimization Targets • few cores, many data units • need to pack latter into former • moreover: scheduling problem, which is packing but along the timeline • moreover(2) : when packing, do you randomize input or not -- hashing M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 16/28 ... 16/28
  16. 16. . Prefix Packing Problem minimize w1count(P) + w2max(M) + w3var(C) subject of k1 < pi < k2 ∀ pi ∈ P. Hashkey - 32 bits0 - k1 (shortest) k2 (longest) Effective range Core0 Core1 Core2 … p (prefix) p1 p3 p2 p4 p5 p6 p8 p7 m (max) n Prefix Packing Problem • prefix length between k1 and k2s ◦ hashkey or raw ◦ fixed in each run in this paper • pi is a pack (group) of items • n total items, mapped to set M of prefixes in each of m cores • C a set of item counts c across prefixes, M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 17/28 ... 17/28
  17. 17. . Prefix Packing GA Heuristic • Generic Algorithm (GA) 12 • chromosome is a tuple of prefixes packed into one core gi = ⟨pi,1, pi,2, ..., pi,m⟩. (1) • one gene (whole solution) is a tuple containing all chromosomes Gj = ⟨g1, g2, ..., gn⟩. (2) 12 D.Knysh+1 "Parallel Genetic Algorithms: a Survey and Problem State of the Art" IJCSS (2010) M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 18/28 ... 18/28
  18. 18. . Analysis M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 19/28 ... 19/28
  19. 19. . Analysis Setup • actual packet traces -- trace-based simulation 16 • input: 2 cases -- hashing verus raw • items are individual packets ◦ packets are packed into prefixes ◦ prefixes are packed into cores • the above GA optimization heuristic 16 myself "MAWI Working Group Traffic Archive" http://mawi.wide.ad.jp/mawi (2014) M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 20/28 ... 20/28
  20. 20. . Analysis (1) Cores 0 1 2 3 4 5 6 7 8 9 Time sequence 4.6 4.7 4.8 4.9 5 5.1 5.2 5.3 5.4 5.5 log(maxitemcount/core) 1 core 2 cores 3 cores 4 cores 5 cores 6 cores 7 cores M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 21/28 ... 21/28
  21. 21. . Analysis (2) Hashing 0 0.2 0.4 0.6 0.8 1 Increasing cutoff parameter 0 40 80 120 160 200 240 Numberofuniqueprefixes hashed raw M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 22/28 ... 22/28
  22. 22. . Forensics 2.0 M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 23/28 ... 23/28
  23. 23. . Forensics 2.0 • reporting part: let's use sketches from data streaming 11 Core 1 Core 1 Core X TABID Manager Now(replay) …. BIG DATA TIMELINE Cursor Time Direction One Sketch One SketchOne Sketch Start End End End Read/prepare Shared Memory Start 11 M.Sung+3 "Scalable and Efficient Data Streaming Algorithms for Detecting Common Content..." ICDE (2006) M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 24/28 ... 24/28
  24. 24. . Wrapup • a natively multicore technology is proposed • performance is opitimized using a packing heuristic • raw input is found to be preferable to randomization • future topics: 1. variable-length prefixes 2. optimization along the timeline 3. jitter minimization (fewer reasignments) 4. further lookup optimiation -- fast hashing M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 25/28 ... 25/28
  25. 25. . That’s all, thank you ... M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 26/28 ... 26/28
  26. 26. . [01] myself+0 (2013) ...community-based architecture for measuring E2E QoS at DCc IJCSE [02] myself+0 (2013) Experiments with Practical On-Demand Multi-Core Packet Capture APNOMS [03] myself+1 (2013) A Graphical Method for Detection of Flash Crowds in Traffic Telecom. Systems (TM) [04] K.Michael (2010) The Linux Programming Interface No Starch Press [05] M.Aldinucci+2 (2009) FastFlow: Efficient Parallel Streaming Applications on Multi-core U.Pisa Techreport M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 26/28 ... 26/28
  27. 27. . [06] R.Brightwell (2008) Workshop on Managed Many-Core Systems 1st Managed Many-Core Systems [07] X.Sui+3 (2010) Parallel Graph Partitioning on Multicore Architectures 23rd LCPC [08] R.Chen+2 (2010) Tiled-MapReduce: Optimizing Resource Usages ... on Multicore with Tiling 19th PACT [09] I.Machdi+2 (2009) Executing parallel TwigStack algorithm on a multi-core system 11th IIWAS [10] S.Stoichev+1 (2009) Parallel Algorithm for Integer Sorting with Multi-Core Processors IT and Control [11] M.Sung+3 (2006) M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 26/28 ... 26/28
  28. 28. . Scalable and Efficient Data Streaming Algorithms for Detecting Common Content... ICDE [12] D.Knysh+1 (2010) Parallel Genetic Algorithms: a Survey and Problem State of the Art IJCSS [13] Luca Deri (2009) Modern Packet Capture and Analysis: Multi-Core, Multi-Gigabit, and Beyond IM [14] myself (2014) MCoreMemory project page https://github.com/maratishe/mcorememory [15] myself (2013) Rings-on-Cores project page https://github.com/maratishe/ringsNcores [16] myself (2014) MAWI Working Group Traffic Archive M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 26/28 ... 26/28
  29. 29. . http://mawi.wide.ad.jp/mawi M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 27/28 ... 27/28
  30. 30. . Extras (1) Per-Unit Cost Hashing Increasing Per-Unit Cost Manager Prefix Matching Cores that do not match Process Stage 1 Stage 2 Stage 3 M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 27/28 ... 27/28
  31. 31. . Extras (2) Share Memory Trick M.Zhanikeev -- maratishe@gmail.com -- Design and Algorithms for Multicore Capture in Data Center Forensics-- http://bit.do/marat140603 -- 28/28 ... 28/28
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×