Meaningful Insights From Raw Metrics: Virtual Worlds and Other Business Appli...PARC, a Xerox company
Presentation at O'Reilly Strata Conference: http://strataconf.com/strata2011/public/schedule/detail/17009
Virtual worlds are a goldmine of untapped behavioral data with insights that can be applied to many online social systems, as well as to the physical world.
But unlike the physical world (where it is obtrusive and cost-prohibitive to follow distributed users around with video cameras and sensors), virtual worlds come readily instrumented. Anything a user says or does – including how often and in what ways they interact with other users and objects in their virtual environment – can be tracked over an extended period of time.
In this presentation, PARC social scientists will share findings and methods (e.g., customized scripts) they developed to extract behavioral data from online games. While we will use the example of World of Warcraft, a massively popular online multiplayer game that appeals to a broad demographic (and has an average user age of 30), our data collection/analysis methods have been applied to other virtual environments as well.
More importantly, we will discuss how we converted and processed raw behavioral metrics into meaningful psychological variables that can be applied to a broad spectrum of business applications and segments. Other questions we will address include: What are some of the unique data collection challenges in virtual environments? What are the pitfalls and advantages of large-scale data sets, real-time data monitoring, and more? How can one extrapolate insights from low-incident events to broader samples or domains?
Meanwhile, our other goals in this work (which is partially funded by the U.S. government) were to examine: whether behaviors in virtual worlds can be used to predict a user’s demographic and personality; which cues are most predictive; and how well can these variables predict real world behaviors? The process we developed and our findings can be used to create practical, actionable tools for automated segmentation, targeted marketing, and other business intelligence.
The Mobile Web is a complicated beast, making Mobile Web Performance a tough problem to tackle. Is an iPad on WiFi a part of the Mobile Web? How about a laptop with a 3G stick?
This presentation tries to split the Mobile Web into three categories, to make it more manageable: Network, Software & Hardware. For each, it reviews the performance challenges this category entails, and offers possible solutions to those challenges.
A recording of this presentation (with audio) is available here: http://vimeo.com/32917131
Meaningful Insights From Raw Metrics: Virtual Worlds and Other Business Appli...PARC, a Xerox company
Presentation at O'Reilly Strata Conference: http://strataconf.com/strata2011/public/schedule/detail/17009
Virtual worlds are a goldmine of untapped behavioral data with insights that can be applied to many online social systems, as well as to the physical world.
But unlike the physical world (where it is obtrusive and cost-prohibitive to follow distributed users around with video cameras and sensors), virtual worlds come readily instrumented. Anything a user says or does – including how often and in what ways they interact with other users and objects in their virtual environment – can be tracked over an extended period of time.
In this presentation, PARC social scientists will share findings and methods (e.g., customized scripts) they developed to extract behavioral data from online games. While we will use the example of World of Warcraft, a massively popular online multiplayer game that appeals to a broad demographic (and has an average user age of 30), our data collection/analysis methods have been applied to other virtual environments as well.
More importantly, we will discuss how we converted and processed raw behavioral metrics into meaningful psychological variables that can be applied to a broad spectrum of business applications and segments. Other questions we will address include: What are some of the unique data collection challenges in virtual environments? What are the pitfalls and advantages of large-scale data sets, real-time data monitoring, and more? How can one extrapolate insights from low-incident events to broader samples or domains?
Meanwhile, our other goals in this work (which is partially funded by the U.S. government) were to examine: whether behaviors in virtual worlds can be used to predict a user’s demographic and personality; which cues are most predictive; and how well can these variables predict real world behaviors? The process we developed and our findings can be used to create practical, actionable tools for automated segmentation, targeted marketing, and other business intelligence.
The Mobile Web is a complicated beast, making Mobile Web Performance a tough problem to tackle. Is an iPad on WiFi a part of the Mobile Web? How about a laptop with a 3G stick?
This presentation tries to split the Mobile Web into three categories, to make it more manageable: Network, Software & Hardware. For each, it reviews the performance challenges this category entails, and offers possible solutions to those challenges.
A recording of this presentation (with audio) is available here: http://vimeo.com/32917131
This slide deck explains a bit how to deal best with state in scalable systems, i.e. pushing it to the system boundaries (client, data store) and trying to avoid state in-between.
Then it picks arbitrarily two scenarios - one in the frontend part and one in the backend part of a system and shows concrete techniques to deal with them.
In the frontend part is examined how to deal with session state of servlet containers in scalable scenarios and introduces the concept of a shared session cache layer. Also an example implementation using Redis is shown.
In the backend part it is examined how to deal with potential data inconsistencies that can occur if maximum availability of the data store is required and eventual consistency is used. The normal way is to resolve inconsistencies manually implementing business specific logic or - even worse - asking the user to resolve it. A pure technical solution called CRDTs (Conflict-free Replicated Data Types) is then shown. CRDTs, based on sound mathematical concepts, are self-stabilizing data structures that offer a generic way to resolve inconsistencies in an eventual consistent data store. Besides some theory also some examples are shown to provide a feeling how CRDTs feel in practice.
Dubbo and Weidian's practice on micro-service architectureHuxing Zhang
Weidian is a social-based e-commerce platform that helps people with dreams to start a business easier. It is also committed to providing consumers with a useful, fun and attitude-oriented shopping platform. Since started from 2014, it has reached 170 millions app downloads in total, and millions of average DAU.
This slides introduces how Weidian's architecture is migrating from monolithic application to Dubbo's based micro-service architecture.
BD Conf: Visit speed - Page speed is only the beginningPeter McLachlan
How can we beat the speed of light and make visits faster? Pre-fetching is one way we can make resources available before they're needed. This talk explores challenges in mobile visit performance and discusses the design of a generic pre-fetching system.
Enterprise Gamification – Exploiting People by Letting Them Have Fun [PARC Fo...PARC, a Xerox company
PARC Forum Presents: Using game mechanics and game design techniques in non-game contexts like business applications have shown significant increases in user engagement, and increased the ROI and other metrics. In this talk we will learn what business can learn from Angry Birds. We will shatter stereotypes about games, show what gamified applications you already use, give you some facts and figures on the impact of gamification on results, and highlight examples in the corporate world.
Mario Herger is a Senior Innovation Strategist at SAP Labs in Palo Alto, California and Global Head of the Gamification Initiative at SAP. He has worked in the past as developer, development manager, architect, product manager and other roles on a series of new SAP products. He has been driving communities for more than 15 years, including innovative topics at SAP, like Visual Composer, Business Process Experts, mobile and gamification.
In his work as head of the Gamification Initiative at SAP he has encountered and supported gamification efforts in the enterprise from multiple levels and departments, like Sustainability, On Demand, Mobile, HR, Training & Education, Banking etc. He has driven the awareness around gamification inside and outside SAP by organizing and leading innovation events around this topic, holding full day gamification workshops, working with gamification platform- & service-providers and game studios, consulting and advising organizations, and by incorporating gamification into SAP's strategy.
He has a Ph.D. in Chemical Engineering from the Vienna University of Technology and an undergraduate degree in International Business Management from the Vienna University of Economy.
He recently played through all levels of the iPad game Air Attack and currently works with his five year old son on reaching the final level of Angry Birds in Space.
This slide deck explains a bit how to deal best with state in scalable systems, i.e. pushing it to the system boundaries (client, data store) and trying to avoid state in-between.
Then it picks arbitrarily two scenarios - one in the frontend part and one in the backend part of a system and shows concrete techniques to deal with them.
In the frontend part is examined how to deal with session state of servlet containers in scalable scenarios and introduces the concept of a shared session cache layer. Also an example implementation using Redis is shown.
In the backend part it is examined how to deal with potential data inconsistencies that can occur if maximum availability of the data store is required and eventual consistency is used. The normal way is to resolve inconsistencies manually implementing business specific logic or - even worse - asking the user to resolve it. A pure technical solution called CRDTs (Conflict-free Replicated Data Types) is then shown. CRDTs, based on sound mathematical concepts, are self-stabilizing data structures that offer a generic way to resolve inconsistencies in an eventual consistent data store. Besides some theory also some examples are shown to provide a feeling how CRDTs feel in practice.
Dubbo and Weidian's practice on micro-service architectureHuxing Zhang
Weidian is a social-based e-commerce platform that helps people with dreams to start a business easier. It is also committed to providing consumers with a useful, fun and attitude-oriented shopping platform. Since started from 2014, it has reached 170 millions app downloads in total, and millions of average DAU.
This slides introduces how Weidian's architecture is migrating from monolithic application to Dubbo's based micro-service architecture.
BD Conf: Visit speed - Page speed is only the beginningPeter McLachlan
How can we beat the speed of light and make visits faster? Pre-fetching is one way we can make resources available before they're needed. This talk explores challenges in mobile visit performance and discusses the design of a generic pre-fetching system.
Similar to CCNxCon2012: Session 5: Denial of Service Attacks Evaluation (20)
Enterprise Gamification – Exploiting People by Letting Them Have Fun [PARC Fo...PARC, a Xerox company
PARC Forum Presents: Using game mechanics and game design techniques in non-game contexts like business applications have shown significant increases in user engagement, and increased the ROI and other metrics. In this talk we will learn what business can learn from Angry Birds. We will shatter stereotypes about games, show what gamified applications you already use, give you some facts and figures on the impact of gamification on results, and highlight examples in the corporate world.
Mario Herger is a Senior Innovation Strategist at SAP Labs in Palo Alto, California and Global Head of the Gamification Initiative at SAP. He has worked in the past as developer, development manager, architect, product manager and other roles on a series of new SAP products. He has been driving communities for more than 15 years, including innovative topics at SAP, like Visual Composer, Business Process Experts, mobile and gamification.
In his work as head of the Gamification Initiative at SAP he has encountered and supported gamification efforts in the enterprise from multiple levels and departments, like Sustainability, On Demand, Mobile, HR, Training & Education, Banking etc. He has driven the awareness around gamification inside and outside SAP by organizing and leading innovation events around this topic, holding full day gamification workshops, working with gamification platform- & service-providers and game studios, consulting and advising organizations, and by incorporating gamification into SAP's strategy.
He has a Ph.D. in Chemical Engineering from the Vienna University of Technology and an undergraduate degree in International Business Management from the Vienna University of Economy.
He recently played through all levels of the iPad game Air Attack and currently works with his five year old son on reaching the final level of Angry Birds in Space.
CCNxCon2012: Session 2: DASH over CCN: A CCN Use-Case for a SocialMedia Base...PARC, a Xerox company
DASH over CCN: A CCN Use-Case for a SocialMedia Based Collaborative Project
Yaning Liu, Joost Geurts (JCP-Consult, France), Benjamin Rainer, Stefan Lederer, Christopher Muller, Christian Timmerer (Alpen-Adria-Universit Klagenfurt)
CCNxCon2012: Session 2: Network Management Framework for Future Internet Scen...PARC, a Xerox company
Network Management Framework for Future Internet Scenarios
Rui L Aguiar, Daniel Corujo (Instituto de Telecomunicações, Universidade de Aveiro), Ivan Vidal Fernandez, Jaime Garcia (Universidade Carlos III de Madrid)
CCNxCon2012: Poster Session: Cache Coordination in a HierarchicalPARC, a Xerox company
Cache Coordination in a Hierarchical Network: Early Experiences with CCNx
Giovanna Carofiglio, Diego Perino, Girolamo Piccinni (Bell Labs, Alcatel-Lucent)
CCNxCon2012: Poster Session: A Backward-Compatible CCNx Extension for Improve...PARC, a Xerox company
A Backward-Compatible CCNx Extension for Improved Support for Notifications and Content-Based Addresses
Antonio Carzaniga, Michele Papalini (University of Lugano, Switzerland), Alexander L. Wolf (Imperial College London)
CCNxCon2012: Session 3: Content-centric VANETs: routing and transport issuesPARC, a Xerox company
Content-centric VANETs: routing and transport issues
Marica Amadeo, Claudia Campolo, Antonella Molinaro (University Mediterranea of Reggio Calabria, Italy)
CCNxCon2012: Session 5: Denial of Service Attacks Evaluation
1. (D)DoS in CCN:
Evaluation & Countermeasures
September 12, 2012
Ongoing collaborative work between PARC, UCLA and UCI
Presenter: Ersin Uzun
2. What is DoS attack?
• Goal: Prevent legitimate resource usage
– i.e., attack on availability
• Resources, e.g.,
– Memory, CPU, Bandwidth, Storage etc.
• Distributed DoS (DDoS) attacks are
common on today’s Internet
PARC | 2
3. (D)DoS in CCN: IP vs. CCN
• CCN is fundamentally different than IP
– Not push based
• Data transmission must be preceded by a request for that data.
• Most DoS attacks in IP are possible because unsolicited data can be sent
anywhere
– Reliable Forwarding Plane
• Interest and data follow the same path (i.e., immediate feedback to routers)
• Easy to secure routing (remains challenging in BGP)
• Better resiliency with multi-path routing
– Most current DoS attacks on IP are not applicable to CCN.
• What about new DoS attacks, specific to CCN?
PARC | 3
4. (D)DoS in CCN: Two Major Threats
• Content Poisoning:
– Adversary introduces junk or fraudulent content
• Pollutes router caches and consumes bandwidth
• Invalid signatures or valid signatures by invalid producers
– Not easy to implement: cannot unilaterally push content
• there will likely be trust mechanisms to register namespaces, etc.
• Interest flooding:
– Adversary injects a large number of spurious interests
• Non-sensical distinct interests: not collapsible by routers
• Consume PIT state in intervening routers as well as bandwidth
• Legitimate CCN traffic suffers!
– Easy to implement
– Current CCNx has no countermeasures implemented
This talk is focused on interest flooding
PARC | 4
5. Interest Flooding Attacks
• Why interest packets could be used for DoS?
– Interests are unsolicited
– Each non-collapsible interest consumes state (distinct PIT entry)
in intervening routers
– Interests requesting distinct data cannot be collapsed
– Interests (usually) routed towards data producer(s)
• Can such attacks be prevented?
– Short Answer: Yes
– Unlike IP routers, ccn routers maintain rich state information that
can be used to detect and react to interest flooding
PARC | 5
7. Exploring the solution space
• Simulation-based small experimentations
– ndnSIM modular NDN simulator
• http://ndnsim.net
– different scale topologies
• binary trees (3, 31, 128 nodes)
• 10Mbps links
• propagation delays randomized from range 1-10ms
• No caching (worst case scenario)
– simple attacker model
• Sends targeted interests (common prefix) for non-existing content
• up to 50% attacker population
– various mitigation techniques
• Emulation-based verification
• Large scale simulations for promising mitigation
techniques
7
9. Respecting physical (bandwidth) limits
• Current CCNx code does not limit the PIT size, or
the # of Pending interests for any interface
– Downstream can send more interests than physically possible to
satisfy.
• CCN has balanced flow between Interests & Data
– Number of Interests defines upper limit on Data packets
• The number of pending Interests to fully utilize a
link with data packets is:
bandwidth !(Bytes/s)
Interest limit = delay(s)! +!
avg data packet size!(Bytes)
PARC | 9
10. That limit alone is not sufficient
• In small topologies,
prevents attackers
from injecting
excessive # of
interests.
• As expected, it
does not work in
big topologies
– No differentiation
between good and bad
traffic.
PARC | 10
11. Utilizing the state information in routers
• Theoretically, CCN routers have all the information
needed to be able to differentiate good interests
from bad ones.
– To be effective in DoS, bad interests need to be insuppressible and
requesting non-existing content.
– On the other hand, good interests will likely be satisfied with a
content
• Keep per incoming interface, per prefix (FIB entry)
interest satisfaction statistics in routers
• Use the statistics to detect and control bad traffic.
PARC | 11
12. Weighted round-robin on interest queues
• when an Interest arrives
– If (per-prefix/per-face) pending Interest limit is not
reached
• accept Interest and create PIT entry
– If limit is reached
• “buffer” Interest in per-outgoing face/prefix queue (within per-
incoming face sub-queue)
• set weight for per-incoming face sub-queue proportional to observed
interest satisfaction ratio
– when new PIT slot becomes available
• accept and create PIT entry for an Interest from queues based on
weighted round robin sampling
12
13. Weighted round-robin results
• Partially works
– more fair share of
resources
– Not very effective at
differentiating bad and
good traffic (no-cache
scenario)
– Setting queue sizes and
lifetime can get tricky
– Will most likely improve if
supplemented with
NACKS (under testing)
13
14. Probabilistic Interest acceptance/drops
• When an Interest arrives
– “accept” if the outgoing face is utilized under a
threshold
– Otherwise, accept with probability proportional to the
satisfaction ratio for Interests on this face and per-
prefix
– Even if satisfaction ratio is 0: “accept” with a low
(“probe”) probability
• All “accepted” Interests are still subject to
(per-prefix/per-face) pending Interest limit
14
15. Probabilistic Interest acceptance Results
• Parameter selection
is important but may
not be easy due to
topology variances.
• May result in link
under-utilization
• Works in general,
• Might perform better
with NACKs (more
accurate statistics)
15
16. Dynamic Interest limit adjustments
• Incorporate “active” PIT management
– Periodically
• for every FIB prefix
– for all faces
» Announce NoPI limit proportional to the satisfaction
ratio
– Min limit is 1 and sum of all announced limits is at least equal to
sum of output limits
" (ratio face ! Limitout ) # Limitout
face
16
17. Illustration of Dynamic limits
Just before the attack Immediately after Stabilized after the attack
attack starts
Satisfaction rates Satisfaction rates Satisfaction rates
1.0 0.0 1.0 1.0
1.0 NA 1.0 0.0
NA 0.0 1.0 0.0
1.0 1.0 0.5 0.5 0.9 0.9
1.0 1.0 1.0
1.0 1.0 1.0
Announced limits Announced limits Announced limits
50 50 25 25 1 25
50 25 25 25
50 1
50 50 25 25 45 45
50 50 50
50 50 50
17
18. Dynamic limits results
• Does not require
much parameter
tweaking
• Works with all
topologies tested
18
19. Large scale experimental setup
• Rocketfuel Sprint topology
• 7337 routers and 10 000 links
• Only adjacency = no link characteristics info
• Extract
– 535 backbone routers
– 3339 gateway routers
– 3463 customer routers
• Backbone <-> Backbone links are 100Mb with 70ms delay
• Backbone <-> Gateways links are 10Mb with 20ms delay
• Gateway <-> Customer links are 1Mb with 20ms delay