SMART Seminar: Massively Interacting Systems


Published on

This talk discusses advanced computationally assisted reasoning about large interaction-dominated systems and addresses the role of involve details of huge numbers and levels of intricate interactions in current fields of research.It was delivered at the SMART Infrastructure Facility by Professor Chris Barrett on September 26, 2012. For more detail, see

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Its general, there is theoretical form to the question that is semantics-freePredecessor existence and reachability, “validation” and “prediction” are both very subtle, the systems branchThe theory is new, deeply connecting to theory of computation, maps onto HPC
  • Neighborhoods have both social, functional (graphical) and spatial (different graphical) structure and these all interact
  • Neither the gene or the embryonic state supervene with respect to the morphology of the pigmentation
  • Normally the gene is presumed to create the color pattern. Here the gene in contact with the last color pattern makes a new one. One aspect does not supervene the other wrt the morphological state of the mouse. Only in the interaction is it possible to create the phenomenon
  • Stiglitz housing patterns are a similar excitable medium with diffusive communicationTerrible story of Belousov-Zhabotinsky
  • T=0State that expert opinion was used to create this slide/information
  • Buildings from DTRA with red indicating areas of high casuality probability (upper right) street network (NAVTEQ) and people positions at the time of detonation in the detailed study area (lower right)Bottom left picture shows power outage area as light purple polygon. Locations in power outage area are plotted as red, locations with power are green. 730,833 persons in the DSA at time of detonationT=0146,337 locations (includes transportation nodes)Small label People, built infrasture, position of people of DSA and left power outages
  • note: Green: no collapse; Yellow: sideways collapse;> Red: 100% collapse.the blue ringis 2.2km circle, different colors on the buildings represent thelevel of collapse. If needed, I can generate another one quicklytomorrow morning, maybe using the data for 3.2km circle.T=0Buildings DTRA says had collapsed
  • Roadway network from NAVTEQ with damage (upper left)Road network zoomed with level of damage included (lower left)Walk network with damage (lower right)
  • CloseAlive_Pairs.movPoint – you can look at the data this way – transportation system is same in both cases – lots of bars on the roads, because that is where people are…Building points are the front door – thus bar on the streetDistribution of the population -
  • Blue – Cell 1 greaterPurple – Cell 2 greaterTitle:tansportaion link demand/or density
  • Green is bad, red is goodAverage level of health state (ie high number, red, is Full Health) per location based on inside vs. outside. All health levels shown, so uninjured are averaged in. Move to t=0Blank spots are the sparse areas where we have very few to 0 people at the time of the blast
  • Database Table sizesInput Data Tables: 3.55 GBOutput Dynamic Data for 1 cell (126 iteration, 80 hours of simulated time): 8.06 GB location tables, 19 GB person tablesDisk Usage:Input Data: 1.16 GBDynamic Data for 1 cell(126 iterations, 80 hours of simulated time): 15 GBComputation Time – for Run 1413:Behavior Module runs in about 2 minutes but uses 96 cores so time spent in computation is roughly 2*96 = 192 minutes/iterationRouter execution time varies depending on number of routes.  To compute approximately 200,000 routes, the runtime is about 8 minutes and uses 6696 nodes/12 threads node for all 124 iterations.The router uses approximately 40 nodes with 12 threads/node so computation time is roughly 40*12*10 = 4800 minutes;
  • SMART Seminar: Massively Interacting Systems

    1. 1. Massively Interacting Systems:Thinking & deciding in the age of Big Data Chris Barrett Scientific Director Virginia Bioinformatics Institute
    2. 2. Part 1
    3. 3. What is interaction? What’s the issue? • A finite undirected graph Y • A sequence of local maps • An ordering of the vertex set of Y [FY,p] = P Fp(i)
    4. 4. Hypergraph
    5. 5. “Genuine” social entities & interactions “ .. [usual causal] hierarchy collapses when causality crosses across units and levels….human behavior in social setting is interdependent …. although … not a new insight, social life is interdependent in … spatial forms – things “go together” in and across distinct places …. which might be better described as neighborhood causal processes…” Robert J. Sampson, The Great American City, 2012
    6. 6. What interacts in an evolving city?• People are entities that have purposes, needs, capacities and interact• Neighborhoods are entities that have purposes, needs, capacities and interact, “have their own logic and causality.”• Causes, causal interactions, occur across “normal” causal boundaries – People interact with people and neighborhoods – Neighborhoods interact with neighborhoods and people – E.g., self selection bias, extra neighborhood proximity processes etc are within and among network processes that do not supervene one another.
    7. 7. This is not entirely unique to neighborhood selection• Traffic and transportation • Motives/goals, activities, transport resource, transport infrastructure, resource competition, form and function of infrastructure, traffic, communicated dynamics, time delays, goal failure/success etc…. loop and evolve• Genetic predisposition, homophily, family and peer mimicry, other social functionalities affect • Success, variously • Suicide • Smoking • Obesity • Healthful behaviors ……., etc.
    8. 8. In fact, it is seen in biology• Suzuki, et al, 2003• The pigmentation control gene Fox1 is defective in a mutant mouse and shuts down the normal process by which pigmentation patterns are stabilized in the skin/hair of the mouse.• It creates moving waves of color striping• This gene normally can produce all solid, spot and striped patterns by simply activating at a particular times in embryonic development; the morphology of the embryo at that time determines the pattern created by the gene• In the mutant case, the continued malfunction, given the most recent color morphology, generates a new pattern, and so on.
    9. 9. Traveling Stripes- SuzukiThese dynamics are the same kindas in Belousov-Zhabotinskii reactionof nonlinear waves in excitablephysical media
    10. 10. Are they are the same causal class?• The Chemical Basis of Morphogenesis, Turing 1951• Usually, diffusion processes (local communication) stabilizes in a mixed system, but under “exicitable” media conditions, structure appears and evolves• It is seen in physical chemistry and excitable media• It is, essentially, a rewrite computational THEORY of interaction that is just now being really discovered
    11. 11. Beyond traditional genomics morphology morphology gene Maybe
    12. 12. Beyond traditional social modeling Social context Social context individual behavior Definitely
    13. 13. And even in physical systems• Chemical morphogenesis, B-Z dynamics
    14. 14. The “inside-outside” problem is a related issue of non-supervenance• What is an organ? – Biome• What is in an organism, what is outside of it?• What/ where is a thought? – Extended mind – Distributed algorithmic accounts of causes• What is an agent? – Is agency necessarily encapsulated? – Driver behavior• What is an urban agent?
    15. 15. Part 2
    16. 16. Where does Big Data come from?Metric, declarative, procedural sources & integration
    17. 17. What is Big?• The worlds technological per-capita capacity to store information has roughly doubled every 40 months since the 1980s; as of 2012, every day 2.5 quintillion (2.5×1018) bytes of data were created [stored]. Wikipedia, “Big Data”, August 2012
    18. 18. Unstructured data growing
    19. 19. Storage shortfall: need synthesis
    20. 20. Virtualized storage is exponential; not enough.
    21. 21. Unit price down, total investment up
    22. 22. In genomic sequencing, bp/$ is down exponentially
    23. 23. Moore’s Law is not enough
    24. 24. New post-genomic reality: The cost drivers have shifted to analytical computing
    25. 25. Data creation, deletion & storage• We will know what - of all that data - it is possible to “forget” only when we know how to summarize what is possible• That’s a big analytical problem as we will see• Use of graph theory and graphical dynamical systems (networks) is essential – Computationally very intensive
    26. 26. Part 3
    27. 27. Who cares?
    28. 28. Massively Interacting Systems• These things produce branching processes• Sometimes they are periodic, sometimes they are not• They do not explore the entire possible state space (all morphologies are not expressed)• So even with the immense amount of underlying data necessary, decision analysis must produce infinitely more• This complicates measurement as well as theory making in sense of acceptable explanations of observations• It makes observed, metric, data; declarative data and procedural information all essential• It is the effects of processes of composition of complex interactions that ultimately generates so much data, both measured and synthesized.
    29. 29. Q: How can we support humananalytical capabilities in this situation?
    30. 30. The end of the great man theories of….. decision making• Many stakeholder synthetic information• The analysis environment is not separated from “the world”• An entirely new interaction medium will create NEW REALITIES• The approach must involve human expertise and context, it must be a cognitive augmentation system• It must involve distributed, social cognition• It must follow context & allow information deletion• It will change scientific process and assumptions
    31. 31. People are interconnected properties Age 26 26 7 Income $27k $16k $0 Status worker worker student Automobile
    32. 32. Extra-household connectivities also influence/ reflect motives, activities and behavior Office Links Jill Shawn Friendship John Links Joe Mar y Ron Family Jane Tim Links
    33. 33. Interdependent motives & activity structures of individuals underlie observable behavior
    34. 34. Unencapsulated Agency:The Inside /outside problem
    35. 35. Built, functional, locational structure defines where activities occur and influences movement/comms • Synthetic activity locations, such as homes, are placed with probability proportional to location geo-functional weights: (type: home location – # people, cost, etc.) California Illinois
    36. 36. Bipartite map of people with activities onto appropriate locations with functional capacities Motivated People Activity- appropriate LocationsVertex attributes: Vertex attributes: age  Coordinates household size  Type gender income Edge attributes:  activity type: shop, work, school  (start time 1, end time 1)  (start time 2, end time 2)
    37. 37. Many social contact networks:this one is physical proximity X duration
    38. 38. Example: large scale socio-physical interaction • Attack in Washington DC – NPS1, a 2006-based unclassified study scenario with lots of people publishing and even putting lectures on YouTube • Basically we wanted to know if there really might be significant social behavior options in the immediate aftermath that could be imagined & that might have long term influence • Disaggregate, detailed socially-coupled simulation used combined with physical modeling
    39. 39. Technical Perspective: Socially-coupled systems• Massively interacting systems generating arbitrarily much data• Want general, re-usable, approach. Many examples: transport, facebook, biosystems, economic systems• Generally, the topic of HPC based data-centric methods, network science/ network dynamics are central• Socially-coupled systems display a lack of symmetries => problems for usual dimensional reduction approaches• Systems are huge, details matter• Detailed disaggregate modeling, appropriate abstractions, novel HPC simulation methods & statistical approaches are necessary• Necessary source information is diverse, including process knowledge• Totally different view of decision analysis necessary
    40. 40. A: Synthetic decision informatics forlarge, complex socially coupled systems
    41. 41. Contextual Synthetic Information• The information platform is the interaction medium• The only way to really deal with the massively interactive, branching—thus extreme data— world.
    42. 42. Part 3.1
    43. 43. Physical Event in a Social Context• Event put “on top of” a normally functioning day’s population dynamics• National Planning Scenario 1• Unannounced detonation• Time: 11:15 EDT• Date: May 15, 2006
    44. 44. Time Damage to power network and long0:00 term power outage area• Probability of damage to individual substations Aggregated outage area• / / : High/medium/low: probability of damage• Long-term outage area devised by geographically relating the location of substations in the city with the blast damage zones.• Loss of a substation has a much more widespread impact on provided power to the customers.
    45. 45. Time0:00 Infrastructure: initial laydown • Positions and demographic identities of individual synthetic people in the DC region were calculated at the time of detonation. • Street addresses mapped to geo-functional data • Persons traveling to destinations were placed outside on transportation networks –walk, roadway, metro, bus. • Power outage, damage, collapse, rubble, blast temp, radiation dose rate assigned to each location and transportation network node Built InfrastructurePower Outages Position of People
    46. 46. Time0:00 Building Collapse Distribution
    47. 47. Damage to transportation networksTime0:00 • Red: completely damagedRoad • Orange: highly damage; reduced travel speed • Green: medium damage • Blue: light damage • White: No damage Walk network
    48. 48. Part 3.2
    49. 49. Social-behavioral Event in a Physical ContextNo communication – greenPartial Communication Restoration – BlueFirst 29 hours
    50. 50. Time+0:00 to +0:10 Transportation load comparisonBlue - Higher load in No Restoration casePurple - Higher load in Partial Restoration case
    51. 51. Composite behavior differences w & w/o early restored comms
    52. 52. CIIMS Avatars automatically create realistic individual behaviors through large scale interaction, local machine intelligence New timeline feature: Scenario displays details connected to timelineNew use oftimeline:detailed analysisofinterdependentindividualbehaviors
    53. 53. Interdependent, contextual, intentional individual avatar behaviors induce social level effects w/o scripting
    54. 54. A drama in machine intelligence: Reuniting a family after the disaster Clair and Denise • Mother and infant daughter • +0:00 - Home • Both uninjured Cliff • Father Theo • +0:00 - At work • Son • Uninjured • +0:00 Daycare • Uninjured
    55. 55. Calls finally go through Clair and Denise • +3:05 - Evacuate City • Doesn’t know where Theo is Cliff • +3:00 – Call to Clair successful • Stops panicking and finds shelter • +3:10 – Call to Theo (i.e., daycare Theo worker) successful• Continues shelter in Daycare
    56. 56. Initial Panic Clair and Denise • +0:00 – Shelter at home • Repeatedly calls 911 • Both exposed to 10cGy first 10 minutes Cliff • +0:00 – Panics, abandon’s Theo car, heads to nearest hospital• +0:10 – Workers bring children • Exposed to 0.4cGy first 50 to nearby building for shelter minutes• No exposure
    57. 57. Family Reconstitution Cliff • 44:30 Leaves shelter Theo• Remains at daycare
    58. 58. Evacuation Cliff• +45:00 – Arrives at daycare• Evacuates city with Theo
    59. 59. Aggregate behavioral details & exposure to injury • Each individuals daily or event context- driven activities take them inside and outside periodically, the details affect their injury level at the time of, as well as after, the blast. • Injury traversing rubble • Delay of access to care, etcOutdoors Indoors
    60. 60. Socio-technical influences on individual behavior• If communication is provided earlier and contact made, less panic unstructured behavior, more sheltering, less searching, etc.• There are hundreds of thousands of these avatars and many different specific motivations, or perhaps, different complex contextual embodiments of similar generic motivations• The composite effect on many things, including exposure to injury cannot be always be calculated in aggregate in particular scenarios from data obtained elsewhere.• Supporting problem evolution and the extreme importance of sparse sequential analysis is a major conclusion of this study.• The 1st 72 hrs is not the same problem as what follows. Saturated performance from initial behavioral models as situation evolves.• These methods do more than better answer a given question:
    61. 61. Section 3.3
    62. 62. Data Intensive Computing ResourcesModule Wall Compute Time Compute Time TimeTransportation 13.75 hr 8911 hr 648 coresBehavior 3.92 hr 397 hr 96 coresCommunication 9.53 hr 9.53 hrHealth 4.3 hr 4.3 hrInfrastructure 1.4 hr 1.4 hr*Summary over all iterations r1413Data Initial Dynamic (1 run) Complete Design (20 2M individuals, 2 cells, 30 replicates) weeks, full designDatabase 3.55 GB 27 GB 25TB 250TBDisk 1.16 GB 15 GB 20TB 175TB
    63. 63. Thanks