Big problems


Published on

Just some presentation I gave a few years ago.

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • QuestionsHow many people have been in a Data Center at any point in their career?How many people have been in a data center in the last year?How many people have been part of the construction, staging and turnup of a data center in their career?How many people have in the last year?
  • Streamed or buffered audio and video (RTSP, RTP, RTMP, Flash), peercasting (PPStream, Octoshape), placeshifting (Slingbox, home media servers)
  • Architect must distill patterns to find a common way of testing for rational justification
  • Architected over 100s of years..Scale evolved over several generationsPurpose and intent left to interpretation but believe this was a place to bury highly important people in the culture. May have been the architects themselves3000BC Dug a ditch a bank and a ring of 56 pits Aubrey Holes under the chalk to possible hold bluestones from wales500 years later sarsen stones were but up and bluestones were movedAvenue to River.Many generations, abandoning one form and moving to another. didn’t have much scaling problems here, lots of mathematics and astrological knowledge (moon and sun trajectorySome, the "bluestones", weighed four tons each and were brought a distance of 150 miles from Pembrokeshire, Wales. with the introduction of copper and gold, Personal wealth lead to individual burials
  • Before Common EraThey were excellent at dealing with wood and stone
  • Rosetta Stone amongs other things is a public notice part of which says ”with regard to the priests, that they should pay no more as the tax for admission to the priesthood than what was appointed them throughout his father's reign and until the first year of his own reign; and has relieved the members of the priestly orders from the yearly journey to Alexandria;” basically relinquishing the priests from paying taxes.
  • Next Slide into Neolithic ArchitectureWe are going to dabble in some ancient architectures and see how they can be related..
  • This is the low level
  • MESIF (Modify, Exclusive, Shared, Invalid, Forward)CAP (Consistency, Availability, Data Partionining)REST (Representational State Transfer Service)pNFS (NFSV4.1DHT(Distributed Hash Table)NOSQL (Not Only SQL)DSL(Domain Specific Language)ORM(Object Relational Mapper)PCM(Phase Change Memory)TSV(Through Silicon Via)
  • This is the high level
  • Scale goes from simple structures to whole cities.. IMHOTEPInvention of writing at 3100BCEThe Sumerians were the first society to create the city itself as a built form
  • IMHOTEPAppears in late Neolithic
  • Also Gunther..Our focus is to model Poisson arrival rates and service times even though Ethernet exhibits some self-similar behavior (i.e. LRD)Contention (i.e. Spinlock, row lock, etc..)Coherency=Consistency“The problem of characterizing Internet traffic is not one that can be solved easily, once and for all. As the Internet increases in size and the technologies connected to it change, we must constantly monitor and reevaluate our assumptions to ensure that our conceptual models correctly represent reality.”[1]
  • Serialized ContentionHyperthreading (SMT) SpinlocksMutex Field of study around lockless algosAs parallel process increase the serialized contention becomes the prominent dependencyWhile there are other ways of modeling data what is important to recognize is the fact that a completely Poisson model is what allows us to balance out the loadThe more self-similar or LRD the more problematic it becomes to model behavior. Ethernet actually exhibits LRD behavior on the output, how much of this will cause bad architectural strategies.Like the Conservation of Mass you have the Conservation of Bottlenecks. Bottlenecks are created nor destroyed they simply move from one point to anotherWhy should we pay attention to these models? Any architecture which is not based on these simple mathematics will have a difficult time being modeled correctly and thus capacity planning will be completely ineffective.People always place the burden on the application to deal with bottlenecks but there are only so many implementations which allow for a significant change. For instance the use of GPU for Victimization. This is the classical “Speed=up” model which we can reduce execution time by adding more SIMD capable computation engines. As opposed to scale-up which allows for application demand to grow while keeping the serialized overhead the same (I,,e same service rate) in order to protect customer expectations of serfvice level.Other ModelsGeometric ModelQuadratic ModelExponential ModelThink if a and k as state and federal taxes
  • GuntherCoherency overheadTwo variables are sigma (serlized contention) and kappa which is the coherency (consistency) overhead,Brawny cores still beat wimpy cores, most of the time, UrsHölzle GoogleSoftware development costs often dominate a company’s overall technical expenses, so forcing programmers to parallelize more code can cost more than we’d save on the hardware side
  • drawings on the left were found by the French at the quarries of Gebel Abu Feida in 1789. These pillar capitals, destined for a temple at Denderah being built by Cleopatra, were sketched with red ochre on the rock face in half the natural size. drawings on the left were found by the French at the quarries of Gebel Abu Feida in 1789. These pillar capitals, destined for a temple at Denderah being built by Cleopatra, were sketched with red ochre on the rock face in half the natural size. of Inventions in Ancient EgyptBlack InkFirst Ox-Drawn Plows365 Day Calendar and Leap YearPaperFirst Triangular Shaped PyramidsOrganized laborHieroglyphics as an early system of writingSails
  • NFSV1 file striping
  • Number of elements to a set (find largest match)
  • ancient mechanical computer[1][2] designed to calculate astronomical positions.The device, they say, is technically more complex than any known device for at least a millennium afterwards.The text is astronomical with many numbers that could be related to planetary motions, and the gears are a mechanical representation of a second century theory that explained the irregularities of the Moon's motion across the sky caused by its elliptical orbit.
  • “Memory trespass vulnerabilities are software weaknesses that allow memory accesses outside of the semantics of the programming language in which the software was written.”Fuzzing attacks are used to exploit unknown application behaviors which can be used to create an exploit.
  • We can see what they were able to accomplish but don’t know how or why. The architecture remains and can be studied even though it has no use today.Different Scales, moving towards defined purposes, burial ground but for individuals with great wealthMonuments for the group to monuments for the most wealthy and powerfulEach architecture develops to solve a purpose and than maybe discarded or refactored for other purposes.
  • It wasn’t the attempt of the ancient architects to define their architectural period, it is for us to analyze the history of design and how its patterns change.The pharaohs wanted to do something “godlike” like live forever…It was the architect who had to figure out a way of explaining it even though it required massive engineering skill.Maybe IMTOs intent was to build the pyramid and got a buyer for it..
  • Architected over 100s of years..Scale evolved over thousands of years smaller stones to bigger stones.Many iterations, many stages. 500 years after bluestones the Sarsen stones appeared.3000BC Dug a ditch a bank and a ring of 56 pits Aubrey Holes under the chalk to hold bluestones, the "bluestones", weighed four tons each and were brought a distance of 150 miles from Pembrokeshire, Wales. generations, abandoning one form and moving to another. didn’t have much scaling problems here, lots of mathematics and astrological knowledge (moon and sun trajectory)Ended with the introduction of copper and gold, Personal wealth lead to individual burials
  • Otber ModelsGeometric ModelQuadratic ModelExponential ModelThink if a and k as state and federal taxes
  • Cardinality: Measure of the number of elements to a set (find largest match)
  • Big problems

    1. 1. Gary BergerTechnical Leader, EngineeringOffice of the CTO, DSSGBiggest Problems in Cloud Design TodaySource:
    2. 2. Internet being dominated byreal-time entertainmentSource: Sandia, 2010 Global Internet Phenomena Report
    3. 3. What is an Architect?IMHOTEPDOCTOR, ARCHITECT, HIGH PRIEST, SCRIBEAND VIZIER TO KING DJOSER“An architect does not arrive at his finishedproduct solely by a sequence ofrationalizations, like a scientist, or through theworkings of the Zeitgeist. Nor does he reachthem by uninhibited intuition, like a musicianor painter. He thinks of forms intuitively, andthen tries to justify them rationally. PeterCollins 1966“Good architecture has been seen largely aseither working within a context orcircumventing it, depending on whichprinciples are adopted and where the cuttingedge is perceived.” Theory of Architecture,Paul-Alan Johnson, 1994
    4. 4. Neolithic Architecture6800BCE– 3200BCEStonehenge Circa 3000BCESecrets of Stonehenge, Nova 2009
    5. 5. Neolithic TechnologyThe Lever Ball Bearings
    6. 6. Why is Architecture hardto understand?“Whereof one cannot speak, one must pass over in silence.”Wittgenstein
    7. 7. Tacit Knowledge(Informal Knowledge)• Knowledge that is difficult totransfer to another person bymeans of writing it down orverbalizing it.• Knowledge which cannot becodified, but can only betransmitted via training orgained through personalexperience.• Inherent “know-how” -- asopposed to “know-what”(facts), “know-why”(science), or “know-who”(networking). It involveslearning and skill but not in away that can be writtendown.Source adapted from The Tacit Dimension, philosopher-chemist MichaelPolanyiW.T. Wallington walks a 21,600lbstone
    8. 8. Explicit Knowledge
    9. 9. "Knowledge as theCompetitive Resource”• "Knowledge is not just another resource alongside thetraditional factors of production --labor, capital andland- but the only meaningful resource today” -[Drucker, 1993]• “Knowledge is the source of the highest quality powerand is the key to the power-shift that lies ahead.knowledge is not merely an adjunct of money powerand muscle power but eventually will be the ultimatereplacement of other resource” -[Toffler, 1990]• “The economic and producing power of a moderncorporation lies more in its intellectual and servicecapabilities than in its hard assets such as land, plantand equipment - [Quinn, 1992]
    10. 10. Caution low flying cloud..…into the fog
    11. 11. ExalogicVCEXMLESXKVMLinux 2.7FusionIBMRFlashFPGAWebsphereHTML5CLOSButterflyHypercubeCayley TreeSpringSourceCongestionPUECisco
    13. 13. Cloud ArchitectureAdrian Colyer, CTO Spring Source
    14. 14. IndependentCompute PODData NetworkUnified I/O 10GEData Snooping/MigrationCapacity ScalingBlock StoreData Center BlueprintI/O ScalingPOD Services TierClient Access TierHTTPCompute/Data Grid
    15. 15. Things we are going totalk about• Dealing with Scalability• Dealing with Data• Dealing with Security
    16. 16. Sumerian Architecture3600BCE– 2300BCEPyramid of Djoser2630BCE – 2611BCE
    17. 17. Sumerian TechnologyThe Wheelcirca 3500BCEAdobe-brick
    18. 18. What is Scalability?Mechanical and Biological systems all have limitsScaling Factors• All systems reach a limitrelative to their size.• Understanding wherethese limitations arisegives us a clue whereto look forperformancebottlenecks• Architects typically findlimitations through trialand error.• Concurrency = Theinteraction betweenprocessors• Contention = The degreeof serialization on sharedwriteable data• Coherency = Penaltyincurred for maintainingconsistency of sharedwritable data
    19. 19. Processor ScalabilityWhat happens when you break a bottleneck!
    20. 20. Nominal ComputerAccess TimesSource; Analyzing Computer Systems withPerl , GuntherSource; Jeff Dean, Google
    21. 21. Scalability Can BeMeasuredGuerrilla Capacity Planning, Gunther, 2007Universal Scalability Law• C(p) = scaleup|scaleout• p = number of processors• a = serializedfraction(contention)• k = coherency k>=0• Scalability is not infinite but aconcave functionWe are making anassumption here that wehave an exponentiallydistributed load and servicerate (i.e. a PoissonDistribution)
    22. 22. Why Scale-Up is ImportantBeyond Wimpy CoresMax Capacity p*Asymptotic MaximumceilingCoherency starts to dominatekAmdahl k=0
    23. 23. ConclusionWe Need Models Moore’s Impact[1]• Effectively modeling some ofthese characteristics are top ofmind problems for currentapplication architects• Eric Brewers CAP Theoremchallenges architects to dealwith latency as a proxy for strongconsistency..• Much work going on inunderstanding these problemsand building a balance betweenavailability and consistency (i.e.adaptive consistency)• Some patterns make it difficult tomodel mathematically• Technologist’s Moore’s Lawo Double Transistors per Chip every 2yearso Slows or stops: TBD• Microarchitect’s Moore’s Lawo Double Performance per Coreevery 2 yearso Slowed or stopped: Early 2000sMulticore’s Moore’sLawo Double cores per chip every 2 years• Double Parallelism perWorkload every 2 yearso Aided by ArchitecturalSupport for Parallelismo Double Performance per Chipevery 2 yearsOr GAME OVER?1. Amdahl’s Law in the Multicore Era, Hill, Marty, Wisconsin Multifacet Project
    24. 24. Ancient EgyptianArchitecture3000BCE– 300CEPyramids at Giza2575BCE to 2150BCEHatshepsut’s TempleCirca 1482BCE
    25. 25. Ancient EgyptianTechnologySchematicsDenderah and the Temple of Hathorbeing built by CleopatraCirca 30 BCE – 14CEzzzProcess DocumentationRope Making
    26. 26. Data ManagementData management is the development, executionand supervision of plans, policies, programs andpractices that control, protect, deliver andenhance the value of data and information assetsWhat are the two most important commands in thedata center today?(NFS Read/Write)Source: Data Management International,
    27. 27. Data ManagementModels Practices• Request level parallelism• Data level parallelism• Persistence model• Durable, Volatile,Transient• Caching Eviction Policies• Synchronous/Asynchronous Updates• Denormalization of data• Caching Treeso Anti-cache spoilers• Distributed Hash Tables(NOSQL)o Key/valueo Columno Documento Graph• Messaging andSerialization(IPC)o Lightweight interfaces (PB, Thrift,HC)• Distributed transactionso Opportunistic lockingo Vector Clockso Paxos protocols
    28. 28. Jason McHugh, Principal Engineer, AmazonFlash CrowdsDemand spike on singular resource• 69.6 seconds receive31K requests for a singleobject• Cache spoilers• Cache trees andcoherency protocolbuilt into relaxconsistency to protectavailability
    29. 29. Data StructuresSet TheorySource Big Data in Real-Time at Twitter, Nick Kallen,QCONSF, 2010
    30. 30. Classical Architecture850BCE– 475CEParthenon
    31. 31. Classical Technology150BCE– 100BCEAntikythera Machine
    32. 32. The “Illusion” of Security• Perimeter defense sealsoff data center soattack surface movesto the client• Attackers find path ofleast resistanceo Email Addresseso Social Websiteso Standard naming practices )i.e.firtname.lastname@company.comThe Apple I,Recently sold for $210,000“Simply keeping out bad code is not sufficient to keep out badcomputation” Stefan Savage, UC San Diego
    33. 33. Modern AttacksEasy to 0wn, Normal processing leads to code executionMitigation Strategies• Memory Trespass• Rogue AV through mass mailings• Injection Flaws (SQL, OS, LDAP)• Cross Site Scripting• Broken Authentication andSession Management• Insecure Direct ObjectReferences• Cross-site Request ForgerySummary• Normal processing leads to codeexecutiono Receive packet/requesto Parse display/data• ASLR (Address SpaceLayout Randomization)• DEP (Data ExecutionPrevention)• Stack Cookies• Sandboxing• Need to understandstrategy more thantacticsExamples
    34. 34. Source: Dino A. Dai Zovi, Memory Corruption, Exploitation and YouWorkstation AttackSurface
    35. 35. Zero Day Attacks• The price of disclosure?• There are 1419 Researchers working at ZDI?• ZDI can be used to launch a new Aurora attack
    36. 36. Modern Browser AttackGraphSource: Dino A. Dai Zovi, Memory Corruption, Exploitation and You
    37. 37. Architectural Ladders3000 BCE 300 CENeolithic ArchitectureSumerian ArchitectureAncient EgyptianArchitectureClassical Architecture
    38. 38. Architecture• Architecture is created to expresssome intent but is not the purposeitself, therefore architecture mustserve a purpose• Architectures must evolve or die,sometimes at the expense of theintent and function• Architectures can be rediscovered,refactored and reused for a newpurpose or function• Architectures may not realize theirfull potential• Architectures do not replacefundamentals in engineering andscience but establish a patternfrom which to describe itseffectivenessFoote, Yoder, 1999, The Big Ball of MudZIGGURAT: Dubai’s Carbon Neutral PyramidWill House 1 Million
    39. 39. Conclusion• Some of the problems today have been recognized over adecade ago but lacked the economic justifications forchange• History repeating as we move to refactoring architectures ofthe past “Engineered Solutions” just at different scales• New architectures being proposed based on empiricalevidence, prototyping and experimentation, others just ahorrible guess• Architects need to quickly establish new patterns with the goalof pushing the bottlenecks to the least cost contributor (i.e.Energy Proportional Computing).• Architecture should help us to describe intent of the productor function not merely as a generalization• Architectures today are agile• Architecture for efficient computing which maximizesprocessing power per joule of energy.
    40. 40. Uggh.. Predictions?• By 2012, 20 percent of businesses will own no IT assets• By 2012, India-centric IT services companies will represent 20percent of the leading cloud aggregators in the market (throughcloud service offerings)• By 2012, Facebook will become the hub for social networkintegration and Web socialization• By 2013, mobile phones will overtake PCs as the most commonWeb access device worldwide• By 2014, most IT business cases will include carbon remediationcost• By 2014, over 3 billion of the worlds adult population will be ableto transact electronically via mobile or Internet technology• By 2015, context will be as influential to mobile consumer servicesand relationships as search engines are to the Web• By 2016, all Global 2000 companies will use public cloud services.
    41. 41. Thank You
    42. 42. Backup
    43. 43. Stonehenge – Woodhenge - BluehengeAround 3 miles
    44. 44. Meta Structures to scaleService Directory MetaDataMetaDataMetaData MetaDataMetaData MetaDataContent ContentContentContentContentContent
    45. 45. PersistencypNFS RFC5661 HoneyComb 2• Parallel Opens by filehandle• Asynchronousnotification on lockavailability• Commands linearizedin slot table• Support for File, Objectand Block targets• Automated data management• Extreme data mobility• Ability to run 3rd party storage apps• Highly Reliable with self healing• Flat name space• Single management entity• Multi‐cell architecture• Programmatic APIs• Immutable• Automatic load balancing• Transparent node upgrades• Meta‐data support• Storage apps support• Deferred maintenance model• Open‐Source Software only
    46. 46. Clustered ScalabilityGuerrilla Capacity Planning, Gunther, 2007Universal Scalability Law• C(p) = intranode scalability• n = nodes• p,n = processors/node• az = global internode contention• kz = global internode coherency
    47. 47. Impact on applicationSource Big Data in Real-Time at Twitter, Nick Kallen, QCONSD, 2010