Your SlideShare is downloading. ×
0
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Current Trends in HPC
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Current Trends in HPC

1,461

Published on

Published in: Education, Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,461
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
115
Comments
0
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • CUDA is an architecture with a number of entry points. Today, developers are programming in C for CUDA. Using NVIDIA compilers. Programming language support for Fortran and other languages is coming soon. Also, CUDA supports emerging API programming standards such as OpenCL. Because the OpenCL and CUDA constructs for parallelism are so similar, applications written in C can easily be ported to OpenCL if desired. OpenCL applications sit on top of the CUDA architecture.
  • Not just WSDLs on things, but common abstractions that apply across many resources and services. (A work in progress.)
  • The sources of information are expanding. Many new sources are machine generated. It’s also big files (siesmic scans can be 5TB per file) and massive numbers of small files (email, social media). Leading companies for decades have always sought to leverage new sources of data, and the insights that can be gleaned from those data sources, as new sources of competitive advantage. More detailed structured data New unstructured data Device-generated data But big data isn’t only about data, a comprehensive big data strategy also needs to consider the role and prominence of new, enabling-technologies such as: Scale out storage MPP database architectures Hadoop and the Hadoop ecosystem In-database analytics In-memory computing Data virtualization Data visualization
  • Content and service providers as well as global organizations that need to distribute large content files are challenged with managing and ensuring performance of these distributed systems. Thus a new approach using a single storage pool in the cloud that provides policies for content placement, multi-tenancy and self service can be beneficial to their business.
  • Transcript

    • 1. Current Trends in High Performance Computing Dr. Putchong Uthayopas Department Head, Department of Computer Engineering, Faculty of Engineering, Kasetsart University Bangkok, Thailand. pu@ku.ac.th
    • 2. I am pleased to be here!
    • 3. Introduction• High Performance Computing – An area of computing that involve the hardware and software that help solving large and complex problem fast• Many applications – Science and Engineering research • CFD, Genomics, Automobile Design, Drug discovery – High Performance Business Analysis • Knowledge Discovery • Risk analysis • Stock portfolio management – Business is moving more to the analysis of data from data warehouse
    • 4. Why we need HPC?• Change in scientific discovery – Experimental to simulation and visualization• Critical need to solve an ever larger problem – Global Climate modeling – Life science – Global warming• Modern business need – Design more complex machinery – More complex electronics design – Complex and large scale financial system analysis – More complex data analysis
    • 5. Top 500: Fastest Computer on Our Planet• List of the 500 most powerful supercomputers generated twice a year (June and November)• Latest was announced in June 2012
    • 6. Sequoia @ Lawrence Livermore Lab• BlugeneQ• 34 login node – 48 cpu/node 64GB• 98304 node – 16 cpu/node 16GB• IBM power 7 1,572,864 CPU, 1.6 PB RAM• Peak 20132 TFlops
    • 7. Performance Development
    • 8. Projected Performance Development
    • 9. Top 500: Application Area
    • 10. Processor Just not running faster• Processor speed keep increasing for the last 20 years• Common technique – Smaller process technology – increase clock speed – Improve microarchitecture • Pentium, Pentium II, Pentium III, Pentium IV, Centrino, Core
    • 11. Pitfall• Smaller process technology let to denser transistor but…. – Heat dissipation – Noise – reduce voltage• Increase clock speed – More power used since CMOS consume power only when switch• Improve microarchitecture – Small improvement for a lot more complex design• The only solution left is to use concurrency. Doing many things at the same time
    • 12. Parallel Computing• Speeding up the execution by splitting task into many independent subtask and run them on multiple processors or core – Break large task into many small sub tasks – Execute these sub tasks on multiple core ort processors – Collect result together 14
    • 13. How to achieve concurrency• Adding more concurrency into hardware • Processor • I/O • Memory• Adding more concurrency into software – How to express parallelism better in software• Adding more concurrency into algorithm – How to do many thing at the same time – How to make people think in parallel
    • 14. The coming (back) of multicore
    • 15. Hybrid Architecture Interconnection Network
    • 16. Rational for Hybrid Architecture• Most scientific application has fine grain parallelism inside – CFD, Financial computation, image processing• Energy efficient – Employing large number of slow processor and parallelism can help lower the power consumption and heat
    • 17. Two main approaches• Using multithreading and scale down processor that is compatible to conventional processor – Intel MIC• Using very large number of small processors core in a SIMD model. Evolving from graphics technology – NVIDIA GPU – AMD fusion
    • 18. Many Integrated Core Architecture• Effort by Intel to add a large number of core into a computing system
    • 19. Multithreading Concept
    • 20. Challenges• Large number of core will have to divide memory among them – Much smaller memory per core – Demand high memory bandwidth• Still need an effective fine grain parallel programming model• No free lunch , programmer have to do some work
    • 21. What is GPU Computing? 4 cores Computing with CPU + GPU Heterogeneous Computing
    • 22. Not 2x or 3x : Speedups are 20x to 150x 146X 36X 18X 50X 100X Medical Molecular Video Matlab AstrophysicImaging Dynamics Transcoding Computing sU of Utah U of Illinois, Elemental Tech AccelerEyes RIKEN Urbana 149X 47X 20X 130X 30X Financial Linear Algebra 3D Quantum Genesimulation Universidad Ultrasound Chemistry Sequencing Oxford Jaime Techniscan U of Illinois, U of Maryland Urbana
    • 23. CUDA Parallel Computing Architecture• Parallel computing architecture and programming model• Includes a C compiler plus support for OpenCL and DX11 Compute• Architected to natively ATI’s Compute support all computational interfaces “Solution” (standard languages and APIs)
    • 24. Compiling C for CUDA Applications C CUDA Rest of C Key Kernels Application NVCC CPU Code CUDA object CPU object files files Linker CPU-GPU Executable
    • 25. Simple “C” Description For Parallelism void saxpy_serial(int n, float a, float *x, float *y) { for (int i = 0; i < n; ++i) y[i] = a*x[i] + y[i]; } Standard C Code // Invoke serial SAXPY kernel saxpy_serial(n, 2.0, x, y); __global__ void saxpy_parallel(int n, float a, float *x, float *y) { Parallel C Code int i = blockIdx.x*blockDim.x + threadIdx.x; if (i < n) y[i] = a*x[i] + y[i]; } // Invoke parallel SAXPY kernel with 256 threads/block int nblocks = (n + 255) / 256; saxpy_parallel<<<nblocks, 256>>>(n, 2.0, x, y);
    • 26. Computational FinanceFinancial Computing Software vendors SciComp : Derivatives pricing modeling Hanweck: Options pricing & risk analysis Aqumin: 3D visualization of market data Exegy: High-volume Tickers & Risk Analysis Source: SciComp QuantCatalyst: Pricing & Hedging Engine Oneye: Algorithmic Trading Arbitragis Trading: Trinomial Options PricingOngoing work LIBOR Monte Carlo market model Callable Swaps and Continuous Time Finance Source: CUDA SDK
    • 27. Weather, Atmospheric, & Ocean ModelingCUDA-accelerated WRF available Other kernels in WRF being portedOngoing work Tsunami modeling Source: Michalakes, Vachharajani Ocean modeling Several CFD codes Source: Matsuoka, Akiyama, et al
    • 28. New emerging Standard• OpenCL – Support by many vendor including apple – Target for both GPU based SIMD and multithreading – More complex to program that CUDA• OpenACC – OpenACC is a programming standard for parallel computing developed by Cray, CAPS, Nvidia and PGI – simplify parallel programming of heterogeneous CPU/GPU systems. – Directives based
    • 29. Cluster computing• The use of large number of server that linked on a high speed local network as one single large supercomputer• Popular way of building supercomputer• Software – Cluster aware OS • Windows compute cluster server 2008 • NPACI Rocks Linux• Programming system such as MPI• Use mostly in computer aided design, engineering, scientific research
    • 30. Comment• Cluster computing is a very mature discipline• We know how to build a sizable cluster very well – Hardware integration – Storage integration : Luster, GPFS – Scheduler: PBS, Torque, SGE, LSF – Programming MPI – Distribution : ROCKS• Cluster is a foundation fabric for grid and cloud
    • 31. TERA Cluster 2.5Gbps to Uninet Storage 48 TB• KU Fiber Backbone 1 Frontend (HP ProLiant DL360 G5 (1Gbps Fiber) Server) and 192 1 Gbps Ethernet/Fiber computer nodes – Intel Xeon 3.2 GHz (Dual core, Edge Switch 1Gbps Ethernet Dual processor) – Memory 4 GB (8GB for Frontend & FE FE WinHPC TERA Anatta SPARE1 SPARE2 infiniband Sunyata Araya (FE) (FE) (FE) (FE) (FE) nodes) – 70x4 GB SCSI HDD (RAID1)• 4 Storage Servers 96 nodes – Lustre file 64 + 15 4 nodes 4 nodes nodes 16 spare nodes system for TERA clusters storage nodes – Attached with Smart Array P400i Controller for 5TB space 200 Ports Gigabit Ethernet switch Storage Tier 5TB Lustre FS FS FS FS FS 1 2 3 4 TGCC 2008, Khon Khan University , August 29,2008 Thailand
    • 32. Grid Computing Technology• Grid computing enables the virtualization of distributed computing and data resources such as processing, network bandwidth and storage capacity to create a single system image, granting users and applications seamless access to vast IT capabilities.• Just as an Internet user views a unified instance of content via the Web, a grid user essentially sees a single, large virtual computer.
    • 33. Grid Architecture• Fabric Layer – Protocol and interface that provide access to computing resources such Application Layer as CPU, storage• Connectivity Layer – Protocol for Grid-specific network Collective Layer transaction such as security GSI• Resources Layer – Protocol to access a single resources from application Resources • GRAM (Grid Resource Allocation Management) • GridFTP ( data access) • Grid Resource Information Service Connectivity• Collective layer – Protocol that manage and access group of resources Fabric
    • 34. Globus as Service-Oriented Infrastructure User User User Application Application Application Tool Tool Reliable File User Svc Uniform interfaces, Transfer Host Envsecurity mechanisms, MDS-Web service transport, Index MyProxy monitoring DAIS User Svc GRAM GridFTP IBM Host Env IB M IBM IB M Database Specialized Computers Storage resource
    • 35. Introduction to ThaiGrid• A National Project under Software Industry Promotion Agency (Public Organization) , Ministry of Information and Communication Technology• Started in 2005 from 14 member organizations• Expanded to 22 organizations in 2008 TGCC 2008, Khon Khan University ,August 29,2008 Thailand
    • 36. Thai Grid Infrastructure 19 sites 1 Gbps About 1000 CPU core. s 1 Gbp 155 M 2.5 Gbps bps 31 bps s M bp 155M 0 Mbps ps 1G 155 ps 310 ps Mb Mb bp Gb 155 s 2 .5 5 15 bps M ps Mb 5 1 5 bps 15 M 5 s bp 1G TGCC 2008, Khon Khan University ,August 29,2008 Thailand
    • 37. ThaiGrid Usage • ThaiGrid provides about 290 years of computing time for members – 9 years on the grid – 280 years on tera • 41 projects from 8 areas are being support on Teraflop machine • More small projects on each machines TGCC 2008, Khon Khan University ,August 29,2008 Thailand
    • 38. Medicinal Herb Research• Partner – Cheminormetics Center, Kasetsart Univesity (Chak Sangma and team)• Objective – Using 3D-molecular databse and virtual screening to verify the traditional medicinal herb• Benefit – Scientific proof of the ancient traditional drug – Benefit poor people that still rely on the drug from medicinal herb – Potential benefit for local pharmaceutical industry Virtual Screening Infrastructure Lab Test TGCC 2008, Khon Khan University , August 29,2008 Thailand
    • 39. NanoGrid Computing Resources Computing Resources2 MS-Gateway 3 1 MS-Gateway ThaiGrid • Objective – Platform that support computational Nano science research • Technology used – AccelRys Materials Studio – Cluster Scheduler: Sun Grid Engine and Torque TGCC 2008, Khon Khan University ,August 29,2008 Thailand
    • 40. Challenges• Size and Scale• Manageability – Deployment – Configuration – Operation• Software and Hardware Compatibility
    • 41. Grid System Architecture• Clusters – Satellite Sets • 16 clusters delivered from ThaiGrid for initial members • Composed of 5 nodes of IBM eServer xSeries 336 – Intel Xeon 2.8Ghz (Dual Processor) – x86_64 architecture – Memory: 4 GB (DDR2 SDRAM) – Other sets • Various type of servers and number of nodes • Provided by member institutes of ThaiGrid
    • 42. Grid as a Super Cluster Grid Scheduler GCC REN H H H C C C C H C C C C C C C C C C C CAugust 29,2008 TGCC 2008, Khon Khan University , Thailand
    • 43. Is grid still alive?• Yes, grid is a useful technology for certain task – Bit torrent for massive file exchange infrastructure – European Grid is using it to share LHC data• Pit fall of the grid – Network is still not reliable and fast enoughlong term operation – Multi-site , multi- authority concept make it very complex for • system management • Security • User to really use the system• Recent trend is to move to centralized cloud
    • 44. What is Clouding Computing? Google Saleforce AmazonSource: Wikipedia (cloud computing) Microsoft Yahoo
    • 45. Why Cloud Computing?• The illusion of infinite computing resources available on demand, thereby eliminating the need for Cloud Computing users to plan far ahead for provisioning.• The elimination of an up-front commitment by Cloud users, thereby allowing companies to start small and increase hardware resources only when there is an increase in their needs.• The ability to pay for use of computing resources on a short-term basis as needed (e.g., processors by the hour and storage by the day) and release them as needed, thereby rewarding conservation by letting machines and storage go when they are no longer useful. Source: “Above the Clouds: A Berkeley View of Cloud Computing”, RAD lab, UC Berkeley
    • 46. Source: “Above the Clouds: A Berkeley View of Cloud Computing”, RAD lab, UCBerkeley
    • 47. Cloud Computing Explained• Saas (Software as a Services) Application delivered over internet as a services (gmail)• Cloud is a massive server and network that serve Saas to large number of user• Service being sold is called Utility computing Source: “Above the Clouds: A Berkeley View of Cloud Computing”, RAD lab, UC Berekeley
    • 48. Enabling Technology for Cloud Computing• Cluster and Grid Technoogy – The ability to build a highly scalable computing system that consists of 100000 -1000000 nodes• Service oriented Architecture – Everything is a service – Easy to build, distributed, integrate into large scale aplication• Web 2.0 – Powerful and flexible user interface for intenet enable world
    • 49. Cloud Service Model
    • 50. Cloud Computing Software Stack
    • 51. Architecture of Service Oriented Cloud Computing Systems (SOCCS)  SOCCS can be User Interface constructed by combining CCR/DSS Cloud Application Software to form scalable service to a client application. DSS CSM  Cloud Service CCR Management (CSM) acts as a resourcesOperating System Operating System Operating System management system that keeps track of the NodeHardware Node Hardware Node Hardware availability of services on Interconnection Network the cloud. 57
    • 52. Cloud System Configuration Cloud UserInterface (Excel) Cloud Cloud Service Management Application (CSM) Service Service Service Service OS OS OS OS HW HW HW HW Interconnection network 58
    • 53. A Proof-of-Concept ApplicationPickup and Delivery Problem with Time Window (PDPTW) is a problem of serving a number of transportation requests based on limited number of vehicles.The objective of the problem is to minimize the sum of the distance traveled by the vehicles and minimize the sum of the time spent by each vehicle. 59
    • 54. PDPTW on the cloud using SOCCS Master/Worker model is adopted as a framework for service interaction. The algorithm is partitioned using domain decomposition approach. Cloud application control the decomposition of the problem by sending each sub problem to worker service and collect the results back to the best answer. 60
    • 55. ResultsSpeed up on a single node with 4 cores 61
    • 56. ResultsPerformance: Speedup and efficiency derived from average runtime on 1, 2, 4, 8 and 16 compute nodes. 62
    • 57. We are living in the world of Data Video Surveillance Social MediaMobile Sensors Gene Sequencing Smart Grids Geophysical Medical Imaging Exploration
    • 58. Big Data“Big data is data that exceeds the processing capacity ofconventional database systems. The data is too big,moves too fast, or doesn’t fit the strictures of yourdatabase architectures. To gain value from this data, youmust choose an alternative way to process it.” Reference: “What is big data? An introduction to the big data landscape.”, Edd Dumbill, http://radar.oreilly.com/2012/01/what-is-big-data.html
    • 59. The Value of Big Data• Analytical use – Big data analytics can reveal insights hidden previously by data too costly to process. • peer influence among customers, revealed by analyzing shoppers’ transactions, social and geographical data. – Being able to process every item of data in reasonable time removes the troublesome need for sampling and promotes an investigative approach to data.• Enabling new products. – Facebook has been able to craft a highly personalized user experience and create a new kind of advertising business
    • 60. 3 Characteristics of Big Data
    • 61. Big Data Challenge• Volume – How to process data so big that can not be move, or store.• Velocity – A lot of data coming very fast so it can not be stored such as Web usage log , Internet, mobile messages. Stream processing is needed to filter unused data or extract some knowledge real-time.• Variety – So many type of unstructured data format making conventional database useless.
    • 62. How to deal with big data • Integration of – Storage – Processing – Analysis Algorithm – Visualization ProcessingMassive Data Stream Processing VisualizeStream processing Storage Processing Analysis
    • 63. A New Approach For Distributed Big L.A. Data BOSTON LONDON L.A. BOSTON LONDON Storage Islands Single Storage Pool• Disparate Systems • Single System Across Locations• Manual Administration • Automated Policies• One Tenant, Many Systems • Many Tenants One System• IT Provisioned Storage • Self-Service Access
    • 64. Hadoop• Hadoop is a platform for distributing computing problems across a number of servers. First developed and released as open source by Yahoo. – Implements the MapReduce approach pioneered by Google in compiling its search indexes. – Distributing a dataset among multiple servers and operating on the data: the “map” stage. The partial results are then recombined: the “reduce” stage.• Hadoop utilizes its own distributed filesystem, HDFS, which makes data available to multiple computing nodes• Hadoop usage pattern involves three stages: – loading data into HDFS, – MapReduce operations, and – retrieving results from HDFS.
    • 65. WHAT FACEBOOK KNOWS Cameron Marlow calls himself Facebooks "in- house sociologist." He and his team can analyzehttp://www.facebook.com/data essentially all the information the site gathers.
    • 66. The links of Love• Often young women specify that they are “in a relationship” with their “best friend forever”. – Roughly 20% of all relationships for the 15-and-under crowd are between girls. – This number dips to 15% for 18- year-olds and is just 7% for 25-year- olds.• Anonymous US users who were over 18 at the start of the relationship – the average of the shortest number of steps to get from any one U.S. user to any other individual is 16.7. – This is much higher than the 4.74 steps you’d need to go from any Facebook user to another through friendship, as opposed to romantic, Graph shown the relationship of anonymous US users who were over ties. 18 at the start of the relationship. http://www.facebook.com/notes/facebook-data-team/the-links-of- love/10150572088343859
    • 67. Why?• Facebook can improve users experience – make useful predictions about users behavior – make better guesses about which ads you might be more or less open to at any given time• Right before Valentines Day this year a blog post from the Data Science Team listed the songs most popular with people who had recently signaled on Facebook that they had entered or left a relationship
    • 68. Data Tsunami• Data flood is coming, no where to run now! – Data being generated anytime, anywhere, anyone – Data is moving in fast – Data is too big to move, too big to store• Better be prepare – Use this to enhance your business and offer better services to customer
    • 69. The Opportunities and Challenges of Exascale Computing• Summary of findings from many workshop in US.• List issues needed to overcome• We will present only some challenges
    • 70. Hardware Challenges• Major improvement in hardware is needed.
    • 71. Power Challenge• Power consumption of the computers is the largest hardware research challenge.• Today, power costs for the largest petaflop systems are in the range of $5-10M60 annually• An exascale system using current technology. – the annual power cost to operate the system would be above $2.5B per year. – The power load would be over a gigawatt• The target of 20 megawatts, identified in the DOE Technology Roadmap, is primarily based on keeping the operational cost of the system in some kind of feasible range.
    • 72. Memory Challenge• Memory subsystem is too slow
    • 73. Data Movement Challenge
    • 74. System Resiliency Challenge• For exascale systems, the number of system components will be increasing faster than component reliability, with projections in the minutes or seconds for exascale systems.• Exascale systems will experience various kind of faults many times per day. – Systems running 100 million cores will continually see core failures and the tools for• Dealing with them will have to be rethought.
    • 75. “Co-Design” Challenge
    • 76. The Computer Science Challenges• A programming model effort is a critical component – clock speeds will be flat or even dropping to save energy. All performance improvements within a chip will come from increased parallelism. The amount of memory per arithmetic – need for fine-grained parallelism and a programming model other than message passing or coarse-grained threads
    • 77. Under the radar• Mobile processor run super computer• Hybrid war! GPU VS. MIC• I/O goes solid state• Programming standard war – CUDA/ OpenCL/ OpenMP/ OpenACC
    • 78. Summary• We are in the challenging world• Demand for HPC system, application will increase. – Software tool , technology, hardware is changing to catch up.• The greatest challenge is how to quickly develop software for the next generation computing system
    • 79. THANK YOU

    ×