Leonid sheremetov

393
-1

Published on

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
393
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Leonid sheremetov

  1. 1. Awareness Raising Workshop, Mexico-city Nov. 23-24 High Performance Computing in Petroleum Exploration and Production: IMP experience Dr. Leonid Sheremetov sher@imp.mx Mexican Petroleum Institute
  2. 2. Outline HPC challenges in the Petroleum Industry Mexican Petroleum Institute (IMP) IMP profile Research Program for Applied Mathematics and Computing (MAyC) High performance computing in IMP Research agenda of HPC in MAyC Grid-based Simulation (dynamic data driven applications) Grid-based Distributed Data Mining Task assignment in desktop grids Adaptive grain parallelism Dynamic task distribution in multi-core clusters Conclusions RISC, Nov. 23-24 Mexico-city
  3. 3. O&G Exploration and Production in Mexicoand in the worldPEMEX – Mexican Oil Company: 4 regions, 14 assets, 2488 oil fields, 24645 wellsPEMEX Technology Strategy: technicalinnovation and advanced decisionmaking supportResearch funds: CONACYT-SENERHidrocarburos and EnergíasRenovablesPrinciple reservoirs decline (decreasedrecovery)Increased technical complexity of allprocesses (increased cost)Increased gap between acquired andutilized data RISC, Nov. 23-24 Mexico-city 3
  4. 4. O&G industry Transparent Data and Shared Information Transactional ProcessesRemote Operations Virtual Processes Collaboration Immersion Technologies Knowledge Management Integrated Operation Supply Chains Optimization Operational and Financial Reports Real Time Information Reservoir modeling about Reservoirs RISC, Nov. 23-24 Mexico-city
  5. 5. O&G industry (continued) 3-D Seismic/simulation Multiple SCADA PEMEX Corporate DB systems ADITEP HF Data Historian SIOPDV HF Data Historian SAPcomputationally intensive tasks data intensive applications sensor intensive applications (i-Field) Solving grand challenge applications using HPC RISC, Nov. 23-24 Mexico-city
  6. 6. Outline HPC challenges in the Petroleum Industry Mexican Petroleum Institute (IMP) IMP profile Research Program for Applied Mathematics and Computing (MAyC) High performance computing in IMP Research agenda of HPC in MAyC Grid-based Simulation (dynamic data driven applications) Grid-based Distributed Data Mining Task assignment in desktop grids Adaptive grain parallelism Dynamic task distribution in multi-core clusters Conclusions RISC, Nov. 23-24 Mexico-city
  7. 7. Mexican Petroleum Institute (IMP)• IMP is public research centre• IMP was founded on August, 23 of 1965• Year budget about $300 mln USD• IMP objectives: • Research and Development • Application Technologies for PEMEX – Mexican Oil Company • Consulting • Education and Training (postgraduate program opened in 2003) RISC, Nov. 23-24 Mexico-city 7
  8. 8. Research Program for Applied Mathematics andComputing Founded in 2001 Contains: Researchers Developers Scientific Computing Lab Main Research Areas: Distributed Intelligent Computing • data mining • computational intelligence • expert systems • agent technology Multiobjective Optimization • logistics • supply chain management Simulation • partial differencial equations • numeric methods RISC, Nov. 23-24 Mexico-city
  9. 9. Supercomputing in Mexico RISC, Nov. 23-24 Mexico-city
  10. 10. HPC in Mexico CINVESTAV: Xiuhcóatl Processors: INTEL-AMD-GPGPU Number of cores: 3480 (CPU), Real performance: 24.97TFlops UAM: AITZALOA Number of nodes: 270 (135 Twin) nodes. Processors: Intel Xeón Quad-Core a 3 Ghz Number of cores: 2160 (540 Quad-Core CPU) Memory: 16GB en RAM por nodo. Real performance: 18.4 TFlops. UNAM: KAN BALAM (HP CP 4000) Number of nodes: 342 nodes, Processors: AMD Opteron Number of cores: 1368 CPU Memory: 3 Terabytes. Real performance: 7.1 TFlops Fujitsu K computer, SPARC64 VIIIfx 2.0GHz, Tofu interconnect: 705,024 cores, 10,510 TFlops RISC, Nov. 23-24 Mexico-city
  11. 11. Mexico and IMP in Top500 MAyC created RISC, Nov. 23-24 Mexico-city
  12. 12. Evolution of High Performance Platforms in the IMP 1968: IBM1130 1972: IBM-360/44 for the Computer Centre and Centre for Geophysical Processing (analysis of seismic data for reservoir characterization) 1980: UNIVAC 1106 (design of oil platforms) 1982: UNIVAC 1100/82 (multiprocessor), VAX 750 1982: 1st distributed DB in Mexico 2000: Cray Origin 2000 2001: Research Program on Applied Mathematics and Computing (PIMAyC). 2001: Lufac Cluster with 256 nodes (2 CPUs each) 2009: Lufac Cluster (Villahermosa) 2011: Supercomputing Lab: Xeon X7500 CPU, 250 cores - in progress Estimated server&cluster capacity (2011): 0.4 TFlops RISC, Nov. 23-24 Mexico-city
  13. 13. Applications of HPC in the IMP Computation intensive tasks Reservoir simulation Oceanographic modeling for offshore exploration Atmospheric modeling Data intensive tasks 3-D seismic cubes pre-stack analysis and multi- attribute analysis Data mining Computation and communication intensive tasks: Collaborative engineering Nano characterization in 2D and 3D and nanochemical analysis (shared Lab.) RISC, Nov. 23-24 Mexico-city
  14. 14. Improved exploration and production (E&P)performanceHPC seismic-to-simulation technologiesseamlessly integrate geophysics, geology, andreservoir engineering in a unified earth modelSchlumberger’s Petrel™ Seismic Serveranalyzes terabytes of seismic survey datarepresented in 2-D and 3-D displays usingPetrel Geophysics.ECLIPSE® reservoir simulation software usesthe power of HPC clusters to generateanimated 3-D simulation modelsSchlumberger’s software is optimized forIntel’s Xeon multi-core architecture working onadvanced compiler and communicationtechnology, such as the Intel® MPI Library 3.1and the Intel® Compiler Suite, for high-performance cluster software available on theMicrosoft® Windows® Compute ClusterServer RISC, Nov. 23-24 Mexico-city 14
  15. 15. Real Time Remote Control of a JEOL JEM 2200FS Microscope Using Internet 2 IMP Ultra High Resolution Electron Microscopy Laboratory is one of the first shared Labs in Mexico promoting in collaboration with the UNAM Institute of Physics the creation of national and international networks on multidisciplinary scientific research, sharing technologic infrastructures through internet 2. It provides nano characterization in 2D and 3D and nanochemical analysis Both computational and communication (12 Mb I2) intensive tasks Head of Lab. Vicente Garibay Febles, vgaribay@imp.mx RISC, Nov. 23-24 Mexico-city
  16. 16. High resolution image of Pd-catalystnanoparticles and its chemical analysis (EDS).. RISC, Nov. 23-24 Mexico-city
  17. 17. Project CONACYT SENER-Hidrocarburos:Data Mining Methods and Techniques of Computational Intelligence and Data Mining for Decision Making in Exploitation of Mature Fields Project coordinator: Instituto Mexicano del Petróleo (MAyC) Project collaborators: CINVESTAV, CIC-IPN, CIMAT, IIE, INAOE,. Project dates: Scatterplot of multiple variables against Fecha JUJO-2A in PozosReconstruidosAforos-HistóricosProducción.stw 3v*9132c Prod. Aforos = Distance Weighted Least Squares Prod. Diaria Prom = Distance Weighted Least Squares March 08, 2011 – March 07, 2013 20000 18000 16000 14000 12000 10000 8000 6000 4000 2000 0 -2000 RISC, Nov. 23-24 Mexico-city 28/08/1976 18/02/1982 11/08/1987 31/01/1993 Fecha 24/07/1998 14/01/2004 06/07/2009 Prod. Aforos Prod. Diaria Prom 17
  18. 18. Project CONACYT SENER-Hidrocarburos:Data Mining Objective: Develop and apply data mining and computational intelligence (DM&CI) techniques for the análysis of technical data on hidrocarbon exploitation to support decision making and solution identification increasing the efficiency of exploitation of mature fields Novel approach: top-down (inverse) modeling based on the analysis of dynamic oilfield data and reconstruction of the static characterization and hydro-geological reservoir models for selection of poorly drained areas and recovery methods applying DM&CI Data: one oilfield – 9,464 files, > 50Gb (without seismic and simulation models) (2488 oil fields) RISC, Nov. 23-24 Mexico-city 18
  19. 19. Outline HPC challenges in the Petroleum Industry Mexican Petroleum Institute (IMP) IMP profile Research Program for Applied Mathematics and Computing (MAyC) High performance computing in IMP Research agenda of HPC in MAyC Grid-based Simulation (dynamic data driven applications) Grid-based Distributed Data Mining Task assignment in desktop grids Adaptive grain parallelism Dynamic task distribution in multi-core clusters Conclusions RISC, Nov. 23-24 Mexico-city
  20. 20. What grids would we need? Data grid: Support for large, distributed data repositories Computational grid: Execution of high-end simulation models in parallel and distributed fashion Knowledge grid: Add basic knowledge discovery mechanisms to a grid A grid architecture specialized for data mining RISC, Nov. 23-24 Mexico-city
  21. 21. Grid-based Simulation: Dynamic Data Driven ApplicationSystems Formalized by Frederica Darema Data is fed into an executing application either as the data is collected or from a data archive. The simulation can then make predictions about the entity regarding how it will change and what its future state will be. The simulation is then continuously adjusted with data gathered from the entity. The predictions made by the simulation can then influence how and where future data will be gathered from the entity, in order to focus on areas of uncertainty. Production history data can be fed to the reservoir simulator to determine the reservoir description parameters from the given performance and to predict the performance of an oil field. Intelligent agents are suitable to make these decisions with regard to which data to absorb, when it should be absorbed, and how it should be absorbed. RISC, Nov. 23-24 Mexico-city
  22. 22. Distributed Data Mining on Knowledge Grids TeraGrid (San Diego Supercomputer Center, National Center for Supercomputing, Caltech, Argonne National Lab: scientific data sets mining) Knowledge Grid (Università di Catanzaro and DEIS, Università della Calabria running over MIUR SP3 Italian national grid) Terra Wide Data Mining Testbed (National Center for Data Mining at the University of Illinois at Chicago) ADaM (University of Alabama in Huntsville: hydrology data mining) IMP&PEMEX - Data mining algorithms and knowledge discovery processes are both compute and data intensive, therefore the Grid can offer a computing and data management infrastructure for supporting decentralized and parallel Adapted from: M. Cannataro, A. Congiusta, A. Pugliese, D. Talia, and data analysis. P. Trunfio. Distributed data mining on grids: Services, tools, and applications. IEEE Transactions on Systems, Man, Cybernetics, Part B, 34(6), 2004. RISC, Nov. 23-24 Mexico-city
  23. 23. Distributed Data Mining for Modelling of HydraulicCommunication between Wells Principal components analysis Fuzzy clustering (fuzzy K-means) MAP Transform (trend analysis) See5 (decision trees and rulesets) WizWhy® (association rule mining), etc. RISC, Nov. 23-24 Mexico-city
  24. 24. Scientific grids: research agenda During the last years, computational speed has been increasing geometrically, while the speed in communication has only experienced a linear increase. The complexity of contemporary scientific applications with increased demand for computing power and access to larger datasets is setting a trend towards the increased utilization of grids of desktop personal computers The combination of many multicore computers in scientific grids demand a combination of fine and coarse grain parallelization RISC, Nov. 23-24 Mexico-city
  25. 25. Desktop grids (in collaboration with CINVESTAV) Network topology depends upon a bandwidth availability for parallel processes to communicate A novel task assignment scheme which takes the dynamic network topology into consideration is developed* The approach is based on the Bandwidth-aware Bulk Synchronous Parallel Computer (BSP) computational model The force field method for synchronisation is used The algorithm tested for the grids composed of 1K nodes *E. Wilson García and G. Morales-Luna, LNCS-3795 RISC, Nov. 23-24 Mexico-city
  26. 26. Task assignment algorithm Three types of applications were studied: High computation, low communication cost High computation, middle communication cost High computation, high communication cost Many parallel applications fall into the 2nd category Distributed data applications fall into the 3rd category RISC, Nov. 23-24 Mexico-city
  27. 27. Adaptive grain parallelism Gmandel (http://gmandel.sf.net/) is a benchmark for computer infrastructure generating images of the fractals from the Mandelbrot set. The basic unit of measure is the MMIPS (Million of Mandelbrot Iterations Per Second). Gmandel runs on Linux (or equivalent) on a single computer, a multiprocesor computer (most new multi-core PCs) or Infiniband/Myrinet computer cluster. In each case, Gmandel can take advantage of multi-core technology by the use of shared memory, fine grained and distributed memory, coarse grain parallel computing techniques. The first is accomplished with posix threads and the latter by means of MPI message passing (currently tested with mpich2, from ANL). RISC, Nov. 23-24 Mexico-city
  28. 28. Dynamic task distribution in HPC (in collaborationwith CIC-IPN) Increased complexity of embedded devices led to their verification consuming up to 70% of human and computational resources Dynamic planning and distribution of HDL models over a parallel simulation platform (clusters with multi- core nodes) is a challenging task Such a simulation platform is being developed by a PhD student Josué Rangel González, in collaboration with the Embedded Systems Lab of CIC-IPN RISC, Nov. 23-24 Mexico-city
  29. 29. Outline HPC challenges in the Petroleum Industry Mexican Petroleum Institute (IMP) IMP profile Research Program for Applied Mathematics and Computing (MAyC) High performance computing in IMP Research agenda of HPC in MAyC Grid-based Simulation (dynamic data driven applications) Grid-based Distributed Data Mining Task assignment in desktop grids Adaptive grain parallelism Dynamic task distribution in multi-core clusters Conclusions RISC, Nov. 23-24 Mexico-city
  30. 30. Conclusions Infrastructure next steps: Cluster installation in the Supercomputing lab. 12 Mb I2 connection Integration to Delta Metropolitana HPC Grid initiative Software, taking advantage of the infrastructure: Current platform: Landmark, Petrel, Eclipse, OFM, opensource Research agenda: Novel tasks and approaches enabled by the HPC increasing the efficiency of R&D to satisfy the needs of PEMEX RISC, Nov. 23-24 Mexico-city
  31. 31. MAyC at the IMP: publications & events March 12-14 2012, Cancun, Mexico RISC, Nov. 23-24 Mexico-city 31
  32. 32. Thank You! Any Questions? Leonid Sheremetov sher@imp.mx RISC, Nov. 23-24 Mexico-city

×