This document discusses trends in high performance computing (HPC) and big data analytics. It notes that while HPC and big data have different resource needs and programming models traditionally, they are converging as big data workloads require more real-time processing and HPC workloads incorporate more data-driven analytics. The document outlines challenges in both HPC and big data such as system bottlenecks, energy efficiency, and barriers to wider usage. It advocates for more integrated solutions that combine storage, networking, processing and memory to address these challenges.
High Performance Computing and Big Data Geoffrey Fox
We propose a hybrid software stack with Large scale data systems for both research and commercial applications running on the commodity (Apache) Big Data Stack (ABDS) using High Performance Computing (HPC) enhancements typically to improve performance. We give several examples taken from bio and financial informatics.
We look in detail at parallel and distributed run-times including MPI from HPC and Apache Storm, Heron, Spark and Flink from ABDS stressing that one needs to distinguish the different needs of parallel (tightly coupled) and distributed (loosely coupled) systems.
We also study "Java Grande" or the principles to use to allow Java codes to perform as fast as those written in more traditional HPC languages. We also note the differences between capacity (individual jobs using many nodes) and capability (lots of independent jobs) computing.
We discuss how this HPC-ABDS concept allows one to discuss convergence of Big Data, Big Simulation, Cloud and HPC Systems. See http://hpc-abds.org/kaleidoscope/
Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...Databricks
At Sams Club we have a long history of using Apache Spark and Hadoop. Projects from all parts of the company use Apache Spark, from fraud detection to product recommendations. Because of the scale of our business with billions of transactions and trillions of events it is often essential to use big data technologies. Until recently all of this work has run on several large on-premise Hadoop clusters. As part of our transition to public cloud we needed to build out an enterprise scale data platform. Azure Databricks is a key component of this platform giving our data scientist, engineers, and business users the ability to easily work with the companies data. We will discuss our architecture considerations that lead to using multiple Databricks workspaces and external Azure blob storage. We will also discuss how we move massive amounts of data to Azure on a daily basis with Airflow. Further we will discuss the self-service tools that we created to help users get their data to Azure and for us to manage the platform. Finally we will discuss our security considerations and how that played out in our architecture.
Authors: Andrew Ray, Craig Covey
State of the Art Robot Predictive Maintenance with Real-time Sensor DataMathieu Dumoulin
Our Strata Beijing 2017 presentation slides where we show how to use data from a movement sensor, in real-time, to do anomaly detection at scale using standard enterprise big data software.
An Introduction to the MapR Converged Data PlatformMapR Technologies
Listen to the webinar on-demand: http://info.mapr.com/WB_Partner_CDP_Intro_EMEA_DG_17.05.31_RegistrationPage.html
In this 90-minute webinar, we discuss:
- The MapR Converged Data Platform and its components
- Use cases for the Converged Data Platform
- MapR Converged Partner Program
- How to get started with MapR
- Becoming a partner
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integrationCesare Cugnasco
Data visualization can be a tricky problem, even more if the dataset is made of several billions of 3-dimensional particles moving along the time. The talk will focus on some simple indexing and data thinning techniques and how (and how do not) implement them with Cassandra and Spark.
Watch this recorded webcast and listen to Infochimps CSO and Co-Founder, Dhruv Bansal, and Think Big Analytics Principal Architect, Douglas Moore, share successful use cases and recommendations for building real-time predictive analytics in your enterprise.
High Performance Computing and Big Data Geoffrey Fox
We propose a hybrid software stack with Large scale data systems for both research and commercial applications running on the commodity (Apache) Big Data Stack (ABDS) using High Performance Computing (HPC) enhancements typically to improve performance. We give several examples taken from bio and financial informatics.
We look in detail at parallel and distributed run-times including MPI from HPC and Apache Storm, Heron, Spark and Flink from ABDS stressing that one needs to distinguish the different needs of parallel (tightly coupled) and distributed (loosely coupled) systems.
We also study "Java Grande" or the principles to use to allow Java codes to perform as fast as those written in more traditional HPC languages. We also note the differences between capacity (individual jobs using many nodes) and capability (lots of independent jobs) computing.
We discuss how this HPC-ABDS concept allows one to discuss convergence of Big Data, Big Simulation, Cloud and HPC Systems. See http://hpc-abds.org/kaleidoscope/
Building an Enterprise Data Platform with Azure Databricks to Enable Machine ...Databricks
At Sams Club we have a long history of using Apache Spark and Hadoop. Projects from all parts of the company use Apache Spark, from fraud detection to product recommendations. Because of the scale of our business with billions of transactions and trillions of events it is often essential to use big data technologies. Until recently all of this work has run on several large on-premise Hadoop clusters. As part of our transition to public cloud we needed to build out an enterprise scale data platform. Azure Databricks is a key component of this platform giving our data scientist, engineers, and business users the ability to easily work with the companies data. We will discuss our architecture considerations that lead to using multiple Databricks workspaces and external Azure blob storage. We will also discuss how we move massive amounts of data to Azure on a daily basis with Airflow. Further we will discuss the self-service tools that we created to help users get their data to Azure and for us to manage the platform. Finally we will discuss our security considerations and how that played out in our architecture.
Authors: Andrew Ray, Craig Covey
State of the Art Robot Predictive Maintenance with Real-time Sensor DataMathieu Dumoulin
Our Strata Beijing 2017 presentation slides where we show how to use data from a movement sensor, in real-time, to do anomaly detection at scale using standard enterprise big data software.
An Introduction to the MapR Converged Data PlatformMapR Technologies
Listen to the webinar on-demand: http://info.mapr.com/WB_Partner_CDP_Intro_EMEA_DG_17.05.31_RegistrationPage.html
In this 90-minute webinar, we discuss:
- The MapR Converged Data Platform and its components
- Use cases for the Converged Data Platform
- MapR Converged Partner Program
- How to get started with MapR
- Becoming a partner
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integrationCesare Cugnasco
Data visualization can be a tricky problem, even more if the dataset is made of several billions of 3-dimensional particles moving along the time. The talk will focus on some simple indexing and data thinning techniques and how (and how do not) implement them with Cassandra and Spark.
Watch this recorded webcast and listen to Infochimps CSO and Co-Founder, Dhruv Bansal, and Think Big Analytics Principal Architect, Douglas Moore, share successful use cases and recommendations for building real-time predictive analytics in your enterprise.
Delivered this talk as part of Spark & Kafka Summit 2017 organized by Unicom Learning Conference.
Big data processing is undoubtedly one of the most exciting areas in computing today, and remains an area of fast evolution and introduction of new ideas. Apache Spark is at the cusp of overtaking MapReduce to emerge as the de-facto standard for big data processing. Thanks to its multi-functional capabilities (SQL, Structured Streaming, ML Pipelines and GraphX) under one unified platform , Spark is now a dominant compute technology across various industry use cases and real-time analytics applications. Apache Spark in past few years has seen successful production and commercial deployments across E-Commerce, Healthcare and Travel industry.
Session gave audience an understanding about the latest and upcoming trends in Big-Data Analytics and the role of Spark in enabling those future use-cases of advanced analytics.
Session explored the latest concepts from Apache Spark 2.x and introduction to various ML/DL frameworks that can run Spark along with some real-life use-cases and applications from Retail and IoT verticals.
Leveraging Spark to Democratize Data for Omni-Commerce with Shafaq AbdullahDatabricks
Insnap, a hyper-personalized ML-based platform acquired by The Honest Company, has been used to build a real-time data platform based on Apache Spark, Cassandra and Redshift. Users’ behavioral and transactional data have been used to build data models and ML models, and to drive use cases for marketing, growth, finance and operations.
Learn how Honest Company has used Spark as a workhorse for 1) collecting, ETL and storing data from various sources including mysql, mongo, jde, Google analytics, Facebook, Localytics and REST API; 2) building data models and aggregating and generating reports of revenue, order fulfillment tracking, data pipeline monitoring and subscriptions; 3) Using ML to build model for user acquisitions, LTV and recommendations use cases. Spark replaced the monolithic codebase with flexible, scalable and robust pipelines. Databricks helped The Honest Company to focus on data instead of maintaining infrastructure. While Honest users got delightful recommendations to improve experience, data users at Honest understood users much better in terms of segmenting with behavioral information and advanced ML models, leading to increased revenue and retention.
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDatabricks
Does more data always improve ML models? Is it better to use distributed ML instead of single node ML?
In this talk I will show that while more data often improves DL models in high variance problem spaces (with semi or unstructured data) such as NLP, image, video more data does not significantly improve high bias problem spaces where traditional ML is more appropriate. Additionally, even in the deep learning domain, single node models can still outperform distributed models via transfer learning.
Data scientists have pain points running many models in parallel automating the experimental set up. Getting others (especially analysts) within an organization to use their models Databricks solves these problems using pandas udfs, ml runtime and MLflow.
Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...Databricks
Performing analytics for risk management purposes is applied in many fields, especially in financial services. We present a framework for accelerated risk analytics and show a large-scale financial sector application where this framework is used to run backtesting algorithms on risk-based securities such as options. These applications require highly computationally-intensive operations on extremely large data sets with objects numbering in the tens of billions.
Intel FPGA and FinLib library for financial applications are used to offload the computation; however, another challenging problem (that we have resolved) is how to feed the data to the FPGA at the optimal speed without having to do customized coding. A combination of Apache Spark along with Levyx’s persistent dataframes are used to address this problem. These dataframes allow absorbing the computation from Spark and offloading it to Finlib in an automated way. This example can be expanded to many other areas of Risk Management such as Insurance and Cybersecurity.
Data Warehouse Modernization: Accelerating Time-To-Action MapR Technologies
Data warehouses have been the standard tool for analyzing data created by business operations. In recent years, increasing data volumes, new types of data formats, and emerging analytics technologies such as machine learning have given rise to modern data lakes. Connecting application databases, data warehouses, and data lakes using real-time data pipelines can significantly improve the time to action for business decisions. More: http://info.mapr.com/WB_MapR-StreamSets-Data-Warehouse-Modernization_Global_DG_17.08.16_RegistrationPage.html
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...inside-BigData.com
DK Panda from Ohio State University presented this deck at the Switzerland HPC Conference.
"This talk will provide an overview of challenges in accelerating Hadoop, Spark and Mem- cached on modern HPC clusters. An overview of RDMA-based designs for multiple com- ponents of Hadoop (HDFS, MapReduce, RPC and HBase), Spark, and Memcached will be presented. Enhanced designs for these components to exploit in-memory technology and parallel file systems (such as Lustre) will be presented. Benefits of these designs on various cluster configurations using the publicly available RDMA-enabled packages from the OSU HiBD project (http://hibd.cse.ohio-state.edu) will be shown."
Watch the video presentation: https://www.youtube.com/watch?v=glf2KITDdVs
See more talks in the Swiss Conference Video Gallery: http://insidehpc.com/2016-swiss-hpc-conference/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionEtu Solution
講者:Informatica 資深產品顧問 | 尹寒柏
議題簡介:Big Data 時代,比的不是數據數量,而是了解數據的深度。現在,因為 Big Data 技術的成熟,讓非資訊背景的 CXO 們,可以讓過去像是專有名詞的 CI (Customer Intelligence) 變成動詞,從 BI 進入 CI,更連結消費者經濟的脈動,洞悉顧客的意圖。不過,有個 Big Data 時代要 注意的思維,那就是競爭到最後,不單只是看數據量的增長,還要比誰能更了解數據的深度。而 Informatica 正是這個最佳解決的答案。我們透過 Informatica 解決在企業及時提供可信賴數據的巨大壓力;同時隨著日益增高的數據量和複雜程度,Informatica 也有能力提供更快速彙集數據技術,從而讓數據變的有意義並可供企業用來促進效率提升、完善品質、保證確定性和發揮優勢的功能。Inforamtica 提供了更為快速有效地實現此目標的方案,是精誠集團在 Big Data 時代的最佳工具。
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Precisely
So you built your Hadoop cluster. How do you get data from hundreds of database tables, streaming Kafka sources, and data shared by 20-year-old COBOL programs, all in there and working together quickly, efficiently and securely? With many customers asking this same question, Hortonworks recently expanded its partnership with Syncsort to provide optimized ETL onboarding for Hadoop. During this talk, we'll discuss how a next-generation ETL tool, built on contributions to the open source community and natively integrated in Hadoop, can drive lasting value for your organization. 1) Seamlessly onboard data from all your enterprise sources – batch and streaming -- into Hadoop for fast and easy analytics. 2) Stay agile and simplify your environment with a "design once, deploy anywhere" approach that minimizes disruption and risk in the face of a rapidly evolving big data ecosystem. 3) Secure, govern and manage your data with full integration with Apache Ambari, Apache Ranger, and more. These benefits come to life with real customer case studies. Learn how a national insurance company and global hotel chain are using Hortonworks HDP and Syncsort DMX-h to get bigger insights from their enterprise data, securely, efficiently, and cost-effectively, without spending hundreds of man-hours.
Enabling Real-Time Business with Change Data CaptureMapR Technologies
Machine learning (ML) and artificial intelligence (AI) enable intelligent processes that can autonomously make decisions in real-time. The real challenge for effective ML and AI is getting all relevant data to a converged data platform in real-time, where it can be processed using modern technologies and integrated into any downstream systems.
The Synapse IoT Stack: Technology Trends in IOT and Big DataInMobi Technology
This is the presentation from Big Data November Bangalore Meetup 2014.
http://technology.inmobi.com/events/bigdata-meetup
Talk Outline:
- What does THE HIVE provide?
- Goals of Synapse Tech Stack
- THE HIVE Startups
- Demystifying IoT Market
- Synapse Stack for IoT
- Big Data Challenge
- Synapse Lambda Architecture
- Synapse Components
- Synapse Internals
- AKILI – Synapse Machine Learning
H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...Sri Ambati
This talk was recorded in London on October 30, 2018.
KNIME Analytics Platform is an easy to use and comprehensive open source data integration, analysis, and exploration platform, enabling data scientists to visually compose end to end data analysis workflows. The over 2,000 available modules ("nodes") cover each step of the analysis workflow, including blending heterogeneous data types, data transformation, wrangling and cleansing, advanced data visualization, or model training and deployment.
Many of these nodes are provided through open source integrations (why reinvent the wheel?). This provides seamless access to large open source projects such as Keras and Tensorflow for deep learning, Apache Spark for big data processing, Python and R for scripting, and more. These integrations can be used in combination with other KNIME nodes meaning that data scientists can freely select from a vast variety of options when tackling an analysis problem.
The integration of H2O in KNIME offers an extensive number of nodes and encapsulating functionalities of the H2O open source machine learning libraries, making it easy to use H2O algorithms from a KNIME workflow without touching any code - each of the H2O nodes looks and feels just like a normal KNIME node - and the data scientist benefits from the high performance libraries and proven quality of H2O during execution. For prototyping these algorithms are executed locally, however training and deployment can easily be scaled up using a Sparkling Water cluster.
In our talk we give a short introduction to KNIME Analytics Platform and then demonstrate how data scientists benefit from using KNIME Analytics Platform and H2O Machine Learning in combination by using a real world analysis example.
Bio: Christian received a Master’s degree in Computer Science from the University of Konstanz. Having gained experience as a research software engineer at the University of Konstanz, where he developed frameworks and libraries in the fields of bioimage analysis and machine learning, Christian moved on to become a software engineer at KNIME. He now focuses on developing new functionalities and extensions for KNIME Analytics Platform. Some of his recent projects include deep learning integrations built upon Keras and Tensorflow, extensions for image analysis and active learning, and the integration of H2O Machine Learning and H2O Sparkling Water in KNIME Analytics Platform.
Real-Time Robot Predictive Maintenance in ActionDataWorks Summit
Industry 4.0 IoT applications promise vast gains in productivity from reduced downtime, higher product quality and higher efficiency. Modern industrial robots integrate hundreds of sensors of all kinds, generating tremendous volumes of data rich in valuable information. However, the reality is that some of the most advanced industrial makers in the world are barely getting started making use of this data, with relatively rudimentary, bespoke monitoring systems built at tremendous cost.
We believe that it is now possible, using a well-chosen selection of enterprise open source big data projects, to successfully deploy Industry 4.0 pilot use cases in a matter of months, at a small fraction of the cost of equivalent projects at leading high-tech makers. We propose to show a working prototype of just such a system, and explain in some detail how it was made.
Our presentation describes a working real-time ML-based anomaly detection system. We show a working industrial robot-analog installed with a wireless movement sensor. Our system scores the data in a cloud-based cluster. For added realism, the system we demonstrate live includes a working augmented-reality headset that can show the real-time status overlaid on the working robot.
This talk is about demonstrating a concrete example of a real-time predictive maintenance system, built as a series of microservices connected by Kafka streams and powered by the excellent H2O distributed Machine Learning tool. Our goal is for our attendees to get a feel for what can be realistically achieved by a few non-genius-level engineers in a few months of effort using the best in open source technology for real-time streams (Kafka) and Machine learning (H2O).
Where appropriate, we’ll mention how our choice of using the MapR Converged Data Platform made the development easier thanks to some of its unique features.
Speaker
Cao Yi, MapR
Blue Pill/Red Pill: The Matrix of Thousands of Data StreamsDatabricks
Designing a streaming application which has to process data from 1 or 2 streams is easy. Any streaming framework which provides scalability, high-throughput, and fault-tolerance would work. But when the number of streams start growing in order 100s or 1000s, managing them can be daunting. How would you share resources among 1000s of streams with all of them running 24×7? Manage their state, Apply advanced streaming operations, Add/Delete streams without restarting? This talk explains common scenarios & shows techniques that can handle thousands of streams using Spark Structured Streaming.
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...Igor José F. Freitas
Vídeo: https://www.youtube.com/watch?v=8cFqNwhQ7uE
Fator chave para a competitividade do País, da Ciência e da Indústria.
Palestra ministrada durante o Intel Innovation Week 2015 .
Delivered this talk as part of Spark & Kafka Summit 2017 organized by Unicom Learning Conference.
Big data processing is undoubtedly one of the most exciting areas in computing today, and remains an area of fast evolution and introduction of new ideas. Apache Spark is at the cusp of overtaking MapReduce to emerge as the de-facto standard for big data processing. Thanks to its multi-functional capabilities (SQL, Structured Streaming, ML Pipelines and GraphX) under one unified platform , Spark is now a dominant compute technology across various industry use cases and real-time analytics applications. Apache Spark in past few years has seen successful production and commercial deployments across E-Commerce, Healthcare and Travel industry.
Session gave audience an understanding about the latest and upcoming trends in Big-Data Analytics and the role of Spark in enabling those future use-cases of advanced analytics.
Session explored the latest concepts from Apache Spark 2.x and introduction to various ML/DL frameworks that can run Spark along with some real-life use-cases and applications from Retail and IoT verticals.
Leveraging Spark to Democratize Data for Omni-Commerce with Shafaq AbdullahDatabricks
Insnap, a hyper-personalized ML-based platform acquired by The Honest Company, has been used to build a real-time data platform based on Apache Spark, Cassandra and Redshift. Users’ behavioral and transactional data have been used to build data models and ML models, and to drive use cases for marketing, growth, finance and operations.
Learn how Honest Company has used Spark as a workhorse for 1) collecting, ETL and storing data from various sources including mysql, mongo, jde, Google analytics, Facebook, Localytics and REST API; 2) building data models and aggregating and generating reports of revenue, order fulfillment tracking, data pipeline monitoring and subscriptions; 3) Using ML to build model for user acquisitions, LTV and recommendations use cases. Spark replaced the monolithic codebase with flexible, scalable and robust pipelines. Databricks helped The Honest Company to focus on data instead of maintaining infrastructure. While Honest users got delightful recommendations to improve experience, data users at Honest understood users much better in terms of segmenting with behavioral information and advanced ML models, leading to increased revenue and retention.
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDatabricks
Does more data always improve ML models? Is it better to use distributed ML instead of single node ML?
In this talk I will show that while more data often improves DL models in high variance problem spaces (with semi or unstructured data) such as NLP, image, video more data does not significantly improve high bias problem spaces where traditional ML is more appropriate. Additionally, even in the deep learning domain, single node models can still outperform distributed models via transfer learning.
Data scientists have pain points running many models in parallel automating the experimental set up. Getting others (especially analysts) within an organization to use their models Databricks solves these problems using pandas udfs, ml runtime and MLflow.
Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...Databricks
Performing analytics for risk management purposes is applied in many fields, especially in financial services. We present a framework for accelerated risk analytics and show a large-scale financial sector application where this framework is used to run backtesting algorithms on risk-based securities such as options. These applications require highly computationally-intensive operations on extremely large data sets with objects numbering in the tens of billions.
Intel FPGA and FinLib library for financial applications are used to offload the computation; however, another challenging problem (that we have resolved) is how to feed the data to the FPGA at the optimal speed without having to do customized coding. A combination of Apache Spark along with Levyx’s persistent dataframes are used to address this problem. These dataframes allow absorbing the computation from Spark and offloading it to Finlib in an automated way. This example can be expanded to many other areas of Risk Management such as Insurance and Cybersecurity.
Data Warehouse Modernization: Accelerating Time-To-Action MapR Technologies
Data warehouses have been the standard tool for analyzing data created by business operations. In recent years, increasing data volumes, new types of data formats, and emerging analytics technologies such as machine learning have given rise to modern data lakes. Connecting application databases, data warehouses, and data lakes using real-time data pipelines can significantly improve the time to action for business decisions. More: http://info.mapr.com/WB_MapR-StreamSets-Data-Warehouse-Modernization_Global_DG_17.08.16_RegistrationPage.html
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...inside-BigData.com
DK Panda from Ohio State University presented this deck at the Switzerland HPC Conference.
"This talk will provide an overview of challenges in accelerating Hadoop, Spark and Mem- cached on modern HPC clusters. An overview of RDMA-based designs for multiple com- ponents of Hadoop (HDFS, MapReduce, RPC and HBase), Spark, and Memcached will be presented. Enhanced designs for these components to exploit in-memory technology and parallel file systems (such as Lustre) will be presented. Benefits of these designs on various cluster configurations using the publicly available RDMA-enabled packages from the OSU HiBD project (http://hibd.cse.ohio-state.edu) will be shown."
Watch the video presentation: https://www.youtube.com/watch?v=glf2KITDdVs
See more talks in the Swiss Conference Video Gallery: http://insidehpc.com/2016-swiss-hpc-conference/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionEtu Solution
講者:Informatica 資深產品顧問 | 尹寒柏
議題簡介:Big Data 時代,比的不是數據數量,而是了解數據的深度。現在,因為 Big Data 技術的成熟,讓非資訊背景的 CXO 們,可以讓過去像是專有名詞的 CI (Customer Intelligence) 變成動詞,從 BI 進入 CI,更連結消費者經濟的脈動,洞悉顧客的意圖。不過,有個 Big Data 時代要 注意的思維,那就是競爭到最後,不單只是看數據量的增長,還要比誰能更了解數據的深度。而 Informatica 正是這個最佳解決的答案。我們透過 Informatica 解決在企業及時提供可信賴數據的巨大壓力;同時隨著日益增高的數據量和複雜程度,Informatica 也有能力提供更快速彙集數據技術,從而讓數據變的有意義並可供企業用來促進效率提升、完善品質、保證確定性和發揮優勢的功能。Inforamtica 提供了更為快速有效地實現此目標的方案,是精誠集團在 Big Data 時代的最佳工具。
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Precisely
So you built your Hadoop cluster. How do you get data from hundreds of database tables, streaming Kafka sources, and data shared by 20-year-old COBOL programs, all in there and working together quickly, efficiently and securely? With many customers asking this same question, Hortonworks recently expanded its partnership with Syncsort to provide optimized ETL onboarding for Hadoop. During this talk, we'll discuss how a next-generation ETL tool, built on contributions to the open source community and natively integrated in Hadoop, can drive lasting value for your organization. 1) Seamlessly onboard data from all your enterprise sources – batch and streaming -- into Hadoop for fast and easy analytics. 2) Stay agile and simplify your environment with a "design once, deploy anywhere" approach that minimizes disruption and risk in the face of a rapidly evolving big data ecosystem. 3) Secure, govern and manage your data with full integration with Apache Ambari, Apache Ranger, and more. These benefits come to life with real customer case studies. Learn how a national insurance company and global hotel chain are using Hortonworks HDP and Syncsort DMX-h to get bigger insights from their enterprise data, securely, efficiently, and cost-effectively, without spending hundreds of man-hours.
Enabling Real-Time Business with Change Data CaptureMapR Technologies
Machine learning (ML) and artificial intelligence (AI) enable intelligent processes that can autonomously make decisions in real-time. The real challenge for effective ML and AI is getting all relevant data to a converged data platform in real-time, where it can be processed using modern technologies and integrated into any downstream systems.
The Synapse IoT Stack: Technology Trends in IOT and Big DataInMobi Technology
This is the presentation from Big Data November Bangalore Meetup 2014.
http://technology.inmobi.com/events/bigdata-meetup
Talk Outline:
- What does THE HIVE provide?
- Goals of Synapse Tech Stack
- THE HIVE Startups
- Demystifying IoT Market
- Synapse Stack for IoT
- Big Data Challenge
- Synapse Lambda Architecture
- Synapse Components
- Synapse Internals
- AKILI – Synapse Machine Learning
H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...Sri Ambati
This talk was recorded in London on October 30, 2018.
KNIME Analytics Platform is an easy to use and comprehensive open source data integration, analysis, and exploration platform, enabling data scientists to visually compose end to end data analysis workflows. The over 2,000 available modules ("nodes") cover each step of the analysis workflow, including blending heterogeneous data types, data transformation, wrangling and cleansing, advanced data visualization, or model training and deployment.
Many of these nodes are provided through open source integrations (why reinvent the wheel?). This provides seamless access to large open source projects such as Keras and Tensorflow for deep learning, Apache Spark for big data processing, Python and R for scripting, and more. These integrations can be used in combination with other KNIME nodes meaning that data scientists can freely select from a vast variety of options when tackling an analysis problem.
The integration of H2O in KNIME offers an extensive number of nodes and encapsulating functionalities of the H2O open source machine learning libraries, making it easy to use H2O algorithms from a KNIME workflow without touching any code - each of the H2O nodes looks and feels just like a normal KNIME node - and the data scientist benefits from the high performance libraries and proven quality of H2O during execution. For prototyping these algorithms are executed locally, however training and deployment can easily be scaled up using a Sparkling Water cluster.
In our talk we give a short introduction to KNIME Analytics Platform and then demonstrate how data scientists benefit from using KNIME Analytics Platform and H2O Machine Learning in combination by using a real world analysis example.
Bio: Christian received a Master’s degree in Computer Science from the University of Konstanz. Having gained experience as a research software engineer at the University of Konstanz, where he developed frameworks and libraries in the fields of bioimage analysis and machine learning, Christian moved on to become a software engineer at KNIME. He now focuses on developing new functionalities and extensions for KNIME Analytics Platform. Some of his recent projects include deep learning integrations built upon Keras and Tensorflow, extensions for image analysis and active learning, and the integration of H2O Machine Learning and H2O Sparkling Water in KNIME Analytics Platform.
Real-Time Robot Predictive Maintenance in ActionDataWorks Summit
Industry 4.0 IoT applications promise vast gains in productivity from reduced downtime, higher product quality and higher efficiency. Modern industrial robots integrate hundreds of sensors of all kinds, generating tremendous volumes of data rich in valuable information. However, the reality is that some of the most advanced industrial makers in the world are barely getting started making use of this data, with relatively rudimentary, bespoke monitoring systems built at tremendous cost.
We believe that it is now possible, using a well-chosen selection of enterprise open source big data projects, to successfully deploy Industry 4.0 pilot use cases in a matter of months, at a small fraction of the cost of equivalent projects at leading high-tech makers. We propose to show a working prototype of just such a system, and explain in some detail how it was made.
Our presentation describes a working real-time ML-based anomaly detection system. We show a working industrial robot-analog installed with a wireless movement sensor. Our system scores the data in a cloud-based cluster. For added realism, the system we demonstrate live includes a working augmented-reality headset that can show the real-time status overlaid on the working robot.
This talk is about demonstrating a concrete example of a real-time predictive maintenance system, built as a series of microservices connected by Kafka streams and powered by the excellent H2O distributed Machine Learning tool. Our goal is for our attendees to get a feel for what can be realistically achieved by a few non-genius-level engineers in a few months of effort using the best in open source technology for real-time streams (Kafka) and Machine learning (H2O).
Where appropriate, we’ll mention how our choice of using the MapR Converged Data Platform made the development easier thanks to some of its unique features.
Speaker
Cao Yi, MapR
Blue Pill/Red Pill: The Matrix of Thousands of Data StreamsDatabricks
Designing a streaming application which has to process data from 1 or 2 streams is easy. Any streaming framework which provides scalability, high-throughput, and fault-tolerance would work. But when the number of streams start growing in order 100s or 1000s, managing them can be daunting. How would you share resources among 1000s of streams with all of them running 24×7? Manage their state, Apply advanced streaming operations, Add/Delete streams without restarting? This talk explains common scenarios & shows techniques that can handle thousands of streams using Spark Structured Streaming.
Computação de Alto Desempenho - Fator chave para a competitividade do País, d...Igor José F. Freitas
Vídeo: https://www.youtube.com/watch?v=8cFqNwhQ7uE
Fator chave para a competitividade do País, da Ciência e da Indústria.
Palestra ministrada durante o Intel Innovation Week 2015 .
If you're like most of the world, you're on an aggressive race to implement machine learning applications and on a path to get to deep learning. If you can give better service at a lower cost, you will be the winners in 2030. But infrastructure is a key challenge to getting there. What does the technology infrastructure look like over the next decade as you move from Petabytes to Exabytes? How are you budgeting for more colossal data growth over the next decade? How do your data scientists share data today and will it scale for 5-10 years? Do you have the appropriate security, governance, back-up and archiving processes in place? This session will address these issues and discuss strategies for customers as they ramp up their AI journey with a long term view.
Intel's Data Center & Connected Systems Group and Diane Bryant shares the latest news on the latest Intel Xeon E5v2 family of processors and technologies like Intel Network Builders to enable the re-architecture of the Data Center.
Streamline End-to-End AI Pipelines with Intel, Databricks, and OmniSciIntel® Software
Preprocess, visualize, and Build AI Faster at-Scale on Intel Architecture. Develop end-to-end AI pipelines for inferencing including data ingestion, preprocessing, and model inferencing with tabular, NLP, RecSys, video and image using Intel oneAPI AI Analytics Toolkit and other optimized libraries. Build at-scale performant pipelines with Databricks and end-to-end Xeon optimizations. Learn how to visualize with the OmniSci Immerse Platform and experience a live demonstration of the Intel Distribution of Modin and OmniSci.
IBM POWER - An ideal platform for scale-out deploymentsthinkASG
IBM Power Systems is the ideal platform for scale-out deployments such as Big Data, SAP HANA and anything else the requires heavy compute to achieve business goals, faster.
Intel, en el corazón del Software Defined Datacenter:
La nueva familia de procesadores Intel Xeon E5 v3
y la visión de Intel en relación con la nube híbrida y el Software Defined Infrastructure
Python Data Science and Machine Learning at Scale with Intel and AnacondaIntel® Software
Python is the number 1 language for data scientists, and Anaconda is the most popular python platform. Intel and Anaconda have partnered to bring scalability and near-native performance to Python with simple installations. Learn how data scientists can now access oneAPI-optimized Python packages such as NumPy, Scikit-Learn, Modin, Pandas, and XGBoost directly from the Anaconda repository through simple installation and minimal code changes.
Introduction to Software Defined Visualization (SDVis)Intel® Software
Software defined visualization (SDVis) is an open-source initiative from Intel and industry collaborators. Improve the visual fidelity, performance, and efficiency of prominent visualization solutions, while supporting the rapidly growing big data use on workstations through high-performance computing (HPC) on supercomputing clusters without memory limitations and cost of GPU-based solutions.
How to optimize Hortonworks Apache Spark ML workloads on Power - POWER 8/9 architecture is the latest offering from IBM and OpenPower foundation. It is the perfect platform for optimizing Hortonworks Spark's performance. During this presentation we will walk the audience through steps required to optimize YARN, HDFS, and Spark on a Power cluster.
Step required:
1) Classify workload into CPU, Memory, IO or mixed (CPU, memory, IO) intensive
2) Characterize "out-of-box" Hortonworks spark workload to understand CPU, Memory, IO and Network performance characteristics
3) Floor Plan cluster resources
4) Tune "out-of-box" workload to navigate "Roofline" Performance space in the above named dimensions
5) If workload is Memory / IO/ Network intensive bound then tune SPARK to increase operational intensity operations/byte as much as possible to make it CPU bound
6) Divide search space into regions and perform exhaustive search.
7) Identify Performance bottlenecks by resource monitoring and tune the System, JVM or application layer by profiling application and hardware counters if required.
Accelerate Machine Learning Software on Intel Architecture Intel® Software
This session presents performance data for deep learning training for image recognition that achieves greater than 24 times speedup performance with a single Intel® Xeon Phi™ processor 7250 when compared to Caffe*. In addition, we present performance data that shows training time is further reduced by 40 times the speedup with a 128-node Intel® Xeon Phi™ processor cluster over Intel® Omni-Path Architecture (Intel® OPA).
Join us for an exciting and informative preview of the broadest range of next-generation systems optimized for tomorrow’s data center workloads, Powered by 4th Gen Intel® Xeon® Scalable Processors (formerly codenamed Sapphire Rapids).
Experts from Supermicro and Intel will discuss how the upcoming Supermicro X13 systems will enable new performance levels utilizing state-of-the-art technology, including DDR5, PCIe 5.0, Compute Express Link™ 1.1, and Intel® Advanced Matrix Extensions (Intel AMX).
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
Adjusting OpenMP PageRank : SHORT REPORT / NOTESSubhajit Sahu
For massive graphs that fit in RAM, but not in GPU memory, it is possible to take
advantage of a shared memory system with multiple CPUs, each with multiple cores, to
accelerate pagerank computation. If the NUMA architecture of the system is properly taken
into account with good vertex partitioning, the speedup can be significant. To take steps in
this direction, experiments are conducted to implement pagerank in OpenMP using two
different approaches, uniform and hybrid. The uniform approach runs all primitives required
for pagerank in OpenMP mode (with multiple threads). On the other hand, the hybrid
approach runs certain primitives in sequential mode (i.e., sumAt, multiply).
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...2023240532
Quantitative data Analysis
Overview
Reliability Analysis (Cronbach Alpha)
Common Method Bias (Harman Single Factor Test)
Frequency Analysis (Demographic)
Descriptive Analysis
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
Trends towards the merge of HPC + Big Data systems
1. WSCAD 2016 - XVII Simpósio em Sistemas
Computacionais de Alto Desempenho
Aracaju - Sergipe – Brazil
October, 7th - 2016
Igor Freitas
igor.freitas@intel.com
2. WSCAD 2016
2
Big Data Analytics
HPC != Big Data ?
*Other brands and names are the property of their respective owners.
FORTRAN / C++ Applications
MPI
High Performance
Java, Python, Go, etc.*
Applications
Hadoop*
Simple to Use
SLURM
Supports large scale startup
YARN*
More resilient of hardware failures
Lustre*
Remote Storage
HDFS*, SPARK*
Local Storage
Compute & Memory Focused
High Performance Components
Storage Focused
Standard Server Components
Server Storage
SSDs
Switch
Fabric
Infrastructure
Modelo de
Programação
Resource
Manager
Sistema de
arquivos
Hardware
Server Storage
HDDs
Switch
Ethernet
Infrastructure
3. WSCAD 2016
Varied Resource Needs
Typical HPC
Workloads
Typical
Big Data
Workloads
3
Big Data Analytics
HPC in real time
Small Data + Small
Compute
e.g. Data analysis
Big Data +
Small Compute
e.g. Search, Streaming,
Data Preconditioning
Small Data +
Big Compute
e.g. Mechanical Design, Multi-physics
Data
Compute
High
Frequency
Trading
Numeric
Weather
Simulation
Oil & Gas
Seismic
Systemcostbalance
Video Survey Traffic
Monitor
Personal
Digital Health
Systemcostbalance
Processor Memory Interconnect Storage
4. WSCAD 2016
4
Trends in HPC + Big Data
Standards
Business viability
Performance
Code Modernization
(Vector instructions)
Many-core
FPGA
Usability
Faster time-to-market
Lower costs (HPC at Cloud ? )
Better products
Easy to mantain HW & SW
Portability
Open
Commom
Environments
Integrated solutions:
Storage + Network +
Processing + Memory
Public investments
6. WSCAD 2016
HPCisFoundationaltoInsight
Aerospace Biology Brain Modeling Chemistry/Chemical Engineering Climate Computer Aided Engineering Cosmology Cybersecurity Defense
Pharmacology Particle Physics Metallurgy Manufacturing / Design Life Sciences Government Lab Geosciences / Oil & Gas Genomics Fluid Dynamics
1Source: IDC HPC and ROI Study Update (September 2015)
2Source: IDC 2015 Q1 World Wide x86 Sever Tracker vs IDC 2015 Q1 World Wide HPC Sever Tracker
DigitalContentCreationEDAEconomics/FinancialServicesFraudDetection
SocialSciences;Literature,linguistics,marketingUniversityAcademicWeather
Business
Innovation
A New
Science
Paradigm
Fundamental
Discovery
High ROI:
$515
Average Return Per $1 of HPC
Investment1
Advancing
Science
And Our Understanding
of the Universe
Data-Driven
Analytics
Joins Theory, Experimentation, and
Computational Science
6
7. WSCAD 2016
Growing Challenges in HPC
“The Walls”
System Bottlenecks
Memory | I/O | Storage
Energy Efficient Performance
Space | Resiliency |
Unoptimized Software
Divergent
Infrastructure
Barriers to
Extending Usage
Resources Split Among
Modeling and Simulation | Big
Data Analytics | Machine
Learning | Visualization
HPC
Optimized
Democratization at Every
Scale | Cloud Access |
Exploration of New Parallel
Programming Models
Big
Datahpc
Machine learning
visualization
7
8. WSCAD 2016
HPC & the Competitiveness of Industry & Science
of USA
Public Investments
8
• Executive order from Obamas’s president for the ‘national program of
Supercomputing’
• HPC as“Top priority” to leverage USA competitiveness
”In order to maximize the benefits
of HPC for economic competitiveness
and scientific discovery, the United
States Government must create a
coordinated Federal strategy in HPC
research, development, and
deployment”
Executive Order, Barack Obama
Fonte: The White House
Office of the Press Secretary
9. WSCAD 2016
HPC & the Competitiveness of Industry & Science
of USA
Public Investments
9
• U.S. makes a Top 10 supercomputer available to anyone who can
'boost' America*
• Boost American competitiveness.
• Accelerate advances in science and technology.
• Develop the country's skilled high-performance computing (HPC)
workforce.
Fonte: The White House
Office of the Press Secretary
10. WSCAD 2016
10
China’s New Supercomputer Puts the US Even
Further Behind*
Public Investments
• Sunway TaihuLight officially became the
fastest supercomputer in the world
• What is really means for HPC:
• Innovation throught HPC
• Gov recognition of HPC competitiveness
• Software is the key !
• Performance
• Productivity
• Programmability
*Source: https://www.wired.com/2016/06/fastest-supercomputer-sunway-taihulight/
12. WSCAD 2016
12
Democratizing HPC for Big Data workloads
Performance: Vector instructions
• In the 70s and 80s Vector Machines was the rule
• Why in 90s it was ‘old stuff’ ?
• By Eugene D. Brooks the reason was simple: it was customs machines*
• The near future ?
• Vectors again ! But in general purpose CPUs
• Affordable
• Easy to code
• Associated with Multi-thread programming
*Source: https://www.hpcwire.com/2016/09/26/vectors-old-became-new-supercomputing/
Gigaflops from Vector Machines vs
Parallel Machines*
14. WSCAD 2016
14
Vectorization and Threading Critical on Modern Hardware
Performance: Vector instructions
Vectorized
& Threaded
Threaded
Vectorized
Serial
Key:
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark,
are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should
consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with
other products. For more information go to http://www.intel.com/performance Configurations at the end of this presentation.
16. WSCAD 2016
Intel® DAAL Overview
Industry leading performance, C++/Java/Python library for machine
learning and deep learning optimized for Intel® Architectures.
(De-)Compression
PCA
Statistical moments
Variance matrix
QR, SVD, Cholesky
Apriori
Linear regression
Naïve Bayes
SVM
Classifier boosting
Kmeans
EM GMM
Collaborative filtering
Neural Networks
Pre-processing Transformation Analysis Modeling Decision Making
Scientific/Engineering
Web/Social
Business
Validation
17. WSCAD 2016
Python* Landscape
Challenge#1:
Domain specialists are not professional software
programmers.
Adoption of Python
continues to grow among
domain specialists and
developers for its
productivity benefits
Challenge#2:
Python performance limits migration to
production systems
18. WSCAD 2016
Python* Landscape
Challenge#1:
Domain specialists are not professional software
programmers.
Adoption of Python
continues to grow among
domain specialists and
developers for its
productivity benefits
Challenge#2:
Python performance limits migration to
production systems
Intel’s solution is to…
Accelerate Python performance
Enable easy access
Empower the community
20. WSCAD 2016
PCA Performance Boosts Using Intel® DAAL vs. Spark* MLLib
on Intel® Architectures
20
4X
6X 6X
7X 7X
0
2
4
6
8
1M x 200 1M x 400 1M x 600 1M x 800 1M x 1000
Speedup
Table size
PCA (correlation method) on an 8-node Hadoop* cluster based on
Intel® Xeon® Processors E5-2697 v3
Configuration Info - Versions: Intel® Data Analytics Acceleration Library 2016, CDH v5.3.1, Apache Spark* v1.2.0; Hardware: Intel® Xeon® Processor E5-2699 v3, 2 Eighteen-core CPUs (45MB LLC, 2.3GHz), 128GB of
RAM per node; Operating System: CentOS 6.6 x86_64.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific
computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in
fully evaluating your contemplated purchases, including the performance of that product when combined with other products. * Other brands and names are the property of their respective owners. Benchmark
Source: Intel Corporation
Optimization Notice: Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include
SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel.
Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors.
Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804 .
21. WSCAD 2016
21
What’s New: Intel® DAAL 2017
• Neural Networks
• Python* API (a.k.a. PyDAAL)
– Easy installation through Anaconda or pip
• New data source connector for KDB+
• Open source project on GitHub
Fork me on GitHub:
https://github.com/01org/daal
22. WSCAD 2016
Profile Python* and Mixed Python / C++ / Fortran*
Tune latest Intel® Xeon Phi™ processors
Quickly see three keys to HPC performance
Optimize memory access
Storage analysis: I/O bound or CPU bound?
Enhanced OpenCL* and GPU profiling
Easier remote and command line usage
Add custom counters to the timeline
Preview: Application and storage performance snapshots
Intel® Advisor: Optimize vectorization for Intel® AVX-512
(with or without hardware)
New for 2017: Python*, FLOPS, Storage, and More…
Intel® VTune™ Amplifier Performance Profiler
22
New!
23. WSCAD 2016
23
Optimize Memory Access
Memory Access Analysis: Intel® VTune™ Amplifier 2017
Tune data structures for performance
Attribute cache misses to data structures
(not just the code causing the miss)
Support for custom memory allocators
Optimize NUMA latency and scalability
True and false sharing optimization
Auto detect max system bandwidth
Easier tuning of inter-socket bandwidth
Easier install, latest processors
No special drivers required on Linux*
Intel® Xeon Phi™ processor MCDRAM (high-
bandwidth memory) analysis
Improved!
24. WSCAD 2016
Are you I/O bound or CPU bound?
Explore imbalance between I/O operations
(async and sync) and compute.
Storage accesses mapped to
the source code.
See when CPU is waiting for I/O.
Measure bus bandwidth to storage.
Latency analysis
Tune storage accesses with
latency histogram.
Distribution of I/O over multiple devices.
24
Storage Device Analysis (HDD, SATA, or NVMe SSD)
Intel® VTune™ Amplifier
New!
Slow task
with I/O Wait
Sliders set
thresholds for
I/O Queue Depth
25. WSCAD 2016
25
Intel® Performance Snapshots
Three Fast Ways to Discover Untapped Performance
Is your application making good use of modern
computer hardware?
Run a test case during your coffee break.
High-level summary shows which apps can
benefit most from code modernization and
faster storage.
Pick a performance snapshot:
Application: For non-MPI apps
MPI: For MPI apps
Storage: For systems, servers, and
workstations with directly attached storage.
New!
New!
Free download: http://www.intel.com/performance-snapshot
Also included with Intel® Parallel Studio and Intel® VTune™ Amplifier products.
26. WSCAD 2016
26
Stick closely with DAAL’s overall
design
– Object-oriented, namespace hierarchy,
plug&play
Seamless interfacing with NumPy
Anaconda package
– http://anaconda.org/intel/
Co-exists with the proprietary version
Apache 2.0 license
Lives on github.com
Python API (a.k.a. PyDAAL)
...
# Create a Numpy array as our input
a = np.array([[1,2,4],
[2,1,23],
[4,23,1]])
# create a DAAL Matrix using our numpy array
m = daal.Matrix(a)
# Create algorithm objects for cholesky decomposition
computing using default method
algorithm = cholesky.Batch()
# Set input arguments of the algorithm
algorithm.input.set(cholesky.data, m)
# Compute Cholesky decomposition
res = algorithm.compute()
# Get computed Cholesky decomposition
tbl = res.get(choleskyFactor)
# get and print the numpy array
print tbl.getArray()
New
28. WSCAD 2016
28
Growing Need for New Class of Memory
Performance & Lower costs: Integrated solutions
Virtualization
Big Data & Cloud
In-Memory DB
OLTP
Workstation
Supply Chain
Mgmt
Enterprise
ERP
Database
Storage
HPC
“Give me a faster
storage interface”
“Allow in-memory
data to survive soft
reset or hard reboot”
“Minimal latency for
huge memory”
“Make large memory servers
less expensive”
29. WSCAD 2016
4
Bridging the Memory-Storage Gap
Intel® Optane™ Technology Based on 3D XPoint™
SSD
Intel® Optane™ SSDs 5-7x Current Flagship
NAND-Based SSDs (IOPS)1
DRAM-like performance
Intel® DIMMs Based on 3D-XPoint™
1,000x Faster than NAND1
1,000x the Endurance of NAND2
Hard drive capacities
10x More Dense than Conventional
Memory3
1Performancedifferencebasedoncomparisonbetween3DXPoint™Technologyandotherindustry NAND
2 Density differencebasedoncomparisonbetween3DXPoint™Technologyandotherindustry DRAM
2 Endurancedifferencebasedoncomparisonbetween3DXPoint™Technologyandother industryNAND
Intel® Scalable
System Framework
30. WSCAD 2016
CPU
DDR
INTEL®DIMMS
Intel®Optane™SSD
NAND SSD
Hard Disk Drives
1000Xfaster
Than NAND1
1000Xendurance
Of NAND2
10Xdenser
Than DRAM3
30
Intel® Scalable
System Framework
Bridging the Memory-Storage Gap
Intel® Optane™ Technology
Performance & Lower costs: Integrated solutions
1Performancedifferencebasedoncomparisonbetween3DXPoint™Technologyandotherindustry NAND
2 Density differencebasedoncomparisonbetween3DXPoint™Technologyandotherindustry DRAM
2 Endurancedifferencebasedoncomparisonbetween3DXPoint™Technologyandother industryNAND
Data granularity:
64B cacheline
31. WSCAD 2016
Yesterday Today Near Future
31
Storage Evolution
Performance & Lower costs: Integrated solutions
Memory
&
Storage
Storage
NAND based Intel P3700
(Fultondale) for NVMe
3D XPoint™ based
Coldstream SSD for NVMe
3D XPoint™ based
Apache Pass (AEP) for DDR4
Revolutionary
Storage
Class Memory
World’s
Fastest
NVMe SSD
3D XPoint enables world’s fastest NVMe SSD and
revolutionary storage class memory
32. WSCAD 2016
32
Code Modernization
Democratizing HPC performance for Big Data workloads
Easy of use
Fine tuning
Vectors
Intel® Math Kernel Library
Array Notation: Intel® Cilk™ Plus
Auto vectorization
Semi-auto vectorization:
#pragma (vector, ivdep, simd)
C/C++ Vector Classes
(F32vec16, F64vec8)
Intel® Data Analytics Acceleration Library
Coprocessor
Fabric
Memory
Memory Bandwidth
~500 GB/s STREAM
Memory Capacity
Over 25x* KNC
Resiliency
Systems scalable to >100 PF
Power Efficiency
Over 25% better than card1
I/O
Up to 100 GB/s with int fabric
Cost
Less costly than discrete
parts1
Flexibility
Limitless configurations
Density
3+ KNL with fabric in 1U3
Knights Landing
*Comparison to 1st Generation Intel® Xeon Phi™ 7120P Coprocessor (formerly codenamed Knights Corner)
1Results based on internal Intel analysis using estimated power consumption and projected component pricing in the 2015timeframe. This analysis is
provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance.
2Comparison to a discrete Knights Landing processor and discrete fabric component.
3Theoretical density for air-cooled system; other cooling solutions and configurations will enable lower or higher density.
Server Processor
33. WSCAD 2016
KNL and KNL-F Processors:
Knights Landing IS the host processor
Boots standard off-the-shelf OS’s
Benefits:
Higher performance density for highly
parallel applications2
Reduced system power consumption2
Higher perf/Watt & perf/$$3
Knights Landing Coprocessor:
Solution for general purpose servers
and workstations
Benefits:
Targeted for applications with larger
sections of serial work1
Upgrade path from Knights Corner as
PCIe card
Knights Landing Processor
“Self-boot” Intel® Xeon Phi™
processor platform
1 Projections based on early product definition and as compared to prior generation Intel® Xeon Phi™ Coprocessors
2 Based on Intel internal analysis.. Lower power based on power consumption estimates between (2) HCAs compared to 15W additional power for KNL-F. Higher density based on removal of PCIe slots and associated
HCAs populated in those slots.
3 Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. 2
Results based on internal Intel analysis using estimated theoretical Flops/s for KNL processors, along with estimated system power consumption and component pricing in the 2015 timeframe. See backup for complete
system configurations.
KNL-
F
KNL
Three Knights Landing Products
Knights Landing Coprocessor
Requires Intel® Xeon® processor
host
Adams Pass
Platform
KNL
Coprocessor
34. WSCAD 2016
DDR4
x4 DMI2 to PCH
36 Lanes PCIe* Gen3 (x16, x16, x4)
MCDRAM MCDRAM
MCDRAM MCDRAM
DDR4
TILE:
(up to
36)
Tile IMC (integrated memory controller)EDC (embedded DRAM controller) IIO (integrated I/O controller)
KNL
Package
Enhanced Intel® Atom™ cores based on
Silvermont Microarchitecture
2D Mesh Architecture
Out-of-Order Cores
3X single-thread vs. KNC
ISA
Intel® Xeon® Processor Binary-Compatible (w/Broadwell)
On-package memory
Up to 16GB, ~465 GB/s STREAM at launch
Fixed Bottlenecks
Platform Memory
Up to 384GB (6ch DDR4-2400 MHz)
2VPU
Core
2VPU
Core
1MB
L2
HUB
KNL Architecture Overview
Bi-directional
tile connections
(same bit width
as Xeon core
interconnect)
34
35. WSCAD 2016
F
CONNECTOR
Lower cost: cost adder expected to be lower than (2) adapters or on-board controllers
Lower power: only ~15W TDP adder, which expected to be less than (2) adapters
Higher density: enables denser form factor – no slots, adapters, on-board controllers
Future-ready: sets stage for future hetero clusters (future Intel® Xeon® processor w/ int fabric)
1. KNL with TWO Fabric Adapters
(2) x16 PCIe slots
(2) x16 PCIe lanes
2. KNL w/ TWO on-board controllers
3. KNL-F with Storm Lake
Fabric
controller
Fabric
controller
QSPF
connector
QSPF
connector
1 Based on Intel internal estimates. Lower cost based on expected price delta between KNL and KNL-F processor, compared to two InfiniBand* or Storm Lake HCA via PCIe Express slots. Lower power based on power
consumption estimates between (2) HCAs (~20W)compared to 15W additional power for KNL-F over a comparable KNL processor.. Higher density based on removal of PCIe slots and HCAs populated in those slots.
QSPF
connector
QSPF
connector
QSFP
module
Same socket for KNL and KNL-F
Design common platform with keep-out zone
and to support additional 15W TDP.
KNL-F Benefits:1
Why KNL-F? (Integrated Fabric)
Dual-Port
100 GB/s bi-
directional
35
36. WSCAD 2016
Integrated Fabric CPU Requirements
Components required to support CPU with Integrated Fabric
with two ports
(1) IFP Cable [supporting two ports]
(1) 2-port “Carrier card” (two main options)
– PCB that plugs into a PCIe slot (aka “PCIe carrier card”)
– Custom OEM PCB with power and sideband cables
PCIe “Carrier card” implementation requires:
– PCB, (2) IFT connectors, (2) IFT cages, sideband cable
PCIe carrier board, 2-port version
(sideband cable and IFT connectors and
cages on underside of the card)
(2) Internal-to-Faceplate
Processor (IFP) cable
supporting two-ports
(1) Internal Faceplate
Transition (IFT)
Connector
EACH port requires:
(1) IFT Cage
36
IFT Carrier Card design kit (including BOM and
design guide) is now posted on IBL (Doc#558210)
37. WSCAD 2016
Tighter Component Integration
Benefits
Bandwidth
Density
Latency
Power
Cost
Cores
Graphics
Fabric
FPGA
I/O
Memory
Intel® Scalable
System Framework
37
38. WSCAD 2016
Source: IDC 2014 (Worldwide High-Performance Systems Revenue by Applications) and https://software.intel.com/en-us/file/xeonphi-catalogpdf/download
CAE
Geosciences
Weather
Other
Mechanical Design
DCC & Distrib
Defense
University /
Academic
Government Lab
Bio-Sciences
EDA / IT / ISV
Economics /
Financial
Chem
Engineering
Balanced ApplicationsMemory Bandwidth Intensive Compute Intensive
CAE
Altair RADIOSS*
Ansys* Mechanical
Matevo MinFE
SIMULIA Abaqus*
Financial Services
Binomial Options Pricing Model
Binomial SP and DP
BlackScholes Merton Formula
BlackScholes SP and DP
Monte Carlo European Options Pricing
Monte Carlo RND SP and DP
Monte Carlo SP and DP
STAC A2
Xcelerit
Bioinformatics
BLAST
Bowtie 2
Burrows Wheeler Alignment (BWA)
Cry-EM Technique
MPI-HMMER 2.3
Computational Chemistry
DiRAC Codes
GAMESS
Integral Calculation Library
NEURON
NWChem
Molecular Dynamics
AMBER
BUDE
DL_POLY
GROMACS
LAMMPS
NAMD
Geophysics
ELMER/Ice
SeisSol
SPECFEM3D Cartesian
UTBench
Climate/Weather
ADCIRC
CAMS
CFSv2
COSMO
ECHAM6
HARMONIE
HBM
MPAS
NOAA NIM
WRF
Digital Content Creation
EMBREE
Superresolution processing
Energy
Acceleware* AxRTM
DownUnder GeoSolutions
ISO3DFD
RTM Petrobras
TTI 3DFD
CFD
AVBP
FrontFlow/Blue code
LBS3D
NASA Overflow
OpenFOAM
OpenLB
ROTORSIM
SU2
TAU and TRACE
software.intel.com/XeonPhiCatalog
Intel® Xeon Phi™ Application Catalog
Over 100 applications to date listed as available or in-flight
38
39. WSCAD 2016
Developer Tools for Knights Landing Platform
Intel Parallel Studio XE Component Supported features in PSXE 2016 Gold
Intel ® C/C++ and Fortran compilers
16.0
1) -xMIC-AVX512 compiler option enables KNL specific optimizations, including
loop optimizations and vectorization
2) Use Intel® Fortran compiler to build for MCDRAM
Intel® Math Kernel Library 11.3 Partial optimizations for all major MKL domains (BLAS, FFT, Sparse BLAS, VML,
VSL) are delivered via AVX512 optimizations.
Intel® MPI 5.1.1 and ITAC 9.1 Support for KNL platform and initial performance tuning is part of Intel MPI 5.1.1
VTune Amplifier XE 2016 (NDA
package)
Collection on KNL targets: advanced hotspots and custom event collection
based on SEP and perf; User API;
Analysis types for KNL profiling: advanced hotspots with full OpenMP analysis,
custom events (core and uncore) Intel MPI spins, general exploration
HBM profiling on Xeon with KNL Bandwidth modeling
Advisor XE 2016 (NDA Package) Survey analysis for AVX512 (includes hotspot collection and compiler static
data)
Data Analytics Acceleration Library
2016
Includes KNL-specific performance optimizations
Intel® Integrated Performance
Primitives 9.0
More than 70% of hot list functions have AVX512 optimizations
44. WSCAD 2016
CurrentStateofSystemSoftwareEffortsinHPCEcosystem
44
THE REALITY: We, the HPC ecosystem, will not be able to get to where we
want to go without a major change in system software development.
With system margins under
pressure, unwillingness to
invest in system software
A desire to get exascale
performance & speed up
software adoption of HW
innovation
Fragmented efforts across the
ecosystem – “Everyone
building their own solution.”
New complex workloads (ML,
Big Data, etc) drive more
complexity into the software
stack
45. WSCAD 2016
Stable HPC System Software that:
Fuels a vibrant and efficient HPC software ecosystem
Takes advantage of hardware innovation & drives
revolutionary technologies
Eases traditional HPC application development and
testing at scale
Extends to new workloads (ML, analytics, big data)
Accommodates new environments (i.e. cloud)
A Shared Repository
DesiredFutureState
2
46. WSCAD 2016
OfficialMembersasof6/1/2016
Goal: A common system software platform for the HPC community that works across
multiple segments and on which ecosystem partners can collaborate and innovate
46
48. WSCAD 2016
48
OpenHPCtoIntel®HPCOrchestratorsystemsoftwareproducts
Intel HPC
Orchestrator
products
• Premium
Software
• Advanced
testing
• Support
An open source community
for HPC software
Intel seeded the community
with a pre-integrated, pre-
tested and validated HPC
system software stack & will
continue contributions along
with other members of the
community
Intel will
offer Intel-
supported
products
based on the
open source
OpenHPC
software
Intel HPC Orchestrator products are the realization of the
software portion of Intel® Scalable System Framework
Intel® Scalable System Framework
49. WSCAD 2016
49
Open Source acellerating HPC + Big Data
Open Standards
• PBS Pro is now Open
• OpenHPC
• Cloud for HPC
• And how about Brazil ?
• Intel Innovation Center at Rio – partnership with AMT (www.amt.com.br)
Pay less + easy of use = democratizing HPC for Big Data
51. WSCAD 2016
Intel’s HPC initiatives in Brazil
Code Modernization – Open source softwares
51
• Modernizing applications to increase parallelism and
scalability
• Leverage cores, caches, threads, and vector capabilities of
microprocessors and coprocessors.
• Current centers in Brazil
52. WSCAD 2016
Intel Modern Code Partner program
52
Intel Modern Code Partners
Code Modernization – driving developers to develop modern code to modern hardware
• Create Faster Code…Faster
• High Performance Scalable Code
• C++, C, Fortran*, Python* and Java*
• Standards-driven parallel models:
• OpenMP*, MPI, and TBB
• To teach developers how to full exploit Xeon and
Xeon Phi performance: vectors + multi-treading
More at: http://software.intel.com/moderncode
Free HPC & Big Data Workshops accross Brazil
53. WSCAD 2016
Code Modernization initiatives in the
Brazilian HPC Ecosystem
Oil & Gas - Reservoir Simulator
at PETROBRAS
LNCC - National Laboratory for Scientific Computing
Largest HPC cluster in Latin America
INPE/CPTEC
Code Modernization of BRAMS
• Up to 10.5x performance
gains in their
Reservoir Simulator software¹
• Up to 30x performance gain
in Oil & Gas applications²
• Up to 3.4x speedup via
AVX (vector instructions)
• Link white-paper
• Initial results – white-paper link
Health & Life Sciences
• Up to 11x speedup in Molecular Dynamics –
NCC/UNESP & LNCC – white-paper link
• Xeon only:
• Original code vs Modernized code: up to 11x speedup
• Xeon + 1 Xeon Phi (same optimized code)
• 1.14x speedup
Article link
Authors:
¹CENPES team and Gilvan Vieira - gilvandsv@gmail.com
²LNCC - Frederico Cabral - fredluiscabral@gmail.com
³NCC/UNESP - Silvio Stanzani silvio.stanzani@gmail.com
55. WSCAD 2016
55
Conclusions
• Likewise other products, technologies and services we have:
“Lower cost + Scale + Easy of usage”
would drive HPC to the masses
1st wave: near Bare-metal at the Cloud (lower cost + scale)
2nd wave: Frameworks offering “free performance” to unlock insights ( usability )
3rd wave: Even small and medium business will relay on HPC / Big Data to drive business
56. WSCAD 2016
56
Big Data Analytics
Integrated solutions: HPC && Big Data
*Other names and brands may be claimed as the property of others
HPC Big Data
FORTRAN / C++ Applications
MPI
High Performance
Python, Frameworks, Java* Applications,
Others
Hadoop* / Spark / Others
Simple to Use
Lustre* with Hadoop* Adapter
Remote Storage
Compute & Big Data Capable
Scalable Performance Components
Server
Storage
(SSDs and
Burst Buffers)
Intel®
Omni-Path
Architecture
Infrastructur
e
Programming
Model
Resource
Manager
File System
Hardware
HPC & Big Data-Aware Resource Manager
57. WSCAD 2016
Next steps for HPC & Big Data
New paradigm in memory and storage
Processor
Compute
Node
I/O Node
Remote
Storage
Compute
Today
Caches
Local Memory
SSD Storage
Parallel File System
(Hard Drive Storage)
HigherBandwidth.
LowerLatencyandCapacity
Some remote data moves
onto I/O node
I/O Node storage moves to
compute node
Local memory is now faster &
in processor package
Compute
Future
Caches
Non-Volatile Memory
Burst Buffer Storage
Parallel File System
(Hard Drive Storage)
In-Package High
Bandwidth Memory*
*cache, memory or hybrid mode
58. WSCAD 2016
Conclusions
A Holistic Architectural Approach is Required
Compute
Memory
Fabric
Storage
PERFORMANCEICAPABILITY
TIME
System
Software
Innovative Technologies Tighter Integration
Application
Modernized Code
Community
ISV
Proprietary
System
Memory
Cores
Graphics
Fabric
FPGA
I/O
58
60. WSCAD 2016
14
A Global Online Community
Intel® Modern Code Developer Community
Developer
zone
- Vectorization/Single Instruction, Multiple Data (SIMD)
- Multi-Threading
- Multi Node/Clustering
- Take Advantage of On-Package High-Bandwidth Memory
- Increase Memory and Power Efficiency
Topics
Experts
software.intel.com/moderncode
- Modern Code Zone
- Software Tools, Training Webinars
- How-to guides, Parallel Programming BKMs
- Remote Access to Hardware
- Support Forums
- Black Belts, & Intel Engineer Experts
- Technical Content, Training -Webinars, F2F, Forum Support
- Conference and Tradeshows: Keynotes, Presentations, BOFs,
Demos, Tutorials
61. WSCAD 2016
61
Machine/Deep Learning | Resources
Training Classes:
U.Oxford Class on Deep Learning
Stanford Class on Machine Learning
Google Class on Deep Learning
Intel Caffe Repo: (Support for Multi-node Training)
https://github.com/intelcaffe/caffe
Spark MLLib Repo:
http://spark.apache.org/mllib/
Intel Machine Learning Blog Posts:
Myth Busted - CPUs and Neural Network Training
Caffe Scoring on Xeon Processors
Caffe Training on Multi-node Distributed Memory Systems
Trusted Analytics Platform:
http://trustedanalytics.org/
Performance Libraries:
MKL for Neural Networks - Technical Preview
Math Kernel Library
MKL Community License
Data Analytics Acceleration Library