Andi Bahar's Presentation material during Seminar Protecting Mission-Critical Application Against Downtime at Grha Datacomm Jakarta, Thursday March 8 2018
SQL on Hadoop benchmarks using TPC-DS query setKognitio
Sharon Kirkham, VP Analytics & Consulting at Kognitio, ran the TPC-DS query set using Impala, SparkSQL and Kognitio, to test for speed, reliability and concurrency for different SQL on Hadoop solutions. Standard Hive was originally investigated as part of this benchmark but lack of SQL support and poor single thread performance meant it was removed.
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...Data Con LA
Today’s Software Defined environments attempt to remove the weakness of computing hardware from the operational equation. There is no doubt that this is a natural progress away from overpriced, proprietary compute and storage layers. However, even at the heart of any Software Defined universe is an underlying hardware stack that must be robust, reliable and cost effective. Our 20+ years experience delivering over 2000 clusters and clouds has taught us how to properly design and engineer the right hardware solution for Big Data, Cluster and Cloud environments. This presentation will share this knowledge allowing user to make better design decisions for any deployment.
Part 3: Models in Production: A Look From Beginning to EndCloudera, Inc.
The document discusses the different roles involved in developing machine learning models from beginning to end. It describes the typical workflow as including data engineering to prepare data, exploratory data science to develop models, and operational model deployment to production applications. It provides examples of tasks for each role such as data engineers ingesting and transforming sensor data, data scientists building and evaluating predictive models, and model deployment engineers validating models and creating APIs.
Big Data Day LA 2016/ Use Case Driven track - Reliable Media Reporting in an ...Data Con LA
OnPrem Solution Partners worked with NBCU to profile in-house data to determine data quality, and recommend process and quality improvements. We present our process for data import, improvements we want to make, and lessons learned regarding various tools used, including MariaDB, ElasticSearch, Cassandra, and others.
The document discusses how Sparklyr allows data scientists to access and work with data stored in Cloudera Enterprise using the popular RStudio IDE. It describes the challenges data scientists face in accessing secured Hadoop clusters and limitations of notebook environments. Sparklyr integration with RStudio provides a familiar environment for data scientists to access Hadoop data and compute using Spark, enabling distributed data science workflows directly in R. The presentation demonstrates how to analyze over a billion records using Spark and R through Sparklyr.
The document discusses optimizing a data warehouse by offloading some workloads and data to Hadoop. It identifies common challenges with data warehouses like slow transformations and queries. Hadoop can help by handling large-scale data processing, analytics, and long-term storage more cost effectively. The document provides examples of how customers benefited from offloading workloads to Hadoop. It then outlines a process for assessing an organization's data warehouse ecosystem, prioritizing workloads for migration, and developing an optimization plan.
Debunking Common Myths of Hadoop Backup & Test Data ManagementImanis Data
These slides are from a webinar where Hari Mankude, CTO at Talena, discussed key concepts associated with Hadoop data management processes around scalable backup, recovery and test data management.
SQL on Hadoop benchmarks using TPC-DS query setKognitio
Sharon Kirkham, VP Analytics & Consulting at Kognitio, ran the TPC-DS query set using Impala, SparkSQL and Kognitio, to test for speed, reliability and concurrency for different SQL on Hadoop solutions. Standard Hive was originally investigated as part of this benchmark but lack of SQL support and poor single thread performance meant it was removed.
Big Data Day LA 2016/ Use Case Driven track - From Clusters to Clouds, Hardwa...Data Con LA
Today’s Software Defined environments attempt to remove the weakness of computing hardware from the operational equation. There is no doubt that this is a natural progress away from overpriced, proprietary compute and storage layers. However, even at the heart of any Software Defined universe is an underlying hardware stack that must be robust, reliable and cost effective. Our 20+ years experience delivering over 2000 clusters and clouds has taught us how to properly design and engineer the right hardware solution for Big Data, Cluster and Cloud environments. This presentation will share this knowledge allowing user to make better design decisions for any deployment.
Part 3: Models in Production: A Look From Beginning to EndCloudera, Inc.
The document discusses the different roles involved in developing machine learning models from beginning to end. It describes the typical workflow as including data engineering to prepare data, exploratory data science to develop models, and operational model deployment to production applications. It provides examples of tasks for each role such as data engineers ingesting and transforming sensor data, data scientists building and evaluating predictive models, and model deployment engineers validating models and creating APIs.
Big Data Day LA 2016/ Use Case Driven track - Reliable Media Reporting in an ...Data Con LA
OnPrem Solution Partners worked with NBCU to profile in-house data to determine data quality, and recommend process and quality improvements. We present our process for data import, improvements we want to make, and lessons learned regarding various tools used, including MariaDB, ElasticSearch, Cassandra, and others.
The document discusses how Sparklyr allows data scientists to access and work with data stored in Cloudera Enterprise using the popular RStudio IDE. It describes the challenges data scientists face in accessing secured Hadoop clusters and limitations of notebook environments. Sparklyr integration with RStudio provides a familiar environment for data scientists to access Hadoop data and compute using Spark, enabling distributed data science workflows directly in R. The presentation demonstrates how to analyze over a billion records using Spark and R through Sparklyr.
The document discusses optimizing a data warehouse by offloading some workloads and data to Hadoop. It identifies common challenges with data warehouses like slow transformations and queries. Hadoop can help by handling large-scale data processing, analytics, and long-term storage more cost effectively. The document provides examples of how customers benefited from offloading workloads to Hadoop. It then outlines a process for assessing an organization's data warehouse ecosystem, prioritizing workloads for migration, and developing an optimization plan.
Debunking Common Myths of Hadoop Backup & Test Data ManagementImanis Data
These slides are from a webinar where Hari Mankude, CTO at Talena, discussed key concepts associated with Hadoop data management processes around scalable backup, recovery and test data management.
Spark in the Wild: An In-Depth Analysis of 50+ Production Deployments-(Arsala...Spark Summit
Spark is being used widely in production across different industries and company sizes. The top reasons for choosing Spark included ease of ETL/data pipelines (60% of existing Hadoop users), production machine learning and data science at scale (40% of non-Hadoop users). Nearly all deployments used Spark for ETL and most (over 60%) accessed multiple data sources. While Spark has made progress in usability, configuration and performance tuning remains challenging. Other issues included difficulty sizing environments and collaboration between different roles. Security approaches also need to evolve as data becomes more distributed.
A Community Approach to Fighting Cyber ThreatsCloudera, Inc.
3 Things to Learn About:
*Infinitely scale data storage, access, and machine learning
*Provide community defined open data models for complete enterprise visibility
*Open up application flexibility while building on a future proofed architecture
Big Data at Geisinger Health System: Big Wins in a Short TimeDataWorks Summit
Geisinger Health System is well known in the healthcare community as a pioneer in data and analytics. We have had an Electronic Health Record (EHR) since 1996, and an Electronic Data Warehouse (EDW) since 2008. Much of daily and weekly operational reporting, as well as an abundance of ad hoc analytics, come from the EDW.
Approximately 18 months ago, the Data Management team implemented Hadoop in the Hortonworks Data Platform (HDP), and successes in implementation and development have proven to the organization that we should abandon the traditional EDW in favor of the Big Data (HDP) platform.
In less than 18 months, we stood up the platform, created a data ingestion pipeline, duplicated all source feeds from the EDW into HDP, and had several analytics developed with HDP and Tableau. Furthermore, we have exploited the new capabilities of the platform, where we use Natural Language Processing (NLP) to interrogate valuable (but previously hidden) clinical notes. The new platform has data that is modeled and governed, setting the stage to push Geisinger Health System from a pioneer to a leader in Big Data and Analytics.
This session will focus on Hortonworks Data Platform, covering data architecture, security, data process flow, and development. It is geared toward Data Architects, Data Scientists, and Operations/I.T. audiences.
How does SolrCloud ensure that replicated data remains consistent? How does Solr avoid data loss when hardware inevitably fails? In this talk, we will cover how Solr addresses failures and what recovery steps the cluster can automatically perform.
BigDataBx #1 - Atelier 1 Cloudera Datawarehouse OptimisationExcelerate Systems
The document discusses how Cloudera can optimize an enterprise data warehouse. It addresses challenges with existing complex architectures that include many specialized systems and data silos. This leads to issues with visibility, time to access data, and high costs of analytics. Cloudera proposes solutions like using their platform for a multi-workload analytic environment, active data archiving at a tenth the cost, faster and cheaper data transformations, and self-service business intelligence. Case studies show customers saving tens of millions through solutions like offloading processing, avoiding expansion costs, and getting insights from more extensive data exploration.
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...Cloudera, Inc.
Recording Link: http://bit.ly/LSImpala
Author: Greg Rahn, Cloudera Director of Product Management
In this session, we'll review the recent set of benchmark tests the Apache Impala (incubating) performance team completed that compare Apache Impala to a traditional analytic database (Greenplum), as well as to other SQL-on-Hadoop engines (Hive LLAP, Spark SQL, and Presto). We'll go over the methodology and results, and we'll also discuss some of the performance features and best practices that make this performance possible in Impala. Lastly, we'll look at some recent advancements in in Impala over the past few releases.
Building a scalable analytics environment to support diverse workloadsAlluxio, Inc.
Data Orchestration Summit 2020 organized by Alluxio
https://www.alluxio.io/data-orchestration-summit-2020/
Building a scalable analytics environment to support diverse workloads
Tom Panozzo, Chief Technology Officer (Aunalytics)
About Alluxio: alluxio.io
Engage with the open source community on slack: alluxio.io/slack
Combat Cyber Threats with Cloudera Impala & Apache HadoopCloudera, Inc.
Learn how you can use Cloudera Impala to:
- Operate with all data in your domain
- Address cyber security analysis and forensics needs
- Combat fraud, waste, and abuse
Cloudera can help optimize Splunk deployments by providing more cost-effective scalability, increased data flexibility, and enhanced analytics capabilities. Cloudera can ingest data from Splunk indexes and apply enrichment using open-source machine learning before storing the data in its data hub. This provides a single platform for advanced analytics like SQL and Python/R scripts across both historical and new data. Initial use cases include offloading event data from Splunk to reduce costs and loading additional context sources to gain better insights.
The Big Picture: Learned Behaviors in ChurnCloudera, Inc.
The Big Picture webinar series explores how industries define their strategies for understanding their consumers better using data. From issuing better healthcare to smarter product and services recommendations, data is the fueling foundation for success. Cloudera is a modern platform that gives analytic access to users that need to understand their customers across multiple touch points and multiple enterprise systems. Cloudera not only unlocks the promise of true customer 360 but it also leverages advanced capabilities for data science and machine learning.
In this webinar, we take a look at how data scientists can leverage Cloudera to identify and predict a common customer loyalty use case in telecommunications. We will explore the data, design our features, and then leverage Apache Spark to help us make some predictions on the accuracy of our finding. All within a secure and collaborative environments utilizing the Cloudera Data Science Workbench.
1) HPE InfoSight provides predictive analytics through artificial intelligence to help eliminate unplanned downtime and optimize infrastructure performance.
2) It collects telemetry data from across the data center stack, including over 250B sensor values daily, and uses this to detect issues before customers and prevent problems across installations.
3) The platform provides recommendations to optimize resources, improve performance, and avoid future issues through its global learning and predictive models.
Cray cluster supercomputers are designed to enable high value Hadoop by providing a turnkey solution that supports more data, users, and complexity with reliable performance. The solution integrates Hadoop with Cray's high performance computing technologies and expertise to provide a validated, optimized system that delivers rapid return on investment without surprises.
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science WorkbenchNOVA DATASCIENCE
This document discusses Cloudera's Data Science Workbench (CDSW) product. It begins with an introduction and agenda. It then discusses challenges with data science projects and how CDSW aims to help by providing a shared platform for data access, analytics and model deployment. The document outlines CDSW's architecture built on Docker and Kubernetes. It demonstrates CDSW's capabilities and integrations with Cloudera's Data Hub platform before concluding with information about Cloudera's research team.
DRAFT - Enterprise Data and Analytics Architecture Overview for Electric UtilityPrajesh Bhattacharya
This document provides an overview of an enterprise data and analytics architecture for an electric utility. It describes a data warehouse containing only production reporting data, a data lake containing all data, and a data translation layer. It discusses implementing the architecture in phases, beginning with new projects and eventually converting existing systems. The data lake is proposed to handle increased data loads and allow for analytics like machine learning. Various analytics tools are described for production, project, and ad-hoc use cases.
Enterprise Data and Analytics Architecture Overview for Electric UtilityPrajesh Bhattacharya
How would you go about creating an enterprise data and analytics architecture for electric utility that 1) will be relevant in the long run, 2) will be easy to implement and 3) will start bringing value to the organization fairly quickly? What will be the components? Who will be the users? The operation of electric utility will change significantly by 2025. How will you future-proof the architecture?
Transforms Document Management at Scale with Distributed Database Solution wi...DataStax Academy
SpringCM will discuss how they achieve massive scalability and blazing performance with DataStax Enterprise and HP Moonshot Solutions; to transform enterprise document management forever. They will dive into: how they broke through their capacity to run millions of workloads per hour, why relational database technologies simply cannot handle the scalability demands of document management SaaS, and how they deliver up to 70% more energy savings than traditional rack server architecture.
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...DataStax
The document discusses challenges with cloud applications and provides an overview of DataStax Enterprise (DSE) as a solution. Key points include: DSE is based on Apache Cassandra and provides multiple data models, extensions for production use, and management tools. It addresses challenges like performance, scalability, and availability. The latest DSE 5.0 release adds support for graph and improves development and management experiences. Real-world customer examples needing massive scale are also presented.
3 Things to Learn About:
-How Kudu is able to fill the analytic gap between HDFS and Apache HBase
-The trade-offs between real-time transactional access and fast analytic performance
-How Kudu provides an option to achieve fast scans and random access from a single API
1) Traditional server architectures struggle to handle growing data volumes and values that decline rapidly over time. In-memory computing is needed for performance but current approaches like sharding data reduce accuracy.
2) Software-defined servers address these issues by allowing servers to expand in size on demand using standard hardware. This provides in-memory performance at large scale with self-optimizing servers that require no changes to applications or operating systems.
3) TidalScale uses machine learning to transparently optimize resource allocation across multiple physical servers operating as a single large virtual machine. This provides up to 200x faster performance than a single server for memory-intensive workloads like machine learning.
Java ee7 with apache spark for the world's largest credit card core systems, ...Rakuten Group, Inc.
Financial industry companies need Java EE to power for its business today. Rakuten Card, one of the largest credit card companies in Japan, adopted Java EE 7 for its credit card core systems architecture, from one of the oldest COBOL based mainframe in Japan. Additionally, we chose Apache Spark for super rapid batch execution platform. We completed this big core system migration project successfully.
You can learn why we choose Java EE, and Apache Spark for super rapid batch execution, and our experiences and lessons we learned. How to start such a the big project? Why we choose it, how we ported, how use Apache Spark for performance improvements, and launched with? We’ll answer these questions and any that you may have.
Additionally, we are going to unveil our future roadmap for expanding our systems as well, with the cutting edge technology and standards.
Fundamentals of Big Data, Hadoop project design and case study or Use case
General planning consideration and most necessaries in Hadoop ecosystem and Hadoop projects
This will provide the basis for choosing the right Hadoop implementation, Hadoop technologies integration, adoption and creating an infrastructure.
Building applications using Apache Hadoop with a use-case of WI-FI log analysis has real life example.
Spark in the Wild: An In-Depth Analysis of 50+ Production Deployments-(Arsala...Spark Summit
Spark is being used widely in production across different industries and company sizes. The top reasons for choosing Spark included ease of ETL/data pipelines (60% of existing Hadoop users), production machine learning and data science at scale (40% of non-Hadoop users). Nearly all deployments used Spark for ETL and most (over 60%) accessed multiple data sources. While Spark has made progress in usability, configuration and performance tuning remains challenging. Other issues included difficulty sizing environments and collaboration between different roles. Security approaches also need to evolve as data becomes more distributed.
A Community Approach to Fighting Cyber ThreatsCloudera, Inc.
3 Things to Learn About:
*Infinitely scale data storage, access, and machine learning
*Provide community defined open data models for complete enterprise visibility
*Open up application flexibility while building on a future proofed architecture
Big Data at Geisinger Health System: Big Wins in a Short TimeDataWorks Summit
Geisinger Health System is well known in the healthcare community as a pioneer in data and analytics. We have had an Electronic Health Record (EHR) since 1996, and an Electronic Data Warehouse (EDW) since 2008. Much of daily and weekly operational reporting, as well as an abundance of ad hoc analytics, come from the EDW.
Approximately 18 months ago, the Data Management team implemented Hadoop in the Hortonworks Data Platform (HDP), and successes in implementation and development have proven to the organization that we should abandon the traditional EDW in favor of the Big Data (HDP) platform.
In less than 18 months, we stood up the platform, created a data ingestion pipeline, duplicated all source feeds from the EDW into HDP, and had several analytics developed with HDP and Tableau. Furthermore, we have exploited the new capabilities of the platform, where we use Natural Language Processing (NLP) to interrogate valuable (but previously hidden) clinical notes. The new platform has data that is modeled and governed, setting the stage to push Geisinger Health System from a pioneer to a leader in Big Data and Analytics.
This session will focus on Hortonworks Data Platform, covering data architecture, security, data process flow, and development. It is geared toward Data Architects, Data Scientists, and Operations/I.T. audiences.
How does SolrCloud ensure that replicated data remains consistent? How does Solr avoid data loss when hardware inevitably fails? In this talk, we will cover how Solr addresses failures and what recovery steps the cluster can automatically perform.
BigDataBx #1 - Atelier 1 Cloudera Datawarehouse OptimisationExcelerate Systems
The document discusses how Cloudera can optimize an enterprise data warehouse. It addresses challenges with existing complex architectures that include many specialized systems and data silos. This leads to issues with visibility, time to access data, and high costs of analytics. Cloudera proposes solutions like using their platform for a multi-workload analytic environment, active data archiving at a tenth the cost, faster and cheaper data transformations, and self-service business intelligence. Case studies show customers saving tens of millions through solutions like offloading processing, avoiding expansion costs, and getting insights from more extensive data exploration.
New Performance Benchmarks: Apache Impala (incubating) Leads Traditional Anal...Cloudera, Inc.
Recording Link: http://bit.ly/LSImpala
Author: Greg Rahn, Cloudera Director of Product Management
In this session, we'll review the recent set of benchmark tests the Apache Impala (incubating) performance team completed that compare Apache Impala to a traditional analytic database (Greenplum), as well as to other SQL-on-Hadoop engines (Hive LLAP, Spark SQL, and Presto). We'll go over the methodology and results, and we'll also discuss some of the performance features and best practices that make this performance possible in Impala. Lastly, we'll look at some recent advancements in in Impala over the past few releases.
Building a scalable analytics environment to support diverse workloadsAlluxio, Inc.
Data Orchestration Summit 2020 organized by Alluxio
https://www.alluxio.io/data-orchestration-summit-2020/
Building a scalable analytics environment to support diverse workloads
Tom Panozzo, Chief Technology Officer (Aunalytics)
About Alluxio: alluxio.io
Engage with the open source community on slack: alluxio.io/slack
Combat Cyber Threats with Cloudera Impala & Apache HadoopCloudera, Inc.
Learn how you can use Cloudera Impala to:
- Operate with all data in your domain
- Address cyber security analysis and forensics needs
- Combat fraud, waste, and abuse
Cloudera can help optimize Splunk deployments by providing more cost-effective scalability, increased data flexibility, and enhanced analytics capabilities. Cloudera can ingest data from Splunk indexes and apply enrichment using open-source machine learning before storing the data in its data hub. This provides a single platform for advanced analytics like SQL and Python/R scripts across both historical and new data. Initial use cases include offloading event data from Splunk to reduce costs and loading additional context sources to gain better insights.
The Big Picture: Learned Behaviors in ChurnCloudera, Inc.
The Big Picture webinar series explores how industries define their strategies for understanding their consumers better using data. From issuing better healthcare to smarter product and services recommendations, data is the fueling foundation for success. Cloudera is a modern platform that gives analytic access to users that need to understand their customers across multiple touch points and multiple enterprise systems. Cloudera not only unlocks the promise of true customer 360 but it also leverages advanced capabilities for data science and machine learning.
In this webinar, we take a look at how data scientists can leverage Cloudera to identify and predict a common customer loyalty use case in telecommunications. We will explore the data, design our features, and then leverage Apache Spark to help us make some predictions on the accuracy of our finding. All within a secure and collaborative environments utilizing the Cloudera Data Science Workbench.
1) HPE InfoSight provides predictive analytics through artificial intelligence to help eliminate unplanned downtime and optimize infrastructure performance.
2) It collects telemetry data from across the data center stack, including over 250B sensor values daily, and uses this to detect issues before customers and prevent problems across installations.
3) The platform provides recommendations to optimize resources, improve performance, and avoid future issues through its global learning and predictive models.
Cray cluster supercomputers are designed to enable high value Hadoop by providing a turnkey solution that supports more data, users, and complexity with reliable performance. The solution integrates Hadoop with Cray's high performance computing technologies and expertise to provide a validated, optimized system that delivers rapid return on investment without surprises.
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science WorkbenchNOVA DATASCIENCE
This document discusses Cloudera's Data Science Workbench (CDSW) product. It begins with an introduction and agenda. It then discusses challenges with data science projects and how CDSW aims to help by providing a shared platform for data access, analytics and model deployment. The document outlines CDSW's architecture built on Docker and Kubernetes. It demonstrates CDSW's capabilities and integrations with Cloudera's Data Hub platform before concluding with information about Cloudera's research team.
DRAFT - Enterprise Data and Analytics Architecture Overview for Electric UtilityPrajesh Bhattacharya
This document provides an overview of an enterprise data and analytics architecture for an electric utility. It describes a data warehouse containing only production reporting data, a data lake containing all data, and a data translation layer. It discusses implementing the architecture in phases, beginning with new projects and eventually converting existing systems. The data lake is proposed to handle increased data loads and allow for analytics like machine learning. Various analytics tools are described for production, project, and ad-hoc use cases.
Enterprise Data and Analytics Architecture Overview for Electric UtilityPrajesh Bhattacharya
How would you go about creating an enterprise data and analytics architecture for electric utility that 1) will be relevant in the long run, 2) will be easy to implement and 3) will start bringing value to the organization fairly quickly? What will be the components? Who will be the users? The operation of electric utility will change significantly by 2025. How will you future-proof the architecture?
Transforms Document Management at Scale with Distributed Database Solution wi...DataStax Academy
SpringCM will discuss how they achieve massive scalability and blazing performance with DataStax Enterprise and HP Moonshot Solutions; to transform enterprise document management forever. They will dive into: how they broke through their capacity to run millions of workloads per hour, why relational database technologies simply cannot handle the scalability demands of document management SaaS, and how they deliver up to 70% more energy savings than traditional rack server architecture.
Webinar: The Performance Challenge: Providing an Amazing Customer Experience ...DataStax
The document discusses challenges with cloud applications and provides an overview of DataStax Enterprise (DSE) as a solution. Key points include: DSE is based on Apache Cassandra and provides multiple data models, extensions for production use, and management tools. It addresses challenges like performance, scalability, and availability. The latest DSE 5.0 release adds support for graph and improves development and management experiences. Real-world customer examples needing massive scale are also presented.
3 Things to Learn About:
-How Kudu is able to fill the analytic gap between HDFS and Apache HBase
-The trade-offs between real-time transactional access and fast analytic performance
-How Kudu provides an option to achieve fast scans and random access from a single API
1) Traditional server architectures struggle to handle growing data volumes and values that decline rapidly over time. In-memory computing is needed for performance but current approaches like sharding data reduce accuracy.
2) Software-defined servers address these issues by allowing servers to expand in size on demand using standard hardware. This provides in-memory performance at large scale with self-optimizing servers that require no changes to applications or operating systems.
3) TidalScale uses machine learning to transparently optimize resource allocation across multiple physical servers operating as a single large virtual machine. This provides up to 200x faster performance than a single server for memory-intensive workloads like machine learning.
Java ee7 with apache spark for the world's largest credit card core systems, ...Rakuten Group, Inc.
Financial industry companies need Java EE to power for its business today. Rakuten Card, one of the largest credit card companies in Japan, adopted Java EE 7 for its credit card core systems architecture, from one of the oldest COBOL based mainframe in Japan. Additionally, we chose Apache Spark for super rapid batch execution platform. We completed this big core system migration project successfully.
You can learn why we choose Java EE, and Apache Spark for super rapid batch execution, and our experiences and lessons we learned. How to start such a the big project? Why we choose it, how we ported, how use Apache Spark for performance improvements, and launched with? We’ll answer these questions and any that you may have.
Additionally, we are going to unveil our future roadmap for expanding our systems as well, with the cutting edge technology and standards.
Fundamentals of Big Data, Hadoop project design and case study or Use case
General planning consideration and most necessaries in Hadoop ecosystem and Hadoop projects
This will provide the basis for choosing the right Hadoop implementation, Hadoop technologies integration, adoption and creating an infrastructure.
Building applications using Apache Hadoop with a use-case of WI-FI log analysis has real life example.
Postgres Vision 2018: How to Consume your Database Platform On-premisesEDB
The usual model for a database platform on-premises is to run it the way IT is usually operated - silo'd and capital- and labor-intensive. In the cloud, consumption means that you pay for what you use, with less heavy lifting to operate the platform. Presented at Postgres Vision 2018, this covers how HPE can deliver EDB Postgres in the data center or on the edge in a consumption model that is pay-per-use, elastic IT, operated for you, migrated, and integrated.
The document discusses modernizing a data warehouse using the Microsoft Analytics Platform System (APS). APS is described as a turnkey appliance that allows organizations to integrate relational and non-relational data in a single system for enterprise-ready querying and business intelligence. It provides a scalable solution for growing data volumes and types that removes limitations of traditional data warehousing approaches.
The document describes the BlueDBM architecture, which aims to optimize performance for big data analytics workloads by using FPGAs to manage flash storage and network communication in order to enable in-network processing and reduce data movement between system components. The prototype system demonstrates low latency and high bandwidth by removing data transport bottlenecks. Future work includes improving the flash controller, developing a distributed file system accelerator, and exploring medical and Hadoop applications.
HPE Hadoop Solutions - From use cases to proposalDataWorks Summit
Hadoop is now doing a lot more than just storage and Map/Reduce and always improving and innovating. It brings near real time, interactive and cost efficient features to do Big Data.
Join us to hear about solutions based on Hadoop, how they responds to specific customer needs, with what component(s) from the Hadoop ecosystem, based on what HPE Reference Architecture(s) for the platform.
Hadoop solutions like, ETL offloading, Predictive Analytics, Ad hoc query, Complex Event processing, Stream processing, Search, Machine learning, Deep learning, …
Based on software components like, Spark, Hive, HBase, Kafka, Storm, Flume, Impala and Elastic Search.
Speaker
John Osborn, SA, Hewlett Packard Enterprise
Modern data management using Kappa and streaming architectures, including discussion by EBay's Connie Yang about the Rheos platform and the use of Oracle GoldenGate, Kafka, Flink, etc.
The document discusses the challenges of handling massive data volumes from various sources and the need for big data analytics platforms to manage, store, and analyze this data. It then describes the key requirements of an effective big data analytics solution, such as managing huge data volumes, delivering fast analytics, supporting legacy tools and data scientists, and providing advanced analytics capabilities. The remainder of the document focuses on introducing the HPE Vertica Analytics Platform as a next-generation big data analytics solution that can scale limitlessly, perform analytics very fast, and be deployed on-premises, on Hadoop, or in the cloud.
Analytics and Lakehouse Integration Options for Oracle ApplicationsRay Février
The document discusses various options for extracting data from Oracle Fusion and Oracle EPM Cloud applications for analytics purposes. It outlines using the Business Intelligence Cloud Connector (BICC) to extract data to object storage, which can then be loaded into Oracle Analytics Cloud (OAC) or Autonomous Data Warehouse (ADW) for analysis. For EPM Cloud, it notes using the EPM Automate REST API wrapper or Oracle Data Integrator Marketplace connector. The document provides an overview of tools like OAC, ADW, ODI, and OCI Data Integration that can help transform and model the data for analytics and machine learning.
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionJames Serra
It can be quite challenging keeping up with the frequent updates to the Microsoft products and understanding all their use cases and how all the products fit together. In this session we will differentiate the use cases for each of the Microsoft services, explaining and demonstrating what is good and what isn't, in order for you to position, design and deliver the proper adoption use cases for each with your customers. We will cover a wide range of products such as Databricks, SQL Data Warehouse, HDInsight, Azure Data Lake Analytics, Azure Data Lake Store, Blob storage, and AAS as well as high-level concepts such as when to use a data lake. We will also review the most common reference architectures (“patterns”) witnessed in customer adoption.
Deploying Apache Spark and testing big data applications on servers powered b...Principled Technologies
To get the most out of the heaps of data your company is sitting on, you’ll need a platform such as Apache Spark to sort through the noise and get meaningful conclusions you can use to improve your services. If you need to get results from such an intense workload in a reasonable amount of time, your company should invest in a solution whose power matches your level of work.
This proof of concept has introduced you to a new solution based on the AMD EPYC line of processors. Based on the new Zen architecture, the AMD EPYC line of processors offers resources and features worth considering. In the Principled Technologies datacenter, we set up a big data solution consisting of whitebox servers powered by the AMD EPYC 7601—the top-of-the-line offering from AMD. We ran an Apache Spark workload and tested the solution with three components of the HiBench benchmarking suite. The AMD systems maintained a consistent level of performance across these tests.
This document discusses business continuity challenges related to increasing data growth and insufficient data protection solutions. It presents Microsoft solutions for addressing these challenges, including Azure Site Recovery for orchestrated replication and recovery across on-premises and Azure environments. The solutions aim to automate processes, eliminate tape management, increase protection breadth and depth, and provide testable disaster recovery.
Virtualizing SAP implementations can significantly reduce hardware costs, increase server utilization rates, and ease the burden of system upgrades and platform migrations. Virtualization allows organizations to streamline testing environments, accelerate upgrade planning, and meet availability and performance SLAs. It enables dynamic, on-demand provisioning of resources to build a more flexible and cost-effective IT infrastructure.
PerfCap offers an integrated performance and capacity planning software solution called PAWZ. PAWZ collects data from nodes, analyzes performance trends, and uses modeling to predict capacity needs and identify systems at risk of saturation. It helps answer questions like how much workload growth an existing configuration can support and what configuration changes would enable more growth. A case study showed PAWZ accurately modeled an Itanium system and identified hardware options to support 200% workload growth. PAWZ automates the capacity planning process.
Cephalocon APAC 2018
March 22-23, 2018 - Beijing, China
Lars Marowsky-Brée SUSE Distinguished Engineer, Ceph Advisory Board member
Marc Koderer, SAP OpenStack Evangelist
This document discusses Hortonworks and its mission to enable modern data architectures through Apache Hadoop. It provides details on Hortonworks' commitment to open source development through Apache, engineering Hadoop for enterprise use, and integrating Hadoop with existing technologies. The document outlines Hortonworks' services and the Hortonworks Data Platform (HDP) for storage, processing, and management of data in Hadoop. It also discusses Hortonworks' contributions to Apache Hadoop and related projects as well as enhancing SQL capabilities and performance in Apache Hive.
SQL Server 2016 provides a consistent platform for hybrid cloud environments with built-in in-memory capabilities, high performance, and enterprise-grade security and availability features. New capabilities in SQL Server 2016 include enhanced AlwaysOn availability groups for increased scalability, manageability and failover support. The document discusses SQL Server 2016's position as a leader in key industry analyses and outlines new features in high availability, in-memory technologies, and mobile and hybrid cloud capabilities.
Redefining ETL Pipelines with Apache Technologies to Accelerate Decision-Maki...Eran Chinthaka Withana
Pharmaceutical and medical device makers spend over $130bn each year collecting and analyzing new data, mostly through clinical trials. It costs over $1.8bn to bring a new drug to market, and over $4bn when factoring in the cost of failures. By more efficiently understanding and analyzing this data, new drugs can reach patients quicker, safer, and at a lower cost.
In this presentation, Eran will discuss how ETL pipelines can be built using the Apache and other open source projects to improve clinical trial development. We will examine how the system is built, the challenges we faced and how we are able to reduce cost, accelerate execution time, and improve results. We will also demonstrate how reliable resource allocation, scalable data ingestion adapters, on-demand and fault tolerant job deployments, and monitoring benefit clinical trial decision-making and execution.
Rain stor isilon_emc_real_Examine the Real Cost of Storing & Analyzing Your M...RainStor
This document discusses the real costs of storing and analyzing big data. It summarizes that unstructured data is growing rapidly, with over 70% growth expected between 2013 and 2017. Hadoop has become a popular platform for big data analytics. The document examines the total cost of ownership for big data storage and finds that costs can be significantly reduced by using scale-out NAS solutions like EMC Isilon combined with RainStor's analytical archive software. Case studies show banks and financial institutions saving over 90% on storage costs and getting faster query performance using this approach.
Similar to How HPE 3PAR Can Help YOur Mission Critical on Cloud : Seminar Protecting Mission-Critical Application Against Downtime (20)
Jika Anda adalah seorang pengembang aplikasi, Anda pasti menginginkan pembuatan aplikasi yang mudah dan menjangkau customer. Membuat CI/CD pipeline lazim dilakukan dalam pengembangan aplikasi sebagai penghubung antara pengembang dan operasional agar dapat terorganisir dengan baik. CI/CD pipeline dapat berfungsi sebagai pendorong proses pengembangan aplikasi sekaligus dapat mengurangi risiko dalam setiap tahap pengembangan. Selain itu, CI/CD juga berguna dalam membantu pengembang dan penguji dalam melakukan rilis dan update aplikasi dengan lebih cepat dan aman karena CI/CD dilakukan dalam lingkungan yang terstruktur. Meskipun dimungkinkan untuk mengeksekusi setiap langkah pipeline CI/CD secara manual, nilai sebenarnya dari pipeline CI/CD diwujudkan melalui otomatisasi.
A cloud-native development platform can empower you to respond to market trends and quickly turn ideas into products and services.
Cloud Native technologies are used to develop applications built with services packaged in containers, deployed as microservices, and managed on elastic infrastructure through agile DevOps processes and continuous delivery workflows.
Disaster Recovery Cookbook - Secret recipes for hybrid-cloud success.
Digital era make organizations must depend on system to operate, but sometime organizations facing the downtime and data loss which are expensive threats.
In this webinar you will learn how to selecting the right solution to protect, move, and recover mission-critical application with near 0% data loss in cost-effective model.
Converting Your Existing SAP Server Infrastructure to a Modern Cloud-Based Ar...PT Datacomm Diangraha
Raih produktivitas maksimal dengan menjalankan sistem SAP Anda di infrastruktur lokal pertama dan satu-satunya yang tersertifikasi langsung oleh SAP.
Ketahui bagaimana caranya untuk memasuki transformasi digital dengan meminimalkan komplektivitas, fleksibilitas yang tinggi, namun dengan TCO yang kompetitif dari Datacomm Cloud.
The document discusses the history of industrial revolutions from the first to the fourth revolution. It outlines the key technologies and transformations that occurred in each revolution. The first revolution involved mechanization powered by steam engines. The second revolution brought about mass production using assembly lines and electricity. The third revolution saw the rise of automation and computer technology. The fourth and current revolution is characterized by cyber-physical systems, cloud computing, additive manufacturing, the internet of things, autonomous robots, and enhanced cybersecurity.
This document discusses driving digital transformation through a future-proof digital platform. The platform allows organizations to rapidly create new value from applications, gain insights from data, and enable business innovation and continuity. It reduces costs while helping organizations become platform companies and develop new revenue streams. The platform connects internal and external systems and data to power new applications and insights in real-time. It also helps organizations address challenges of accelerating growth versus maintaining existing systems, and achieving agile transformation versus dealing with non-optimized cloud and on-premise systems.
This document discusses implementing mobile solutions across various business functions. It outlines plans to deploy basic QAD modules, integrate systems, and perform business analysis. Mobile applications are highlighted for sales, finance, and production functions to provide mobile access from tablets and laptops. Fixed asset auditing will also utilize a mobile app. The overall message is that "Mobile is now everything" and mobility needs to be a key part of the business strategy.
Digital twins represent physical objects in the digital world through 3D models and data from IoT sensors, CAD files, and product lifecycle management systems. They support Industry 4.0 goals by enabling connectivity between physical machines and digital representations to optimize operations and provide instructions to service technicians through augmented reality. Caterpillar aims to connect people and products with their IoT and AR strategy to make split-second decisions across entire mine sites. The Bosch Rexroth CytroPac hydraulic power unit is a revolutionary Industry 4.0 solution that monitors smart, connected products and visualizes digital insights from those products in their physical environment.
Drive Your Company Digital Transformation
Create Live Business by establishing a seamless, connected, and data-driven digital platform to rapidly create value from applications, gain insights, and enable live business. Migrate your SAP infrastructure to the cloud to accelerate growth and innovation, drive IT transformation, set up processes quickly and reduce costs and risks.
SAP on Datacomm Cloud offers a fully managed cloud service for SAP infrastructure to simplify maintenance and operations. It addresses challenges such as managing complexity, capacity, security, and availability. The service provides infrastructure as a service, platform as a service, software as a service, backup and disaster recovery, and full management of servers, storage, networking, and applications. Customers benefit from pay-as-you-use pricing, 99.9% service level agreements, and a single point of contact for support.
The document describes different approaches to hosting web applications and services, including a monolithic approach, a microservices approach, and a containerization approach using Kubernetes for container orchestration. It shows illustrations of a monolithic application with static content, REST APIs, and various services and databases hosted together on one server. It then shows how those components can be broken out into separate microservices and containers that are orchestrated by Kubernetes for improved scalability, availability, and maintenance.
Kubernetes as a Service provides benefits to executives, operators, and developers. For executives, it allows organizations to deliver new software and features more quickly, enabling faster time to market and multi-cloud operations for agility and resilience. Operators see more productive development teams with fewer impediments to development and faster software deployment times. Developers experience easier multi-cloud operations, more services delivered with less infrastructure, and a reduction in manual work and operator errors.
This document provides an overview of cloud-native development and Red Hat OpenShift:
- It discusses moving to cloud-native development through optimizing existing applications, developing new applications faster, and automating infrastructure.
- Red Hat OpenShift is positioned as the enterprise solution for running Kubernetes in production, as it addresses limitations of "raw" Kubernetes through features like developer tools, operations automation, and additional services.
- New features are highlighted for OpenShift 4.6, including improved application topology and monitoring, a new log forwarding API, and enhancements to the developer experience.
The document discusses how cloud computing can help Indonesia achieve its goal of becoming Indonesia 4.0 by facing the pandemic with technology. It outlines Datacomm's cloud and security products and services, how they can help SMEs overcome challenges in adopting digital technologies, and examples of technologies that can solve social issues like telehealth.
Sutedjo menjelaskan secara singkat mengenai sistem ERP SAP, beserta dengan manfaat yang diberikan seperti mampu menurunkan biaya inventory sebesar 20%, mempercepat proses bisnis sampai 50% dan mengurangi biaya TCO sampai 30%. Sutedjo juga menjelaskan mengenai pembagian tanggungjawab antara provider dengan konsumen ketika menjalankan SAP secara on-premise, hosted, dan menjalankan keseluruhan sistem di cloud.
Deri menjelaskan mengenai proses melakukan migrasi dari SAP on-prem ke cloud. Pemaparan materi juga mencakup mengenai tahapan-tahapan migrasi, aktivitas-aktivitas prioritas dalam melakukan migrasi, serta manfaat yang didapatkan ketika seluruh tahapan migrasi ke cloud selesai dilaksanakan.
Disaster Recovery: Understanding Trend, Methodology, Solution, and StandardPT Datacomm Diangraha
Disaster Recovery (DR)
Provides the technical ability to maintain critical services in the event of any unplanned incident that threatens these services or the technical infrastructure required to maintain them.
This document summarizes Disaster Recovery services from Datacomm using Zerto's virtual replication technology. Key points include:
- Datacomm offers DR services hosted in Jakarta and Bandung data centers with 99.9% availability SLA and up to zero data loss protection.
- Zerto's block-level replication technology allows for real-time data protection of both physical and virtual systems with checkpoints in seconds, and recovery of individual VMs or entire application groups.
- Consistency groups ensure related VMs and applications are recovered together at the same recovery point, improving scalability and recoverability compared to individual VM recovery.
- Customers can test disaster recovery with isolated VMs on demand without
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3Data Hops
Free A4 downloadable and printable Cyber Security, Social Engineering Safety and security Training Posters . Promote security awareness in the home or workplace. Lock them Out From training providers datahops.com
A Comprehensive Guide to DeFi Development Services in 2024Intelisync
DeFi represents a paradigm shift in the financial industry. Instead of relying on traditional, centralized institutions like banks, DeFi leverages blockchain technology to create a decentralized network of financial services. This means that financial transactions can occur directly between parties, without intermediaries, using smart contracts on platforms like Ethereum.
In 2024, we are witnessing an explosion of new DeFi projects and protocols, each pushing the boundaries of what’s possible in finance.
In summary, DeFi in 2024 is not just a trend; it’s a revolution that democratizes finance, enhances security and transparency, and fosters continuous innovation. As we proceed through this presentation, we'll explore the various components and services of DeFi in detail, shedding light on how they are transforming the financial landscape.
At Intelisync, we specialize in providing comprehensive DeFi development services tailored to meet the unique needs of our clients. From smart contract development to dApp creation and security audits, we ensure that your DeFi project is built with innovation, security, and scalability in mind. Trust Intelisync to guide you through the intricate landscape of decentralized finance and unlock the full potential of blockchain technology.
Ready to take your DeFi project to the next level? Partner with Intelisync for expert DeFi development services today!
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframePrecisely
Inconsistent user experience and siloed data, high costs, and changing customer expectations – Citizens Bank was experiencing these challenges while it was attempting to deliver a superior digital banking experience for its clients. Its core banking applications run on the mainframe and Citizens was using legacy utilities to get the critical mainframe data to feed customer-facing channels, like call centers, web, and mobile. Ultimately, this led to higher operating costs (MIPS), delayed response times, and longer time to market.
Ever-changing customer expectations demand more modern digital experiences, and the bank needed to find a solution that could provide real-time data to its customer channels with low latency and operating costs. Join this session to learn how Citizens is leveraging Precisely to replicate mainframe data to its customer channels and deliver on their “modern digital bank” experiences.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Trusted Execution Environment for Decentralized Process MiningLucaBarbaro3
Presentation of the paper "Trusted Execution Environment for Decentralized Process Mining" given during the CAiSE 2024 Conference in Cyprus on June 7, 2024.
Generating privacy-protected synthetic data using Secludy and MilvusZilliz
During this demo, the founders of Secludy will demonstrate how their system utilizes Milvus to store and manipulate embeddings for generating privacy-protected synthetic data. Their approach not only maintains the confidentiality of the original data but also enhances the utility and scalability of LLMs under privacy constraints. Attendees, including machine learning engineers, data scientists, and data managers, will witness first-hand how Secludy's integration with Milvus empowers organizations to harness the power of LLMs securely and efficiently.
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
GraphRAG for Life Science to increase LLM accuracy
How HPE 3PAR Can Help YOur Mission Critical on Cloud : Seminar Protecting Mission-Critical Application Against Downtime
1. How HPE 3PAR can
help your mission
critical on Cloud
Andi Bahar
HPE Country Product Manager - Storage
2. Apps and data
It’s driven by a new generation of apps and data
New business models
Improve operational
efficiencies
Increase productivity
Enhance
customer
experiences
Differentiated
products and
services
New sources of
data and insight
(“things”)
3. But I&O Leaders Must Look ‘Beyond the Box’
Predictive
Predictive Analytics to
anticipate and prevent
issues across the
infrastructure stack
Cloud Ready
Ubiquitous scale with
data mobility between
systems, sites, and
the public cloud
Timeless
Architectural and financial
flexibility to go anywhere
with futureproofed platforms
and guaranteed availability
For HPE and Channel Partner internal use only
4. Business Impact is Severe
DataApps
Loading…
You can’t afford an app data gap1
For HPE and Channel Partner internal use only
5. There is no one cause… and it’s too complex for humans to fix
Source: InfoSight analysis HPE customer base
Cross-stack
Best Practices
54%
Non-Storage
46%
Storage
Cross-stack
Resource
Contention
Cross-stack
Interoperability
Storage
Related
Storage
Network
Compute
VM/Container/DB
For HPE and Channel Partner internal use only
6. Overcoming Complexity and Eliminating Anxiety
54%
Cases solved
outside of storage*
Support you
actually like
>99.9999%
Measured
availability*
Global visibility
and learning
86%
Issues automatically
opened and solved*
Predicts and
prevents problems
For HPE and Channel Partner internal use only
7. HPE InfoSight Extends to 3PAR
Foundation for Predictive Support
Predict problems & automate resolution for 3PAR
Cloud-based Visibility
Advanced visualizations, system health, and predictive analytics
VMvision
Cross-stack analytics for virtualized environments
First release includes:
8. Eliminating the Guesswork in Managing Infrastructure
Preemptive Recommendations for Your Data Center
Example:
Change network
setting at port 3 to
avoid failover issue
Example:
Move VM3 to
Host2 as Host1 is
oversubscribed
Example:
Apply QoS to vol1 to
improve performance
on vol2
Prevent Issues
Before They Occur
Improve Performance
Proactively
Optimize
Available Resources
9. Performance impact scores
Correlated performance factors
Clear recommendations
Automated cases with resolution
HPE InfoSight...Gives Answers
IOPS
Machine Learning Correlation
Traditional Monitoring…Creates Questions
Should I be concerned?
What’s causing it?
How do I fix it?
Latency
!
Prescriptive Insights Beyond Monitoring
10. Storage
Network
Compute
VMs/DBs/Apps
Cross-Stack Analytics for VMware Environments
Latency Attribution
Identify root cause across
host, storage, or SAN
Noisy Neighbor
Determine if VMs are hogging
resources from another VM
Host & Memory Analytics
Visibility into host CPU
and memory metrics
Top Performing VMs
Visibility into Top 10 VMs
by IOPs and Latency
Inactive VMs
Visibility into inactive VMs to
repurpose/reclaim resources
11. Diagnoses Abnormal Latency with VMVision
HPE InfoSight VMVision
Latency(ms)
0
20
40
From ToJan 24, 2017 March 20, 2017
Monday, March 6, 2017
24 Jan 30 Jan 4 Feb 7 Feb 11 Feb 15 Feb 19 Feb 22 Feb 26 Feb 1 Mar 4 Mar 7 Mar 10 Mar 14 Mar 16 Mar
Host Network Storage
Latency Spike
Latency Host 0.3 Network 0.13 Storage 0.24
12. Diagnoses Abnormal Latency with VMVision
VDI Sharepoint
Datastore: esxi289-1x
I/O Total: 5,278,000
Avg Latency: 23.31msec
Latency(ms)
0
20
40
From To
Host Network Storage
Latency Spike
Latency Host 0.3 Network 0.13 Storage 0.24
Jan 24, 2017 March 20, 2017
Monday, March 6, 2017
24 Jan 30 Jan 4 Feb 7 Feb 11 Feb 15 Feb 19 Feb 22 Feb 26 Feb 1 Mar 4 Mar 7 Mar 10 Mar 14 Mar 16 Mar
13. Takes the Guess Work Out of Planning
Predicts Upgrade Needs
Sizes Infrastructure Based on Applications
Application
Oracle data (GB)
Apply Compression
Percent reads
…
Oracle Database
1600
Yes
50
AF5000
AF7000
AF1000
…
Array Quantity Recommendation
1
1
2
…
R Recommended
R Recommended
T Not Recommended
…
HPE InfoSight Answers:
How has my data usage trended?
When am I going to run out of capacity?
What’s the right SKU that I need for Oracle?
What if I ran these apps...on the same array?
Likely to
exceed
threshold:
Sept 25
!
14. Acceleration of established and emerging applications
App Acceleration
Application workload integration
Databases Virtualization ContainersBlock / File
Control data
Data payload
Inter-Node &
Cache IO
RAID
Calculations
Data
Deduplication
Space
Reclamation
Mixed
Workload
3PAR ASIC
Replication Snapshots Compression
RAID
Rebuilds
Thin
Provisioning
Sub-Lun
Tiering
CPU
Multi-tenant and parallelized IO processing
Every controller can access every disk, port, and resource
#1 for every workload in Gartner’s Critical Capabilities report 2 years in a row
15. HPE 3PAR data-centric architecture
– I/O path flash optimizations
Adaptive Sparing
One technology to improve flash
performance, capacity and endurance
Express Layout
Drives performance and efficiency to
maximize utilization of SSDs
Persistent Checksum
Guarantee host-to-disk consistency,
eliminating data corruption
System-Wide Striping
All resources are used evenly, from
front-end ports through CPUs to SSDs
1 2 3 4
1
2 4
SolidStateDisks
Hostservers
3
3PAR ASIC
16. Legacy platforms
Non-disruptively* migrate existing
workloads from legacy arrays onto
HPE 3PAR using Online Import
Non-disruptive migration and technology refreshes to 3PAR
– Array-level tools to migrate data quickly and easily
Single array or federated systems
Legacy 3PAR
Peer Motion
3rd Party
Online Import
3PAR Federation
Federate HPE 3PAR systems together
to form a single pool of resources for
management and provisioning
Technology refresh
Migrate workloads from existing 3PAR
systems onto new 3PAR platforms
with no disruption to services
17. HPE RMC
– Combining the performance of replication with the protection of backups
Fast
Deliver on SLAs with fast,
non-disruptive, application
consistent snapshots, backup
and recovery
Efficient
Reduce cost & complexity
with backup direct from 3PAR
to StoreOnce
Simple
Control backup & recovery
direct and seamlessly from
their native hypervisor and
application interfaces
Reliable
Protect applications with the
availability of snapshots and the
protection of backups
23x Faster backup
15x Faster restore
*compared to traditional server-based backup environments
RMC
3PAR StoreOnce
18. Architected for Efficiency Today & for What’s Next
Deduplication
Compression
Only HPE
Data Packing
Max system efficiency
with near-zero garbage
Zero Detect1
2
3
4
Guaranteed
Data
Compaction
Always
In-line and
selective
Ready
Storage Class
Memory
HPE 3PAR Adaptive Data Reduction
Optimize Investment
50%
Lower
latency
60%
Increased
IOPS
100%
Future
proof
HPE 3PAR 3D Cache
First with SCM
HPE 3PAR 3D Cache
An extension of DRAM cache based on
Storage
Class Memory (Intel® Optane™) over NVMe to
deliver accelerate flash itself
NEW: HPE 3PAR Flash Now - All-flash
What happens when flash is a given and everyone’s a ‘leader’?
InfoSight delivers 3 primary benefits. It predicts and prevents problems across the stack. It provides customers the visibility they need to quickly resolve complex issues. And, it delivers a support experience that customers actually like . Let’s unpack this.
1st. Our goal’s not to just show you that you have a problem and how to resolve it resolve it, but to prevent you from having a problem to being with. As InfoSight analyzes the installed base, it’s predicting and preventing in every customers' environment. And if it uncovers one, it proactively resolves it. And these are problems that exist outside of storage even. For instance, we’ve prevented thousands of systems from experiencing issues across the network, servers, and hypervisor.
2nd. InfoSight sees what others can’t, providing you clear insights up and down the infrastructure stack, across your environments, in the past and into the future. Traditional tools cause more trouble than good – requiring youto be their own data scientist and interpret what the data means. Instead, Infosight has embedded data science and machine learning to just give the right answers. And with all the systems connected in the cloud, infrastructure gets smarter learning from each other.
3rd. Because of InfoSight, we’ve transformed te support experience...one that customers actually like. InfoSight automates the tasks handled by traditional level 1 and 2 support staff. This has allowed Nimble to build a support organization made up entirely of level 3 experts. So in the rare case that customers need support, in less than min customers speak directly with a Level 3 expert who will quickly resolve the problem, even if it’s outside of storage. Customers can forget the mundane questions and painful escalations.
And the proofs in the numbers...we’re 6-9s across our installed base for all systems including the first array that shipped | 86% | 54%
HPE InfoSight Tells What to Do to:
(read slide)
So our AI recommendations eliminate the guesswork in managing infrastructure by making decisions for IT that ensure the optimal environment all the time. This paves the path for an autonomous data center as today we tell our customers what to do and tomorrow we can just do it for them.
InfoSight picks up where monitoring falls short. One of the big limitations with monitoring is that it just creates more questions (read bullets)…
and looking at latency is misleading because different workloads carry different inherent amounts of latency, so you end up chasing after the wrong events and get false alarms…
whereas InfoSight gives answers. It applies a machine learning algorithm that filters the noise and simply highlights problematic periods based on the underlining I/O signatures – and shows what’s causing it and how to resolve it.
Now when it comes to performance diagnostics - you have to a complete view and be able to correlate storage with other infrastructure systems to diagnose problems. Take for instance latency - it’s easy identify that there’s excessive latency, but diagnosing the root cause – be it in the host, VMs, network, array – that’s the challenge. So we’ve built in cross-stack correlations made available through our customer portal and this gives customers a single source of truth, which is customers to…(read bullets)
Take for instance this example here with VM latency. Traditional monitoring tools can show you that latency occurred on a time-series graph, but that doesn’t help you diagnose the problem so you can fix it. With VMvision, a feature in the InfoSight customer portal, we pull data from your VMware environment and correlate this with the data we have across the infrastructure to clearly tell customers whether the host, network, or storage array is causing the latency.
But we don’t stop there. Here we’re able to see that the server is contributing the most to the latency and then drill even further to see that it is the VDI application holding it up. And based on the root cause, customers can improve their application performance moving to a different data store, applying QoS controls, optimizing resources, and upgrading components.
InfoSight also takes the guess work out of planning. It accurately predicts future capacity, performance and bandwidth needs and right sizes your new infrastructure through app-centric modeling. Our forecasting models are statistically highly confident using autoregressive and monte carlo simulations. And, planning for new infrastructure is simple, because InfoSight lets customers model different scenarios, applications, users, and performance requirements using a machine learning model that continously improves.
All of the infrastructure we talk about is simply there to help your business and much of that comes from the applications that power your innovation agenda.
How can the right infrastructure support your application transformation?
First: the underlying architecture
- Build for speed… designed for IO, not for a specific Flash Media such as NAND Flash
- Built for massive consolidation of mixed application loads… which is what Hybrid IT is
- ASIC to offload processing of many core functions so the system can maintain quality of service
Second: the application ecosystem
- Mode 1 apps… oracle, HANA, Microsoft, VMware… deep application integration
Example is VMware vVOLS technology… the picked HPE as the design partner and reference platform
- Mode 2… Partners like Docker and Mesosphere which HPE actually is investing in a t a corporate level
3D Cache works as an extension of DRAM cache
and combines HPE 3PAR’s intelligent caching algorithms with Storage Class Memory (in this case Intel® Optane™) and NVMe to deliver extreme application acceleration