Apache Geode (incubating) is the core of Pivotal Gemfire now available as an open source project governed by Apache Software Foundation Incubator. The legacy of Pivotal Gemfire and the ASF community uniquely position Geode as a secret ingredient for modern-day data management architectures.
These types of architectures require a robust in-memory data grid solution to handle a variety of use cases, ranging from enterprise-wide caching to real-time transactional applications at scale. In addition, as memory size and network bandwidth growth continues to outpace those of disk, the importance of managing large pools of RAM at scale increases. It is essential to innovate at the same pace.
Apache Geode (incubating) has all the right ingredients to do for RAM what HDFS has done for direct attach disks. The excitement (and funding!) in this area of big data ecosystem is palpable, and the ASF is the place where the innovation is happening. Come to this session to understand: a brief history of Geode, architecture and use cases, design philosophy and principles, but most importantly: how you too can participate in the in-memory data center revolution.
An Introduction to Apache Geode (incubating)Anthony Baker
Geode is a data management platform that provides real-time, consistent access to data-intensive applications throughout widely distributed cloud architectures.
Geode pools memory (along with CPU, network and optionally local disk) across multiple processes to manage application objects and behavior. It uses dynamic replication and data partitioning techniques for high availability, improved performance, scalability, and fault tolerance. Geode is both a distributed data container and an in-memory data management system providing reliable asynchronous event notifications and guaranteed message delivery.
Pivotal GemFire has had a long and winding journey, starting in 2002, winding through VMware, Pivotal, and finding it's way to Apache in 2015. Companies using GemFire have deployed it in some of the most mission critical latency sensitive applications in their enterprises, making sure tickets are purchased in a timely fashion, hotel rooms are booked, trades are made, and credit card transactions are cleared. This presentation discusses:
- A brief history of GemFire
- Architecture and use cases
- Why we are taking GemFire Open Source
- Design philosophy and principles
But most importantly: how you can join this exciting community to work on the bleeding edge in-memory platform.
Apache Geode (incubating) is the core of Pivotal Gemfire now available as an open source project governed by Apache Software Foundation Incubator. The legacy of Pivotal Gemfire and the ASF community uniquely position Geode as a secret ingredient for modern-day data management architectures.
These types of architectures require a robust in-memory data grid solution to handle a variety of use cases, ranging from enterprise-wide caching to real-time transactional applications at scale. In addition, as memory size and network bandwidth growth continues to outpace those of disk, the importance of managing large pools of RAM at scale increases. It is essential to innovate at the same pace.
Apache Geode (incubating) has all the right ingredients to do for RAM what HDFS has done for direct attach disks. The excitement (and funding!) in this area of big data ecosystem is palpable, and the ASF is the place where the innovation is happening. Come to this session to understand: a brief history of Geode, architecture and use cases, design philosophy and principles, but most importantly: how you too can participate in the in-memory data center revolution.
An Introduction to Apache Geode (incubating)Anthony Baker
Geode is a data management platform that provides real-time, consistent access to data-intensive applications throughout widely distributed cloud architectures.
Geode pools memory (along with CPU, network and optionally local disk) across multiple processes to manage application objects and behavior. It uses dynamic replication and data partitioning techniques for high availability, improved performance, scalability, and fault tolerance. Geode is both a distributed data container and an in-memory data management system providing reliable asynchronous event notifications and guaranteed message delivery.
Pivotal GemFire has had a long and winding journey, starting in 2002, winding through VMware, Pivotal, and finding it's way to Apache in 2015. Companies using GemFire have deployed it in some of the most mission critical latency sensitive applications in their enterprises, making sure tickets are purchased in a timely fashion, hotel rooms are booked, trades are made, and credit card transactions are cleared. This presentation discusses:
- A brief history of GemFire
- Architecture and use cases
- Why we are taking GemFire Open Source
- Design philosophy and principles
But most importantly: how you can join this exciting community to work on the bleeding edge in-memory platform.
Next Generation Scheduling for YARN and K8s: For Hybrid Cloud/On-prem Environ...DataWorks Summit
Scheduler of a container orchestration system, such as YARN and K8s, is a critical component that users rely on to plan resources and manage applications.
And if we assess where we are today, in YARN effectively it had two power schedulers (Fair and Capacity scheduler) and both serve many strong use cases in big data ecosystem. It can scale up to 50k nodes per cluster, and schedule 20k containers per second, and extremely efficient to manage batch workloads.
K8s default scheduler is an industry-proven solution to efficiently manage long-running services. As more big data apps are moving to K8s and cloud world, but many features like hierarchical queues to support multi-tenancy better, fairness resource sharing, and preemption, etc. are either missing or not mature enough at this point of time to support big data apps running on K8s.
At this point, there is no solution that exists to address the needs of having a unified resource scheduling experiences across platforms. That makes it extremely difficult to manage workloads running on different environments, from on-premise to cloud.
Hence evolving a common scheduler powered from YARN and K8s’s legacy capabilities and improving towards cloud use cases will focus more on use cases like:
Better bin-packing scheduling (and gang scheduling)
Autoscale up and shrink policy management
Effectively run batch workloads and services with clear SLA’s
In summary, we are improving core scheduling capabilities to manage both K8s and YARN cluster which is cloud aware as a separate initiative and above-mentioned cases will be the core focus of this initiative. More details of our works will be presented in this talk.
Scale Out Your Big Data Apps: The Latest on Pivotal GemFire and GemFire XDVMware Tanzu
Companies across all industries and sizes are investing in strategic custom applications to enhance their competitive advantages. Developing these applications requires continuous improvement, based on insights gleaned from collecting and analyzing the data that they generate.
Big Data for high-performing, scalable and reliable applications requires a new set of tools and technologies. Pivotal GemFire is a distributed in-memory NoSQL data management solution for creating high-scale custom applications. Pivotal GemFire XD supports structured data as part the industry’s first Hadoop-based platform for creating closed loop analytics solutions – enabling businesses to continuously optimize real-time automation in their applications.
In this webinar, we will discuss different open-source models and different ways open source communities are organized. Understanding these key concepts is essential when selecting a strategic open-source platform. We will explore how the PostgreSQL community ensures that it stays independent, remains vibrant, drives innovation, and provides a reliable long-term platform for strategic IT projects.
YARN Containerized Services: Fading The Lines Between On-Prem And CloudDataWorks Summit
Apache Hadoop YARN is the modern distributed operating system for big data applications. In Apache Hadoop 3.1.0, YARN added a service framework that supports long-running services. This new capability goes hand in hand with the recent improvements in YARN to support Docker containers. Together these features have made it significantly easier to bring new applications and services to YARN.
In this talk you will learn about YARN service framework, its new containerization capabilities and how it lays the foundation for a hybrid and uniform architecture for compute and storage across on-prem and multi-cloud environments. This will include examples highlighting how easy it is to bring applications to the YARN service framework as well as how to containerize applications.
Here's what to expect in this talk:
- Motivation for YARN service framework and containerization
- YARN service framework overview
- YARN service examples
- Containerization overview
- Containerization for Big Data and non Big Data workloads - wait that's everything
How-To: Zero Downtime Migrations from Oracle to a Cloud-Native PostgreSQLYugabyteDB
Presentation by Rajkumar Sen, Chief Technical Architect, Founder - BlitzzIO, recorded at Distributed SQL Summit on Sept 20, 2019.
https://vimeo.com/362348541
distributedsql.org/
Hadoop {Submarine} Project: Running Deep Learning Workloads on YARNDataWorks Summit
Deep learning is useful for enterprises tasks in the field of speech recognition, image classification, AI chatbots and machine translation, just to name a few.
In order to train deep learning/machine learning models, applications such as TensorFlow / MXNet / Caffe / XGBoost can be leveraged. And sometimes these applications will be used together to solve different problems.
To make distributed deep learning/machine learning applications easily launched, managed, monitored. Hadoop community has introduced Submarine project along with other improvements such as first-class GPU support, container-DNS support, scheduling improvements, etc. These improvements make distributed deep learning/machine learning applications run on YARN as simple as running it locally, which can let machine-learning engineers focus on algorithms instead of worrying about underlying infrastructure. Also, YARN can better manage a shared cluster which runs deep learning/machine learning and other services/ETL jobs with these improvements.
In this session, we will take a closer look at Submarine project as well as other improvements and show how to run these deep learning workloads on YARN with demos. Audiences can start trying running these workloads on YARN after this talk.
Speakers:
Sunil Govindan, Staff Engineer
Hortonworks
Zhankun Tank, Staff Engineer
Hortonworks
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...DataWorks Summit
At Walmart Labs, we get close to 200 million customers every week across our 11000+ stores & online all over the world. As part of our data lake initiatives, we started a full-fledged migration to Hadoop based solutions for all our data needs at lower cost than traditional RDBMS/MPP solutions. While we have seen significant success in migrating to Hadoop based Data Lake solutions from traditional RDBMS based data warehouses, one challenge that we have faced is around migrating end users to Hadoop due to query latency issues. To solve this problem and to reduce the cost of the solution, Walmart Labs started using Hive LLAP.
In this session, we will introduce you to Hive LLAP, its architecture, best practices for deployment to achieve sub-second query performance and its cost comparison with traditional RDBMS systems for the same use case.
Traditionally database systems were optimized either for OLAP either for OLTP workloads. Such mainstream DBMSes like Postgres,MySQL,... are mostly used for OLTP, while Greenplum, Vertica, Clickhouse, SparkSQL,... are oriented on analytic queries. But right now many companies do not want to have two different data stores for OLAP/OLTP and need to perform analytic queries on most recent data. I want to discuss which features should be added to Postgres to efficiently handle HTAP workload.
#GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Gree...PivotalOpenSourceHub
In this session we explore a case study of a large-scale government fraud detection program that prevents billions of dollars in fraudulent payments each year leveraging the beta release of the GemFire+Greenplum Connector, which is planned for release in GemFire 9. Topics will include an overview of the system architecture and a review of the new GemFire+Greenplum Connector features that simplify use cases requiring a blend of massively parallel database capabilities and accelerated in-memory data processing.
New Integration Options with Postgres Enterprise Manager 8.0EDB
Postgres Enterprise Manager (PEM), is a comprehensive, customizable solution providing a GUI to control, monitor, and optimize your PostgreSQL deployment. The newly released PEM 8.0 makes it easier to integrate with other parts of your infrastructure. Key features now include more API endpoints and the introduction of webhooks, providing more options for integrating with popular ITSMs like ServiceNow.
In this webinar we will cover:
- Why you need PostgreSQL specific tooling
- Overview of Postgres Enterprise Manager
- Manage, Monitor, and Tune PostgreSQL
- A look at recent REST API updates
- Examples of a common tasks utilizing the API
- What’s new in Postgres Enterprise Manager 8.0
- Webhooks for event-based alerting
- Enhanced details on why an alert triggered
- Security fixes from pen testing
- Examples of API and webhooks in action
Pulsar is used by a portfolio of products at Splunk for stream processing of different types of data, including metrics and logs. In this talk, Karthik Ramasamy will share how Splunk helped a flagship customer scale a Pulsar deployment to handle 10 PB/day in a single cluster. He will talk about the journey, the challenges faced, and the trade-offs made to scale Pulsar and operate it reliably and stably in Google Cloud Platform (GCP).
Always upgrade! There are hundreds of fixes between each PostgreSQL release, and an important number of them are security fixes! Logical replication allows making major upgrades with minimal downtime and feasible cons.
Highly available databases are essential to organizations depending on mission-critical, 24/7 access to data. Postgres is widely recognized as an excellent open-source database, with critical maturity and features that allow organizations to scale and achieve high availability.
This webinar will explore:
- Evolution of replication in Postgres
- Streaming replication
- Logical replication
- Replication for high availability
- Important high availability parameters
- Options to monitor high availability
- HA infrastructure to patch the database with minimal downtime
- EDB Postgres Failover Manager (EFM)
- EDB tools to create a highly available Postgres architecture
One key feature that differentiates HBase from other distributed databases is its support of coprocessors. Bloomberg develops and manages some very low-latency systems that service real-time requests. In order to achieve real-time speeds, it was necessary to utilize coprocessors, which are similar to traditional stored procedures. As a result, we were able to match the average latency of an HBase cluster with that of a traditional database. This was done by using coprocessors to parallelize a lot of data computation and reduce the number of round-trips to the cluster by a factor of 5, thereby lowering the amount of data sent over the wire by 5. However, there are also significant challenges to managing coprocessors in a production environment. In this talk, I will to review the use case for HBase coprocessors and some practical tips on how to properly develop and deploy them. Some of the key topics covered in this talk are:
Type of coprocessors
Development challenges
Deployment challenges
Speakers
Amit Anand, Senior Software Developer, Bloomberg LP
Esther Kundin, Senior Software Engineer, Bloomberg LP
Lean engineering for lean/balanced teams: lessons learned (and still learning...Balanced Team
Bill Scott, PayPal
How do you take a gigantic organization and begin to transform the products? One key is to change the way teams work together to build experiences by following a Lean UX methodology. However, essential to this is to have engineering fully onboard as an integrated partner in the process. In this talk, Bill Scott will share 6 principles gleaned from the last two years to transforming engineering and the technology stack to support this working model.
Understanding how to run Microservices at scale is becoming a key success factor for organisations. Mesos makes it easy to deploy robust architectures in the Cloud. Today's technologies offer simple solutions to create RESTfull services, containerize them and deploy them in Mesos but is this the best way to expose Microservices ? As the number of Microservices increase the inter-communication between them becomes more complicated, and we soon realize we have new questions awaiting our answers: how do Microservices authenticate ? how to monitor who's using their APIs ? how to protect them from attacks ? how to set throttling and rate limiting rules across a cluster ? How to control which service allows public access and which one is private ? Come and learn a scalable architecture to manage Microservices in Mesos by integrating an API Management layer inside your Mesos clusters.
Next Generation Scheduling for YARN and K8s: For Hybrid Cloud/On-prem Environ...DataWorks Summit
Scheduler of a container orchestration system, such as YARN and K8s, is a critical component that users rely on to plan resources and manage applications.
And if we assess where we are today, in YARN effectively it had two power schedulers (Fair and Capacity scheduler) and both serve many strong use cases in big data ecosystem. It can scale up to 50k nodes per cluster, and schedule 20k containers per second, and extremely efficient to manage batch workloads.
K8s default scheduler is an industry-proven solution to efficiently manage long-running services. As more big data apps are moving to K8s and cloud world, but many features like hierarchical queues to support multi-tenancy better, fairness resource sharing, and preemption, etc. are either missing or not mature enough at this point of time to support big data apps running on K8s.
At this point, there is no solution that exists to address the needs of having a unified resource scheduling experiences across platforms. That makes it extremely difficult to manage workloads running on different environments, from on-premise to cloud.
Hence evolving a common scheduler powered from YARN and K8s’s legacy capabilities and improving towards cloud use cases will focus more on use cases like:
Better bin-packing scheduling (and gang scheduling)
Autoscale up and shrink policy management
Effectively run batch workloads and services with clear SLA’s
In summary, we are improving core scheduling capabilities to manage both K8s and YARN cluster which is cloud aware as a separate initiative and above-mentioned cases will be the core focus of this initiative. More details of our works will be presented in this talk.
Scale Out Your Big Data Apps: The Latest on Pivotal GemFire and GemFire XDVMware Tanzu
Companies across all industries and sizes are investing in strategic custom applications to enhance their competitive advantages. Developing these applications requires continuous improvement, based on insights gleaned from collecting and analyzing the data that they generate.
Big Data for high-performing, scalable and reliable applications requires a new set of tools and technologies. Pivotal GemFire is a distributed in-memory NoSQL data management solution for creating high-scale custom applications. Pivotal GemFire XD supports structured data as part the industry’s first Hadoop-based platform for creating closed loop analytics solutions – enabling businesses to continuously optimize real-time automation in their applications.
In this webinar, we will discuss different open-source models and different ways open source communities are organized. Understanding these key concepts is essential when selecting a strategic open-source platform. We will explore how the PostgreSQL community ensures that it stays independent, remains vibrant, drives innovation, and provides a reliable long-term platform for strategic IT projects.
YARN Containerized Services: Fading The Lines Between On-Prem And CloudDataWorks Summit
Apache Hadoop YARN is the modern distributed operating system for big data applications. In Apache Hadoop 3.1.0, YARN added a service framework that supports long-running services. This new capability goes hand in hand with the recent improvements in YARN to support Docker containers. Together these features have made it significantly easier to bring new applications and services to YARN.
In this talk you will learn about YARN service framework, its new containerization capabilities and how it lays the foundation for a hybrid and uniform architecture for compute and storage across on-prem and multi-cloud environments. This will include examples highlighting how easy it is to bring applications to the YARN service framework as well as how to containerize applications.
Here's what to expect in this talk:
- Motivation for YARN service framework and containerization
- YARN service framework overview
- YARN service examples
- Containerization overview
- Containerization for Big Data and non Big Data workloads - wait that's everything
How-To: Zero Downtime Migrations from Oracle to a Cloud-Native PostgreSQLYugabyteDB
Presentation by Rajkumar Sen, Chief Technical Architect, Founder - BlitzzIO, recorded at Distributed SQL Summit on Sept 20, 2019.
https://vimeo.com/362348541
distributedsql.org/
Hadoop {Submarine} Project: Running Deep Learning Workloads on YARNDataWorks Summit
Deep learning is useful for enterprises tasks in the field of speech recognition, image classification, AI chatbots and machine translation, just to name a few.
In order to train deep learning/machine learning models, applications such as TensorFlow / MXNet / Caffe / XGBoost can be leveraged. And sometimes these applications will be used together to solve different problems.
To make distributed deep learning/machine learning applications easily launched, managed, monitored. Hadoop community has introduced Submarine project along with other improvements such as first-class GPU support, container-DNS support, scheduling improvements, etc. These improvements make distributed deep learning/machine learning applications run on YARN as simple as running it locally, which can let machine-learning engineers focus on algorithms instead of worrying about underlying infrastructure. Also, YARN can better manage a shared cluster which runs deep learning/machine learning and other services/ETL jobs with these improvements.
In this session, we will take a closer look at Submarine project as well as other improvements and show how to run these deep learning workloads on YARN with demos. Audiences can start trying running these workloads on YARN after this talk.
Speakers:
Sunil Govindan, Staff Engineer
Hortonworks
Zhankun Tank, Staff Engineer
Hortonworks
Hive LLAP: A High Performance, Cost-effective Alternative to Traditional MPP ...DataWorks Summit
At Walmart Labs, we get close to 200 million customers every week across our 11000+ stores & online all over the world. As part of our data lake initiatives, we started a full-fledged migration to Hadoop based solutions for all our data needs at lower cost than traditional RDBMS/MPP solutions. While we have seen significant success in migrating to Hadoop based Data Lake solutions from traditional RDBMS based data warehouses, one challenge that we have faced is around migrating end users to Hadoop due to query latency issues. To solve this problem and to reduce the cost of the solution, Walmart Labs started using Hive LLAP.
In this session, we will introduce you to Hive LLAP, its architecture, best practices for deployment to achieve sub-second query performance and its cost comparison with traditional RDBMS systems for the same use case.
Traditionally database systems were optimized either for OLAP either for OLTP workloads. Such mainstream DBMSes like Postgres,MySQL,... are mostly used for OLTP, while Greenplum, Vertica, Clickhouse, SparkSQL,... are oriented on analytic queries. But right now many companies do not want to have two different data stores for OLAP/OLTP and need to perform analytic queries on most recent data. I want to discuss which features should be added to Postgres to efficiently handle HTAP workload.
#GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Gree...PivotalOpenSourceHub
In this session we explore a case study of a large-scale government fraud detection program that prevents billions of dollars in fraudulent payments each year leveraging the beta release of the GemFire+Greenplum Connector, which is planned for release in GemFire 9. Topics will include an overview of the system architecture and a review of the new GemFire+Greenplum Connector features that simplify use cases requiring a blend of massively parallel database capabilities and accelerated in-memory data processing.
New Integration Options with Postgres Enterprise Manager 8.0EDB
Postgres Enterprise Manager (PEM), is a comprehensive, customizable solution providing a GUI to control, monitor, and optimize your PostgreSQL deployment. The newly released PEM 8.0 makes it easier to integrate with other parts of your infrastructure. Key features now include more API endpoints and the introduction of webhooks, providing more options for integrating with popular ITSMs like ServiceNow.
In this webinar we will cover:
- Why you need PostgreSQL specific tooling
- Overview of Postgres Enterprise Manager
- Manage, Monitor, and Tune PostgreSQL
- A look at recent REST API updates
- Examples of a common tasks utilizing the API
- What’s new in Postgres Enterprise Manager 8.0
- Webhooks for event-based alerting
- Enhanced details on why an alert triggered
- Security fixes from pen testing
- Examples of API and webhooks in action
Pulsar is used by a portfolio of products at Splunk for stream processing of different types of data, including metrics and logs. In this talk, Karthik Ramasamy will share how Splunk helped a flagship customer scale a Pulsar deployment to handle 10 PB/day in a single cluster. He will talk about the journey, the challenges faced, and the trade-offs made to scale Pulsar and operate it reliably and stably in Google Cloud Platform (GCP).
Always upgrade! There are hundreds of fixes between each PostgreSQL release, and an important number of them are security fixes! Logical replication allows making major upgrades with minimal downtime and feasible cons.
Highly available databases are essential to organizations depending on mission-critical, 24/7 access to data. Postgres is widely recognized as an excellent open-source database, with critical maturity and features that allow organizations to scale and achieve high availability.
This webinar will explore:
- Evolution of replication in Postgres
- Streaming replication
- Logical replication
- Replication for high availability
- Important high availability parameters
- Options to monitor high availability
- HA infrastructure to patch the database with minimal downtime
- EDB Postgres Failover Manager (EFM)
- EDB tools to create a highly available Postgres architecture
One key feature that differentiates HBase from other distributed databases is its support of coprocessors. Bloomberg develops and manages some very low-latency systems that service real-time requests. In order to achieve real-time speeds, it was necessary to utilize coprocessors, which are similar to traditional stored procedures. As a result, we were able to match the average latency of an HBase cluster with that of a traditional database. This was done by using coprocessors to parallelize a lot of data computation and reduce the number of round-trips to the cluster by a factor of 5, thereby lowering the amount of data sent over the wire by 5. However, there are also significant challenges to managing coprocessors in a production environment. In this talk, I will to review the use case for HBase coprocessors and some practical tips on how to properly develop and deploy them. Some of the key topics covered in this talk are:
Type of coprocessors
Development challenges
Deployment challenges
Speakers
Amit Anand, Senior Software Developer, Bloomberg LP
Esther Kundin, Senior Software Engineer, Bloomberg LP
Lean engineering for lean/balanced teams: lessons learned (and still learning...Balanced Team
Bill Scott, PayPal
How do you take a gigantic organization and begin to transform the products? One key is to change the way teams work together to build experiences by following a Lean UX methodology. However, essential to this is to have engineering fully onboard as an integrated partner in the process. In this talk, Bill Scott will share 6 principles gleaned from the last two years to transforming engineering and the technology stack to support this working model.
Understanding how to run Microservices at scale is becoming a key success factor for organisations. Mesos makes it easy to deploy robust architectures in the Cloud. Today's technologies offer simple solutions to create RESTfull services, containerize them and deploy them in Mesos but is this the best way to expose Microservices ? As the number of Microservices increase the inter-communication between them becomes more complicated, and we soon realize we have new questions awaiting our answers: how do Microservices authenticate ? how to monitor who's using their APIs ? how to protect them from attacks ? how to set throttling and rate limiting rules across a cluster ? How to control which service allows public access and which one is private ? Come and learn a scalable architecture to manage Microservices in Mesos by integrating an API Management layer inside your Mesos clusters.
Pivotal Big Data Suite: A Technical OverviewVMware Tanzu
How and why are companies like Uber, Netflix and Airbnb so successful, what you need to in order to become successful in the same way that they are and how Pivotal can help you with that.
Speaker: Les Klein, EMEA CTO Data, Pivotal
Driving Real Insights Through Data ScienceVMware Tanzu
Major changes in industries have been brought about by the emergence of data-driven discoveries and applications. Many organizations are bringing together their data, and looking to drive change. But the ability to generate new insights in real time from a massive sets of data is still far from commonplace.
At this event, data technology experts and data scientists from Pivotal provided the latest business perspective on how data science and engineering can be used to accelerate the generation of new insights.
For information about upcoming Pivotal events, please visit: http://pivotal.io/news-events/#events
Troubleshooting App Health and Performance with PCF Metrics 1.2VMware Tanzu
Join Allen Duet and Pieter Humphrey from Pivotal, to learn how PCF Metrics enhances the developer experience on Pivotal Cloud Foundry, with a simple and powerful way to troubleshoot app health and performance issues. You will see how, with a single, unified interface for events, logs, and metrics, app devs can easily navigate graphs to identify problems and then view logs for that time slice.
My presentation slides from Hadoop Summit, San Jose, June 28, 2016. See live video at http://www.makedatauseful.com/vid-solving-performance-problems-hadoop/ and follow along for context.
Moving analytic workloads into production - specific technical challenges and best practices for engineering SQL in Hadoop solutions. Highlighting the next generation engineering approaches to the secret sauce we have implemented in the Actian VectorH database.
Digital Transformation (Implications for the CXO)Anant Desai
Digital transformation refers to the organizational change that occurs through the use of digital technologies and business models to improve the organizational performance.
Data technology experts from Pivotal give the latest perspective on how big data analytics and applications are transforming organizations across industries.
This event provides an opportunity to learn about new developments in the rapidly-changing world of big data and understand best practices in creating Internet of Things (IoT) applications.
Learn more about the Pivotal Big Data Roadshow: http://pivotal.io/big-data/data-roadshow
Why Domain-Driven Design and Reactive Programming?VMware Tanzu
Enterprise software development is hard.
A poorly designed enterprise software application can result in exorbitant costs and overall project failure. Traditional approaches have had difficulty with promoting good design practices, resulting in applications that don’t meet the needs of the business and are costly and difficult to change. Ultimately, this severely limits the value of these applications.
Domain-Driven Design (DDD) and Reactive Programming are design patterns that address these issues head on. Both approaches address application development complexity by breaking your big problems into smaller problems.
DDD puts the focus on the core business domain ensuring that the highest business value areas are addressed first. DDD operates on the premise that your business needs will change, and your applications need to change accordingly. Working closely together, your business domain experts and technical team can deliver apps that evolve with your business.
Reactive Programming promotes simplicity by focusing on only a few important concepts. It reduces the complexity of building a big application by viewing it as a collection of smaller applications that respond to events. The stream of events that occur as part of your business operations can instantly trigger responses from the application, making Reactive Programming real-time, interactive, and engaging.
In this webinar, we will answer five key questions:
What causes software projects to lack well-designed domains?
What is a good domain model and how does it help with reducing complexity?
What is the Reactive model and how does it help developers solve complex application and integration problems?
How can you use these techniques to reduce time-to-market and improve quality as you build software that is more flexible, more scalable, and more tightly aligned to business goals?
How can in-memory data grids like open source Apache Geode and GemFire (Pivotal’s product based on Apache Geode) fit with these modern concepts?
Using the awesome power of Spring Boot with Spring Data Geode to build highly-scalable, distributed Spring/Java applications using Apache Geode or Pivotal GemFire.
In this session we review the design of the current capabilities of the Spring Data GemFire API that supports Geode, and explore additional use cases and future direction that the Spring API and underlying Geode support might evolve.
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)Gruter
Apache Tajo: A Big Data Warehouse System on Hadoop
- presented by Jae-hwaJeong, Apache Tajo committer and Gruter research engineer
at Gruter TECHDAY 2014 (Oct. 29 Seoul, Korea)
[db tech showcase Tokyo 2016] E34: Oracle SE - RAC, HA and Standby are Still ...Insight Technology, Inc.
Standard Edition (SE) is alive and well – maybe it had some growing pains over the last year, BUT it is here to stay! SE is a powerful database albeit with some limitations. whether it is using a Cloud based environment or on premise. In this session we will discuss Oracle SE and review some of the recent changes and the introduction of the new kid on the block – Standard Edition 2 (SE2). Topics that will be discussed include moving between Editions, High Availability, Disaster Recovery as well as Backup and Recovery.
How do we collect analytics at Rounds. We have build a pipeline that starts at the user mobile device and flow via our collecting servers (written in golang) until in inserted into BigQuery and Elasticsearch cluster. We will share our experience and the journey that we have made until we reached our current system
zData Inc. Big Data Consulting and Services - Overview and SummaryzData Inc.
This slide deck is a summary of zData Inc., a leading Big Data Consulting and Services Provider. zData focuses on commercial and enterprise corporations, employing experts in all areas of the field from software engineers to data scientists. They work with top hardware and software providers for on-site and off-site consulting, managed services, trainings, and long term scalable data solutions.
Open Data Science Conference Big Data Infrastructure – Introduction to Hadoop...DataKitchen
The main objective of this workshop is to give the audience hands on experience with several Hadoop technologies and jump start their hadoop journey. In this workshop, you will load data and submit queries using Hadoop! Before jumping in to the technology, the Founders of DataKitchen review Hadoop and some of its technologies (MapReduce, Hive, Pig, Impala and Spark), look at performance, and present a rubric for choosing which technology to use when.
NOTE: To complete hands on poriton in the time allotted, attendees should come with a newly created AWS (Amazon Web Services) Account and complete the other prerequisites found in the DataKitchen blog <http: />.
Cloud Foundry Diego, Lattice, Docker and morecornelia davis
Colorado Cloud Foundry Meetup
May 19, 2015
Lattice and Docker with Cornelia Davis
Starting with a comparison of the current core runtime of the Cloud Foundry Elastic Runtime, to the new Diego rewrite, we take a tour through how linux containers can run a variety of image formats, including Docker. We talk about one way that you can get the Diego functionality in Lattice, a container scheduler that runs on a laptop or as a cluster in the cloud. We talk about ways of creating container images including Cloud Rocker and we draw it all together with a bunch of demos.
Abstract from the meetup:
What is Lattice (www.lattice.cf)?
Lattice is an open source project for running containerized workloads on a cluster. A Lattice cluster is comprised of a number of Lattice Cells (VMs that run containers) and a Lattice Coordinator that monitors the Cells.
Lattice includes built-in http load-balancing, a cluster scheduler, log aggregation with log streaming and health management.
Lattice containers are described as long-running processes or temporary tasks. Lattice includes support for Linux Containers expressed either as Docker Images or by composing applications as binary code on top of a root file system. Lattice's container pluggability will enable other backends such as Windows or Rocket in the future.
This presentation reviews the key methodologies that all the member of the team should consider such as:
- How to prioritize the right application or project for your first Oracle
- Tips to execute a well-defined, phased migration process to minimize risk and increase time to value
- Handling the common concerns and pitfalls related to a migration project
- What resources you can leverage before, during and after your migration
- Suggestions on how you can achieve independence from an Oracle database – without sacrificing performance.
Target audience: This presentation is intended for IT Decision-Makers and Leaders on the team involved in Database decisions and execution.
For more information, please email sales@enterprisedb.com
Similar to SpringCamp 2016 - Apache Geode 와 Spring Data Gemfire (20)
A review on techniques and modelling methodologies used for checking electrom...nooriasukmaningtyas
The proper function of the integrated circuit (IC) in an inhibiting electromagnetic environment has always been a serious concern throughout the decades of revolution in the world of electronics, from disjunct devices to today’s integrated circuit technology, where billions of transistors are combined on a single chip. The automotive industry and smart vehicles in particular, are confronting design issues such as being prone to electromagnetic interference (EMI). Electronic control devices calculate incorrect outputs because of EMI and sensors give misleading values which can prove fatal in case of automotives. In this paper, the authors have non exhaustively tried to review research work concerned with the investigation of EMI in ICs and prediction of this EMI using various modelling methodologies and measurement setups.
We have compiled the most important slides from each speaker's presentation. This year’s compilation, available for free, captures the key insights and contributions shared during the DfMAy 2024 conference.
6th International Conference on Machine Learning & Applications (CMLA 2024)ClaraZara1
6th International Conference on Machine Learning & Applications (CMLA 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of on Machine Learning & Applications.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
Water billing management system project report.pdfKamal Acharya
Our project entitled “Water Billing Management System” aims is to generate Water bill with all the charges and penalty. Manual system that is employed is extremely laborious and quite inadequate. It only makes the process more difficult and hard.
The aim of our project is to develop a system that is meant to partially computerize the work performed in the Water Board like generating monthly Water bill, record of consuming unit of water, store record of the customer and previous unpaid record.
We used HTML/PHP as front end and MYSQL as back end for developing our project. HTML is primarily a visual design environment. We can create a android application by designing the form and that make up the user interface. Adding android application code to the form and the objects such as buttons and text boxes on them and adding any required support code in additional modular.
MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software. It is a stable ,reliable and the powerful solution with the advanced features and advantages which are as follows: Data Security.MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software.