In this session we review the design of the current capabilities of the Spring Data GemFire API that supports Geode, and explore additional use cases and future direction that the Spring API and underlying Geode support might evolve.
In this session we review the design of the newly released off heap storage feature in Apache Geode, and discuss use cases and potential direction for additional capabilities of this feature.
#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & GeodePivotalOpenSourceHub
In this session we review the design of the current state of support for Apache Geode by Spring Cloud Data Flow, and explore additional use cases and future direction that Spring Cloud Data Flow and Apache Geode might evolve.
How Southwest Airlines Uses Geode
Distributed systems and fast data require new software patterns and implementation skills. Learn how Southwest Airlines uses Apache Geode, organizes team responsibilities, and approaches design tradeoffs. Drawing inspiration from real whiteboard conversations, we’ll explore: common development pitfalls, environment capacity planning, streaming data patterns like consumer checkpointing, support roles, and production lessons learned.
Every day, Apache Geode improves how Southwest Airlines schedules nearly 4,000 flights and serves over 500,000 passengers. It’s an essential component of Southwest’s ability to reduce flight delays and support future growth.
Apache Apex and Apache Geode are two of the most promising incubating open source projects. Combined, they promise to fill gaps of existing big data analytics platforms. Apache Apex is an enterprise grade native YARN big data-in-motion platform that unifies stream and batch processing. Apex is highly scalable, performant, fault tolerant, and strong in operability. Apache Geode provides a database-like consistency model, reliable transaction processing and a shared-nothing architecture to maintain very low latency performance with high concurrency processing. We will also look at some use cases where how these two projects can be used together to form distributed, fault tolerant, reliable in memory data processing layer.
#GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Gree...PivotalOpenSourceHub
In this session we explore a case study of a large-scale government fraud detection program that prevents billions of dollars in fraudulent payments each year leveraging the beta release of the GemFire+Greenplum Connector, which is planned for release in GemFire 9. Topics will include an overview of the system architecture and a review of the new GemFire+Greenplum Connector features that simplify use cases requiring a blend of massively parallel database capabilities and accelerated in-memory data processing.
#GeodeSummit - Where Does Geode Fit in Modern System ArchitecturesPivotalOpenSourceHub
In this talk, Eitan Suez explores the question: Where does Geode fit in an organization's system architecture? Geode is a unique and feature-rich product that perhaps hasn't seen as much adoption as it deserves. Today's apps are no longer the straightforward, database-backed web applications we used to build a few years ago. Applications have become more sophisticated, as they've had to meet the need to scale, to be reliable, fault-tolerant, and to integrate with other systems. In this talk, Eitan will suggest one particular fit for Geode in the context of a CQRS architecture, and welcomes you to attend, and to contribute by sharing how you've put Geode to use in your organization.
Apache Apex & Apache Geode are two very promising incubating open source projects, combined they promise to fill gaps of existing big data analytics platforms.
Apache Geode provides a database-like consistency model, reliable transaction processing and a shared-nothing architecture to maintain very low latency performance with high concurrency processing.
In this session we will talk about use cases and on-going efforts of integrating Apex and Geode to build scallable & fault tolerant RealTime streaming applications that ingest from various sources and egress to Geode.
Use case 1 - Geode as data store to write streaming processed data computed by Apex which is powering user applications or dashboards.
Use case 2 - Apex application reading data from Geode cache and use it for data processing.
Use case 3 - Apex platform's operator checkpointing in Geode to improve performance of Apex batch operations.
Presented by Ashish Tadose at Apex Meetup on 03/17/16
In this session we review the design of the newly released off heap storage feature in Apache Geode, and discuss use cases and potential direction for additional capabilities of this feature.
#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & GeodePivotalOpenSourceHub
In this session we review the design of the current state of support for Apache Geode by Spring Cloud Data Flow, and explore additional use cases and future direction that Spring Cloud Data Flow and Apache Geode might evolve.
How Southwest Airlines Uses Geode
Distributed systems and fast data require new software patterns and implementation skills. Learn how Southwest Airlines uses Apache Geode, organizes team responsibilities, and approaches design tradeoffs. Drawing inspiration from real whiteboard conversations, we’ll explore: common development pitfalls, environment capacity planning, streaming data patterns like consumer checkpointing, support roles, and production lessons learned.
Every day, Apache Geode improves how Southwest Airlines schedules nearly 4,000 flights and serves over 500,000 passengers. It’s an essential component of Southwest’s ability to reduce flight delays and support future growth.
Apache Apex and Apache Geode are two of the most promising incubating open source projects. Combined, they promise to fill gaps of existing big data analytics platforms. Apache Apex is an enterprise grade native YARN big data-in-motion platform that unifies stream and batch processing. Apex is highly scalable, performant, fault tolerant, and strong in operability. Apache Geode provides a database-like consistency model, reliable transaction processing and a shared-nothing architecture to maintain very low latency performance with high concurrency processing. We will also look at some use cases where how these two projects can be used together to form distributed, fault tolerant, reliable in memory data processing layer.
#GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Gree...PivotalOpenSourceHub
In this session we explore a case study of a large-scale government fraud detection program that prevents billions of dollars in fraudulent payments each year leveraging the beta release of the GemFire+Greenplum Connector, which is planned for release in GemFire 9. Topics will include an overview of the system architecture and a review of the new GemFire+Greenplum Connector features that simplify use cases requiring a blend of massively parallel database capabilities and accelerated in-memory data processing.
#GeodeSummit - Where Does Geode Fit in Modern System ArchitecturesPivotalOpenSourceHub
In this talk, Eitan Suez explores the question: Where does Geode fit in an organization's system architecture? Geode is a unique and feature-rich product that perhaps hasn't seen as much adoption as it deserves. Today's apps are no longer the straightforward, database-backed web applications we used to build a few years ago. Applications have become more sophisticated, as they've had to meet the need to scale, to be reliable, fault-tolerant, and to integrate with other systems. In this talk, Eitan will suggest one particular fit for Geode in the context of a CQRS architecture, and welcomes you to attend, and to contribute by sharing how you've put Geode to use in your organization.
Apache Apex & Apache Geode are two very promising incubating open source projects, combined they promise to fill gaps of existing big data analytics platforms.
Apache Geode provides a database-like consistency model, reliable transaction processing and a shared-nothing architecture to maintain very low latency performance with high concurrency processing.
In this session we will talk about use cases and on-going efforts of integrating Apex and Geode to build scallable & fault tolerant RealTime streaming applications that ingest from various sources and egress to Geode.
Use case 1 - Geode as data store to write streaming processed data computed by Apex which is powering user applications or dashboards.
Use case 2 - Apex application reading data from Geode cache and use it for data processing.
Use case 3 - Apex platform's operator checkpointing in Geode to improve performance of Apex batch operations.
Presented by Ashish Tadose at Apex Meetup on 03/17/16
An introduction into Spark ML plus how to go beyond when you get stuckData Con LA
Abstract:-
This talk will introduce Spark new machine learning frame work (Spark ML) and how to train basic models with it. A companion Jupyter notebook for people to follow along with will be provided. Once we've got the basics down we'll look at what to do when we find we need more than the tools available in Spark ML (and I'll try and convince people to contribute to my latest side project -- Sparkling ML).
Bio:-
Holden Karau is a transgender Canadian, Apache Spark committer, an active open source contributor, and coauthor of Learning Spark and High Performance Spark. When not in San Francisco working as a software development engineer at IBM’s Spark Technology Center, Holden speaks internationally about Spark and holds office hours at coffee shops at home and abroad. She makes frequent contributions to Spark, specializing in PySpark and machine learning. Prior to IBM, she worked on a variety of distributed, search, and classification problems at Alpine, Databricks, Google, Foursquare, and Amazon. She holds a bachelor of mathematics in computer science from the University of Waterloo. Outside of computers she enjoys scootering and playing with fire.
Low latency high throughput streaming using Apache Apex and Apache KuduDataWorks Summit
True streaming is fast becoming a necessity for many business use cases. On the other hand the data set sizes and volumes are also growing exponentially compounding the complexity of data processing pipelines.There exists a need for true low latency streaming coupled with very high throughput data processing. Apache Apex as a low latency and high throughput data processing framework and Apache Kudu as a high throughput store form a nice combination which solves this pattern very efficiently.
This session will walk through a use case which involves writing a high throughput stream using Apache Kafka,Apache Apex and Apache Kudu. The session will start with a general overview of Apache Apex and capabilities of Apex that form the foundation for a low latency and high throughput engine with Apache kafka being an example input source of streams. Subsequently we walk through Kudu integration with Apex by walking through various patterns like end to end exactly once, selective column writes and timestamp propagations for out of band data. The session will also cover additional patterns that this integration will cover for enterprise level data processing pipelines.
The session will conclude with some metrics for latency and throughput numbers for the use case that is presented.
Speaker
Ananth Gundabattula, Senior Architect, Commonwealth Bank of Australia
In this session we review the design of the current capabilities of a partially completed feature in Apache Geode - the ability to act as a backend for Redis client applications. We’ll explore potential use cases and future direction that this capability might evolve.
Spark Tuning For Enterprise System Administrators, Spark Summit East 2016Anya Bida
by Anya Bida and Rachel Warren from Alpine Data
https://spark-summit.org/east-2016/events/spark-tuning-for-enterprise-system-administrators/
Spark offers the promise of speed, but many enterprises are reluctant to make the leap from Hadoop to Spark. Indeed, System Administrators will face many challenges with tuning Spark performance. This talk is a gentle introduction to Spark Tuning for the Enterprise System Administrator, based on experience assisting two enterprise companies running Spark in yarn-cluster mode. The initial challenges can be categorized in two FAQs. First, with so many Spark Tuning parameters, how do I know which parameters are important for which jobs? Second, once I know which Spark Tuning parameters I need, how do I enforce them for the various users submitting various jobs to my cluster? This introduction to Spark Tuning will enable enterprise system administrators to overcome common issues quickly and focus on more advanced Spark Tuning challenges. The audience will understand the “cheat-sheet” posted here: http://techsuppdiva.github.io/ Key takeaways: FAQ 1: With so many Spark Tuning parameters, how do I know which parameters are important for which jobs? Solution 1: The Spark Tuning cheat-sheet! A visualization that guides the System Administrator to quickly overcome the most common hurdles to algorithm deployment. [1]http://techsuppdiva.github.io/ FAQ 2: Once I know which Spark Tuning parameters I need, how do I enforce them at the user level? job level? algorithm level? project level? cluster level? Solution 2: We’ll approach these challenges using job & cluster configuration, the Spark context, and 3rd party tools – of which Alpine will be one example. We’ll operationalize Spark parameters according to user, job, algorithm, workflow pipeline, or cluster levels.
Getting Started with Apache Cassandra and Apache Zeppelin (DuyHai DOAN, DataS...DataStax
If you are interested in Big Data, you surely already know Apache Spark or Apache Cassandra, but do you know Apache Zeppelin ? Do you know that it is possible to draw out beautiful graph using an user-friendly interface out of your Spark RDD and Cassandra queries ?
In this session, I will introduce Zeppelin by live coding example and highlight its modular architecture which allows you to plug-in any interpreter for the back-end of your choice.
Then we'll dig into the Apache Cassandra interpreter and show how to use it as a default front-end to display your Cassandra data
About the Speaker
DuyHai DOAN Apache Cassandra Evangelist, DataStax
DuyHai DOAN is an Apache Cassandra Evangelist at DataStax. He spends his time between technical presentations/meetups on Cassandra, coding on open source projects like Achilles or Apache Zeppelin to support the community and helping all companies using Cassandra to make their project successful. Previously he was working as a freelance Java/Cassandra consultant.
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...Yahoo Developer Network
Monte Zweben Co-Founder and CEO of Splice Machine, will discuss how to use HBase co-processors to build an ANSI-99 SQL database with 1) parallelization of SQL execution plans, 2) ACID transactions with snapshot isolation and 3) consistent secondary indexing.
Transactions are critical in traditional RDBMSs because they ensure reliable updates across multiple rows and tables. Most operational applications require transactions, but even analytics systems use transactions to reliably update secondary indexes after a record insert or update.
In the Hadoop ecosystem, HBase is a key-value store with real-time updates, but it does not have multi-row, multi-table transactions, secondary indexes or a robust query language like SQL. Combining SQL with a full transactional model over HBase opens a whole new set of OLTP and OLAP use cases for Hadoop that was traditionally reserved for RDBMSs like MySQL or Oracle. However, a transactional HBase system has the advantage of scaling out with commodity servers, leading to a 5x-10x cost savings over traditional databases like MySQL or Oracle.
HBase co-processors, introduced in release 0.92, provide a flexible and high-performance framework to extend HBase. In this talk, we show how we used HBase co-processors to support a full ANSI SQL RDBMS without modifying the core HBase source. We will discuss how endpoint transactions are used to serialize SQL execution plans over to regions so that computation is local to where the data is stored. Additionally, we will show how observer co-processors simultaneously support both transactions and secondary indexing.
The talk will also discuss how Splice Machine extended the work of Google Percolator, Yahoo Labs’ OMID, and the University of Waterloo on distributed snapshot isolation for transactions. Lastly, performance benchmarks will be provided, including full TPC-C and TPC-H results that show how Hadoop/HBase can be a replacement of traditional RDBMS solutions.
Presto on Apache Spark: A Tale of Two Computation EnginesDatabricks
The architectural tradeoffs between the map/reduce paradigm and parallel databases has been a long and open discussion since the dawn of MapReduce over more than a decade ago. At Facebook, we have spent the past several years in independently building and scaling both Presto and Spark to Facebook scale batch workloads, and it is now increasingly evident that there is significant value in coupling Presto’s state-of-art low-latency evaluation with Spark’s robust and fault tolerant execution engine.
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...Chicago Hadoop Users Group
John Leach Co-Founder and CTO of Splice Machine with 15+ years software development and machine learning experience will discuss how to use HBase co-processors to build an ANSI-99 SQL database with 1) parallelization of SQL execution plans, 2) ACID transactions with snapshot isolation and 3) consistent secondary indexing.
Transactions are critical in traditional RDBMSs because they ensure reliable updates across multiple rows and tables. Most operational applications require transactions, but even analytics systems use transactions to reliably update secondary indexes after a record insert or update.
In the Hadoop ecosystem, HBase is a key-value store with real-time updates, but it does not have multi-row, multi-table transactions, secondary indexes or a robust query language like SQL. Combining SQL with a full transactional model over HBase opens a whole new set of OLTP and OLAP use cases for Hadoop that was traditionally reserved for RDBMSs like MySQL or Oracle. However, a transactional HBase system has the advantage of scaling out with commodity servers, leading to a 5x-10x cost savings over traditional databases like MySQL or Oracle.
HBase co-processors, introduced in release 0.92, provide a flexible and high-performance framework to extend HBase. In this talk, we show how we used HBase co-processors to support a full ANSI SQL RDBMS without modifying the core HBase source. We will discuss how endpoint transactions are used to serialize SQL execution plans over to regions so that computation is local to where the data is stored. Additionally, we will show how observer co-processors simultaneously support both transactions and secondary indexing.
The talk will also discuss how Splice Machine extended the work of Google Percolator, Yahoo Labs’ OMID, and the University of Waterloo on distributed snapshot isolation for transactions. Lastly, performance benchmarks will be provided, including full TPC-C and TPC-H results that show how Hadoop/HBase can be a replacement of traditional RDBMS solutions.
To view the accompanying slide deck: http://www.slideshare.net/ChicagoHUG/
Ansible is an open-source configuration management and deployment tool, which can be used to manage servers and software installations. This talk will briefly cover Ansible itself, and then explain how Ansible is used to install and configure PostgreSQL on a server. Examples will round up the talk.
Slides from my talk at 2016 Apache: Big Data conference.
Resource managers like Apache YARN and Mesos have emerged as a critical layer in the cloud computing system stack, but the developer abstractions for leasing cluster resources and instantiating application logic are very low-level. We present Apache REEF, a powerful yet simple framework that helps developers of big data systems to retain fine-grained control over the cloud resources and address common problems of fault-tolerance, task scheduling and coordination, caching, interprocess communication, and bulk-data transfers. We will guide the developers through a simple REEF application and discuss current state of Apache REEF project and its place in the Hadoop ecosystem.
#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...PivotalOpenSourceHub
This talk introduces an open-source solution that integrates cloud native apps running on Cloud Foundry with an open-source hybrid transactions + analytics real-time solution. The architecture is based on the fastest scalable, highly available and fully consistent In-Memory Data Grid (Apache Geode / GemFire), natively integrated to the first open-source massive parallel data warehouse (Greenplum Database) in a hybrid transactional and analytical architecture that is extremely fast, horizontally scalable, highly resilient and open source. This session also features a live demo running on Cloud Foundry, showing a real case of real-time closed-loop analytics and machine learning using the featured solution.
#GeodeSummit - Wall St. Derivative Risk Solutions Using GeodePivotalOpenSourceHub
In this talk, Andre Langevin discusses how Geode forms the core of many Wall Street derivative risk solutions. By externalizing risk from trading systems, Geode-based solutions provide cross-product risk management at speeds suitable for automated hedging, while simultaneously eliminating the back office costs associated with traditional trading system based solutions.
An introduction into Spark ML plus how to go beyond when you get stuckData Con LA
Abstract:-
This talk will introduce Spark new machine learning frame work (Spark ML) and how to train basic models with it. A companion Jupyter notebook for people to follow along with will be provided. Once we've got the basics down we'll look at what to do when we find we need more than the tools available in Spark ML (and I'll try and convince people to contribute to my latest side project -- Sparkling ML).
Bio:-
Holden Karau is a transgender Canadian, Apache Spark committer, an active open source contributor, and coauthor of Learning Spark and High Performance Spark. When not in San Francisco working as a software development engineer at IBM’s Spark Technology Center, Holden speaks internationally about Spark and holds office hours at coffee shops at home and abroad. She makes frequent contributions to Spark, specializing in PySpark and machine learning. Prior to IBM, she worked on a variety of distributed, search, and classification problems at Alpine, Databricks, Google, Foursquare, and Amazon. She holds a bachelor of mathematics in computer science from the University of Waterloo. Outside of computers she enjoys scootering and playing with fire.
Low latency high throughput streaming using Apache Apex and Apache KuduDataWorks Summit
True streaming is fast becoming a necessity for many business use cases. On the other hand the data set sizes and volumes are also growing exponentially compounding the complexity of data processing pipelines.There exists a need for true low latency streaming coupled with very high throughput data processing. Apache Apex as a low latency and high throughput data processing framework and Apache Kudu as a high throughput store form a nice combination which solves this pattern very efficiently.
This session will walk through a use case which involves writing a high throughput stream using Apache Kafka,Apache Apex and Apache Kudu. The session will start with a general overview of Apache Apex and capabilities of Apex that form the foundation for a low latency and high throughput engine with Apache kafka being an example input source of streams. Subsequently we walk through Kudu integration with Apex by walking through various patterns like end to end exactly once, selective column writes and timestamp propagations for out of band data. The session will also cover additional patterns that this integration will cover for enterprise level data processing pipelines.
The session will conclude with some metrics for latency and throughput numbers for the use case that is presented.
Speaker
Ananth Gundabattula, Senior Architect, Commonwealth Bank of Australia
In this session we review the design of the current capabilities of a partially completed feature in Apache Geode - the ability to act as a backend for Redis client applications. We’ll explore potential use cases and future direction that this capability might evolve.
Spark Tuning For Enterprise System Administrators, Spark Summit East 2016Anya Bida
by Anya Bida and Rachel Warren from Alpine Data
https://spark-summit.org/east-2016/events/spark-tuning-for-enterprise-system-administrators/
Spark offers the promise of speed, but many enterprises are reluctant to make the leap from Hadoop to Spark. Indeed, System Administrators will face many challenges with tuning Spark performance. This talk is a gentle introduction to Spark Tuning for the Enterprise System Administrator, based on experience assisting two enterprise companies running Spark in yarn-cluster mode. The initial challenges can be categorized in two FAQs. First, with so many Spark Tuning parameters, how do I know which parameters are important for which jobs? Second, once I know which Spark Tuning parameters I need, how do I enforce them for the various users submitting various jobs to my cluster? This introduction to Spark Tuning will enable enterprise system administrators to overcome common issues quickly and focus on more advanced Spark Tuning challenges. The audience will understand the “cheat-sheet” posted here: http://techsuppdiva.github.io/ Key takeaways: FAQ 1: With so many Spark Tuning parameters, how do I know which parameters are important for which jobs? Solution 1: The Spark Tuning cheat-sheet! A visualization that guides the System Administrator to quickly overcome the most common hurdles to algorithm deployment. [1]http://techsuppdiva.github.io/ FAQ 2: Once I know which Spark Tuning parameters I need, how do I enforce them at the user level? job level? algorithm level? project level? cluster level? Solution 2: We’ll approach these challenges using job & cluster configuration, the Spark context, and 3rd party tools – of which Alpine will be one example. We’ll operationalize Spark parameters according to user, job, algorithm, workflow pipeline, or cluster levels.
Getting Started with Apache Cassandra and Apache Zeppelin (DuyHai DOAN, DataS...DataStax
If you are interested in Big Data, you surely already know Apache Spark or Apache Cassandra, but do you know Apache Zeppelin ? Do you know that it is possible to draw out beautiful graph using an user-friendly interface out of your Spark RDD and Cassandra queries ?
In this session, I will introduce Zeppelin by live coding example and highlight its modular architecture which allows you to plug-in any interpreter for the back-end of your choice.
Then we'll dig into the Apache Cassandra interpreter and show how to use it as a default front-end to display your Cassandra data
About the Speaker
DuyHai DOAN Apache Cassandra Evangelist, DataStax
DuyHai DOAN is an Apache Cassandra Evangelist at DataStax. He spends his time between technical presentations/meetups on Cassandra, coding on open source projects like Achilles or Apache Zeppelin to support the community and helping all companies using Cassandra to make their project successful. Previously he was working as a freelance Java/Cassandra consultant.
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...Yahoo Developer Network
Monte Zweben Co-Founder and CEO of Splice Machine, will discuss how to use HBase co-processors to build an ANSI-99 SQL database with 1) parallelization of SQL execution plans, 2) ACID transactions with snapshot isolation and 3) consistent secondary indexing.
Transactions are critical in traditional RDBMSs because they ensure reliable updates across multiple rows and tables. Most operational applications require transactions, but even analytics systems use transactions to reliably update secondary indexes after a record insert or update.
In the Hadoop ecosystem, HBase is a key-value store with real-time updates, but it does not have multi-row, multi-table transactions, secondary indexes or a robust query language like SQL. Combining SQL with a full transactional model over HBase opens a whole new set of OLTP and OLAP use cases for Hadoop that was traditionally reserved for RDBMSs like MySQL or Oracle. However, a transactional HBase system has the advantage of scaling out with commodity servers, leading to a 5x-10x cost savings over traditional databases like MySQL or Oracle.
HBase co-processors, introduced in release 0.92, provide a flexible and high-performance framework to extend HBase. In this talk, we show how we used HBase co-processors to support a full ANSI SQL RDBMS without modifying the core HBase source. We will discuss how endpoint transactions are used to serialize SQL execution plans over to regions so that computation is local to where the data is stored. Additionally, we will show how observer co-processors simultaneously support both transactions and secondary indexing.
The talk will also discuss how Splice Machine extended the work of Google Percolator, Yahoo Labs’ OMID, and the University of Waterloo on distributed snapshot isolation for transactions. Lastly, performance benchmarks will be provided, including full TPC-C and TPC-H results that show how Hadoop/HBase can be a replacement of traditional RDBMS solutions.
Presto on Apache Spark: A Tale of Two Computation EnginesDatabricks
The architectural tradeoffs between the map/reduce paradigm and parallel databases has been a long and open discussion since the dawn of MapReduce over more than a decade ago. At Facebook, we have spent the past several years in independently building and scaling both Presto and Spark to Facebook scale batch workloads, and it is now increasingly evident that there is significant value in coupling Presto’s state-of-art low-latency evaluation with Spark’s robust and fault tolerant execution engine.
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...Chicago Hadoop Users Group
John Leach Co-Founder and CTO of Splice Machine with 15+ years software development and machine learning experience will discuss how to use HBase co-processors to build an ANSI-99 SQL database with 1) parallelization of SQL execution plans, 2) ACID transactions with snapshot isolation and 3) consistent secondary indexing.
Transactions are critical in traditional RDBMSs because they ensure reliable updates across multiple rows and tables. Most operational applications require transactions, but even analytics systems use transactions to reliably update secondary indexes after a record insert or update.
In the Hadoop ecosystem, HBase is a key-value store with real-time updates, but it does not have multi-row, multi-table transactions, secondary indexes or a robust query language like SQL. Combining SQL with a full transactional model over HBase opens a whole new set of OLTP and OLAP use cases for Hadoop that was traditionally reserved for RDBMSs like MySQL or Oracle. However, a transactional HBase system has the advantage of scaling out with commodity servers, leading to a 5x-10x cost savings over traditional databases like MySQL or Oracle.
HBase co-processors, introduced in release 0.92, provide a flexible and high-performance framework to extend HBase. In this talk, we show how we used HBase co-processors to support a full ANSI SQL RDBMS without modifying the core HBase source. We will discuss how endpoint transactions are used to serialize SQL execution plans over to regions so that computation is local to where the data is stored. Additionally, we will show how observer co-processors simultaneously support both transactions and secondary indexing.
The talk will also discuss how Splice Machine extended the work of Google Percolator, Yahoo Labs’ OMID, and the University of Waterloo on distributed snapshot isolation for transactions. Lastly, performance benchmarks will be provided, including full TPC-C and TPC-H results that show how Hadoop/HBase can be a replacement of traditional RDBMS solutions.
To view the accompanying slide deck: http://www.slideshare.net/ChicagoHUG/
Ansible is an open-source configuration management and deployment tool, which can be used to manage servers and software installations. This talk will briefly cover Ansible itself, and then explain how Ansible is used to install and configure PostgreSQL on a server. Examples will round up the talk.
Slides from my talk at 2016 Apache: Big Data conference.
Resource managers like Apache YARN and Mesos have emerged as a critical layer in the cloud computing system stack, but the developer abstractions for leasing cluster resources and instantiating application logic are very low-level. We present Apache REEF, a powerful yet simple framework that helps developers of big data systems to retain fine-grained control over the cloud resources and address common problems of fault-tolerance, task scheduling and coordination, caching, interprocess communication, and bulk-data transfers. We will guide the developers through a simple REEF application and discuss current state of Apache REEF project and its place in the Hadoop ecosystem.
#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...PivotalOpenSourceHub
This talk introduces an open-source solution that integrates cloud native apps running on Cloud Foundry with an open-source hybrid transactions + analytics real-time solution. The architecture is based on the fastest scalable, highly available and fully consistent In-Memory Data Grid (Apache Geode / GemFire), natively integrated to the first open-source massive parallel data warehouse (Greenplum Database) in a hybrid transactional and analytical architecture that is extremely fast, horizontally scalable, highly resilient and open source. This session also features a live demo running on Cloud Foundry, showing a real case of real-time closed-loop analytics and machine learning using the featured solution.
#GeodeSummit - Wall St. Derivative Risk Solutions Using GeodePivotalOpenSourceHub
In this talk, Andre Langevin discusses how Geode forms the core of many Wall Street derivative risk solutions. By externalizing risk from trading systems, Geode-based solutions provide cross-product risk management at speeds suitable for automated hedging, while simultaneously eliminating the back office costs associated with traditional trading system based solutions.
#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)PivotalOpenSourceHub
Today, if events change the decision model, we wait until the next batch model build for new insights. By extending fast “time-to-decisions” into the world of Big Data Analytics to get fast “time-to-insights”, apps will get what used to be batch insights in near real time. The technology enabling this includes smart in-memory data storage, new storage class memory, and products designed to do one or more parts of an analysis pipeline very well. In this talk we describe how Ampool is building on Apache Geode to allow Big Data analysis solutions to work together with a scalable smart storage class memory layer to allow fast and complex end-to-end pipelines to be built -- closing the loop and providing dramatically lower time to critical insights.
#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-R...PivotalOpenSourceHub
The financial sector is an exciting mix of challenges regarding throughput, high availability as well as specific constraints regarding latency and consistency. In the continuous evolution of its platform, Murex relies on open source technologies like Apache Geode and Apache Storm in a "kind of" lambda architecture to ensure storage, near-real time (around the milliseconds) aggregation of thousands of events per second, advanced notification mechanisms and on-demand deployments. This talk will focus on the technical architecture, the underlying principles as well as the technologies used to support this mix of functional and non-functional requirements.
#GeodeSummit - Using Geode as Operational Data Services for Real Time Mobile ...PivotalOpenSourceHub
One of the largest retailers in North America are considering Apache Geode for their new mobile loyalty application, to support their digital transformation effort. They would use Geode to provide operational data services for their mobile cloud service. This retailer needs to replace sluggish response times with sub-second response which will improved conversion rates. They also want to able to close the loop between data science findings and app experience. This way the right customer interaction is suggested when it is needed such as when customers are looking at their mobile app while walking in the store, or sending notifications at the individuals most likely shopping times. The final benefits of using Geode will include faster development cycles, increased customer loyalty, and higher revenue.
#GeodeSummit - Modern manufacturing powered by Spring XD and GeodePivotalOpenSourceHub
Wondering how to improve on your production yield, increase asset life and activate reliability centered maintenance? TEKsystems has developed “Golden Batch” recommendation engine to realize your goals of modern manufacturing. This is a Predictive analytics framework built on top of Manufacturing Data Lake for analysis and training of machine learning algorithms, and subsequent processing and detection of streaming data from sensors to detect or predict failures. We’ll present a solution architecture featuring Spring XD for data pipelining, Apache Geode for in-memory processing, Hadoop as a data lake, and R for machine learning.
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"PivotalOpenSourceHub
Keynote at Geode Summit 2016 by Dr. Justin Erenkrantz, Bloolmberg LP. Creating the Future of Big Data Through "The Apache Way" and why this matters to the community
Design Tradeoffs in Distributed Systems- How Southwest Airlines Uses Geode VMware Tanzu
SpringOne Platform 2016
Speaker: Brian Dunlap; Tech Lead, Southwest Airlines.
Distributed systems and fast data require new software patterns and implementation skills. Learn how Southwest Airlines uses Apache Geode, organizes team responsibilities, and approaches design tradeoffs. Drawing inspiration from real whiteboard conversations, we’ll explore:
-Common development pitfalls
-Environment capacity planning
-Streaming data patterns, like consumer checkpointing
-Support roles
-Production lessons learned
Every day, Apache Geode improves how Southwest Airlines schedules nearly 4,000 flights and serves over 500,000 passengers. It’s an essential component of Southwest’s ability to reduce flight delays and support future growth.
Wall Street Derivative Risk Solutions Using GeodeVMware Tanzu
SpringOne Platform 2016
Speaker: Andre Langevin; Consultant, CIBC
In this talk, I will discuss how Geode forms the core of many Wall Street derivative risk solutions. By externalizing risk from trading systems, Geode-based solutions provide cross-product risk management at speeds suitable for automated hedging, while simultaneously eliminating the back office costs associated with traditional trading system based solutions.
Asynchronous API in Java8, how to use CompletableFutureJosé Paumard
Slides of my talk as Devoxx 2015. How to set up asynchronous data processing pipelines using the CompletionStage / CompletableFuture API, including how to control threads and how to handle exceptions.
Using the awesome power of Spring Boot with Spring Data Geode to build highly-scalable, distributed Spring/Java applications using Apache Geode or Pivotal GemFire.
Part 4: Custom Buildpacks and Data Services (Pivotal Cloud Platform Roadshow)VMware Tanzu
Custom Buildpacks & Data Services
The primary goals of this session are to:
Give an overview of the extension points available to Cloud Foundry users.
Provide a buildpack overview with a deep focus on the Java buildpack (my target audience has been Java conferences)
Provide an overview of service options, from user-provided to managed services, including an overview of the V2 Service Broker API.
Provide two hands-on lab experiences:
Java Buildpack Extension
via customization (add a new framework component)
via configuration (upgrade to Java 8)
Service Broker Development/Management
deploy a service broker for “HashMap as a Service (HaaSh).”
Register the broker, make the plan public.
create an instance of the HaaSh service
deploy a client app, bind to the service, and test it
Pivotal Cloud Platform Roadshow is coming to a city near you!
Join Pivotal technologists and learn how to build and deploy great software on a modern cloud platform. Find your city and register now http://bit.ly/1poA6PG
Google App Engine has become a powerful PaaS provider in last couple of years. With Java support enabled, its scope has further increased. This session presents the options we have in terms of testing on Google App Engine. Following is an overview of the proposed agenda:
Running Test suites of Applications not built for Google App Engine
An over-view of Google Cloud Cover
Advantages & Limitations of Cloud Cover
Integration of Existing Test Suites in Google Cloud Cover
Local Testing of Applications built for Google App Engine
DataStore Testing
Memcache Testing
Task-Queue Testing
Authentication API Testing
Caching is a common and effective pattern for scaling applications in near real-time to handle increased load & demand. This is particularly important for cloud-native applications that serve a multitude of users, devices and even other applications. Caching can be applied in several tiers of the application and (HTTP) session state is no exception.
Spring Session offers a robust, yet portable, solution that avoids container lock-in without sacrificing robust features like clustering. It seamlessly allows you to plug in different provider implementations, including Spring Session for Apache Geode, to cluster session state in a reliable, consistent, and scalable manner.
Following this webinar, you will understand how session state caching using Spring Session with Apache Geode provides a reliable experience to end-users—an experience that is resilient to system failures, constant reconfiguration and continuous changes in both the underlying infrastructure and applications as they evolve.
Presenter : John Blum and Jagdish Mirani, Pivotal
Building microservice for api with helidon and cicd pipelineDonghuKIM2
Eclispe Microprofile 기반 프레임워크인 Oracle Helidon에 대한 이해와 이를 활용한 마이크로 서비스 개발을 시연합니다.
• API 문서 검증, 서비스 빌드, 테스트 및 Oracle Kubernetes Engine에 배포하는 과정을 마이크로 서비스 CI/CD 서비스인 Oracle Wercker를 통해 자동화하는 과정을 시연합니다.
o Building microservice with Helidon MP and Helidon SE
o Validating API document against backend microservice with Dredd and Apiary
o Building CI/CD pipeline with Wercker and Oracle Kubernetes Engine
Modular Test-driven SPAs with Spring and AngularJSGunnar Hillert
Single-page Applications (SPA) are all the rage these days and with them there is an avalanche of new tools, libraries and frameworks we need to know. But what does this mean for us as Spring developers? In this session we will give you an overview of the current landscape and illustrate the choices the Spring XD team has made for its user interface. What do I use to write SPA applications? How do I integrate them into existing Spring-based backends? How do I build them? Can I integrate them into my existing Gradle or Maven build processes in order to achieve complete build automation? How do I integrate realtime messaging using Spring's SockJS/WebSocket support? In this talk we will answer these and many more questions. We will cover frameworks such as AngularJS, Bootstrap, RequireJS; tools like Bower, Grunt, Gulp; and also talk about testing using Karma and Protractor.
PHP is the top platform for building and modernizing IBM i applications. In this webinar, Erwin discusses how features of the application server can be leveraged to streamline the development process as well as fast-tracking the management of the PHP environment.
A complete boot camp for beginners who want to learn Spring Boot.
In this course, you'll learn how we can create web services and cover all the topics of Spring Boot, Spring Framework, and many others.
If you've some experience in Java and want to be a Software Engineer or Java Developer using Spring, you're on right way.
Just read and practice, in the end of this course you'll have a great knowledge of Spring boot, a backend knowledge.
Course outline:
JPA, Hibernate, Spring, Spring Framework, H2 Database, PostgreSQL, MySQL.
#SpringBoot
#SpringFramwork
#MySQL
#PostgreSQL
#MySQL
#H2
#JPA/Hibernate
#Webservices
Timings API: Performance Assertion during the functional testingPetrosPlakogiannis
Most teams at GoDaddy follow a CI/CD process where Selenium is commonly used for test automation. When they realized that the Selenium WebDrivers provide access to the same APIs as real browsers, including the widely supported W3C performance API, the concept for the cicd-perf-api webservice was born! By inserting some JavaScript code into the WebDriver object, performance data can be collected and posted back to the webservice. The response from the webservice includes a boolean field that testers can use for assertion - just like they would with functional checks! The field indicates whether performance was above/below the baseline. With this talk Petros will give an introduction to the concept of the Timings API and Marcel will be online in order to tell us about the future of it.
Here are the slides for Greenplum Chat #8. You can view the replay here: https://www.youtube.com/watch?v=FKFiyJDgdQk
The increased frequency and sophistication of high-profile data breaches and malicious hacking is putting organizations at continued risk of data theft and significant business disruption. Complicating this scenario is the unbounded growth of Big Data and petabyte-scale data storage, new open source database and distribution schemes, and the continued adoption of cloud services by enterprises.
Pivotal Greenplum customers often look for additional encryption of data-at-rest and data-in-motion. The massively parallel processing (MPP) architecture of Pivotal Greenplum provides an architecture that is unlike traditional OLAP on RDBMS for data warehousing, and encryption capabilities must address the scale-out architecture.
The Zettaset Big Data Encryption Suite has been designed for optimal performance and scalability in distributed Big Data systems like Greenplum Database and Apache HAWQ.
Here is a replay of our recent Greenplum Chat with Zettaset:
00:59 What is Greenplum’s approach for encryption and why Zettaset?
02:17 Results of field testing Zettaset with Greenplum
03:50 Introduction to Zettaset, the security company
05:36 Overview of Zettaset and their solutions
14:51 Different layers for encrypting data at rest
16:50 Encryption key management for big data
20:51 Zettaset BD Encrypt for data at rest and data in motion
22:19 How to mitigate encryption overhead with an MPP scale-out system
24:12 How to deploy BD Encrypt
25:50 Deep dive on data at rest encryption
30:44 Deep dive on data in motion encryption
36:72 Q: How does Zettaset deal with encrypting Greenplums multiple interfaces?
38:08 Q: Can I encrypt data for a particular column?
40:26 How Zettaset fits into a security strategy
41:21 Q: What is the performance impact on queries by encrypting the entire database?
43:28 How Zettaset helps Greenplum meet IT compliance requirements
45:12 Q: How authentication for keys is obtained
48:50 Q: How can Greenplum users try out Zettaset?
50:53 Q: What is a ‘Zettaset Security Coach’?
How to use the WAN Gateway feature of Apache Geode to implement multi-site and active-active failover, disaster recovery, and global scale applications.
Building Apps with Distributed In-Memory Computing Using Apache GeodePivotalOpenSourceHub
Slides from the Meetup Monday March 7, 2016 just before the beginning of #GeodeSummit, where we cover an introduction of the technology and community that is Apache Geode, the in-memory data grid.
GPORCA is newly open source advanced query optimizer that is a subproject of Greenplum Database open source project. GPORCA is the query optimizer used in commercial distributions of both Greenplum and HAWQ. In these distributions GPORCA has achieved 1000x performance improvement across TPC-DS queries by focusing on three distinct areas: Dynamic Partition Elimination, SubQuery Unnesting, and Common Table Expression.
Now that GPORCA is open source, we are looking for collaborators to help us realize the ultimate dream for GPORCA - to work with any database.
The new breed of data management systems in Big Data have to process so much data that optimization mistakes are magnified in traditional optimizers. Furthermore, coding and manual optimization of complex queries has proven to be hard.
In this session, Venkatesh will discuss:
- Overview of GPORCA
- How to add GPORCA to HAWQ with a build option
- How GPORCA could be made to work with any database
- Future vision for GPORCA and more immediate plans
- How to work with GPORCA, and how to contribute to GPORCA
Pivoting Spring XD to Spring Cloud Data Flow with Sabby AnandanPivotalOpenSourceHub
Pivoting Spring XD to Spring Cloud Data Flow: A microservice based architecture for stream processing
Microservice based architectures are not just for distributed web applications! They are also a powerful approach for creating distributed stream processing applications. Spring Cloud Data Flow enables you to create and orchestrate standalone executable applications that communicate over messaging middleware such as Kafka and RabbitMQ that when run together, form a distributed stream processing application. This allows you to scale, version and operationalize stream processing applications following microservice based patterns and practices on a variety of runtime platforms such as Cloud Foundry, Apache YARN and others.
About Sabby Anandan
Sabby Anandan is a Product Manager at Pivotal. Sabby is focused on building products that eliminate the barriers between application development, cloud, and big data.
Motivation and goals for off-heap storage
Off-heap features and usage
Implementation overview
Preliminary benchmarks: off-heap vs. heap
Tips and best practices
Zeppelin Interpreters
PSQL (to became JDBC in 0.6.x)
Geode
SpringXD
Apache Ambari
Zeppelin Service
Geode, HAWQ and Spring XD services
Webpage Embedder View
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Misconceptions about Spring…
Spring is a Web Application Framework
Spring’s programming model is unique and Spring uses it’s own conventions
Built on fundamental OO principles (POJO)
Software Design Patterns (IoC/DI, AOP) and…
Open Standards (OSS)
Apache Geode is a complex technology…
Too many configuration options and settings.
Inconsistent behavior between XML configuration (i.e. cache.xml) and API.