This document discusses data locality, latency, and caching. It provides background on early problems with web applications that motivated the use of caching technology. It then discusses key considerations for caching solutions, including accommodating developers, offloading databases, scaling, and buffering load variability. The document reviews the history of data locality problems and explores latency from mathematical and physics perspectives. It introduces concepts around consistency, availability, and partitioning for distributed systems. Finally, it introduces the Java Caching (JSR-107) API as a standard solution and discusses transactional capabilities and consistency models supported.
The document discusses SQL*Net waits and how to diagnose them. SQL*Net waits can indicate issues with the network, client tools, or applications. Specific wait events like "SQL*Net more data from client" suggest adjusting SDU, RECV_BUF_SIZE and SEND_BUF_SIZE. Errors are identified by waits like "SQL*Net break/reset to client". Network timings can be analyzed using tools like ping, tnsping, and network sniffers.
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012Big Data Spain
Session presented at Big Data Spain 2012 Conference
16th Nov 2012
ETSI Telecomunicacion UPM Madrid
www.bigdataspain.org
More info: http://www.bigdataspain.org/es-2012/conference/top-five-questions-about-nosql/jonathan-ellis
Running with the Devil: Mechanical Sympathetic NetworkingTodd Montgomery
The document discusses various challenges and tradeoffs related to networking protocols like TCP and UDP. It shows how factors like packet size, OS page size, and network effects like Nagle's algorithm can impact performance. While applications sometimes batch data appropriately, low-level protocols may still cause issues. The author suggests applications should be able to optimize batching for their own needs at the necessary time. Addressing request/response idiosyncrasies may also help.
The document contains log output from the StarBurn Development Kit software. It lists the software version and license information, and then outputs details about detected optical drives, including device IDs, supported media formats, and status information. The software appears to be initializing and probing the capabilities of two optical drives, located at addresses 0:1:0:0 and 255:255:255:0.
Jan-Piet Mens gave a presentation on DNSSEC and securing DNS with cryptography. He discussed how DNSSEC works by adding digital signatures to DNS records to authenticate their source and integrity. It provides data origin authentication and integrity but does not encrypt data. He demonstrated how to configure and use DNSSEC with Unbound and PowerDNS to enable validating queries. Attendees were encouraged to set up test environments to better understand DNSSEC.
This document describes research into developing a discovery method to ensure DNSSEC information can be delivered to end hosts. Measurements using RIPE ATLAS probes found that 64% of recursive resolvers could perform basic DNSSEC queries, while only 40% could process authenticated wildcard information. The proposed discovery method has stub resolvers first try the default recursive resolver, then the ISP resolver, a public resolver, or full recursion if needed, to balance functionality and efficiency.
The document discusses SQL*Net waits and how to diagnose them. SQL*Net waits can indicate issues with the network, client tools, or applications. Specific wait events like "SQL*Net more data from client" suggest adjusting SDU, RECV_BUF_SIZE and SEND_BUF_SIZE. Errors are identified by waits like "SQL*Net break/reset to client". Network timings can be analyzed using tools like ping, tnsping, and network sniffers.
The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012Big Data Spain
Session presented at Big Data Spain 2012 Conference
16th Nov 2012
ETSI Telecomunicacion UPM Madrid
www.bigdataspain.org
More info: http://www.bigdataspain.org/es-2012/conference/top-five-questions-about-nosql/jonathan-ellis
Running with the Devil: Mechanical Sympathetic NetworkingTodd Montgomery
The document discusses various challenges and tradeoffs related to networking protocols like TCP and UDP. It shows how factors like packet size, OS page size, and network effects like Nagle's algorithm can impact performance. While applications sometimes batch data appropriately, low-level protocols may still cause issues. The author suggests applications should be able to optimize batching for their own needs at the necessary time. Addressing request/response idiosyncrasies may also help.
The document contains log output from the StarBurn Development Kit software. It lists the software version and license information, and then outputs details about detected optical drives, including device IDs, supported media formats, and status information. The software appears to be initializing and probing the capabilities of two optical drives, located at addresses 0:1:0:0 and 255:255:255:0.
Jan-Piet Mens gave a presentation on DNSSEC and securing DNS with cryptography. He discussed how DNSSEC works by adding digital signatures to DNS records to authenticate their source and integrity. It provides data origin authentication and integrity but does not encrypt data. He demonstrated how to configure and use DNSSEC with Unbound and PowerDNS to enable validating queries. Attendees were encouraged to set up test environments to better understand DNSSEC.
This document describes research into developing a discovery method to ensure DNSSEC information can be delivered to end hosts. Measurements using RIPE ATLAS probes found that 64% of recursive resolvers could perform basic DNSSEC queries, while only 40% could process authenticated wildcard information. The proposed discovery method has stub resolvers first try the default recursive resolver, then the ISP resolver, a public resolver, or full recursion if needed, to balance functionality and efficiency.
Oracle Enterprise Data Quality for Siebel provides data quality services for Siebel CRM. It uses Siebel's universal connector interface to connect to EDQ web services for standardization, matching, and duplicate identification. Records are passed between Siebel and EDQ in real-time or batch jobs. EDQ matches records and returns possible matches to Siebel without storing the working data. Templates are provided for common data quality tasks like contact and account matching, verification, and standardization.
Deduplication without performance hits: Intel Xeon processor E5-2697v2-powere...Principled Technologies
Backing up business data with deduplication has the potential to clog network resources and slow down performance for your customers. Servers with powerful processors are crucial to running an efficient and secure infrastructure. We found that a Dell PowerEdge R720xd server powered by the Intel Xeon processor E5-2600 v2 Series successfully overcame this hurdle in our tests. When we used NetVault Backup to dedupe and back up data while running database workloads, performance did not slow down. The Dell solution using source-based deduplication reduced backup storage space used by 39 times and reduced application server network traffic by 70.5 percent when compared with a traditional backup method. Strong processor and server combinations, such as the one we tested, can keep your business moving swiftly and save resources as you back up your data to keep it safe.
4 supporting h base jeff, jon, kathleen - cloudera - final 2Cloudera, Inc.
This document outlines a presentation given at HBaseCon 2012 on supporting HBase. It introduces the presenters and their backgrounds in supporting HBase. The presentation covers preventative steps to maintain a healthy HBase cluster, such as monitoring, sizing the cluster properly, and avoiding swap. It also discusses common issues that require "triage", such as connection resets, running out of file descriptors, and long garbage collection pauses. Finally, it addresses more severe issues that may require "surgery" to repair, like an unable to load database error or a downed HBase master and regionservers. The presentation provides examples of diagnosing issues using logs and event trails and provides recommendations for resolving different problems.
This document provides a summary of updates to MySQL between October 2021 and May 2021. It discusses three releases of MySQL 8.0 (versions 8.0.23, 8.0.24, and 8.0.25) and new features including invisible columns, asynchronous replication connection failover, improved load/dump functionality in MySQL Shell, and the new MySQL Database Service on Oracle Cloud Infrastructure with HeatWave for accelerated analytics.
English version of the presentation we gave at Devoxx FR 2012.
In depth analysis on how java Garbage collector works and how to minimise pause in your application.
The document discusses new Cassandra drivers that utilize a CQL binary protocol and provide an asynchronous, failover-capable architecture. Key points include the drivers supporting CQL's denormalized data model, request pipelining, notifications, configurable load balancing and retry policies, and downgrading consistency levels automatically if necessary.
Class lecture by Prof. Raj Jain on Application Delivery Networking. The talk covers Application Delivery in a Data Center, Application Delivery Controllers (ADCs), Load Balancing Concepts, Load Balancing Using DNS, Global Server Load Balancing (GSLB), Load Balancing Modes, Network Address Translation (NAT), NAT Example, NAT64 and NAT46, Carrier Grade NAT, Dual NAT, Server NAT, Transparent Mode, Firewall Load Balancing, Reverse Proxy Load Balancing, SSL Offload, TCP Multiplexing ADC, HTTP Compression, ADC Proliferation in Data Centers, Virtual ADCs, Multi-Function ADCs. Video recording available in YouTube.
The Very Very Latest in Database Development - Oracle Open World 2012Lucas Jellema
The document discusses using virtual columns in Oracle databases to implement business rules and uniqueness constraints across tables in a declarative way. Virtual columns allow expressing attributes as SQL expressions of real columns, enabling indexing and foreign key constraints that check rules involving multiple tables or columns. Business rules that were previously only possible through procedural logic can now be enforced at the database level through virtual columns.
20150704 benchmark and user experience in sahara weitingWei Ting Chen
Sahara provides a way to deploy and manage Hadoop clusters within an OpenStack cloud. It addresses common customer needs like providing an elastic environment for data processing jobs, integrating Hadoop with the existing private cloud infrastructure, and reducing costs. Key challenges include speeding up cluster provisioning times, supporting complex data workflows, optimizing storage architectures, and improving performance when using remote object storage.
This document provides an overview of the Domain Name System (DNS) including how DNS works, the record types stored in DNS, caching and authoritative servers, and delegation. It discusses how DNS queries work by recursively searching through the DNS hierarchy from the root servers down. Key record types like A, AAAA, NS, MX and SOA are described. The roles of caching servers, which forward queries on behalf of clients, and authoritative servers, which serve data from their own zone files, are compared. Delegation allows separate administration of different domains through the hierarchical structure of DNS.
MySQL Cluster Scaling to a Billion QueriesBernd Ocklin
MySQL Cluster is a distributed database that provides extreme scalability, high availability, and real-time performance. It uses an auto-sharding and auto-replicating architecture to distribute data across multiple low-cost servers. Key benefits include scaling reads and writes, 99.999% availability through its shared-nothing design with no single point of failure, and real-time responsiveness. It supports both SQL and NoSQL interfaces to enable complex queries as well as high-performance key-value access.
The document discusses data partitioning and distribution across multiple machines in a cluster. It explains that data replication does not scale well, but data partitioning, where each record exists on only one machine, allows write latency to scale with the number of machines in the cluster. Coherence provides a distributed cache that partitions data and offers functions for server-side processing near the data through tools like entry processors.
IT Infrastructure Through The Public Network Challenges And SolutionsMartin Jackson
Identifying the challenges that companies face when they wish to adopt Infrastructure as a Service like those from Amazon and Rackspace and possible solutions to those problems. This presentation seeks to provide insight and possible solutions, covering the areas of security, availability, cloud standards, interoperability, vendor lock in and performance management.
1. What are the difficulties in deploying and managing the life cycle of data-heavy application
2. Review of kubernetes landscape w.r.t data-heavy applications
3. Robin approach to orchestrating data-heavy applications
Dataservices: Processing Big Data the Microservice WayQAware GmbH
O'Reilly Software Architecture Conference 2018, New York (USA): Talk by Mario-Leander Reimer (@LeanderReimer, Principal Software Architect at QAware)
Abstract:
Big data processing, microservices, and cloud-native technology are a match made in computing heaven, enabling microservices to be used to build a flexible, scalable, and distributed system of loosely coupled data processing tasks, called data services.
Mario-Leander Reimer explores key JEE technologies that can be used to build JEE-powered data services and walks you through implementing the individual data processing tasks of a simplified showcase application. You’ll then deploy and orchestrate the individual data services using OpenShift, illustrating the scalability of the overall processing pipeline. The context and content is taken from a real-world project for a major German car manufacturer, implementing a microservices-based processing pipeline that uses car-related event data (sensor data, traffic events, and other real-time data) for a traffic information management and route optimization system.
Speaker: Tom Spitzer, Vice President, Engineering, EC Wise, Inc.
Session Type: 40 minute main track session
Level: 200 (Intermediate)
Track: Security
MongoDB Community Server provides a wide range of capabilities for securing your MongoDB installation. In this session, we will focus on access control features, including authentication and authorization mechanisms, that enable you to enforce a least privilege model on user accounts. We will also discuss strategies for enabling and maintaining service and application accounts. Next we will present the encryption capabilities that are available in the community edition and discuss their benefits and possible shortcomings. Finally, we will talk about application level protections your developers can implement to keep risky code from getting to your MongoDB instance.
What You Will Learn:
- The workings of the MongoDB User Management Interface, the Authentication Database, basic Authentication mechanisms (SCRAM-SHA-1 and certificates), Roles, and Role Based Access controls – plus best practices for using these features to improve the security of your database.
- How to use TLS/SSL for transport encryption, application encryption options, and field level redaction.
- How injection attacks work and how to minimize the risk of injection attacks.
Actionable Metrics at Production Scale - LSPE Meetup June 27, 2012SOASTA
Presentation to Large Scale Production Engineering meetup at Yahoo campus on June 27, 2012. Dan Bartow, VP Product Management at SOASTA discussed critical performance metrics.
Apache hadoop 3.x state of the union and upgrade guidance - Strata 2019 NYWangda Tan
The document discusses Apache Hadoop 3.x updates and provides guidance for upgrading to Hadoop 3. It covers community updates, features in YARN, Submarine, HDFS, and Ozone. Release plans are outlined for Hadoop, Submarine, and upgrades from Hadoop 2 to 3. Express upgrades are recommended over rolling upgrades for the major version change. The session summarizes that Hadoop 3 is an eagerly awaited release with many successful production uses, and that now is a good time for those not yet upgraded.
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...Josef Adersberger
Running applications on Kubernetes can provide a lot of benefits: more dev speed, lower ops costs, and a higher elasticity & resiliency in production. Kubernetes is the place to be for cloud native apps. But what to do if you’ve no shiny new cloud native apps but a whole bunch of JEE legacy systems? No chance to leverage the advantages of Kubernetes? Yes you can!
We’re facing the challenge of migrating hundreds of JEE legacy applications of a major German insurance company onto a Kubernetes cluster within one year. We're now close to the finish line and it worked pretty well so far.
The talk will be about the lessons we've learned - the best practices and pitfalls we've discovered along our way. We'll provide our answers to life, the universe and a cloud native journey like:
- What technical constraints of Kubernetes can be obstacles for applications and how to tackle these?
- How to architect a landscape of hundreds of containerized applications with their surrounding infrastructure like DBs MQs and IAM and heavy requirements on security?
- How to industrialize and govern the migration process?
- How to leverage the possibilities of a cloud native platform like Kubernetes without challenging the tight timeline?
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...QAware GmbH
CloudNativeCon North America 2017, Austin (Texas, USA): Talk by Josef Adersberger (@adersberger, CTO at QAware)
Abstract:
Running applications on Kubernetes can provide a lot of benefits: more dev speed, lower ops costs, and a higher elasticity & resiliency in production. Kubernetes is the place to be for cloud native apps. But what to do if you’ve no shiny new cloud native apps but a whole bunch of JEE legacy systems? No chance to leverage the advantages of Kubernetes? Yes you can!
We’re facing the challenge of migrating hundreds of JEE legacy applications of a major German insurance company onto a Kubernetes cluster within one year. We're now close to the finish line and it worked pretty well so far.
The talk will be about the lessons we've learned - the best practices and pitfalls we've discovered along our way. We'll provide our answers to life, the universe and a cloud native journey like:
- What technical constraints of Kubernetes can be obstacles for applications and how to tackle these?
- How to architect a landscape of hundreds of containerized applications with their surrounding infrastructure like DBs MQs and IAM and heavy requirements on security?
- How to industrialize and govern the migration process?
- How to leverage the possibilities of a cloud native platform like Kubernetes without challenging the tight timeline?
Oracle Enterprise Data Quality for Siebel provides data quality services for Siebel CRM. It uses Siebel's universal connector interface to connect to EDQ web services for standardization, matching, and duplicate identification. Records are passed between Siebel and EDQ in real-time or batch jobs. EDQ matches records and returns possible matches to Siebel without storing the working data. Templates are provided for common data quality tasks like contact and account matching, verification, and standardization.
Deduplication without performance hits: Intel Xeon processor E5-2697v2-powere...Principled Technologies
Backing up business data with deduplication has the potential to clog network resources and slow down performance for your customers. Servers with powerful processors are crucial to running an efficient and secure infrastructure. We found that a Dell PowerEdge R720xd server powered by the Intel Xeon processor E5-2600 v2 Series successfully overcame this hurdle in our tests. When we used NetVault Backup to dedupe and back up data while running database workloads, performance did not slow down. The Dell solution using source-based deduplication reduced backup storage space used by 39 times and reduced application server network traffic by 70.5 percent when compared with a traditional backup method. Strong processor and server combinations, such as the one we tested, can keep your business moving swiftly and save resources as you back up your data to keep it safe.
4 supporting h base jeff, jon, kathleen - cloudera - final 2Cloudera, Inc.
This document outlines a presentation given at HBaseCon 2012 on supporting HBase. It introduces the presenters and their backgrounds in supporting HBase. The presentation covers preventative steps to maintain a healthy HBase cluster, such as monitoring, sizing the cluster properly, and avoiding swap. It also discusses common issues that require "triage", such as connection resets, running out of file descriptors, and long garbage collection pauses. Finally, it addresses more severe issues that may require "surgery" to repair, like an unable to load database error or a downed HBase master and regionservers. The presentation provides examples of diagnosing issues using logs and event trails and provides recommendations for resolving different problems.
This document provides a summary of updates to MySQL between October 2021 and May 2021. It discusses three releases of MySQL 8.0 (versions 8.0.23, 8.0.24, and 8.0.25) and new features including invisible columns, asynchronous replication connection failover, improved load/dump functionality in MySQL Shell, and the new MySQL Database Service on Oracle Cloud Infrastructure with HeatWave for accelerated analytics.
English version of the presentation we gave at Devoxx FR 2012.
In depth analysis on how java Garbage collector works and how to minimise pause in your application.
The document discusses new Cassandra drivers that utilize a CQL binary protocol and provide an asynchronous, failover-capable architecture. Key points include the drivers supporting CQL's denormalized data model, request pipelining, notifications, configurable load balancing and retry policies, and downgrading consistency levels automatically if necessary.
Class lecture by Prof. Raj Jain on Application Delivery Networking. The talk covers Application Delivery in a Data Center, Application Delivery Controllers (ADCs), Load Balancing Concepts, Load Balancing Using DNS, Global Server Load Balancing (GSLB), Load Balancing Modes, Network Address Translation (NAT), NAT Example, NAT64 and NAT46, Carrier Grade NAT, Dual NAT, Server NAT, Transparent Mode, Firewall Load Balancing, Reverse Proxy Load Balancing, SSL Offload, TCP Multiplexing ADC, HTTP Compression, ADC Proliferation in Data Centers, Virtual ADCs, Multi-Function ADCs. Video recording available in YouTube.
The Very Very Latest in Database Development - Oracle Open World 2012Lucas Jellema
The document discusses using virtual columns in Oracle databases to implement business rules and uniqueness constraints across tables in a declarative way. Virtual columns allow expressing attributes as SQL expressions of real columns, enabling indexing and foreign key constraints that check rules involving multiple tables or columns. Business rules that were previously only possible through procedural logic can now be enforced at the database level through virtual columns.
20150704 benchmark and user experience in sahara weitingWei Ting Chen
Sahara provides a way to deploy and manage Hadoop clusters within an OpenStack cloud. It addresses common customer needs like providing an elastic environment for data processing jobs, integrating Hadoop with the existing private cloud infrastructure, and reducing costs. Key challenges include speeding up cluster provisioning times, supporting complex data workflows, optimizing storage architectures, and improving performance when using remote object storage.
This document provides an overview of the Domain Name System (DNS) including how DNS works, the record types stored in DNS, caching and authoritative servers, and delegation. It discusses how DNS queries work by recursively searching through the DNS hierarchy from the root servers down. Key record types like A, AAAA, NS, MX and SOA are described. The roles of caching servers, which forward queries on behalf of clients, and authoritative servers, which serve data from their own zone files, are compared. Delegation allows separate administration of different domains through the hierarchical structure of DNS.
MySQL Cluster Scaling to a Billion QueriesBernd Ocklin
MySQL Cluster is a distributed database that provides extreme scalability, high availability, and real-time performance. It uses an auto-sharding and auto-replicating architecture to distribute data across multiple low-cost servers. Key benefits include scaling reads and writes, 99.999% availability through its shared-nothing design with no single point of failure, and real-time responsiveness. It supports both SQL and NoSQL interfaces to enable complex queries as well as high-performance key-value access.
The document discusses data partitioning and distribution across multiple machines in a cluster. It explains that data replication does not scale well, but data partitioning, where each record exists on only one machine, allows write latency to scale with the number of machines in the cluster. Coherence provides a distributed cache that partitions data and offers functions for server-side processing near the data through tools like entry processors.
IT Infrastructure Through The Public Network Challenges And SolutionsMartin Jackson
Identifying the challenges that companies face when they wish to adopt Infrastructure as a Service like those from Amazon and Rackspace and possible solutions to those problems. This presentation seeks to provide insight and possible solutions, covering the areas of security, availability, cloud standards, interoperability, vendor lock in and performance management.
1. What are the difficulties in deploying and managing the life cycle of data-heavy application
2. Review of kubernetes landscape w.r.t data-heavy applications
3. Robin approach to orchestrating data-heavy applications
Dataservices: Processing Big Data the Microservice WayQAware GmbH
O'Reilly Software Architecture Conference 2018, New York (USA): Talk by Mario-Leander Reimer (@LeanderReimer, Principal Software Architect at QAware)
Abstract:
Big data processing, microservices, and cloud-native technology are a match made in computing heaven, enabling microservices to be used to build a flexible, scalable, and distributed system of loosely coupled data processing tasks, called data services.
Mario-Leander Reimer explores key JEE technologies that can be used to build JEE-powered data services and walks you through implementing the individual data processing tasks of a simplified showcase application. You’ll then deploy and orchestrate the individual data services using OpenShift, illustrating the scalability of the overall processing pipeline. The context and content is taken from a real-world project for a major German car manufacturer, implementing a microservices-based processing pipeline that uses car-related event data (sensor data, traffic events, and other real-time data) for a traffic information management and route optimization system.
Speaker: Tom Spitzer, Vice President, Engineering, EC Wise, Inc.
Session Type: 40 minute main track session
Level: 200 (Intermediate)
Track: Security
MongoDB Community Server provides a wide range of capabilities for securing your MongoDB installation. In this session, we will focus on access control features, including authentication and authorization mechanisms, that enable you to enforce a least privilege model on user accounts. We will also discuss strategies for enabling and maintaining service and application accounts. Next we will present the encryption capabilities that are available in the community edition and discuss their benefits and possible shortcomings. Finally, we will talk about application level protections your developers can implement to keep risky code from getting to your MongoDB instance.
What You Will Learn:
- The workings of the MongoDB User Management Interface, the Authentication Database, basic Authentication mechanisms (SCRAM-SHA-1 and certificates), Roles, and Role Based Access controls – plus best practices for using these features to improve the security of your database.
- How to use TLS/SSL for transport encryption, application encryption options, and field level redaction.
- How injection attacks work and how to minimize the risk of injection attacks.
Actionable Metrics at Production Scale - LSPE Meetup June 27, 2012SOASTA
Presentation to Large Scale Production Engineering meetup at Yahoo campus on June 27, 2012. Dan Bartow, VP Product Management at SOASTA discussed critical performance metrics.
Apache hadoop 3.x state of the union and upgrade guidance - Strata 2019 NYWangda Tan
The document discusses Apache Hadoop 3.x updates and provides guidance for upgrading to Hadoop 3. It covers community updates, features in YARN, Submarine, HDFS, and Ozone. Release plans are outlined for Hadoop, Submarine, and upgrades from Hadoop 2 to 3. Express upgrades are recommended over rolling upgrades for the major version change. The session summarizes that Hadoop 3 is an eagerly awaited release with many successful production uses, and that now is a good time for those not yet upgraded.
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...Josef Adersberger
Running applications on Kubernetes can provide a lot of benefits: more dev speed, lower ops costs, and a higher elasticity & resiliency in production. Kubernetes is the place to be for cloud native apps. But what to do if you’ve no shiny new cloud native apps but a whole bunch of JEE legacy systems? No chance to leverage the advantages of Kubernetes? Yes you can!
We’re facing the challenge of migrating hundreds of JEE legacy applications of a major German insurance company onto a Kubernetes cluster within one year. We're now close to the finish line and it worked pretty well so far.
The talk will be about the lessons we've learned - the best practices and pitfalls we've discovered along our way. We'll provide our answers to life, the universe and a cloud native journey like:
- What technical constraints of Kubernetes can be obstacles for applications and how to tackle these?
- How to architect a landscape of hundreds of containerized applications with their surrounding infrastructure like DBs MQs and IAM and heavy requirements on security?
- How to industrialize and govern the migration process?
- How to leverage the possibilities of a cloud native platform like Kubernetes without challenging the tight timeline?
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...QAware GmbH
CloudNativeCon North America 2017, Austin (Texas, USA): Talk by Josef Adersberger (@adersberger, CTO at QAware)
Abstract:
Running applications on Kubernetes can provide a lot of benefits: more dev speed, lower ops costs, and a higher elasticity & resiliency in production. Kubernetes is the place to be for cloud native apps. But what to do if you’ve no shiny new cloud native apps but a whole bunch of JEE legacy systems? No chance to leverage the advantages of Kubernetes? Yes you can!
We’re facing the challenge of migrating hundreds of JEE legacy applications of a major German insurance company onto a Kubernetes cluster within one year. We're now close to the finish line and it worked pretty well so far.
The talk will be about the lessons we've learned - the best practices and pitfalls we've discovered along our way. We'll provide our answers to life, the universe and a cloud native journey like:
- What technical constraints of Kubernetes can be obstacles for applications and how to tackle these?
- How to architect a landscape of hundreds of containerized applications with their surrounding infrastructure like DBs MQs and IAM and heavy requirements on security?
- How to industrialize and govern the migration process?
- How to leverage the possibilities of a cloud native platform like Kubernetes without challenging the tight timeline?
This document provides a summary of Sanjay Arya's qualifications and work experience. It includes over 24 years of experience in database administration, cloud computing, and software development. Recent work has focused on Oracle and AWS cloud administration, including database migration, backup/recovery, and database replication between Oracle and SQL Server. Previous roles involved administration of Oracle RAC, Exadata, high availability clusters, disaster recovery, and storage administration.
Partha Seetala is the CTO of Robin Systems, which provides a Kubernetes platform for running big data, NoSQL, database, and AI/ML workloads. Robin addresses challenges with containerizing these applications, such as resource management and storage and networking issues. Robin's solution allows applications to drive infrastructure configuration for improved user experience with capabilities like one-click provisioning, scaling, cloning, backup, and migration of applications across clouds.
The Future of Hadoop: A deeper look at Apache SparkCloudera, Inc.
Jai Ranganathan, Senior Director of Product Management, discusses why Spark has experienced such wide adoption and provide a technical deep dive into the architecture. Additionally, he presents some use cases in production today. Finally, he shares our vision for the Hadoop ecosystem and why we believe Spark is the successor to MapReduce for Hadoop data processing.
Hadoop is an open-source software framework written in Java for distributed storage and processing of large datasets across clusters of computers. It allows for the reliable, scalable, and distributed processing of large data sets across clusters of commodity hardware. Hadoop features include a distributed file system called HDFS that stores data across compute nodes, and a programming model called MapReduce that processes data in parallel across the cluster.
Similar to Data Locality, Latency and Caching: JSR-107 and the new Java JCACHE Standard (20)
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Infrastructure Challenges in Scaling RAG with Custom AI modelsZilliz
Building Retrieval-Augmented Generation (RAG) systems with open-source and custom AI models is a complex task. This talk explores the challenges in productionizing RAG systems, including retrieval performance, response synthesis, and evaluation. We’ll discuss how to leverage open-source models like text embeddings, language models, and custom fine-tuned models to enhance RAG performance. Additionally, we’ll cover how BentoML can help orchestrate and scale these AI components efficiently, ensuring seamless deployment and management of RAG systems in the cloud.
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceIndexBug
Imagine a world where machines not only perform tasks but also learn, adapt, and make decisions. This is the promise of Artificial Intelligence (AI), a technology that's not just enhancing our lives but revolutionizing entire industries.
2. Early (Major) Problems:
Winter 1997, high stakes problems encountered by those
Enterprises that trusted the “Web Applications are ready!”
promise.
CASE: The week before Christmas 1997 the web-facing
OMS at www.amazon.com repeatedly crashed. Availability
was severely impacted. Millions of dollars in potential
revenue were lost. Credibility with customers damaged. Put
simply, AMAZON.com cannot go down the week before
Christmas!
An urgent, immediate solution was required to correct legacy
application architectures so that they could reliably and
performantly operate at “Internet scale”.
Enter Caching
Technology!
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
3. What is at stake for Caching Solution Providers and
their Customers?
Caching technology providers had a huge incentive to deliver
a 100% sound and100% complete solution …. At Stake?
The World Wide Web as a reliable and performant
application platform.
NEEDED:
1. Accommodate application architects & developers with the capability to
rapidly implement Caching technology.
2. Offload the database (ULTRA URGENT!)
3. Scale UP (How do I make the cache‟s in-process heap scale?)
4. Scale OUT (How do I distribute my cache to multiple processes?)
5. Buffer against load variability
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
4. A Brief History of the Data Locality &
Latency Problem
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
4
5. The very first data locality problem – Ma Bell
and the FCC assignment of Area Codes. 1947.
NYC
LA
CHIC
212
213
312
…
MEMPHIS 901
RALEIGH 919
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
6. The Data Locality Problem
Big Decision: How much am I willing to “pay” to have
my data as close as possible to processing resources?
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
2/15/2014
6
7. Moving the data further away from the processing
…
P1
P2
Local Unix Kernel (IPC
Messages, Semaphores, Shared Memory)
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
2/15/2014
7
8. A little further …
P1
P2
RDBMS
Local Client-Server
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
2/15/2014
8
9. A little further … LAN Client Server
P1
P2
Client Host 1
Client Host 2
RDBMS
Data Server
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
2/15/2014
9
10. And further still … WAN Client Server
P1
P2
Internet
RDBMS
Public Data Source
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
2/15/2014
10
11. Latency Problem = Given that data is available, how fast
can I access it for processing?
Forget networked client server!! All Low-Latency apps use highly
optimized data feeds to bring the data “local”.
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
2/15/2014
11
12.
Latency Math ….
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
2/15/2014
12
13.
Latency Physics ….
2,000 PhD Physicists now working on Wall St.
U.S. Colleges and Universities cannot produce nearly
enough Physics PhDs to meet Wall St. demand
Many of these PhD Physicists are from former Soviet-Bloc
countries
COLD-WAR priority work = building defenses that could
stop Western advances in Stealth technology (largely unsuccessful)
POST COLD-WAR priority work = designing lowlatency, high-frequency Capital Markets trading algorithms
(very successful!)
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
2/15/2014
13
14.
Latency Physics ….
Famous LL/HF Capital Markets Trading Algorithms =
“Cherokee Nation”
“Marco Polo”
“Boston Shuffler”
“Frog Pond”
“Carnival”
“Twilight”
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
2/15/2014
14
15. Why these names, e.g. Why “Cherokee Nation”?
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
16.
Latency Physics ….
Bottom Line :
Great Latency iff Great
Location
FAMOUS CASE STUDY = NYC
Carrier
Hotel
Algorithms “check in” but … They don’t “check
out” !
(details to follow)
http://www.carrierhotels.com/news/September200
1/60hudson.shtml
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
2/15/2014
16
22. Full story of the legend of the NYC Carrier Hotel … and
how it became the place where
“wall street trading algorithms check in … but they don’t
check out”
is very compellingly told here:
http://www.ted.com/talks/kevin_slavin_how_algorithms_sh
ape_our_world.html
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
2/15/2014
22
23. Beyond LL/HF Caching‟s Mathematics, Physics and
(dramatic) case studies we can sum up the general
definition of Caching performance in one simple
expression ratio:
How can we best make this happen w/ Java
Caching technology?
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
2/15/2014
23
26. CAP: The simplest view of simultaneous service
guarantee capabilities and tradeoffs.
Abstract Communication Partition
DataNode
1
DataNode
2
1. Allowing 1 or both available nodes to be UPDATED
means you lose simultaneous Consistency
2. To guarantee simultaneous Consistency you must
make 1 node unAvailable
3. If you want both C & A then all nodes must
communicate and synchronize, but then you gain
Partition risk (Communication could break).
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
27. Ben starts a “Birthday Reminder” service
@t=1 Call Ben … put(“name”,”DOB”);
@t=2 Call Ben … get(„name”,”DOB”);
//lacks availability … what if 2 calls arrive at same time?
Ben adds ex-wife to answer phones
Call svc … put(“name”,”DOB”);
Call svc … get(“name”,”DOB”);
//lacks consistency … what if call
@t=1 Ben answers = put(“name”,”DOB”)
@t=2 call back … but ..ex-Wife answers … get(“name”,”DOB”)
fails?
Ben & ex-Wife add coherency and synchronization protocol
Call svc … put(“name”,”DOB”);
Ben & ex-Wife get in fight …stop talking. //coherency sol‟n
useless!
Call svc … get(“name”,”DOB”);
//lacks partition tolerance
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
2/15/2014
27
29. Protocols for Cache Replacement/Consistency
•
•
•
•
•
•
•
Belady‟s Algorithm (aka Clairvoyant)
LRU
MRU
Random
LFU
LIRRS
CLOCK – FIFO/LIFO/FILO/LILO
Protocols for Cache Partition(Node) Health/Healing
•
•
Cardiology Protocol
Nodes obliged to produce “Heart Beat”
•
-
•
Gossip Protocol
•
•
Nodes obliged to produce “rumors” about neighboring nodes
Central “Gossip columnist” collect rumors, treats ‟illness‟ based on
Gossip credibility policy.
Central „Cardiologist‟ consumes HB pulses, treats „illness‟
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
30.
Java Caching API “Join Points” =
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
2/15/2014
30
31. Enough of the theory of Caching… on to the
standard Java Caching API …
On to JSR-107!
Before JSR-107 all Java Caching vendors provided
proprietary APIs
Caching code suffered from vendor lock-in
Just like JDBC solved this problem for the RDBMS
community, JSR-107 solves this problem for the
JCACHE community
JSR-107 completely open … not hosted on private
jcp.org EG forum. Hosted on
http://groups.google.com/group/jsr107 public EG
forum.
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
2/15/2014
31
38. A JSR-107 compliant Caching implementation is
“Transactional” in 2 important (different) senses:
1.
Compound access/mutate operations against a
Cache Entry are guaranteed to be EXCLUSIVE and
ATOMIC.
(see javax.cache.Cache.EntryProcessor interface)
2. Compound operations against a Cache are
guaranteed to preserve Transactional ACID properties.
(see JTA and javax.transaction.UserTransaction
interface)
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
2/15/2014
38
39. Important to note re: JSR-107 specification of
“Transactional” Java Caching …
If a Java Caching provider claims to be “JSR-107
Transactions compliant”, then all of the following transactional
isolation levels are guaranteed to be available in the provider‟s
JSR-107 implementation:
TX_READ_UNCOMMITTED
TX_READ_COMMITTED
TX_REPEATABLE_READ
no „Dirty Read‟ guaranteed
„Repeatable Read‟ assurance
guaranteed
TX_SERIALIZABLE
no „Phantom Read‟ guaranteed
In other words, a 100% sound and 100% complete API
capability for managing transaction isolation is guaranteed.
CRUCIAL for Capital Markets Java Caching uses cases!!
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
2/15/2014
39
40. JSR-107 => all Java Caches now guarantee TX capabilities:
@t=0
@t=1
TX_THR_1
utx.begin();
JCACHE_ENTRY
$5
mutate(credit $1)
@t=2
@t=3
@t=4
@t=5
TX_THR_2
utx.begin();
$6
$6
//more processing
utx.rollback()
val = access() //$6
$6
//more processing
$5
$5
utx.commit() //$6 ??
// NOTE: @t=4 DIRTY_READ realized by TX_THR_2 access()
@t=2
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
2/15/2014
40
41. Can we use JSR-107 API to prevent Dirty Reads?
Of course!!
Just code TX_THR_2 like this
javax.cache.CacheManager cm = javax.cache.Caching.getCacheManager(id);
javax.cache.Cache c = cm.createCacheBuilder(“bank_cache”)
.setTransactionEnabled(
javax.cache.transaction.IsolationLevel.READ_COMMITTED,
javax.cache.transaction.Mode.LOCAL
).build();
javax.transaction.UserTransaction jsr107_utx = new JSR107ProviderJTAUserTransactionImpl(c);
jsr107_utx.begin();
try {
//process
BigDecimal cash = c.get(ACCOUNT_ID);
// @t=2 … guaranteed to BLOCK until TX_THR_1 commits/RB
//process more
jsr107_utx.commit();
} catch (Exception e) {
jsr107_utx.rollback();
}
// STILL NEEDED => OPTIMISITIC and PESSIMISTIC block policy (via API) for @t=2 DIRTY_READ
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
2/15/2014
41
42. JSR-107 => DIRTY READ prevention
@t=0
@t=1
TX_THR_1
utx.begin();
JCACHE_ENTRY
$5
mutate(credit $1)
@t=2
@t=3
@t=4
@t=5
$6
$6
//more processing
utx.rollback()
TX_THR_2
utx.begin();
val = access() //BLOCK
$6
$5
$5
val = access() //$5
// DIRTY READ guaranteed to be PREVENTED
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
2/15/2014
42
43. Can we use JSR-107 API to accommodate all TX_ISOLATION tolerances (and
intolerances) ?
What about rare transactional use cases that have extreme TX_ISOLATION demands? E.g.
Intolerance to PHANTOM_READS? Can JSR-107 accommodate these rare (but important ) uses
cases?
Answer = YES!!
What about multiple, dynamic views of the same Cache with different (possibly extreme)
TX_ISOLATION demands?
Answer = YES
EG
common Bond Portfolio CACHE shared by both RETAIL and INSTITUTIONAL Trading desks
…
trade
trade
HY Credit
Portfolio
(3,000 bonds)
RETAIL desk
1000s of trades daily
Each trade = few millis to transact
Avg Trade = $5,000 USD
Use JSR-107
TX_READ_COMMITTED
INSTITUTIONAL desk
dozen or so trades daily
Each trade = 4-6 secs to transact
Avg Trade = $10,000,000 USD
PHANTOM_READ intolerant
Use JSR-107
TX_SERIALIZABLE
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
2/15/2014
43
56.
Questions (?)
Acknowledgment: Much of this PPT was taken
from Greg Luck and Cameron Purdy‟s JavaOne
2012 presentation on JSR-107 see=
http://www.parleys.com/#st=5&id=3046
Data Locality, Latency and Caching: JSR-107 and the new Java Standard
2/15/2014
56