Elliptics is a distributed, fault-tolerant data storage system built on distributed hash table (DHT) principles. It is designed to store medium to large data records from 1KB to terabytes in size. Key features include no single point of failure, high availability even during network or hardware failures, fast read and write speeds through techniques like asynchronous I/O, caching and direct peer-to-peer data streaming for large files. Elliptics ensures data consistency through techniques like replication across data centers and automatic data repartitioning when nodes are added or removed. It provides a simple interface for use in C/C++, Go and via HTTP/REST.
Deep Dive into Automating Oracle GoldenGate Using the New MicroservicesKal BO
Oracle open Word 2017 , please download it
in this session learn from Oracle Development and Product Management how to automate and embed Oracle GoldenGate using the new Oracle GoldenGate microservices. Learn how to embed and orchestrate Oracle GoldenGate for your use case similar to how Oracle Database sharding embeds and automates Oracle GoldenGate. Learn how to use the new conflict detection and resolution for active-active environments using the new integration with the database to automate this functionality.
Integrated Cloud Platform: Database, Integration
Code: CON6569
Session Type: Conference Session
SPEAKERS
Nick Wagner, Oracle
Volker Kuhr, Senior Principle Product Manager, Oracle
Jing Liu, Director, Development, Oracle
How many ways to monitor oracle golden gate-Collaborate 14Bobby Curtis
The document provides contact information for Bobby Curtis, a senior technical consultant specializing in Oracle GoldenGate and Oracle Enterprise Manager 12c. It lists his location, affiliations, areas of expertise, and contact details including his Twitter, blog, and email addresses. The document also provides links to registration and location pages for an upcoming training event from Enkitec and an overview of the topics to be covered, including monitoring approaches for Oracle GoldenGate.
Slide 1 - Parallels Plesk Control Panel 8.6.0webhostingguy
The document discusses various maintenance items and PTFs for IBM DB2 including:
- PTFs for DB2 Version 8 and z/OS to fix various issues like performance problems, errors, and serviceability enhancements
- New features in recent DB2 releases including support for longer SQL statements in ODBC, improved monitoring of real storage usage, and preliminary support for IBM's Enterprise Workload Manager
- Details on fixes for specific problems like encrypting passwords for distributed data, diagnosing hung threads, and monitoring when dynamic SQL exceeds resource limits.
Improve PostgreSQL replication with Oracle GoldenGateBobby Curtis
This document discusses using Oracle GoldenGate 19c to improve PostgreSQL replication. It provides an overview of RheoData, a global systems integrator, and then details the steps to configure GoldenGate for PostgreSQL replication, including prerequisites, installation, registering an extract, adding transaction data, adding an extract and replicat, and monitoring replication slots and statistics. It also covers using GoldenGate for on-premises to cloud replication with a remote apply to an AWS RDS PostgreSQL database.
VoltDB is a high performance database for real-time analytics that can be deployed on SoftLayer cloud infrastructure. The document outlines the process to install and run VoltDB on SoftLayer, including unpacking the VoltDB distribution, installing Java, exporting the VoltDB binaries to the path, and running VoltDB using the run.sh script. It also discusses how VoltDB enables real-time analytics by ingesting and exporting data to Netezza for deeper historical analysis in a closed loop system.
Oracle GoldenGate is Oracle's strategic solution for real-time data integration. It provides low-impact capture, routing, and delivery of transactional data across heterogeneous environments in real time. GoldenGate supports data replication between various database platforms for scenarios such as real-time reporting, zero-downtime migrations and upgrades, and data consolidation.
The document describes a migration from an Oracle database topology to a PostgreSQL database topology at ACI. It discusses the starting Oracle topology with issues around operational complexity and non-ACID compliance. It then describes the target PostgreSQL topology with improved performance, availability and lower costs. The document outlines decisions around tools, extensions, code changes and testing approaches needed for the migration. It also discusses options for migrating the data and cutting over to the new PostgreSQL environment.
Deep Dive into Automating Oracle GoldenGate Using the New MicroservicesKal BO
Oracle open Word 2017 , please download it
in this session learn from Oracle Development and Product Management how to automate and embed Oracle GoldenGate using the new Oracle GoldenGate microservices. Learn how to embed and orchestrate Oracle GoldenGate for your use case similar to how Oracle Database sharding embeds and automates Oracle GoldenGate. Learn how to use the new conflict detection and resolution for active-active environments using the new integration with the database to automate this functionality.
Integrated Cloud Platform: Database, Integration
Code: CON6569
Session Type: Conference Session
SPEAKERS
Nick Wagner, Oracle
Volker Kuhr, Senior Principle Product Manager, Oracle
Jing Liu, Director, Development, Oracle
How many ways to monitor oracle golden gate-Collaborate 14Bobby Curtis
The document provides contact information for Bobby Curtis, a senior technical consultant specializing in Oracle GoldenGate and Oracle Enterprise Manager 12c. It lists his location, affiliations, areas of expertise, and contact details including his Twitter, blog, and email addresses. The document also provides links to registration and location pages for an upcoming training event from Enkitec and an overview of the topics to be covered, including monitoring approaches for Oracle GoldenGate.
Slide 1 - Parallels Plesk Control Panel 8.6.0webhostingguy
The document discusses various maintenance items and PTFs for IBM DB2 including:
- PTFs for DB2 Version 8 and z/OS to fix various issues like performance problems, errors, and serviceability enhancements
- New features in recent DB2 releases including support for longer SQL statements in ODBC, improved monitoring of real storage usage, and preliminary support for IBM's Enterprise Workload Manager
- Details on fixes for specific problems like encrypting passwords for distributed data, diagnosing hung threads, and monitoring when dynamic SQL exceeds resource limits.
Improve PostgreSQL replication with Oracle GoldenGateBobby Curtis
This document discusses using Oracle GoldenGate 19c to improve PostgreSQL replication. It provides an overview of RheoData, a global systems integrator, and then details the steps to configure GoldenGate for PostgreSQL replication, including prerequisites, installation, registering an extract, adding transaction data, adding an extract and replicat, and monitoring replication slots and statistics. It also covers using GoldenGate for on-premises to cloud replication with a remote apply to an AWS RDS PostgreSQL database.
VoltDB is a high performance database for real-time analytics that can be deployed on SoftLayer cloud infrastructure. The document outlines the process to install and run VoltDB on SoftLayer, including unpacking the VoltDB distribution, installing Java, exporting the VoltDB binaries to the path, and running VoltDB using the run.sh script. It also discusses how VoltDB enables real-time analytics by ingesting and exporting data to Netezza for deeper historical analysis in a closed loop system.
Oracle GoldenGate is Oracle's strategic solution for real-time data integration. It provides low-impact capture, routing, and delivery of transactional data across heterogeneous environments in real time. GoldenGate supports data replication between various database platforms for scenarios such as real-time reporting, zero-downtime migrations and upgrades, and data consolidation.
The document describes a migration from an Oracle database topology to a PostgreSQL database topology at ACI. It discusses the starting Oracle topology with issues around operational complexity and non-ACID compliance. It then describes the target PostgreSQL topology with improved performance, availability and lower costs. The document outlines decisions around tools, extensions, code changes and testing approaches needed for the migration. It also discusses options for migrating the data and cutting over to the new PostgreSQL environment.
This document provides an overview and agenda for a presentation on Couchbase and its integration with Hadoop ecosystems. It begins with an introduction to Couchbase as a NoSQL database offering key-value, document, and query capabilities. It then discusses Couchbase's role in operational and analytical use cases as well as its architecture involving Couchbase nodes, SDKs, and clusters. The document outlines how Couchbase can integrate with Hadoop for streaming, batch processing, and serving merged results. It also provides an example of Couchbase's implementation at PayPal, including its use of Kafka to stream data from Couchbase to Hadoop clusters. The presentation concludes with a demo of the Couchbase Kafka connector.
Rapid Home Provisioning is a new feature in Oracle Grid Infrastructure 12c R2 that provides a simplified way to provision and patch Oracle software and databases. It uses a centralized management server and golden images stored on ACFS to deploy pre-packaged and patched Oracle homes to client nodes. Administrators can easily create working copies of golden images, deploy databases from the working copies, and seamlessly patch databases by moving them to a working copy based on a newer patched golden image with a single command.
This document discusses tuning Oracle GoldenGate for optimal performance. It begins with an overview of GoldenGate architecture and use cases, then discusses the importance of baseline monitoring. Key metrics to monitor are identified as lag times, checkpoint information, CPU usage, memory usage, and disk I/O. The document provides examples of commands to gather baseline data on these metrics. It then discusses configuring GoldenGate for parallel processing using multiple process groups to optimize performance. Overall it provides guidance on setting baselines and configuring GoldenGate to minimize lag times and resource utilization.
7 September 2017 - At ION Conference Durban, South Africa, Andrew Alston on how Liquid Telecom deployed IPv6 and how other organizations can do the same.
The University of Edinburgh is undergoing a large project to reprocure its campus networking infrastructure. The existing network, which has grown organically over many years, contains equipment that is up to 20 years old and no longer meets the university's needs. After an internal review in 2014 recommended a new network be procured, the university embarked on a multi-stage competitive dialogue procurement process that is still ongoing. The process involves pre-market engagement, shortlisting bidders, and multiple rounds of dialogue and evaluation to refine solutions before selecting a final vendor. The procurement has proven to be a large undertaking but may result in a network solution tailored to the university's unique requirements.
Oracle Goldengate first acquaintance
The Oracle's strategic solution for real time data integration. Oracle GoldenGate provides low-impact capture, routing, transformation, and delivery of transactional data across heterogeneous environments in real time
Oracle GoldenGate provides real-time data integration and replication capabilities. It uses non-intrusive change data capture to replicate transactional changes in real-time across heterogeneous database environments with sub-second latency. GoldenGate has over 500 customers across various industries and supports workloads involving terabytes of data movement per day. It extends Oracle's data integration and high availability capabilities beyond Oracle databases to other platforms like SQL Server and MySQL.
Effective Oracle Home Management in the new Release Model eraLudovico Caldara
How many companies can afford patching regularly their environments?
Patching and maintaining a big amount of Oracle Databases is perceived as complex by most companies. Is there a way to make patching simpler and more controlled? What are the best (and worst) practices for Oracle Home maintenance?
What are the challenges of the new release model that will bring us one new major release per year?
In this session, we will explain some ideas to improve Oracle Home management and database patching, as well as practical examples of automated environments, live demos included!
Whats new in Oracle Database 12c release 12.1.0.2Connor McDonald
This document provides an overview of new features in Oracle Database 12c Release 1 (12.1.0.2). It discusses Oracle Database In-Memory for accelerating analytics, improvements for developers like support for JSON and RESTful services, capabilities for accessing big data using SQL, enhancements to Oracle Multitenant for database consolidation, and other performance improvements. The document also briefly outlines features like Oracle Rapid Home Provisioning, Database Backup Logging Recovery Appliance, and Oracle Key Vault.
Are your Oracle databases highly available? You have deployed Real Application Clusters (RAC), Data Guard, or Failover Clusters and are well protected against server failures? Great – the prerequisites for a highly available environment are given. However, to assure that backend infrastructure failures also remain transparent to the client, an appropriate configuration is a prerequisite.
This lecture will discuss the Oracle technologies that can be used to achieve automatic client failover functionality. What are the advantages, but also the limitations of these technologies?
PayPal has seen tremendous growth in recent years, processing over 7.8 billion payments transactions annually for over 227 million active customer accounts across 200+ markets and currencies. To support this scale, PayPal's data infrastructure includes over 2,000 database instances, 116 billion database calls per day, and over 74 petabytes of total storage. PayPal continues enhancing its data infrastructure to meet growing analytics and machine learning needs through technologies like Kafka, Hadoop, graph databases and real-time OLAP engines.
MySQL Cluster Performance Tuning - 2013 MySQL User ConferenceSeveralnines
Slides from a presentation given at Percona Live MySQL Conference 2013 in Santa Clara, US.
Topics include:
- How to look for performance bottlenecks
- Foreign Key performance in MySQL Cluster 7.3
- Sharding and table partitioning
- efficient use of datatypes (e.g. BLOBS vs varbinary)
How to migrate AWS RDS Oracle DBs to OCI using OCI Backup Service. View how you can migrate your Oracle databases on AWS to OCI. View the recording at : https://asktom.oracle.com/pls/apex/asktom.search?oh=7575
Writing High-Performance Software by Arvid Norbergbittorrentinc
The document discusses techniques for writing high-performance software, focusing on optimizing memory access and reducing context switching. It covers CPU memory hierarchies, data structures, and socket programming. Some key points include organizing data sequentially in memory to improve cache hits, batching work to amortize context switching costs, and using asynchronous I/O to avoid blocking threads on disk or network operations.
Handling Kernel Upgrades at Scale - The Dirty Cow StoryDataWorks Summit
Apache Hadoop at Yahoo is a massive platform with 36 different clusters spread across YARN, Apache HBase, and Apache Storm deployments, totaling 60,000 servers made up of 100s of different hardware configurations accumulated over generations, presenting unique operational challenges and a variety of unforeseen corner cases. In this talk, we will share methods, tips and tricks to deal with large scale kernel upgrade on heterogeneous platforms within tight timeframes with 100% uptime and no service or data loss through the Dirty COW use case (privilege escalation vulnerability found in the Linux Kernel in late 2016).
We will dive deep into our three phased approach that led to eventual success of the program - pre work, kernel upgrade itself, and post work / cleanup. We will share the details on automation tools, UIs, and reporting tools developed and used to achieve the stated objectives of 800+ server upgrades per hour, track the upgrade progress, validate and report data blocks, and recover quickly from bad blocks encountered. Throughout the talk, we will highlight the importance of process management, communicating with 100s of custom teams to ensure they are onboard and aware, and successful coordination tactics with SREs and Site Operations. We will also touch upon some of the unique challenges we faced along with way such as BIOS updates necessary on over 20,000 hosts along the way, and explain system rolling upgrade support we added to HBase and Storm for avoiding service disruption to low latency customer during these upgrades.
A5 oracle exadata-the game changer for online transaction processing data w...Dr. Wilfred Lin (Ph.D.)
The document discusses Oracle Exadata and how it can transform online transaction processing, data warehousing, and database consolidation. It describes Exadata as a scale-out platform that integrates servers, storage, and networking optimized for Oracle Database. Exadata delivers extreme performance through special software that brings database intelligence to storage, flash, and networking. It is suitable for all database workloads including OLTP, data warehousing, and database clouds.
This white paper discusses various transition technologies that service providers need to support both IPv4 and IPv6 networks during the lengthy transition period to IPv6. It covers dual-stack networking, different types of network address translation (NAT44, NAT64, NAT444, etc.), and various tunneling methods like 6rd, DS-Lite, and IPv6 in MPLS. Dual-stack is preferred but requires maintaining both protocols. NAT extended the life of IPv4 by allowing private addressing but broke the end-to-end IP model. Transition technologies aim to provide a smooth path to the full deployment of IPv6 while still supporting legacy IPv4 devices and applications.
MOUG17 Keynote: Oracle OpenWorld Major AnnouncementsMonica Li
Midwest Oracle Users Group Training Day 2017 Presentation by Rich Niemiec, Chief Innovation Officer at Viscosity North America.
Catch up on OOW17's top announcements in this 1 hour presentation.
Building a High-Performance Database with Scala, Akka, and SparkEvan Chan
Here is my talk at Scala by the Bay 2016, Building a High-Performance Database with Scala, Akka, and Spark. Covers integration of Akka and Spark, when to use actors and futures, back pressure, reactive monitoring with Kamon, and more.
Java on arm theory, applications, and workloads [dev5048]Aleksei Voitylov
This document discusses optimizing Java performance on Arm processors. It describes adding intrinsics and stubs to the HotSpot JVM compiler to generate optimized Arm assembly for key methods like String processing and math functions. Benchmark results show up to 78x speedups for microbenchmarks and improved performance on SPECjbb2015 from these changes. The goal is to improve the performance of typical enterprise Java workloads on Arm servers.
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...confluent
Do you know who is knocking on your network’s door? Have new regulations left you scratching your head on how to a handle what is happening in your network? Network flow data helps answer many questions across a multitude of use cases including network security, performance, capacity planning, routing, operational troubleshooting and more. Today’s modern day streaming data pipelines need to include tools that can scale to meet the demands of these service providers while continuing to provide responsive answers to difficult questions. In addition to stream processing, data needs to be stored in a redundant, operationally focused database to provide fast, reliable answers to critical questions. Together, Kafka and Druid work together to create such a pipeline.
In this talk Eric Graham and Rachel Pedreschi will discuss these pipelines and cover the following topics: Network flow use cases and why this data is important. Reference architectures from production systems at a major international Bank. Why Kafka and Druid and other OSS tools for Network flows. A demo of one such system.
This document provides an overview and agenda for a presentation on Couchbase and its integration with Hadoop ecosystems. It begins with an introduction to Couchbase as a NoSQL database offering key-value, document, and query capabilities. It then discusses Couchbase's role in operational and analytical use cases as well as its architecture involving Couchbase nodes, SDKs, and clusters. The document outlines how Couchbase can integrate with Hadoop for streaming, batch processing, and serving merged results. It also provides an example of Couchbase's implementation at PayPal, including its use of Kafka to stream data from Couchbase to Hadoop clusters. The presentation concludes with a demo of the Couchbase Kafka connector.
Rapid Home Provisioning is a new feature in Oracle Grid Infrastructure 12c R2 that provides a simplified way to provision and patch Oracle software and databases. It uses a centralized management server and golden images stored on ACFS to deploy pre-packaged and patched Oracle homes to client nodes. Administrators can easily create working copies of golden images, deploy databases from the working copies, and seamlessly patch databases by moving them to a working copy based on a newer patched golden image with a single command.
This document discusses tuning Oracle GoldenGate for optimal performance. It begins with an overview of GoldenGate architecture and use cases, then discusses the importance of baseline monitoring. Key metrics to monitor are identified as lag times, checkpoint information, CPU usage, memory usage, and disk I/O. The document provides examples of commands to gather baseline data on these metrics. It then discusses configuring GoldenGate for parallel processing using multiple process groups to optimize performance. Overall it provides guidance on setting baselines and configuring GoldenGate to minimize lag times and resource utilization.
7 September 2017 - At ION Conference Durban, South Africa, Andrew Alston on how Liquid Telecom deployed IPv6 and how other organizations can do the same.
The University of Edinburgh is undergoing a large project to reprocure its campus networking infrastructure. The existing network, which has grown organically over many years, contains equipment that is up to 20 years old and no longer meets the university's needs. After an internal review in 2014 recommended a new network be procured, the university embarked on a multi-stage competitive dialogue procurement process that is still ongoing. The process involves pre-market engagement, shortlisting bidders, and multiple rounds of dialogue and evaluation to refine solutions before selecting a final vendor. The procurement has proven to be a large undertaking but may result in a network solution tailored to the university's unique requirements.
Oracle Goldengate first acquaintance
The Oracle's strategic solution for real time data integration. Oracle GoldenGate provides low-impact capture, routing, transformation, and delivery of transactional data across heterogeneous environments in real time
Oracle GoldenGate provides real-time data integration and replication capabilities. It uses non-intrusive change data capture to replicate transactional changes in real-time across heterogeneous database environments with sub-second latency. GoldenGate has over 500 customers across various industries and supports workloads involving terabytes of data movement per day. It extends Oracle's data integration and high availability capabilities beyond Oracle databases to other platforms like SQL Server and MySQL.
Effective Oracle Home Management in the new Release Model eraLudovico Caldara
How many companies can afford patching regularly their environments?
Patching and maintaining a big amount of Oracle Databases is perceived as complex by most companies. Is there a way to make patching simpler and more controlled? What are the best (and worst) practices for Oracle Home maintenance?
What are the challenges of the new release model that will bring us one new major release per year?
In this session, we will explain some ideas to improve Oracle Home management and database patching, as well as practical examples of automated environments, live demos included!
Whats new in Oracle Database 12c release 12.1.0.2Connor McDonald
This document provides an overview of new features in Oracle Database 12c Release 1 (12.1.0.2). It discusses Oracle Database In-Memory for accelerating analytics, improvements for developers like support for JSON and RESTful services, capabilities for accessing big data using SQL, enhancements to Oracle Multitenant for database consolidation, and other performance improvements. The document also briefly outlines features like Oracle Rapid Home Provisioning, Database Backup Logging Recovery Appliance, and Oracle Key Vault.
Are your Oracle databases highly available? You have deployed Real Application Clusters (RAC), Data Guard, or Failover Clusters and are well protected against server failures? Great – the prerequisites for a highly available environment are given. However, to assure that backend infrastructure failures also remain transparent to the client, an appropriate configuration is a prerequisite.
This lecture will discuss the Oracle technologies that can be used to achieve automatic client failover functionality. What are the advantages, but also the limitations of these technologies?
PayPal has seen tremendous growth in recent years, processing over 7.8 billion payments transactions annually for over 227 million active customer accounts across 200+ markets and currencies. To support this scale, PayPal's data infrastructure includes over 2,000 database instances, 116 billion database calls per day, and over 74 petabytes of total storage. PayPal continues enhancing its data infrastructure to meet growing analytics and machine learning needs through technologies like Kafka, Hadoop, graph databases and real-time OLAP engines.
MySQL Cluster Performance Tuning - 2013 MySQL User ConferenceSeveralnines
Slides from a presentation given at Percona Live MySQL Conference 2013 in Santa Clara, US.
Topics include:
- How to look for performance bottlenecks
- Foreign Key performance in MySQL Cluster 7.3
- Sharding and table partitioning
- efficient use of datatypes (e.g. BLOBS vs varbinary)
How to migrate AWS RDS Oracle DBs to OCI using OCI Backup Service. View how you can migrate your Oracle databases on AWS to OCI. View the recording at : https://asktom.oracle.com/pls/apex/asktom.search?oh=7575
Writing High-Performance Software by Arvid Norbergbittorrentinc
The document discusses techniques for writing high-performance software, focusing on optimizing memory access and reducing context switching. It covers CPU memory hierarchies, data structures, and socket programming. Some key points include organizing data sequentially in memory to improve cache hits, batching work to amortize context switching costs, and using asynchronous I/O to avoid blocking threads on disk or network operations.
Handling Kernel Upgrades at Scale - The Dirty Cow StoryDataWorks Summit
Apache Hadoop at Yahoo is a massive platform with 36 different clusters spread across YARN, Apache HBase, and Apache Storm deployments, totaling 60,000 servers made up of 100s of different hardware configurations accumulated over generations, presenting unique operational challenges and a variety of unforeseen corner cases. In this talk, we will share methods, tips and tricks to deal with large scale kernel upgrade on heterogeneous platforms within tight timeframes with 100% uptime and no service or data loss through the Dirty COW use case (privilege escalation vulnerability found in the Linux Kernel in late 2016).
We will dive deep into our three phased approach that led to eventual success of the program - pre work, kernel upgrade itself, and post work / cleanup. We will share the details on automation tools, UIs, and reporting tools developed and used to achieve the stated objectives of 800+ server upgrades per hour, track the upgrade progress, validate and report data blocks, and recover quickly from bad blocks encountered. Throughout the talk, we will highlight the importance of process management, communicating with 100s of custom teams to ensure they are onboard and aware, and successful coordination tactics with SREs and Site Operations. We will also touch upon some of the unique challenges we faced along with way such as BIOS updates necessary on over 20,000 hosts along the way, and explain system rolling upgrade support we added to HBase and Storm for avoiding service disruption to low latency customer during these upgrades.
A5 oracle exadata-the game changer for online transaction processing data w...Dr. Wilfred Lin (Ph.D.)
The document discusses Oracle Exadata and how it can transform online transaction processing, data warehousing, and database consolidation. It describes Exadata as a scale-out platform that integrates servers, storage, and networking optimized for Oracle Database. Exadata delivers extreme performance through special software that brings database intelligence to storage, flash, and networking. It is suitable for all database workloads including OLTP, data warehousing, and database clouds.
This white paper discusses various transition technologies that service providers need to support both IPv4 and IPv6 networks during the lengthy transition period to IPv6. It covers dual-stack networking, different types of network address translation (NAT44, NAT64, NAT444, etc.), and various tunneling methods like 6rd, DS-Lite, and IPv6 in MPLS. Dual-stack is preferred but requires maintaining both protocols. NAT extended the life of IPv4 by allowing private addressing but broke the end-to-end IP model. Transition technologies aim to provide a smooth path to the full deployment of IPv6 while still supporting legacy IPv4 devices and applications.
MOUG17 Keynote: Oracle OpenWorld Major AnnouncementsMonica Li
Midwest Oracle Users Group Training Day 2017 Presentation by Rich Niemiec, Chief Innovation Officer at Viscosity North America.
Catch up on OOW17's top announcements in this 1 hour presentation.
Building a High-Performance Database with Scala, Akka, and SparkEvan Chan
Here is my talk at Scala by the Bay 2016, Building a High-Performance Database with Scala, Akka, and Spark. Covers integration of Akka and Spark, when to use actors and futures, back pressure, reactive monitoring with Kamon, and more.
Java on arm theory, applications, and workloads [dev5048]Aleksei Voitylov
This document discusses optimizing Java performance on Arm processors. It describes adding intrinsics and stubs to the HotSpot JVM compiler to generate optimized Arm assembly for key methods like String processing and math functions. Benchmark results show up to 78x speedups for microbenchmarks and improved performance on SPECjbb2015 from these changes. The goal is to improve the performance of typical enterprise Java workloads on Arm servers.
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...confluent
Do you know who is knocking on your network’s door? Have new regulations left you scratching your head on how to a handle what is happening in your network? Network flow data helps answer many questions across a multitude of use cases including network security, performance, capacity planning, routing, operational troubleshooting and more. Today’s modern day streaming data pipelines need to include tools that can scale to meet the demands of these service providers while continuing to provide responsive answers to difficult questions. In addition to stream processing, data needs to be stored in a redundant, operationally focused database to provide fast, reliable answers to critical questions. Together, Kafka and Druid work together to create such a pipeline.
In this talk Eric Graham and Rachel Pedreschi will discuss these pipelines and cover the following topics: Network flow use cases and why this data is important. Reference architectures from production systems at a major international Bank. Why Kafka and Druid and other OSS tools for Network flows. A demo of one such system.
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...confluent
Do you know who is knocking on your network’s door? Have new regulations left you scratching your head on how to handle what is happening in your network? Network flow data helps answer many questions across a multitude of use cases including network security, performance, capacity planning, routing, operational troubleshooting and more. Today’s modern day streaming data pipelines need to include tools that can scale to meet the demands of these service providers while continuing to provide responsive answers to difficult questions. In addition to stream processing, data needs to be stored in a redundant, operationally focused database to provide fast, reliable answers to critical questions. Together, Kafka and Druid work together to create such a pipeline.
In this talk Eric Graham and Rachel Pedreschi will discuss these pipelines and cover the following topics:
-Network flow use cases and why this data is important.
-Reference architectures from production systems at a major international Bank.
-Why Kafka and Druid and other OSS tools for Network Flows.
-A demo of one such system.
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach ShoolmanRedis Labs
This document summarizes RedisConf 2017, covering several topics:
1. Running Redis on Flash in a DBaaS model for improved performance and cost savings compared to other NoSQL databases.
2. Redis modules gaining momentum with over 50 created so far, and the importance of multi-threading for high performance. Useful modules highlighted include RediSearch, ReJSON, Redis-ML, and Redis-Graph.
3. Using Redis for IoT applications, with challenges around small edge devices and clusters, high throughput from thousands of devices, and varied functionality needs addressed through Redis modules.
Here are some useful GDB commands for debugging:
- break <function> - Set a breakpoint at a function
- break <file:line> - Set a breakpoint at a line in a file
- run - Start program execution
- next/n - Step over to next line, stepping over function calls
- step/s - Step into function calls
- finish - Step out of current function
- print/p <variable> - Print value of a variable
- backtrace/bt - Print the call stack
- info breakpoints/ib - List breakpoints
- delete <breakpoint#> - Delete a breakpoint
- layout src - Switch layout to source code view
- layout asm - Switch layout
MongoDB for Time Series Data Part 3: ShardingMongoDB
The document discusses sharding time series sensor data in MongoDB. It recommends modeling the application's read, write and storage patterns to determine the optimal sharding strategy. A good shard key has sufficient cardinality, distributes writes evenly and enables targeted reads. For time series data, a compound shard key of an arbitrary value and incrementing timestamp is suggested to balance hot spots and targeted queries. The document also covers configuring a sharded cluster and replica sets with tags to control data distribution.
This document discusses cloud data center network architectures and how to scale them using Arista switches. It describes the limitations of legacy data center designs and introduces the cloud networking model. The cloud networking model with Arista switches provides benefits like lower latency, no oversubscription between racks, and the ability to scale to hundreds of racks. The document then discusses how to scale the network using layer 2, layer 3, and VXLAN designs from thousands to over a million nodes. It provides examples of scaling the number of leaf and spine switches to achieve greater node counts in a non-blocking two-tier design.
This document provides an introduction to IPv6 including a discussion of IPv6 addresses, headers, autoconfiguration, DNS, and the transition from IPv4. It describes key aspects of IPv6 such as the 128-bit addresses, extension headers, stateless address autoconfiguration, neighbor discovery, and duplicate address detection. The document also discusses DNS records for IPv6, transition technologies like dual-stack and tunneling, and some security considerations for IPv6 deployment.
MySQL Cluster Scaling to a Billion QueriesBernd Ocklin
MySQL Cluster is a distributed database that provides extreme scalability, high availability, and real-time performance. It uses an auto-sharding and auto-replicating architecture to distribute data across multiple low-cost servers. Key benefits include scaling reads and writes, 99.999% availability through its shared-nothing design with no single point of failure, and real-time responsiveness. It supports both SQL and NoSQL interfaces to enable complex queries as well as high-performance key-value access.
This document discusses end-to-end processing of 3.7 million telemetry events per second using a lambda architecture at Symantec. It provides an overview of Symantec's security data lake infrastructure, the telemetry data processing architecture using Kafka, Storm and HBase, tuning targets for the infrastructure components, and performance benchmarks for Kafka, Storm and Hive.
The document discusses security issues with IPv6 and proposed mitigation techniques. It covers topics such as router advertisements, neighbor discovery protocol, and fragmentation. Specifically, it notes that router advertisements and neighbor solicitations are not authenticated by default, allowing for spoofing attacks. The document proposes several mitigation approaches including cryptographically generated addresses, router authorization, port access control lists, and host isolation to secure IPv6 networks.
The network layer routes packets between devices on a network through multiple hops. It must address scalability issues around representing addresses and routing packets as networks grow large. Routers connect multiple local area networks, which may use different link layer technologies. IP addresses use a hierarchical structure to improve routing scalability. Classless Inter-Domain Routing (CIDR) allows arbitrary allocation of addresses and subnets to minimize routing tables.
The document provides an overview of Apache Cassandra, including its key components, data replication, scalability, read/write operations, and tunable data consistency. It discusses how Cassandra is a distributed, decentralized database that provides high availability and horizontal scalability. The key components that enable these features are nodes, partitioners, snitches, gossip protocols, and the replication of data across multiple nodes.
Various Open Source Cryptographic Libraries are being used these days to implement the
general purpose cryptographic functions and to provide a secure communication channel over
the internet. These libraries, that implement SSL/TLS, have been targeted by various side
channel attacks in the past that result in leakage of sensitive information flowing over the
network. Side channel attacks rely on inadvertent leakage of information from devices
through observable attributes of online communication. Some of the common side channel
attacks discovered so far rely on packet arrival and departure times (Timing Attacks), power
usage and packet sizes. Our research explores novel side channel attack that relies on CPU
architecture and instruction sets. In this research, we explored such side channel vectors
against popular SSL/TLS implementations which were previously believed to be patched
against padding oracle attacks, like the POODLE attack. We were able to successfully extract
the plaintext bits in the information exchanged using the APIs of two popular SSL/TLS
libraries.
This document provides information about networking and Microsoft Certified Systems Engineer (MCSE) certification. It defines what a network is and discusses the benefits of networking such as sharing resources, software, and licenses. It also describes different types of networks including local area networks (LANs), metropolitan area networks (MANs), and wide area networks (WANs). Additionally, it discusses networking devices like hubs, switches, routers, and network interface cards. The document also covers topics such as network topologies, IP addressing, implementing TCP/IP, and Active Directory.
The document provides an overview of Redis, including its history, users, data model, commands, programming interfaces, architecture, data structures, persistence, replication approach, and performance benchmarks. Redis started in 2009 and has grown significantly in popularity as a fast in-memory database. It uses different data structures like hashes, lists, sets to store data efficiently and provides atomic operations to manipulate this data.
The document discusses data partitioning and distribution across multiple machines in a cluster. It explains that data replication does not scale well, but data partitioning, where each record exists on only one machine, allows write latency to scale with the number of machines in the cluster. Coherence provides a distributed cache that partitions data and offers functions for server-side processing near the data through tools like entry processors.
This document discusses sockets and their use in networking applications. It begins with an overview of sockets, including that they provide an endpoint for network connections and are identified by both an IP address and port number. It then covers socket details for TCP and UDP, such as how TCP provides reliable connections while UDP is connectionless. The document concludes with examples of TCP and UDP client-server code using common socket functions like bind(), listen(), accept(), connect(), send(), and recv().
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...shadow0702a
This document serves as a comprehensive step-by-step guide on how to effectively use PyCharm for remote debugging of the Windows Subsystem for Linux (WSL) on a local Windows machine. It meticulously outlines several critical steps in the process, starting with the crucial task of enabling permissions, followed by the installation and configuration of WSL.
The guide then proceeds to explain how to set up the SSH service within the WSL environment, an integral part of the process. Alongside this, it also provides detailed instructions on how to modify the inbound rules of the Windows firewall to facilitate the process, ensuring that there are no connectivity issues that could potentially hinder the debugging process.
The document further emphasizes on the importance of checking the connection between the Windows and WSL environments, providing instructions on how to ensure that the connection is optimal and ready for remote debugging.
It also offers an in-depth guide on how to configure the WSL interpreter and files within the PyCharm environment. This is essential for ensuring that the debugging process is set up correctly and that the program can be run effectively within the WSL terminal.
Additionally, the document provides guidance on how to set up breakpoints for debugging, a fundamental aspect of the debugging process which allows the developer to stop the execution of their code at certain points and inspect their program at those stages.
Finally, the document concludes by providing a link to a reference blog. This blog offers additional information and guidance on configuring the remote Python interpreter in PyCharm, providing the reader with a well-rounded understanding of the process.
Comparative analysis between traditional aquaponics and reconstructed aquapon...bijceesjournal
The aquaponic system of planting is a method that does not require soil usage. It is a method that only needs water, fish, lava rocks (a substitute for soil), and plants. Aquaponic systems are sustainable and environmentally friendly. Its use not only helps to plant in small spaces but also helps reduce artificial chemical use and minimizes excess water use, as aquaponics consumes 90% less water than soil-based gardening. The study applied a descriptive and experimental design to assess and compare conventional and reconstructed aquaponic methods for reproducing tomatoes. The researchers created an observation checklist to determine the significant factors of the study. The study aims to determine the significant difference between traditional aquaponics and reconstructed aquaponics systems propagating tomatoes in terms of height, weight, girth, and number of fruits. The reconstructed aquaponics system’s higher growth yield results in a much more nourished crop than the traditional aquaponics system. It is superior in its number of fruits, height, weight, and girth measurement. Moreover, the reconstructed aquaponics system is proven to eliminate all the hindrances present in the traditional aquaponics system, which are overcrowding of fish, algae growth, pest problems, contaminated water, and dead fish.
Software Engineering and Project Management - Introduction, Modeling Concepts...Prakhyath Rai
Introduction, Modeling Concepts and Class Modeling: What is Object orientation? What is OO development? OO Themes; Evidence for usefulness of OO development; OO modeling history. Modeling
as Design technique: Modeling, abstraction, The Three models. Class Modeling: Object and Class Concept, Link and associations concepts, Generalization and Inheritance, A sample class model, Navigation of class models, and UML diagrams
Building the Analysis Models: Requirement Analysis, Analysis Model Approaches, Data modeling Concepts, Object Oriented Analysis, Scenario-Based Modeling, Flow-Oriented Modeling, class Based Modeling, Creating a Behavioral Model.
Discover the latest insights on Data Driven Maintenance with our comprehensive webinar presentation. Learn about traditional maintenance challenges, the right approach to utilizing data, and the benefits of adopting a Data Driven Maintenance strategy. Explore real-world examples, industry best practices, and innovative solutions like FMECA and the D3M model. This presentation, led by expert Jules Oudmans, is essential for asset owners looking to optimize their maintenance processes and leverage digital technologies for improved efficiency and performance. Download now to stay ahead in the evolving maintenance landscape.
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Sinan KOZAK
Sinan from the Delivery Hero mobile infrastructure engineering team shares a deep dive into performance acceleration with Gradle build cache optimizations. Sinan shares their journey into solving complex build-cache problems that affect Gradle builds. By understanding the challenges and solutions found in our journey, we aim to demonstrate the possibilities for faster builds. The case study reveals how overlapping outputs and cache misconfigurations led to significant increases in build times, especially as the project scaled up with numerous modules using Paparazzi tests. The journey from diagnosing to defeating cache issues offers invaluable lessons on maintaining cache integrity without sacrificing functionality.
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...IJECEIAES
Climate change's impact on the planet forced the United Nations and governments to promote green energies and electric transportation. The deployments of photovoltaic (PV) and electric vehicle (EV) systems gained stronger momentum due to their numerous advantages over fossil fuel types. The advantages go beyond sustainability to reach financial support and stability. The work in this paper introduces the hybrid system between PV and EV to support industrial and commercial plants. This paper covers the theoretical framework of the proposed hybrid system including the required equation to complete the cost analysis when PV and EV are present. In addition, the proposed design diagram which sets the priorities and requirements of the system is presented. The proposed approach allows setup to advance their power stability, especially during power outages. The presented information supports researchers and plant owners to complete the necessary analysis while promoting the deployment of clean energy. The result of a case study that represents a dairy milk farmer supports the theoretical works and highlights its advanced benefits to existing plants. The short return on investment of the proposed approach supports the paper's novelty approach for the sustainable electrical system. In addition, the proposed system allows for an isolated power setup without the need for a transmission line which enhances the safety of the electrical network
4. In 21st century we figured out a way to get around disk problems
RAID, replication, Reed-Solomon coding, LDPC and many others
*enterprise IBM hard drive, circa 1980.
1.7 or 3.4 gb capacity, price — 250 000 USD
5. What if it is “some master server”?
But what will happen when the server goes down?
6. What if the whole datacenter goes down?
Should you plan for this?
8. The probability of these events can be VERY small
What will be with your business/systems if it happens after all?
“Things always become obvious after the fact”
― Nassim Nicholas Taleb
9. Reasons for losses of servers, data-centers, coherence
• Tornado, earth quake, flood
• Tech support made a change onto the wrong rack
• Errors made by NOCs
• A cat who got into the electrical transformer and burned
together with equipment
• Virtual machines cluster got a new really angry neighbor
• Cloud provider suddenly went down (say hello amazon S3!)
• Excavator tearing an underground optical cable
while digging a ditch
*all the above examples are from real life
10. You can fix anything… if you have enough time and money.
And if you have nothing else to do :)
11. Choosing the data storage
system that is right for you.
You need to answer the following questions:
What is your record size: Bytes? Kbytes? Mb? Gb? Tb?
Do you need:
- transactions?
- replication?
- fastest access possible?
- query language?
- full-text search?
- CAP properties?
- scalability options?
…
12. To put it simple:
- Massively scalable - replica sets of DHTs
- Fault tolerant by design
- Fast - async I/O, caching, Eblob, bloom filters
- Ease of use: C,C++,Go,HTTP REST,WEBDAV, (S3)
- One point of entry for the clients
13. Elliptics:
- a very fast, linearly scalable NoSQL (key/value) data storage
- based on DHT principles
- designed to store medium to large data records, > 1Kb and up to terabytes
Features:
- No transactions support, but write to one replica is atomic
- CAP - Availability, Partition tolerance + Eventual consistency
- No metadata servers, true horizontal scaling
- Replication - geographically distributed replication
- Direct P2P data streaming (useful for large files)
- Access speed - true O(1) data read access + SLRU cache
- Automatic data repartitioning in case of removed or added storage nodes
- Bulk writes
- Datacenter aware (cross datacenter replication) and CDN
- and much more…
Opensource (GPL), implemented in C/C++
14. CAP theorem
Consistency
Availability
Partition
Tolerance
All clients see
the same data at the
same time
Will always respond
to a request, even if
data is not completely
consistent
Works even in
presence of
node/network
failures
RMDBS:
MySQL/MariaDB
Postgres
MSSQL
Oracle
Elastic Search
…
CACP
PA
HBASE
MongoDB
Redis
Google Big Table
Ceph
…
Elliptics
Cassandra
Riak
DynamoDB
CouchDB
…
26. - Scalable - DHT
- Fault tolerant by design
- Fast - Eblob, async I/O, caching, bloom filters
- Simplicity of usage: C/C++/Go/HTTP REST
- One point of entry for the clients
To put it simple:
27. Terminology:
1) Bucket - set of replicas
2) Replica - one set of data (one DHT)
3) DHT - Distributed Hash Table
4) Hash ring - consistent hashing algorithm
5) Node - one of nodes in Elliptics network
28. 02048
Node 1
Hashring ranges
Node 2
Hashring ranges
Hash Ring
for simplification,
in reality 2^512
*this and following slides following is a simplification of
what’s actually happening
29. IP addr Hash ring segments
Node 1
Node 2
Node 1
routing table
IP addr Hash ring segments
Node 1
Node 2
Node 2
routing table
Start-up and DHT initialization
30. IP addr Hash ring segments
Node 1 12, 90, 644
Node 2
Node 1
routing table
IP addr Hash ring segments
Node 2 44, 129, 1608
Node 1
Node 2
routing table
Start-up and DHT initialization
31. IP addr Hash ring segments
Node 1 12, 90, 644
Node 2 44, 129, 1608
Node 1
Start-up and DHT initialization
routing table
IP addr Hash ring segments
Node 2 44, 129, 1608
Node 1 12, 90, 644
Node 2
routing table
42. - Scalable - DHT
- Fault tolerant by design
- Fast - Eblob, async I/O, caching, bloom filters
- Simplicity of usage: C/C++/Go/HTTP REST
- One point of entry for the clients
To put it simple:
43. Client
Loosing a node
IP addr Hash ring segments
Node 1 12, 90, 644
Node 2 44, 129, 1608
Node 1 routing table
IP addr Hash ring segments
Node 2 44, 129, 1608
Node 1
Node 2 routing table
IP addr Hash ring segments
Node 1
Node 2 44, 129, 1608
45. Client
Writing data (with failed nodes)
IP addr Hash ring segments
Node 1 12, 90, 644
Node 2 44, 129, 1608
Node 1 routing table
IP addr Hash ring segments
Node 2 44, 129, 1608
Node 1
Node 2 routing table
IP addr Hash ring segments
Node 1
Node 2 44, 129, 1608
node2.write(“key1”, data1)
elliptics.write(“key1”, data1)
hash(“key1”) == 20
“key1” -> data
46. Client
Reading data (with failed nodes)
IP addr Hash ring segments
Node 1 12, 90, 644
Node 2 44, 129, 1608
Node 1 routing table
IP addr Hash ring segments
Node 2 44, 129, 1608
Node 1
Node 2 routing table
IP addr Hash ring segments
Node 1
Node 2 44, 129, 1608
elliptics.read(“key1”)
hash(“key1”) == 20
“key1” -> data
47. Client
IP addr Hash ring segments
Node 1 12, 90, 644
Node 2 44, 129, 1608
Node 1 routing table
IP addr Hash ring segments
Node 2 44, 129, 1608
Node 1 12, 90, 644
Node 2 routing table
IP addr Hash ring segments
Node 1 12, 90, 644
Node 2 44, 129, 1608
elliptics.read(“key1”)
hash(“key1”) == 20
Reading data (with restored nodes)
“key1” -> data
node1.read(“key1”)
48. Merge - special procedure to move
keys and data that do not belong to the
local node. Such keys are moved to the
nodes they belong to, restoring
consistency.
* Merge is FAST
49. Client
IP addr Hash ring segments
Node 1 12, 90, 644
Node 2 44, 129, 1608
Node 1 routing table
IP addr Hash ring segments
Node 2 44, 129, 1608
Node 1 12, 90, 644
Node 2 routing table
IP addr Hash ring segments
Node 1 12, 90, 644
Node 2 44, 129, 1608
Merge
“key1” -> data
hash(“key1”) == 20
51. Elliptics backend — EBLOB
Eblob is an append-only low-level IO library, which saves data in blob files.
Elliptics uses it as one of its low-level IO backends.
Supported features:
- Fast append-only updates which do not require disk seeks
- Compact index to populate lookup information from disk
- Multi-threaded index reading during startup (gives you fast storage start)
- O(1) data location lookup time (for in-memory indexes)
- Ability to lock in-memory lookup index (hash table) to eliminate memory swap
- Readahead games with data and index blobs for maximum performance
- Multiple blob files support (tested with single blob file on block device too)
- Optional sha512 on-disk checksumming
- Direct streaming from eblob to client, there’s an Nginx module for that
52. Elliptics backend — EBLOB
Supported features:
- 2-stage write: prepare (which reserves the space) and commit (which calculates
checksum and update in-memory and on-disk indexes). One can (re)write data using
pwrite() in between without locks
- Usuall 1-stage write interface
- Flexible configuration of hash table size, flags, alignment
- Defragmentation tool: entries to be deleted are only marked as removed, eblob_check will
iterate over specified blob files and actually remove those blocks
- Off-line blob consistency checker: eblob_check can verify checksums for all records
which have them enabled
- Run-time sync support — dedicated thread runs fsync in background on all files on timed
base
- Sorted data and indexes on disk – ideal for column creation, iteration, subkeys and range
requests
- In-memory index compression (upto 60%) ~64 bytes per key in RAM