Timely genome analysis requires a fresh approach to platform design for big data problems. Louisiana State University has tested enterprise cluster deployments of Redis with a unique solution that allows flash memory to act as extended RAM. Learn about how this solution allows large amounts of data to be handled with a fraction of the memory needed for a typical deployment.
Redis Networking Nerd Down: For Lovers of Packets and Jumbo Frames- John Bull...Redis Labs
Packets per second (PPS) is an often overlooked value within an environment. Most network concerns are around throughput and interface speed, but what happens when this value becomes the bottleneck due to large-scale hosting providers (AWS, Azure, etc.) with rigid standards? This talk covers what a Redis packet looks like and how the out-of-the-box configuration can drastically affect
packet per second overhead. From here we’ll deep dive into specific configuration values which help lower PPS numbers, as well as different Redis master/slave relationships that can be utilized to keep PPS below inflexible network thresholds.
Running Analytics at the Speed of Your BusinessRedis Labs
The speed at which you can extract insights from your data is increasingly a competitive edge for your business. Data and analytics have to be at lightning fast speeds to seriously impact your user acquisition.
Join this webinar featuring Forrester analyst Noel Yuhanna and Leena Joshi, VP Product Marketing at Redis Labs to learn how you can glean insights faster with new open source data processing frameworks like Spark and Redis.
In this webinar you will learn:
* Why analytics has to run at the real time speed of business
* How this can be achieved with next generation Big Data tools
* How data structures can optimize your hybrid transaction-analytics processing scenarios
Kafka is becoming an ever more popular choice for users to help enable fast data and Streaming. Kafka provides a wide landscape of configuration to allow you to tweak its performance profile. Understanding the internals of Kafka is critical for picking your ideal configuration. Depending on your use case and data needs, different settings will perform very differently. Lets walk through performance essentials of Kafka. Let's talk about how your Consumer configuration, can speed up or slow down the flow of messages to Brokers. Lets talk about message keys, their implications and their impact on partition performance. Lets talk about how to figure out how many partitions and how many Brokers you should have. Let's discuss consumers and what effects their performance. How do you combine all of these choices and develop the best strategy moving forward? How do you test performance of Kafka? I will attempt a live demo with the help of Zeppelin to show in real time how to tune for performance.
Redis Developers Day 2014 - Redis Labs TalksRedis Labs
These are the slides that the Redis Labs team had used to accompany the session that we gave during the first ever Redis Developers Day on October 2nd, 2014, London. It includes some of the ideas we've come up with to tackle operational challenges in the hyper-dense, multi-tenants Redis deployments that our service - Redis Cloud - consists of.
What's new with enterprise Redis - Leena Joshi, Redis LabsRedis Labs
Redis Labs manages over 160k+ HA databases, 10k clustered databases, without data loss in spite of one node failure a day and one data center outage per month. Using Enterprise
Redis(RLEC), Redis Labs delivers seamless zero downtime scaling, true high availability with persistence, cross-rack/zone/
datacenter replication and instant automatic failover. Learn how. Join this session for a deep dive into how enterprise Redis makes for no-hassle Redis deployments and the roadmap for new Redis capabilities. Discover new cost savings with Redis on Flash for cost-effective high performance operations and analytics
Redis Networking Nerd Down: For Lovers of Packets and Jumbo Frames- John Bull...Redis Labs
Packets per second (PPS) is an often overlooked value within an environment. Most network concerns are around throughput and interface speed, but what happens when this value becomes the bottleneck due to large-scale hosting providers (AWS, Azure, etc.) with rigid standards? This talk covers what a Redis packet looks like and how the out-of-the-box configuration can drastically affect
packet per second overhead. From here we’ll deep dive into specific configuration values which help lower PPS numbers, as well as different Redis master/slave relationships that can be utilized to keep PPS below inflexible network thresholds.
Running Analytics at the Speed of Your BusinessRedis Labs
The speed at which you can extract insights from your data is increasingly a competitive edge for your business. Data and analytics have to be at lightning fast speeds to seriously impact your user acquisition.
Join this webinar featuring Forrester analyst Noel Yuhanna and Leena Joshi, VP Product Marketing at Redis Labs to learn how you can glean insights faster with new open source data processing frameworks like Spark and Redis.
In this webinar you will learn:
* Why analytics has to run at the real time speed of business
* How this can be achieved with next generation Big Data tools
* How data structures can optimize your hybrid transaction-analytics processing scenarios
Kafka is becoming an ever more popular choice for users to help enable fast data and Streaming. Kafka provides a wide landscape of configuration to allow you to tweak its performance profile. Understanding the internals of Kafka is critical for picking your ideal configuration. Depending on your use case and data needs, different settings will perform very differently. Lets walk through performance essentials of Kafka. Let's talk about how your Consumer configuration, can speed up or slow down the flow of messages to Brokers. Lets talk about message keys, their implications and their impact on partition performance. Lets talk about how to figure out how many partitions and how many Brokers you should have. Let's discuss consumers and what effects their performance. How do you combine all of these choices and develop the best strategy moving forward? How do you test performance of Kafka? I will attempt a live demo with the help of Zeppelin to show in real time how to tune for performance.
Redis Developers Day 2014 - Redis Labs TalksRedis Labs
These are the slides that the Redis Labs team had used to accompany the session that we gave during the first ever Redis Developers Day on October 2nd, 2014, London. It includes some of the ideas we've come up with to tackle operational challenges in the hyper-dense, multi-tenants Redis deployments that our service - Redis Cloud - consists of.
What's new with enterprise Redis - Leena Joshi, Redis LabsRedis Labs
Redis Labs manages over 160k+ HA databases, 10k clustered databases, without data loss in spite of one node failure a day and one data center outage per month. Using Enterprise
Redis(RLEC), Redis Labs delivers seamless zero downtime scaling, true high availability with persistence, cross-rack/zone/
datacenter replication and instant automatic failover. Learn how. Join this session for a deep dive into how enterprise Redis makes for no-hassle Redis deployments and the roadmap for new Redis capabilities. Discover new cost savings with Redis on Flash for cost-effective high performance operations and analytics
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...HBaseCon
The talk will go through the following topics and experiences.
1. Container Strategy and Implementation for HBase.
2. Resource Management inside a containerized environment.
3. Network Isolation and Policy Management (Project Calico)
4. Orchestration Structure using DCOS Commons
5. Internal and External Load Balancing using Marathon Load Balancer.
6. Auto-scaling of HBase
7. AWS Deployment learnings.
This talk will be interesting to developers and administrators who are attempting to automate the deployment of HBase and HDFS in containers.
by John Leach and Daniel Gomez Ferro of Splice Machine
Роман Новиков "Best Practices for MySQL Performance & Troubleshooting with th...Fwdays
Troubleshooting performance issues can be a bit tricky, especially when you’re given a broad statement that the database is slow.
Learn to direct your attention to the correct moving pieces and fix what needs your attention.
Learn how all this is done at Percona, what we monitor and track, and the tools we use.
Xiaomi is a Chinese technology company, it sells more than 100 million smartphones worldwide in 2018, and also owns one of the world's largest IoT device platforms. Xiaomi builds dozens of mobile apps and Internet services based on intelligent devices, including Ads, news feeds, finance service, game, music, video, personal cloud service and so on. The rapid growth of business results in exponential growth of the data analytics infrastructure. The amount of data has roared more than 20 times in the past 3 years, which renders us big challenges on the HDFS scalability
In this talk, we introduce how we scale HDFS to support hundreds of PB data with thousands nodes:
1. How Xiaomi use Hadoop and the characteristic of our usage
2. We made HDFS federation cluster to be used like a single cluster, most applications don't need to change any code to migrate from a single cluster to a federation cluster. Our works include a wrapper FileSystem compatible with DistributedFileSystem, supporting rename among different name spaces and zookeeper-based mount table renewer.
3. Experience of tuning NameNode to improve scalability
4. How to maintain hundreds of HDFS clusters and the optimization we did on client-side to make user and programs access these clusters easily with high performance
Espresso Database Replication with Kafka, Tom Quiggleconfluent
The initial deployment of Espresso relies on MySQL’s built-in mechanism for Master-Slave replication. Storage hosts running MySQL masters service HTTP requests to store and retrieve documents, while hosts running slave replicas remain mostly idle. Since replication is at the MySQL instance level, masters and slaves must contain the exact same partitions – precluding flexible and dynamic partition placement and migration within the cluster.
Espresso is migrating to a new deployment topology where each Storage Node may host a combination of master and slave partitions; thus distributing the application requests equally across all available hardware resources. This topology requires per-partition replication between master and slave nodes. Kafka will be used as the transport for replication between partitions.
For use as the replication stream for the source-of-truth data store for LinkedIn’s most valuable data, Kafka must be as reliable as MySQL replication. The session will cover Kafka configuration options to ensure highly reliable, in-order message delivery. Additionally, the application logic maintains state both within the Kafka event stream and externally to detect message re-delivery, out of order delivery, and messages inserted out-of-band. These application protocols to guarantee high fidelity will be discussed.
HBaseCon 2015: HBase Operations in a FlurryHBaseCon
With multiple clusters of 1,000+ nodes replicated across multiple data centers, Flurry has learned many operational lessons over the years. In this talk, you'll explore the challenges of maintaining and scaling Flurry's cluster, how we monitor, and how we diagnose and address potential problems.
We’ll present details about Argus, a time-series monitoring and alerting platform developed at Salesforce to provide insight into the health of infrastructure as an alternative to systems such as Graphite and Seyren.
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
HBase has been in production in hundreds of clusters across the CDH/HDP customer base and Cloudera/Hortonworks support it for many years.
In this talk, based on our support experience, we aim to introduce useful information to troubleshoot HBase clusters efficiently. First off, we (Daisuke at Cloudera support) are going to talk about typical log messages and web UI info which we can use for troubleshooting (especially for struggling with performance issues). Since their meanings have been changing over the past versions, we would like to show the difference and improvements as well (e.g. HBASE-20232 for memstore flush, HBASE-16972 for slow scanner, HBASE-18469 for request counter, and also HBASE-21207 for sorting in web UI). We (Toshihiro at Cloudera, a former Hortonworks employee) will also cover some new tools (e.g. HBASE-21926 Profiler Servlet, HBASE-11062 htop, etc.), which should also be useful for performance troubleshooting.
In this deck from the HPC User Forum in Tucson, Jeff Stuecheli from IBM presents: POWER9 for AI & HPC.
"Built from the ground-up for data intensive workloads, POWER9 is the only processor with state-of-the-art I/O subsystem technology, including next generation NVIDIA NVLink, PCIe Gen4, and OpenCAPI."
Watch the video: https://wp.me/p3RLHQ-isJ
Learn more: https://www.ibm.com/it-infrastructure/power/power9
and
http://hpcuserforum.com
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
IBM POWER8 processor is the fastest available on the market, redefining Open Source performance. With this amazing processor, IBM and members of the OpenPower Foundation design innovative and cost-effective systems, delivering the infrastructure of choice for the most demanding workloads, in terms of throughput, scalability and reliability.
In this talk in english, Thibaud Besson will browse the key characteristics of Power Systems, why they are the most relevant for today's challenges, both from a technical and economical standpoint. Finally, we will review the possibilities you have to get your hands on one of these outstanding plateforms for your Open Source applications.
HBaseCon2017 Splice Machine as a Service: Multi-tenant HBase using DCOS (Meso...HBaseCon
The talk will go through the following topics and experiences.
1. Container Strategy and Implementation for HBase.
2. Resource Management inside a containerized environment.
3. Network Isolation and Policy Management (Project Calico)
4. Orchestration Structure using DCOS Commons
5. Internal and External Load Balancing using Marathon Load Balancer.
6. Auto-scaling of HBase
7. AWS Deployment learnings.
This talk will be interesting to developers and administrators who are attempting to automate the deployment of HBase and HDFS in containers.
by John Leach and Daniel Gomez Ferro of Splice Machine
Роман Новиков "Best Practices for MySQL Performance & Troubleshooting with th...Fwdays
Troubleshooting performance issues can be a bit tricky, especially when you’re given a broad statement that the database is slow.
Learn to direct your attention to the correct moving pieces and fix what needs your attention.
Learn how all this is done at Percona, what we monitor and track, and the tools we use.
Xiaomi is a Chinese technology company, it sells more than 100 million smartphones worldwide in 2018, and also owns one of the world's largest IoT device platforms. Xiaomi builds dozens of mobile apps and Internet services based on intelligent devices, including Ads, news feeds, finance service, game, music, video, personal cloud service and so on. The rapid growth of business results in exponential growth of the data analytics infrastructure. The amount of data has roared more than 20 times in the past 3 years, which renders us big challenges on the HDFS scalability
In this talk, we introduce how we scale HDFS to support hundreds of PB data with thousands nodes:
1. How Xiaomi use Hadoop and the characteristic of our usage
2. We made HDFS federation cluster to be used like a single cluster, most applications don't need to change any code to migrate from a single cluster to a federation cluster. Our works include a wrapper FileSystem compatible with DistributedFileSystem, supporting rename among different name spaces and zookeeper-based mount table renewer.
3. Experience of tuning NameNode to improve scalability
4. How to maintain hundreds of HDFS clusters and the optimization we did on client-side to make user and programs access these clusters easily with high performance
Espresso Database Replication with Kafka, Tom Quiggleconfluent
The initial deployment of Espresso relies on MySQL’s built-in mechanism for Master-Slave replication. Storage hosts running MySQL masters service HTTP requests to store and retrieve documents, while hosts running slave replicas remain mostly idle. Since replication is at the MySQL instance level, masters and slaves must contain the exact same partitions – precluding flexible and dynamic partition placement and migration within the cluster.
Espresso is migrating to a new deployment topology where each Storage Node may host a combination of master and slave partitions; thus distributing the application requests equally across all available hardware resources. This topology requires per-partition replication between master and slave nodes. Kafka will be used as the transport for replication between partitions.
For use as the replication stream for the source-of-truth data store for LinkedIn’s most valuable data, Kafka must be as reliable as MySQL replication. The session will cover Kafka configuration options to ensure highly reliable, in-order message delivery. Additionally, the application logic maintains state both within the Kafka event stream and externally to detect message re-delivery, out of order delivery, and messages inserted out-of-band. These application protocols to guarantee high fidelity will be discussed.
HBaseCon 2015: HBase Operations in a FlurryHBaseCon
With multiple clusters of 1,000+ nodes replicated across multiple data centers, Flurry has learned many operational lessons over the years. In this talk, you'll explore the challenges of maintaining and scaling Flurry's cluster, how we monitor, and how we diagnose and address potential problems.
We’ll present details about Argus, a time-series monitoring and alerting platform developed at Salesforce to provide insight into the health of infrastructure as an alternative to systems such as Graphite and Seyren.
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
HBase has been in production in hundreds of clusters across the CDH/HDP customer base and Cloudera/Hortonworks support it for many years.
In this talk, based on our support experience, we aim to introduce useful information to troubleshoot HBase clusters efficiently. First off, we (Daisuke at Cloudera support) are going to talk about typical log messages and web UI info which we can use for troubleshooting (especially for struggling with performance issues). Since their meanings have been changing over the past versions, we would like to show the difference and improvements as well (e.g. HBASE-20232 for memstore flush, HBASE-16972 for slow scanner, HBASE-18469 for request counter, and also HBASE-21207 for sorting in web UI). We (Toshihiro at Cloudera, a former Hortonworks employee) will also cover some new tools (e.g. HBASE-21926 Profiler Servlet, HBASE-11062 htop, etc.), which should also be useful for performance troubleshooting.
In this deck from the HPC User Forum in Tucson, Jeff Stuecheli from IBM presents: POWER9 for AI & HPC.
"Built from the ground-up for data intensive workloads, POWER9 is the only processor with state-of-the-art I/O subsystem technology, including next generation NVIDIA NVLink, PCIe Gen4, and OpenCAPI."
Watch the video: https://wp.me/p3RLHQ-isJ
Learn more: https://www.ibm.com/it-infrastructure/power/power9
and
http://hpcuserforum.com
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
IBM POWER8 processor is the fastest available on the market, redefining Open Source performance. With this amazing processor, IBM and members of the OpenPower Foundation design innovative and cost-effective systems, delivering the infrastructure of choice for the most demanding workloads, in terms of throughput, scalability and reliability.
In this talk in english, Thibaud Besson will browse the key characteristics of Power Systems, why they are the most relevant for today's challenges, both from a technical and economical standpoint. Finally, we will review the possibilities you have to get your hands on one of these outstanding plateforms for your Open Source applications.
OpenPOWER Acceleration of HPCC SystemsHPCC Systems
JT Kellington, IBM and Allan Cantle, Nallatech present at the 2015 HPCC Systems Engineering Summit Community Day about porting HPCC Systems to the POWER8-based ppc64el architecture.
IBM Consultants & System Integrators Interchange - 2015
http://www-07.ibm.com/events/in/csiinterchange/index.html
Demystify OpenPOWER
Speaker: Anand Haridass, Chief Engineer – Power System, IBM India
OpenPOWER is an open development community, using the POWER Architecture to serve the evolving needs of customers. Hear about the success of the OpenPOWER strategy and Foundation that is building momentum, and fueling an explosion of new development, innovation and collaboration, and improved performance on the POWER Architecture. What does this means for your clients? Find out how OpenPOWER is expanding the Power ecosystem and capabilities with new solutions coming from IBM and our partners.
IBM Power9 Servers are here! Launched this week, the AC922 POWER9 servers will form the basis of the world’s fastest “Coral” supercomputers coming to ORNL and LLNL. Built specifically for compute-intensive AI workloads, the new POWER9 systems are capable of improving the training times of deep learning frameworks by nearly 4x allowing enterprises to build more accurate AI applications, faster.
Listen to the Radio Free HPC podcast on Power9: https://insidehpc.com/2017/12/radio-free-hpc-looks-new-power9-titan-v-snapdragon-845/
Learn more: https://www.ibm.com/us-en/marketplace/power-systems-ac922
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Better performance and cost effectiveness empower better results in the cognitive era. For more information, visit: http://www.ibm.com/systems/power/hardware/linux-lc.html
The Apache Spark config behind the indsutry's first 100TB Spark SQL benchmarkLenovo Data Center
Some configurations deserve their own SlideShare entry: this is one of them. When the indsutry's first 100TB Spark SQL benchmark was reached, the media took notice. For good reason.
Intel, Mellanox, Lenovo and IBM came together to investigate a topology that leveraged advances in CPU, memory, storage and networking to assess the readiness of Spark SQL to harness new capabilities -- and speeds.
Webinar: High Performance MongoDB Applications with IBM POWER8MongoDB
Innovative companies are building Internet of Things, mobile, content management, single view, and big data apps on top of MongoDB. In this session, we'll explore how the IBM POWER8 platform brings new levels of performance and ease of configuration to these solutions which already benefit from easier and faster design and development using MongoDB.
IBM POWER - An ideal platform for scale-out deploymentsthinkASG
IBM Power Systems is the ideal platform for scale-out deployments such as Big Data, SAP HANA and anything else the requires heavy compute to achieve business goals, faster.
The Open Coherent Accelerator Processor Interface (OpenCAPI) is an industry-standard architecture targeted for emerging accelerator solutions and workloads. This session will address these following areas : 1.) The latest technology advancements surround OpenCAPI, 2.) The OpenCAPI strategy as it relates to the other industry acceleration standards. ie Intel's CXL, Gen-Z and CCIX, 3.) The open initiatives surrounding OMI and OpenCAPI 3.0 and GitHub, 4.) Industry Open Source Initiatives around OpenCAPI, 5.) OC-Accel - Our new FPGA programming framework, supporting OpenCAPI 3.0, targeting higher level programming languages such as C, C++ 6.) Interesting Use Cases
Heterogeneous Computing : The Future of SystemsAnand Haridass
Charts from NITK-IBM Computer Systems Research Group (NCSRG)
- Dennard Scaling,Moore's Law, OpenPOWER, Storage Class Memory, FPGA, GPU, CAPI, OpenCAPI, nVidia nvlink, Google Microsoft Heterogeneous system usage
Similar to Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry Leatherland, IBM (20)
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry Leatherland, IBM
1. Revolutionizing the Datacenter
Join the Conversation #OpenPOWERSummit
Accelerating Genome Assembly with
Power8
Seung-Jong Park, Ph.D.
School of EECS, CCT, Louisiana State University
Join the Conversation #OpenPOWERSummit
2. Agenda
The Genome Assembly Problem
Accelerating Graph Construction with POWER8
Accelerating Graph Simplification with IBM CAPI®
Flash and Redis NoSQL database.
25/8/2016
7. Experimental Test Beds
75/8/2016
System Type IBM PKY Cluster LSU SuperMikeII
Processor Two 10-core IBM Power8 Two 8-core Intel SandyBridge Xeon
Maximum #Nodes used in various
experiments
40 120
#Physical cores/node 20 (8 Simultaneous Multi-Thread) 16 (Hyper threading disabled)
#vcores/node 160 16
RAM/node (GB) 256 32
#Disks/node 5 3
#Disks/node used for shuffled data 3 1
Total Storage space/node used for shuffled
data
1.8 0.5
Network 56Gbps InfiniBand (non-blocking) 40Gbps InfiniBand (2:1 blockings)
8. Datasets
85/8/2016
Genome data set Input size Shuffle data
size
Output size
Rice genome 12GB 70GB 50GB
Bumble bee genome 90GB 600GB 95GB
Metagenome 3.2TB 20TB 8.6TB
Input data set to stage 2 Key-value Stores
With Redis NoSql and IBM Power8-CAPI -Flash
10. Hadoop Scalability with POWER8 SMTs
Tested with small size rice genome data on 2 node
Almost linear scalability with increasing SMTs
105/8/2016
11. Rice Genome
Analyzing small size (12GB) data
Eliminate the impact of network and disk I/O
7.5X performance improvement per server
115/8/2016
12. Bumble Bee Genome
Analyzing Medium size (90GB) Bumble Bee genome
7.5x improvement in terms of Performance/server
125/8/2016
13. Metagenome Stage 1
Analyzing huge (3.2TB) metagenome data
Only 6.5 hours on 40-node IBM Power8 cluster
More than 9x improvement in terms of performance
per server
135/8/2016
14. IBM Data Engine for NoSQL
Performance and Value
Stage 2 Requires Large Memory access that isn’t readily available via
traditional compute processing.
15. Custom
Hardware
Application
POWER8
CAPP
Coherence Bus
PSL
FPGA or ASIC
Customizable Hardware
Application Accelerator
• Specific system SW, middleware, or user application
• Written to durable interface provided by PSL
POWER8
PCIe Gen 3
Transport for encapsulated messages
Processor Service Layer (PSL)
• Present robust, durable interfaces to applications
• Offload complexity / content from CAPP
Virtual Addressing
• Accelerator can work with same memory addresses that the processors use
• Pointers de-referenced same as the host application
• Removes OS & device driver overhead
Hardware Managed Cache Coherence
• Enables the accelerator to participate in “Locks” as a normal thread Lowers
latency over IO communication model
POWER8 CAPI (Coherent Accelerator Processor
Interface)
16. Redis Labs Exploits the IBM Data Engine for NoSQL
Redis stores key-value pairs
• Key-value pairs may be variable size, in any
format (Text, Document, JPEG, Video, etc.)
Basic operations are “SET” and “GET”
> SET 100001 “CAPI is Fast”
> GET 100001
“CAPI is Fast”
> ...
Database Characteristics
• 90 GB MAX Capacity, up to 10 GB RAM, and 80 GB Flash
• key-value pairs are 1,000 bytes of random data
• DB filled with ~50GB of data (42.5 million keys)
Client Characteristics
• 288 clients, randomly issuing Redis GETs or SETs
• ~50% of keys from RAM, ~50% from CAPI-Accelerated Flash
Demo System:
• IBM Power System S812L
• 1 POWER8 Socket
• 2 IBM DataEngine for NoSQL CAPI Accelerators
• 1 FlashSystem 840
• Ubuntu 14.10
• Redis Labs Enterprise Cluster (Beta)
Set Key = Value
Retrieve Key
10Gb Uplinks
Power8 Server
Flash Array w/ up
To 56TB
Demonstration Platform
(POWER8 + CAPI Flash)
Infrastructure Attributes
- up to 192 threads in 2U Server drawer
- up to 56 TB of memory based Flash per 2U Drawer
- Shared Memory & Cache for dynamic tuning
WWW
OpenPower Partner Redis Labs’s highly-differentiated product
offering built on CAPI is available today.
Demo Link
17. IBM Data Engine for NoSQL + Redis Labs Value
Built on Open APIs
• Leverages IBM DataEngine for NoSQL APIs
Redis Labs Enterprise Cluster provides
near Speed of RAM, with the Capacity of
Flash
• Leverages IBM DataEngine for NoSQL CAPI Accelerator for
high-speed, low-latency link to Flash
Controls use of Memory, Flash, and Cost!
• Hot Data Maintained in RAM
• Provides ISPs and MSPs up to 72% Cost Savings
When 80% of Data is in Flash
Redis Labs Enterprise Cluster allows the user to select the ratio of
RAM and flash with a simple slider, when using POWER8 with the
IBM Data Engine for NoSQL.
18. Load Balancer
500GB Cache
Node
10Gb Uplink
POWER8 Server
Flash Array w/ up
to 56TB
Differentiated NoSQL
(POWER8 + FlashSystem with CAPI)
Infrastructure Attributes
- 192 threads in 4U server drawer
- 56 TB of flash per 2U drawer
- Shared Memory & cache for dynamic tuning
- Elimination of I/O and network overhead
- Cluster solution in a box
Today’s NoSQL in memory (x86)
Infrastructure Requirements
- Large distributed (Scale out)
- Large memory per node
- Networking bandwidth needs
- Load balancing
Power CAPI-attached FlashSystem for NoSQL regains
infrastructure control and reigns in the cost to deliver services.
WWW10Gb Uplink
WWW
Backup Nodes
500GB Cache
Node
500GB Cache
Node
500GB Cache
Node500GB Cache
Node
What CAPI Means for NoSQL Solutions
19. Big Redis w/ CAPI Flash Offers New Performance / Cost Points
Users pick the performance / cost point that meets their solution
needs, be it IOPs Rate or Latency requirements.
*typical workload
0% 18% 45% 72% 81%
AverageLatency(ms)
1
5
8
9
10
% Implementation Savings
100% 80% 50% 20% 10%
IOPS at 1 ms Latency
382K 208K 188K 175K
2.5M
366-750K
1.35M
483-950K
671-1250K
IOPS at Max Throughput
DRAM / FLASH Ratio