The IARPA Machine Intelligence from Cortical Networks (MICrONS) program is a research endeavor created to improve neurally-plausible machine-learning algorithms by understanding data representations and learning rules used by the brain through structurally and functionally interrogating a cubic millimeter of mammalian neocortex. This effort requires efficiently storing, visualizing, and processing petabytes of neuroimaging data. The Johns Hopkins University Applied Physics Laboratory (APL) has developed an open-source, highly available service to manage these data, called the Boss. The Boss uses AWS to provide a cloud-native spatial database with an innovative storage hierarchy and auto-scaling capability to balance cost and performance. This system extensively uses serverless components to meet both scalability and cost requirements. In this session, we provide an overview of the Boss, and we focus on how the APL used Amazon DynamoDB, AWS Lambda, and AWS Step Functions for several high-throughput components of the system. We discuss both the challenges and successes with serverless technologies.
Evolution of Netflix's cloud security strategy. Includes cloud-based key management and hybrid security controls that span traditional datacenter and public cloud.
Healthcare systems around the world are looking to Precision Medicine -- care decisions tailored for the individual patient -- as a means to drive better care outcomes at lower cost. Today, the most promising technology that has made this possible in certain diseases like cancer is sequencing a patient's genome. For infectious diseases, sequencing has revolutionized our understanding of outbreaks and how they spread. Genome sequencing has progressed significantly in the past decade to improve throughput and lower costs by 100X or more. It is a data and compute intensive endeavor, which most biomedical research and care delivery networks are not equipped to handle. This session features Dr. Swaine Chen from the Genome Institute of Singapore, and the Broad Institute Cromwell team, discussing the problem of dealing with the scale of genomic data, and how they solved these to deliver results.
Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...Adrian Cockcroft
Presentation given in October 2011 at the High Performance Transaction Systems Workshop http://hpts.ws - describes how Netflix used AWS to run a set of highly scalable Cassandra benchmarks on hundreds of instances in only a few hours.
A presentation on the Netflix Cloud Architecture and NetflixOSS open source. For the All Things Open 2015 conference in Raleigh 2015/10/19. #ATO2015 #NetflixOSS
(ISM301) Engineering Netflix Global Operations In The CloudAmazon Web Services
Operating a massively scalable, constantly changing, distributed global service is a daunting task. We innovate at breakneck speed to attract new customers and stay ahead of the competition. This means more features, more experiments, more deployments, more engineers making changes in production environments, and ever-increasing complexity. Simultaneously improving service availability and accelerating rate of change seems impossible on the surface. At Netflix, operations engineering is both a technical and organizational construct designed to accomplish just that by integrating disciplines like continuous delivery, fault injection, regional traffic management, crisis response, best practice automation, and real-time analytics. In this talk, designed for technical leaders seeking a path to operational excellence, we'll explore these disciplines in depth and how they integrate and create competitive advantages.
Evolution of Netflix's cloud security strategy. Includes cloud-based key management and hybrid security controls that span traditional datacenter and public cloud.
Healthcare systems around the world are looking to Precision Medicine -- care decisions tailored for the individual patient -- as a means to drive better care outcomes at lower cost. Today, the most promising technology that has made this possible in certain diseases like cancer is sequencing a patient's genome. For infectious diseases, sequencing has revolutionized our understanding of outbreaks and how they spread. Genome sequencing has progressed significantly in the past decade to improve throughput and lower costs by 100X or more. It is a data and compute intensive endeavor, which most biomedical research and care delivery networks are not equipped to handle. This session features Dr. Swaine Chen from the Genome Institute of Singapore, and the Broad Institute Cromwell team, discussing the problem of dealing with the scale of genomic data, and how they solved these to deliver results.
Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...Adrian Cockcroft
Presentation given in October 2011 at the High Performance Transaction Systems Workshop http://hpts.ws - describes how Netflix used AWS to run a set of highly scalable Cassandra benchmarks on hundreds of instances in only a few hours.
A presentation on the Netflix Cloud Architecture and NetflixOSS open source. For the All Things Open 2015 conference in Raleigh 2015/10/19. #ATO2015 #NetflixOSS
(ISM301) Engineering Netflix Global Operations In The CloudAmazon Web Services
Operating a massively scalable, constantly changing, distributed global service is a daunting task. We innovate at breakneck speed to attract new customers and stay ahead of the competition. This means more features, more experiments, more deployments, more engineers making changes in production environments, and ever-increasing complexity. Simultaneously improving service availability and accelerating rate of change seems impossible on the surface. At Netflix, operations engineering is both a technical and organizational construct designed to accomplish just that by integrating disciplines like continuous delivery, fault injection, regional traffic management, crisis response, best practice automation, and real-time analytics. In this talk, designed for technical leaders seeking a path to operational excellence, we'll explore these disciplines in depth and how they integrate and create competitive advantages.
Slide deck for a presentation at OSCON 2011 about why Netflix uses web technology for TV user interfaces and how we maximize performance for a broad range of devices.
Moonbot Studios Shoots for the Cloud to Meet Deadlines and Manage Costs
Threatened by deadlines for Academy award submissions, Moonbot Studios faced a shortage of rendering capacity while working on Taking Flight, its newest animated short film, and other important projects. As a small studio with a matching budget, the team did what it does best—it got creative and solved the problem with what they first called “magic.”
In this webinar, the Moonbot team will tell its tale of sending its rendering capacity to Google Compute Engine and how they defied networking odds by caching data close to the animators with an Avere vFXT. Hear Moonbot’s pipeline supervisor tell how they turned cloud data center distance into a non-issue, met deadlines, and gained quantitative benefits that sparked energy in this small team of creative aviators.
In this session, you will learn:
•What drove the Moonbot Studios to move to the cloud
•How they moved complex renders to Google Compute Engine, overcoming data access roadblocks
•Measurable results including speed, economics, flexibility, and creative freedom
The Moonbot Studios flight to the cloud will be supported by Google Cloud Platform and Avere Systems for a complete overview of how the technologies help bring new ideas to life.
Who Needs Network Management in a Cloud Native Environment?Eshed Gal-Or
(This talk was presented in OSS NA 2017 Los Angeles )
Network management (and virtual network in particular) is hard.
Cloud app developers find themselves dealing with too many options and too many settings, which make no sense.
This is because Cloud APIs evolved from legacy IT management.
Cloud-Native apps are revolutionizing how software is developed and deployed.
Why do app developers need to deal with those legacy network knobs and gauges?
Why do we even need to care about IP addresses, routers, or load balancers, in a cloud-native world?
In this presentation, we will explore some alternative approach and how we could go about implementing it *today* with K8S and Dragonflow (an open source virtual network management project), to provide a more stable, better performing and truly scalable cloud-native infrastructure.
Siddhi: A Second Look at Complex Event Processing ImplementationsSrinath Perera
Today there are so much data being available from sources like sensors (RFIDs, Near Field Communication), web activities, transactions, social networks, etc. Making sense of this avalanche of data requires efficient and fast processing.
Processing of high volume of events to derive higher-level information is a vital part of taking critical decisions, and
Complex Event Processing (CEP) has become one of the most rapidly emerging fields in data processing. e-Science
use-cases, business applications, financial trading applications, operational analytics applications and business activity monitoring applications are some use-cases that directly use CEP. This paper discusses different design decisions associated
with CEP Engines, and proposes some approaches to improve CEP performance by using more stream processing
style pipelines. Furthermore, the paper will discuss Siddhi, a CEP Engine that implements those suggestions. We
present a performance study that exhibits that the resulting CEP Engine—Siddhi—has significantly improved performance.
Primary contributions of this paper are performing a critical analysis of the CEP Engine design and identifying
suggestions for improvements, implementing those improvements
through Siddhi, and demonstrating the soundness of those suggestions through empirical evidence.
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindAvere Systems
While cloud computing offers virtually unlimited capacity, harnessing that capacity in an efficient, cost effective fashion can be cumbersome and difficult at the workload level. At the organizational level, it can quickly become chaos.
You must make choices around cloud deployment, and these choices could have a long-lasting impact on your organization. It is important to understand your options and avoid incomplete, complicated, locked-in scenarios. Data management and placement challenges make having the ability to automate workflows and processes across multiple clouds a requirement.
In this webinar, you will:
• Learn how to leverage cloud services as part of an overall computation approach
• Understand data management in a cloud-based world
• Hear what options you have to orchestrate HPC in the cloud
• Learn how cloud orchestration works to automate and align computing with specific goals and objectives
• See an example of an orchestrated HPC workload using on-premises data
From computational research to financial back testing, and research simulations to IoT processing frameworks, decisions made now will not only impact future manageability, but also your sanity.
(CMP202) Engineering Simulation and Analysis in the CloudAmazon Web Services
"Building great products, ones that are aesthetically appealing as well as functionally sound, requires cutting-edge design and engineering. Given the high cost of physical testing prototypes, engineering organizations are turning to simulation and analysis using digital models, but compute requirements for these have traditionally required expensive on-premises infrastructure. But now, engineering organizations can use high-performance computing services from AWS and solutions from AWS technology partners to innovate at scale globally, with no up-front capital infrastructure investment.
In this session, AWS Partner Ansys shares how they help customers of all sizes design and engineer better products through digital simulation and analysis using HPC on AWS."
Building a Just-in-Time Application Stack for AnalystsAvere Systems
Slide presentation from Webinar on February 17, 2016.
People in analytical roles are demanding more and more compute and storage to get their jobs done. Instead of building out infrastructure for a few employees or a department, systems engineers and IT managers can find value in creating a compute stack in the cloud to meet the fluctuating demand of their clients.
In this 45-minute webinar, you’ll learn:
- How to identify the right analytical workloads
- How to create a scalable compute environment using the cloud for analysts in under 10 minutes
- How to best manage costs associated with the cloud compute stack
- How to create dedicated client stacks with their own scratch space as well as general access to reference data
Health systems departments, research & development departments, and business analyst groups all face silos of these challenging, compute-intensive use cases. By learning how to quickly build this flexible workflow that can be scaled up and down (or off) instantly, you can support business objectives while efficiently managing costs.
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS Amazon Web Services
AWS is a great fit for both steady state and episodic computational workloads. Here we present some common architecture patterns for analyzing genomic and other biomedical data on scalable high-throughput computational clusters on AWS. This talk will cover bootstrapping a traditional Beowulf compute cluster on AWS EC2, data transfer and storage strategies for S3.
Building Reliable Data Lakes at Scale with Delta LakeDatabricks
Most data practitioners grapple with data reliability issues—it’s the bane of their existence. Data engineers, in particular, strive to design, deploy, and serve reliable data in a performant manner so that their organizations can make the most of their valuable corporate data assets.
Delta Lake is an open-source storage layer that brings ACID transactions to Apache Spark™ and big data workloads. Built on open standards, Delta Lake employs co-designed compute and storage and is compatible with Spark API’s. It powers high data reliability and query performance to support big data use cases, from batch and streaming ingests, fast interactive queries to machine learning. In this tutorial we will discuss the requirements of modern data engineering, the challenges data engineers face when it comes to data reliability and performance and how Delta Lake can help. Through presentation, code examples and notebooks, we will explain these challenges and the use of Delta Lake to address them. You will walk away with an understanding of how you can apply this innovation to your data architecture and the benefits you can gain.
This tutorial will be both instructor-led and hands-on interactive session. Instructions on how to get tutorial materials will be covered in class.
What you’ll learn:
Understand the key data reliability challenges
How Delta Lake brings reliability to data lakes at scale
Understand how Delta Lake fits within an Apache Spark™ environment
How to use Delta Lake to realize data reliability improvements
Prerequisites
A fully-charged laptop (8-16GB memory) with Chrome or Firefox
Pre-register for Databricks Community Edition
Those who out-compute can many times out-compete. The cloud gives you access to a massive amount of compute power when you need it. This talk will present an introduction to HPC in the cloud, including, the benefits of HPC in the cloud, how to get started, some tools to use, and how you can manage data. We will showcase several examples of HPC in the cloud by a number of public sector and commercial customers.
Created by: Dr. Jeff Layton, Principal, Solutions Architect
Netflix designed a massive scale cloud based media transcoding system from scratch for processing professionally produced studio content(to meet the unique scale and time constraints of our business). We bucked the common industry trend of vertical scaling and, instead, designed a horizontally scaled elastic system using AWS to meet the unique scale and time constraints of our business. Come hear how we designed this system, how it continues to get less expensive for Netflix, and how AWS represents a transformative opportunity in the wider media owning industry.
Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...Amazon Web Services
Amazon EC2 now offers a new GPU instance capable of running graphics and GPU compute workloads. In this session, we take a deeper look at the remote graphics capabilities of this new GPU instance, the tooling required to get started, and a live demo of applications streamed from our West Coast regions. We also explore the benefits of hosting your 3D graphics applications in the AWS cloud, where you can harness the vast compute and storage resources.
QCon London Presentation - 3/8/16
Abstract:
On December 24th, 2012 ASW US-EAST1 experienced a region-wide failure that took down the Netflix service for almost 24 hours. Knowing that failure is inevitable in any complex system we evolved our cloud-based, micro-service architecture to support multi-region traffic management and failover capabilities. With that foundation in place we drove initiatives to achieve service ubiquity and rapid global expansion. The overarching theme is #NetflixEverywhere - an amazing, global, highly available movie and TV streaming experience for any member, anytime, on any device, anywhere in the world.
Building and evolving a pervasive, global service requires a multi-disciplined approach that balances requirements around service availability, latency, data replication, compute capacity, and efficiency. In this session, we’ll follow the Netflix journey of failure, innovation, and ubiquity. We'll review the many facets of globalization then delve deep into the architectural patterns that enable seamless, multi-region traffic management, reliable, fast data propagation, and efficient service infrastructure. The patterns presented will be broadly applicable to internet services with global aspirations.
A Petascale Database for Large-Scale Neuroscience Powered by Serverless Advan...Amazon Web Services
The IARPA Machine Intelligence from Cortical Networks (MICrONS) program is a research endeavor that seeks to improve neurally-plausible machine learning algorithms by developing an understanding of the data representations and learning rules employed by the brain through structurally and functionally interrogating a cubic millimeter of mammalian neocortex. This effort requires the efficient storage, visualization, and processing of petabytes of neuroimaging data. The Johns Hopkins University Applied Physics Laboratory has developed an open-source, highly-available service to manage these data called the Boss. The Boss leverages Amazon Web Services to provide a cloud-native spatial database with an innovative storage hierarchy and auto-scaling capability to balance cost and performance. The system leverages serverless components extensively to meet both scalability and cost requirements. In this session we will provide an overview of the Boss, and focus on how JHU/APL leveraged DynamoDB, Lambda, and Step Functions for several high-throughput components of the system. We'll discuss both the challenges faced and successes achieved with serverless technologies. Learn More: https://aws.amazon.com/government-education/
Slide deck for a presentation at OSCON 2011 about why Netflix uses web technology for TV user interfaces and how we maximize performance for a broad range of devices.
Moonbot Studios Shoots for the Cloud to Meet Deadlines and Manage Costs
Threatened by deadlines for Academy award submissions, Moonbot Studios faced a shortage of rendering capacity while working on Taking Flight, its newest animated short film, and other important projects. As a small studio with a matching budget, the team did what it does best—it got creative and solved the problem with what they first called “magic.”
In this webinar, the Moonbot team will tell its tale of sending its rendering capacity to Google Compute Engine and how they defied networking odds by caching data close to the animators with an Avere vFXT. Hear Moonbot’s pipeline supervisor tell how they turned cloud data center distance into a non-issue, met deadlines, and gained quantitative benefits that sparked energy in this small team of creative aviators.
In this session, you will learn:
•What drove the Moonbot Studios to move to the cloud
•How they moved complex renders to Google Compute Engine, overcoming data access roadblocks
•Measurable results including speed, economics, flexibility, and creative freedom
The Moonbot Studios flight to the cloud will be supported by Google Cloud Platform and Avere Systems for a complete overview of how the technologies help bring new ideas to life.
Who Needs Network Management in a Cloud Native Environment?Eshed Gal-Or
(This talk was presented in OSS NA 2017 Los Angeles )
Network management (and virtual network in particular) is hard.
Cloud app developers find themselves dealing with too many options and too many settings, which make no sense.
This is because Cloud APIs evolved from legacy IT management.
Cloud-Native apps are revolutionizing how software is developed and deployed.
Why do app developers need to deal with those legacy network knobs and gauges?
Why do we even need to care about IP addresses, routers, or load balancers, in a cloud-native world?
In this presentation, we will explore some alternative approach and how we could go about implementing it *today* with K8S and Dragonflow (an open source virtual network management project), to provide a more stable, better performing and truly scalable cloud-native infrastructure.
Siddhi: A Second Look at Complex Event Processing ImplementationsSrinath Perera
Today there are so much data being available from sources like sensors (RFIDs, Near Field Communication), web activities, transactions, social networks, etc. Making sense of this avalanche of data requires efficient and fast processing.
Processing of high volume of events to derive higher-level information is a vital part of taking critical decisions, and
Complex Event Processing (CEP) has become one of the most rapidly emerging fields in data processing. e-Science
use-cases, business applications, financial trading applications, operational analytics applications and business activity monitoring applications are some use-cases that directly use CEP. This paper discusses different design decisions associated
with CEP Engines, and proposes some approaches to improve CEP performance by using more stream processing
style pipelines. Furthermore, the paper will discuss Siddhi, a CEP Engine that implements those suggestions. We
present a performance study that exhibits that the resulting CEP Engine—Siddhi—has significantly improved performance.
Primary contributions of this paper are performing a critical analysis of the CEP Engine design and identifying
suggestions for improvements, implementing those improvements
through Siddhi, and demonstrating the soundness of those suggestions through empirical evidence.
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your MindAvere Systems
While cloud computing offers virtually unlimited capacity, harnessing that capacity in an efficient, cost effective fashion can be cumbersome and difficult at the workload level. At the organizational level, it can quickly become chaos.
You must make choices around cloud deployment, and these choices could have a long-lasting impact on your organization. It is important to understand your options and avoid incomplete, complicated, locked-in scenarios. Data management and placement challenges make having the ability to automate workflows and processes across multiple clouds a requirement.
In this webinar, you will:
• Learn how to leverage cloud services as part of an overall computation approach
• Understand data management in a cloud-based world
• Hear what options you have to orchestrate HPC in the cloud
• Learn how cloud orchestration works to automate and align computing with specific goals and objectives
• See an example of an orchestrated HPC workload using on-premises data
From computational research to financial back testing, and research simulations to IoT processing frameworks, decisions made now will not only impact future manageability, but also your sanity.
(CMP202) Engineering Simulation and Analysis in the CloudAmazon Web Services
"Building great products, ones that are aesthetically appealing as well as functionally sound, requires cutting-edge design and engineering. Given the high cost of physical testing prototypes, engineering organizations are turning to simulation and analysis using digital models, but compute requirements for these have traditionally required expensive on-premises infrastructure. But now, engineering organizations can use high-performance computing services from AWS and solutions from AWS technology partners to innovate at scale globally, with no up-front capital infrastructure investment.
In this session, AWS Partner Ansys shares how they help customers of all sizes design and engineer better products through digital simulation and analysis using HPC on AWS."
Building a Just-in-Time Application Stack for AnalystsAvere Systems
Slide presentation from Webinar on February 17, 2016.
People in analytical roles are demanding more and more compute and storage to get their jobs done. Instead of building out infrastructure for a few employees or a department, systems engineers and IT managers can find value in creating a compute stack in the cloud to meet the fluctuating demand of their clients.
In this 45-minute webinar, you’ll learn:
- How to identify the right analytical workloads
- How to create a scalable compute environment using the cloud for analysts in under 10 minutes
- How to best manage costs associated with the cloud compute stack
- How to create dedicated client stacks with their own scratch space as well as general access to reference data
Health systems departments, research & development departments, and business analyst groups all face silos of these challenging, compute-intensive use cases. By learning how to quickly build this flexible workflow that can be scaled up and down (or off) instantly, you can support business objectives while efficiently managing costs.
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS Amazon Web Services
AWS is a great fit for both steady state and episodic computational workloads. Here we present some common architecture patterns for analyzing genomic and other biomedical data on scalable high-throughput computational clusters on AWS. This talk will cover bootstrapping a traditional Beowulf compute cluster on AWS EC2, data transfer and storage strategies for S3.
Building Reliable Data Lakes at Scale with Delta LakeDatabricks
Most data practitioners grapple with data reliability issues—it’s the bane of their existence. Data engineers, in particular, strive to design, deploy, and serve reliable data in a performant manner so that their organizations can make the most of their valuable corporate data assets.
Delta Lake is an open-source storage layer that brings ACID transactions to Apache Spark™ and big data workloads. Built on open standards, Delta Lake employs co-designed compute and storage and is compatible with Spark API’s. It powers high data reliability and query performance to support big data use cases, from batch and streaming ingests, fast interactive queries to machine learning. In this tutorial we will discuss the requirements of modern data engineering, the challenges data engineers face when it comes to data reliability and performance and how Delta Lake can help. Through presentation, code examples and notebooks, we will explain these challenges and the use of Delta Lake to address them. You will walk away with an understanding of how you can apply this innovation to your data architecture and the benefits you can gain.
This tutorial will be both instructor-led and hands-on interactive session. Instructions on how to get tutorial materials will be covered in class.
What you’ll learn:
Understand the key data reliability challenges
How Delta Lake brings reliability to data lakes at scale
Understand how Delta Lake fits within an Apache Spark™ environment
How to use Delta Lake to realize data reliability improvements
Prerequisites
A fully-charged laptop (8-16GB memory) with Chrome or Firefox
Pre-register for Databricks Community Edition
Those who out-compute can many times out-compete. The cloud gives you access to a massive amount of compute power when you need it. This talk will present an introduction to HPC in the cloud, including, the benefits of HPC in the cloud, how to get started, some tools to use, and how you can manage data. We will showcase several examples of HPC in the cloud by a number of public sector and commercial customers.
Created by: Dr. Jeff Layton, Principal, Solutions Architect
Netflix designed a massive scale cloud based media transcoding system from scratch for processing professionally produced studio content(to meet the unique scale and time constraints of our business). We bucked the common industry trend of vertical scaling and, instead, designed a horizontally scaled elastic system using AWS to meet the unique scale and time constraints of our business. Come hear how we designed this system, how it continues to get less expensive for Netflix, and how AWS represents a transformative opportunity in the wider media owning industry.
Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...Amazon Web Services
Amazon EC2 now offers a new GPU instance capable of running graphics and GPU compute workloads. In this session, we take a deeper look at the remote graphics capabilities of this new GPU instance, the tooling required to get started, and a live demo of applications streamed from our West Coast regions. We also explore the benefits of hosting your 3D graphics applications in the AWS cloud, where you can harness the vast compute and storage resources.
QCon London Presentation - 3/8/16
Abstract:
On December 24th, 2012 ASW US-EAST1 experienced a region-wide failure that took down the Netflix service for almost 24 hours. Knowing that failure is inevitable in any complex system we evolved our cloud-based, micro-service architecture to support multi-region traffic management and failover capabilities. With that foundation in place we drove initiatives to achieve service ubiquity and rapid global expansion. The overarching theme is #NetflixEverywhere - an amazing, global, highly available movie and TV streaming experience for any member, anytime, on any device, anywhere in the world.
Building and evolving a pervasive, global service requires a multi-disciplined approach that balances requirements around service availability, latency, data replication, compute capacity, and efficiency. In this session, we’ll follow the Netflix journey of failure, innovation, and ubiquity. We'll review the many facets of globalization then delve deep into the architectural patterns that enable seamless, multi-region traffic management, reliable, fast data propagation, and efficient service infrastructure. The patterns presented will be broadly applicable to internet services with global aspirations.
A Petascale Database for Large-Scale Neuroscience Powered by Serverless Advan...Amazon Web Services
The IARPA Machine Intelligence from Cortical Networks (MICrONS) program is a research endeavor that seeks to improve neurally-plausible machine learning algorithms by developing an understanding of the data representations and learning rules employed by the brain through structurally and functionally interrogating a cubic millimeter of mammalian neocortex. This effort requires the efficient storage, visualization, and processing of petabytes of neuroimaging data. The Johns Hopkins University Applied Physics Laboratory has developed an open-source, highly-available service to manage these data called the Boss. The Boss leverages Amazon Web Services to provide a cloud-native spatial database with an innovative storage hierarchy and auto-scaling capability to balance cost and performance. The system leverages serverless components extensively to meet both scalability and cost requirements. In this session we will provide an overview of the Boss, and focus on how JHU/APL leveraged DynamoDB, Lambda, and Step Functions for several high-throughput components of the system. We'll discuss both the challenges faced and successes achieved with serverless technologies. Learn More: https://aws.amazon.com/government-education/
Why Scale Matters and How the Cloud is Really Different (at scale)Amazon Web Services
Cloud computing gives you a number of advantages, such as being able to scale your application on demand. As a new business looking to use the cloud, you inevitably ask yourself, "Where do I start?" Join us in this session to understand best practices for scaling your resources from zero to millions of users. We will show you how to best combine different AWS services, make smarter decisions for architecting your application, and best practices for scaling your infrastructure in the cloud.
Presenter:
Santanu Dutt, Solution Architect, Amazon Internet Services
Vinayak Hegde, Vice President – Engineering, Helpshift
Sunny Saxena, Product Lead, Sprinklr
15015 SRV318 Serverless Breakout Session Research at PNNL: Powered by AWS Pacific Northwest National Laboratory's rich data sciences capability has produced novel solutions in numerous research areas including image analysis, statistical modeling, and social media (and many more!). See how PNNL software engineers utilize AWS to enable better collaboration between researchers and engineers, and to power the data processing systems required to facilitate this work, with a focus on Lambda, EC2, S3, Apache Nifi and other technologies. Several approaches will be covered including lessons learned. AWS re:Invent 2017, Amazon, Giardinelli, Serverless, SRV318, EC2 11/28/2017 1:00:00 PM Tue Breakout Session
Research at PNNL: Powered by AWS - SRV318 - re:Invent 2017Amazon Web Services
Pacific Northwest National Laboratory's rich data sciences capability has produced novel solutions in numerous research areas including image analysis, statistical modeling, and social media (and many more!). See how PNNL software engineers utilize AWS to enable better collaboration between researchers and engineers, and to power the data processing systems required to facilitate this work, with a focus on Lambda, EC2, S3, Apache Nifi and other technologies. Several approaches will be covered including lessons learned.
AWS Cloud Kata 2013 | Singapore - Getting to Scale on AWSAmazon Web Services
This session will focus on how to get from 'Minimum Viable Product' (MVP) to scale. It will also explain how to deal with unpredictable demand and how to build a scalable business. Attend this session to learn how to:
Scale web servers and app services with Elastic Load Balancing and Auto Scaling on Amazon EC2
Scale your storage on Amazon S3 and S3 Reduced Redundancy Storage
Scale your database with Amazon DynamoDB, Amazon RDS, and Amazon ElastiCache
Scale your customer base by reaching customers globally in minutes with Amazon CloudFront
When you're handling big data in the modern world, you will come to a point where you can't just pick a “one size fits all” approach anymore. However, to get the results you want, you also don’t have to spend big money on fire breathing hardware, or expensive software. AWS offers a beautiful array of open and commercial database choices, from do-it-yourself to fully managed services which handle scaling, and gives you powerful tools to choose the right architecture. You could choose from MySQL, RDS, Oracle, SQL Server, MongoDB, DynamoDB, Cassandra, ElastiCache, Redis, and SimpleDB, and our customers use them for different use cases. Each has different strengths, and this session highlights when you would want to choose each, with examples of how we use each to solve our big data challenges and why we made those decisions. We profile the some of the choices available to you - MySQL, RDS, Elasticache, Redis, Cassandra, MongoDB and DynamoDB – and three customer case studies on RDS, Elasticache and DynamoDB.
Microservices and serverless for MegaStartups - DLD TLV 2017Boaz Ziniman
Microservices and Serverless computing allow you to build and run simpler and more efficient applications, while improving your agility and saving a lot of money.
The ability to deploy your applications without the need for provisioning or managing servers opens for startups new opportunities to build web, mobile, and IoT backends; run stream processing or big data workloads; run chatbots, and more, without the investment in hardware or professional manpower to run this hardware.
In this session, we will learn how to get started with Microservices and Serverless computing with AWS Lambda, which lets you run code without provisioning or managing servers.
This is a must-read for all engineers interested in developing a Micro services architecture. Turn your monolithic server into a prolific and multiple instance solution! Includes well-known example such as Netflix. Please contact me for more details.
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...Cisco DevNet
Data gravity is a reality when dealing with massive amounts and globally distributed systems. Processing this data requires distributed analytics processing across InterCloud. In this presentation we will share our real world experience with storing, routing, and processing big data workloads on Cisco Cloud Services and Amazon Web Services clouds.
AWS Summit 2013 | India - Web, Mobile and Social Apps on AWS, Kingsley WoodAmazon Web Services
Build your next-generation, internet-scale, applications with low upfront costs using on demand access to web and application servers with AWS. Start small and grow to any scale with automated scaling. Stop reinventing the wheel, offload the undifferentiated heavy-lifting, and accelerate time to market using scalable storage, databases, content delivery, cache, search and other application services that make it easier to build and run apps that deliver a great customer experience.
Estimating the Total Costs of Your Cloud Analytics PlatformDATAVERSITY
Organizations today need a broad set of enterprise data cloud services with key data functionality to modernize applications and utilize machine learning. They need a platform designed to address multi-faceted needs by offering multi-function Data Management and analytics to solve the enterprise’s most pressing data and analytic challenges in a streamlined fashion. They need a worry-free experience with the architecture and its components.
For the Computer Measurement Group workshop in San Diego November 2013. Also presented to a student class at UC Santa Barbara. What is Cloud Native. Capacity and Performance benchmarks. Cost Optimization Techniques - content co-developed with Jinesh Varia of AWS.
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
Il Forecasting è un processo importante per tantissime aziende e viene utilizzato in vari ambiti per cercare di prevedere in modo accurato la crescita e distribuzione di un prodotto, l’utilizzo delle risorse necessarie nelle linee produttive, presentazioni finanziarie e tanto altro. Amazon utilizza delle tecniche avanzate di forecasting, in parte questi servizi sono stati messi a disposizione di tutti i clienti AWS.
In questa sessione illustreremo come pre-processare i dati che contengono una componente temporale e successivamente utilizzare un algoritmo che a partire dal tipo di dato analizzato produce un forecasting accurato.
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
La varietà e la quantità di dati che si crea ogni giorno accelera sempre più velocemente e rappresenta una opportunità irripetibile per innovare e creare nuove startup.
Tuttavia gestire grandi quantità di dati può apparire complesso: creare cluster Big Data su larga scala sembra essere un investimento accessibile solo ad aziende consolidate. Ma l’elasticità del Cloud e, in particolare, i servizi Serverless ci permettono di rompere questi limiti.
Vediamo quindi come è possibile sviluppare applicazioni Big Data rapidamente, senza preoccuparci dell’infrastruttura, ma dedicando tutte le risorse allo sviluppo delle nostre le nostre idee per creare prodotti innovativi.
Ora puoi utilizzare Amazon Elastic Kubernetes Service (EKS) per eseguire pod Kubernetes su AWS Fargate, il motore di elaborazione serverless creato per container su AWS. Questo rende più semplice che mai costruire ed eseguire le tue applicazioni Kubernetes nel cloud AWS.In questa sessione presenteremo le caratteristiche principali del servizio e come distribuire la tua applicazione in pochi passaggi
Vent'anni fa Amazon ha attraversato una trasformazione radicale con l'obiettivo di aumentare il ritmo dell'innovazione. In questo periodo abbiamo imparato come cambiare il nostro approccio allo sviluppo delle applicazioni ci ha permesso di aumentare notevolmente l'agilità, la velocità di rilascio e, in definitiva, ci ha consentito di creare applicazioni più affidabili e scalabili. In questa sessione illustreremo come definiamo le applicazioni moderne e come la creazione di app moderne influisce non solo sull'architettura dell'applicazione, ma sulla struttura organizzativa, sulle pipeline di rilascio dello sviluppo e persino sul modello operativo. Descriveremo anche approcci comuni alla modernizzazione, compreso l'approccio utilizzato dalla stessa Amazon.com.
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
L’utilizzo dei container è in continua crescita.
Se correttamente disegnate, le applicazioni basate su Container sono molto spesso stateless e flessibili.
I servizi AWS ECS, EKS e Kubernetes su EC2 possono sfruttare le istanze Spot, portando ad un risparmio medio del 70% rispetto alle istanze On Demand. In questa sessione scopriremo insieme quali sono le caratteristiche delle istanze Spot e come possono essere utilizzate facilmente su AWS. Impareremo inoltre come Spreaker sfrutta le istanze spot per eseguire applicazioni di diverso tipo, in produzione, ad una frazione del costo on-demand!
In recent months, many customers have been asking us the question – how to monetise Open APIs, simplify Fintech integrations and accelerate adoption of various Open Banking business models. Therefore, AWS and FinConecta would like to invite you to Open Finance marketplace presentation on October 20th.
Event Agenda :
Open banking so far (short recap)
• PSD2, OB UK, OB Australia, OB LATAM, OB Israel
Intro to Open Finance marketplace
• Scope
• Features
• Tech overview and Demo
The role of the Cloud
The Future of APIs
• Complying with regulation
• Monetizing data / APIs
• Business models
• Time to market
One platform for all: a Strategic approach
Q&A
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
Per creare valore e costruire una propria offerta differenziante e riconoscibile, le startup di successo sanno come combinare tecnologie consolidate con componenti innovativi creati ad hoc.
AWS fornisce servizi pronti all'utilizzo e, allo stesso tempo, permette di personalizzare e creare gli elementi differenzianti della propria offerta.
Concentrandoci sulle tecnologie di Machine Learning, vedremo come selezionare i servizi di intelligenza artificiale offerti da AWS e, anche attraverso una demo, come costruire modelli di Machine Learning personalizzati utilizzando SageMaker Studio.
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
Con l'approccio tradizionale al mondo IT per molti anni è stato difficile implementare tecniche di DevOps, che finora spesso hanno previsto attività manuali portando di tanto in tanto a dei downtime degli applicativi interrompendo l'operatività dell'utente. Con l'avvento del cloud, le tecniche di DevOps sono ormai a portata di tutti a basso costo per qualsiasi genere di workload, garantendo maggiore affidabilità del sistema e risultando in dei significativi miglioramenti della business continuity.
AWS mette a disposizione AWS OpsWork come strumento di Configuration Management che mira ad automatizzare e semplificare la gestione e i deployment delle istanze EC2 per mezzo di workload Chef e Puppet.
Scopri come sfruttare AWS OpsWork a garanzia e affidabilità del tuo applicativo installato su Instanze EC2.
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
Vuoi conoscere le opzioni per eseguire Microsoft Active Directory su AWS? Quando si spostano carichi di lavoro Microsoft in AWS, è importante considerare come distribuire Microsoft Active Directory per supportare la gestione, l'autenticazione e l'autorizzazione dei criteri di gruppo. In questa sessione, discuteremo le opzioni per la distribuzione di Microsoft Active Directory su AWS, incluso AWS Directory Service per Microsoft Active Directory e la distribuzione di Active Directory su Windows su Amazon Elastic Compute Cloud (Amazon EC2). Trattiamo argomenti quali l'integrazione del tuo ambiente Microsoft Active Directory locale nel cloud e l'utilizzo di applicazioni SaaS, come Office 365, con AWS Single Sign-On.
Dal riconoscimento facciale al riconoscimento di frodi o difetti di fabbricazione, l'analisi di immagini e video che sfruttano tecniche di intelligenza artificiale, si stanno evolvendo e raffinando a ritmi elevati. In questo webinar esploreremo le possibilità messe a disposizione dai servizi AWS per applicare lo stato dell'arte delle tecniche di computer vision a scenari reali.
Amazon Web Services e VMware organizzano un evento virtuale gratuito il prossimo mercoledì 14 Ottobre dalle 12:00 alle 13:00 dedicato a VMware Cloud ™ on AWS, il servizio on demand che consente di eseguire applicazioni in ambienti cloud basati su VMware vSphere® e di accedere ad una vasta gamma di servizi AWS, sfruttando a pieno le potenzialità del cloud AWS e tutelando gli investimenti VMware esistenti.
Molte organizzazioni sfruttano i vantaggi del cloud migrando i propri carichi di lavoro Oracle e assicurandosi notevoli vantaggi in termini di agilità ed efficienza dei costi.
La migrazione di questi carichi di lavoro, può creare complessità durante la modernizzazione e il refactoring delle applicazioni e a questo si possono aggiungere rischi di prestazione che possono essere introdotti quando si spostano le applicazioni dai data center locali.
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
Molte aziende oggi, costruiscono applicazioni con funzionalità di tipo ledger ad esempio per verificare lo storico di accrediti o addebiti nelle transazioni bancarie o ancora per tenere traccia del flusso supply chain dei propri prodotti.
Alla base di queste soluzioni ci sono i database ledger che permettono di avere un log delle transazioni trasparente, immutabile e crittograficamente verificabile, ma sono strumenti complessi e onerosi da gestire.
Amazon QLDB elimina la necessità di costruire sistemi personalizzati e complessi fornendo un database ledger serverless completamente gestito.
In questa sessione scopriremo come realizzare un'applicazione serverless completa che utilizzi le funzionalità di QLDB.
Con l’ascesa delle architetture di microservizi e delle ricche applicazioni mobili e Web, le API sono più importanti che mai per offrire agli utenti finali una user experience eccezionale. In questa sessione impareremo come affrontare le moderne sfide di progettazione delle API con GraphQL, un linguaggio di query API open source utilizzato da Facebook, Amazon e altro e come utilizzare AWS AppSync, un servizio GraphQL serverless gestito su AWS. Approfondiremo diversi scenari, comprendendo come AppSync può aiutare a risolvere questi casi d’uso creando API moderne con funzionalità di aggiornamento dati in tempo reale e offline.
Inoltre, impareremo come Sky Italia utilizza AWS AppSync per fornire aggiornamenti sportivi in tempo reale agli utenti del proprio portale web.
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
Molte organizzazioni sfruttano i vantaggi del cloud migrando i propri carichi di lavoro Oracle e assicurandosi notevoli vantaggi in termini di agilità ed efficienza dei costi.
La migrazione di questi carichi di lavoro, può creare complessità durante la modernizzazione e il refactoring delle applicazioni e a questo si possono aggiungere rischi di prestazione che possono essere introdotti quando si spostano le applicazioni dai data center locali.
In queste slide, gli esperti AWS e VMware presentano semplici e pratici accorgimenti per facilitare e semplificare la migrazione dei carichi di lavoro Oracle accelerando la trasformazione verso il cloud, approfondiranno l’architettura e dimostreranno come sfruttare a pieno le potenzialità di VMware Cloud ™ on AWS.
Amazon Elastic Container Service (Amazon ECS) è un servizio di gestione dei container altamente scalabile, che semplifica la gestione dei contenitori Docker attraverso un layer di orchestrazione per il controllo del deployment e del relativo lifecycle. In questa sessione presenteremo le principali caratteristiche del servizio, le architetture di riferimento per i differenti carichi di lavoro e i semplici passi necessari per poter velocemente migrare uno o più dei tuo container.
4. No servers to provision
or manage
Scales with usage
Never pay for idle Availability and fault
tolerance built in
Serverless characteristics
5. Serverless Computing – AWS Lambda
Run code without provisioning or managing servers – pay
only for the compute time you consume.
6. Continuous
Scaling
No Servers to
Manage
Subsecond
Metering
Benefits of AWS Lambda
AWS Lambda handles:
• Operations and
management
• Provisioning and
utilization
• Scaling
• Availability and fault
tolerance
Automatically scales your
application, running code in
response to each trigger
Your code runs in parallel and
processes each trigger
individually, scaling precisely
with the size of the workload
Pricing
• CPU and Network
scaled based on
RAM (128 MB to
1500 MB)
• $0.20 per
1M requests
• Price per 100 ms
9. Benefits of AWS Step Functions
Diagnose and
debug problems
faster
Adapt to change
Easy to connect and
coordinate
distributed components
and microservices to
quickly create apps
Manages the
operations and
infrastructure of
service coordination
to ensure availability
at scale, and
under failure
Productivity Agility Resilience
10. Application Lifecycle in AWS Step Functions
Visualize in the
Console
Define in JSON Monitor
Executions
11. Amazon
DynamoDB
Fast and Flexible NoSQL Database Service
• NoSQL Database
• Seamless scalability
• Zero admin
• Single digit millisecond latency
12. Amazon DynamoDB
Document or Key-Value Scales to Any WorkloadFully Managed NoSQL
Access Control Event Driven ProgrammingFast and Consistent
14. IARPA MICrONS
Intelligence Advanced Research Projects Activity
Machine Intelligence from Cortical Networks
• MICrONS seeks to revolutionize machine learning by
understanding the representations, transformations, and
learning rules employed by the brain
• The program is expressly designed as a dialogue between
computer science, data science, and neuroscience
Neurally Plausible
Machine Learning
Framework
Behavior
Experiment
Functional
Imaging
Structural
Imaging
Data
Analysis
15. Why Is This Different?
• Current Neural networks are “neurally inspired” but
not considered biofidelic or neurally plausible
• Previous projects to build algorithms based on the
brain exist, but have been focused on macro and
micro information, or lower-fidelity statistics
• Little is known about the brain at the mesoscale
• A “cortical column” is theorized to be order
~1mm3
• In this program, structure and function co-
registration provides a uniquely rich picture of
computing circuits
• Researchers are directly measuring mesoscale
activity and circuits
Human Connectome Project
(1-100s of neurons)
microscale
(1k – 1M neurons)
mesoscale
(brain regions)
macroscale
?
16. Why Is This Different: Functional Imaging
Video Credit: Tianyu Wang (Xu Lab, Cornell University) & Jacob Reimer (Tolias Lab, Baylor College of Medicine)
17. Why Is This Different: Structural Imaging
• Peta-scale structural imaging
• 1mm3 region is large enough to contain
meaningful circuits never before observed
• ~50k-100k neurons
• ~100,000,000 synapses
• ~4x4x30nm voxels
• ~2 – 2.5 PB
• Three different techniques
• Scanning Electron Microscopy (SEM)
• Transmission Electron Microscopy
(TEM)
• Fluorescent in situ sequencing
(FISSEQ) Barcoding
Video Credit: Kasthuri, et al. - Cell 2015
Bobby Kasthuri, Daniel Berger, Jeff Lichtman
18. Why Is This Different: Co-registered Data
• Co-registration links structure to function
• For the first time, researchers will
measure in the same sample at scale:
• Stimulus (”input”)
• Behavior (“output”)
• Connectome (“circuit diagram”)
• Neuronal Activity (“voltages”)
Calcium Imaging Data – Tolias Lab, Baylor College of Medicine
X-ray Tomography and co-registration – Allen Institute for Brain Science
19. Why Can We Succeed Now?
• New imaging techniques and engineering advances allow for
interrogation of mesoscale circuits
• Increased computing power has enabled automated analysis w/
machine learning
• Reduced storage costs have made collection of many petabytes of data
possible
• Use of the cloud has provided the ability to scale when needed and
facilitates sharing and collaboration
We can directly observe and densely reconstruct mesoscale
neuronal circuits in vivo for the first time
20. The Boss
Block and Object Storage Service
The Boss is a multi-dimensional spatial database, provided as a managed service on AWS
The Boss stores annotation data co-registered to image data
• An annotation is a unique 64-bit identifier applied to a set of voxels, representing its spatial distribution
ID: 1267
ID: 345345
ID: 534534799
22. Ingest Service
Boss API Overview
The Boss is accessible through a versioned REST API
User Service Group Service Resource Service Permission Service
Object Service Tile Service Downsample
Service
Cutout Service
Metadata Service
On Premise The BossID: 104829
23. Ingest Service
Boss API Overview
The Boss is accessible through a versioned REST API
User Service Group Service Resource Service Permission Service
Object Service Tile Service Downsample
Service
Cutout Service
Metadata Service
On Premise The BossID: 104829
24. The Boss Leverages Serverless Components
Experimental Metadata,
Annotation Index, Cuboid
Index, Tile Index
DynamoDB Lambda SQS Step Functions S3
Downsampling, Ingest,
Cache page-in and page-out
operations, DNS updates
Ingest Upload Tasks,
Reliable Lambda
Processing
Downsample Workflow,
Ingest Workflow,
Asynchronous Delete
Workflow
Cuboid Storage,
Tile Storage,
Static Hosting
25. The Boss Leverages Serverless Components
Experimental Metadata,
Annotation Index, Cuboid
Index, Tile Index
DynamoDB Lambda SQS Step Functions S3
Downsampling, Ingest,
Cache page-in and page-out
operations, DNS updates
Ingest Upload Tasks,
Reliable Lambda
Processing
Downsample Workflow,
Ingest Workflow,
Asynchronous Delete
Workflow
Cuboid Storage,
Tile Storage,
Static Hosting
26. Heaviside
Python library and DSL for working with AWS Step Functions
• The Step Function state machine language, while
flexible, is hard to write and maintain
• Heaviside is a Python package that provides several
components to make Step Functions easy to use
• DSL and Compiler – Greatly simplifies writing and maintaining
Step Function JSON definitions
• Library for creating and execution Step Functions in AWS
• A framework for running Activities
https://github.com/jhuapl-boss/heaviside
27. Heaviside Example {
"Comment": "Delete CuboidnRemoves all of the different data related to a given cuboid,nremoves the actual cuboid data, and then cleans up the finalnbookkeeping for the cuboidn",
"States": {
"Line7": {
"Next": "merge_parallel_outputs",
"Branches": [
{
"States": {
"delete_metadata": {
"Comment": "deletes metadata",
"Resource": "arn:aws:states:us-east-1:451493790433:activity:delete_metadata-integration-boss",
"End": true,
"Retry": [
{
"IntervalSeconds": 60,
"MaxAttempts": 4,
"ErrorEquals": [
"States.ALL"
],
"BackoffRate": 2.0
}
],
"Type": "Task"
}
},
"StartAt": "delete_metadata"
},
{
"States": {
"delete_id_count": {
"Comment": "deletes from dynamodb table idcount",
"Resource": "arn:aws:states:us-east-1:451493790433:activity:delete_id_count-integration-boss",
"End": true,
"Retry": [
{
"IntervalSeconds": 60,
"MaxAttempts": 4,
"ErrorEquals": [
"States.ALL"
],
"BackoffRate": 2.0
}
],
"Type": "Task"
}
},
"StartAt": "delete_id_count"
},
{
"States": {
"delete_id_index": {
"Comment": "deletes from dyanmodb table idindex",
"Resource": "arn:aws:states:us-east-1:451493790433:activity:delete_id_index-integration-boss",
"End": true,
"Retry": [
{
"IntervalSeconds": 60,
"MaxAttempts": 4,
"ErrorEquals": [
"States.ALL"
],
"BackoffRate": 2.0
}
],
"Type": "Task"
}
},
"StartAt": "delete_id_index"
}
],
"Type": "Parallel"
},
"merge_parallel_outputs": {
"Comment": "merges the outputs of all the parallel activities into a single dictionary",
"Resource": "arn:aws:states:us-east-1:451493790433:activity:merge_parallel_outputs-integration-boss",
"Next": "find_s3_index",
"Retry": [
{
"IntervalSeconds": 60,
"MaxAttempts": 4,
"ErrorEquals": [
"States.ALL"
],
"BackoffRate": 2.0
}
],
"Type": "Task"
},
"find_s3_index": {
"Comment": "finds data to delete from s3index and s3",
"Resource": "arn:aws:states:us-east-1:451493790433:activity:find_s3_index-integration-boss",
"Next": "delete_s3_index",
"Retry": [
{
"IntervalSeconds": 60,
"MaxAttempts": 4,
"ErrorEquals": [
"States.ALL"
],
"BackoffRate": 2.0
}
],
"Type": "Task"
},
"delete_s3_index": {
"Comment": "deletes data from s3index and s3",
"Catch": [
{
"ErrorEquals": [
"States.ALL"
],
"ResultPath": "$.error",
"Next": "notify_admins"
}
],
"Resource": "arn:aws:states:us-east-1:451493790433:activity:delete_s3_index-integration-boss",
"Next": "delete_clean_up",
"Type": "Task",
"Retry": [
{
"IntervalSeconds": 120,
"MaxAttempts": 4,
"ErrorEquals": [
"States.ALL"
],
"BackoffRate": 2.0
}
]
},
"notify_admins": {
"Comment": "sends SNS message to microns topic",
"Resource": "arn:aws:states:us-east-1:451493790433:activity:notify_admins-integration-boss",
"Next": "delete_clean_up",
"Type": "Task"
},
"delete_clean_up": {
"Comment": "cleans up the delete s3 table.",
"Resource": "arn:aws:states:us-east-1:451493790433:activity:delete_clean_up-integration-boss",
"End": true,
"Retry": [
{
"IntervalSeconds": 120,
"MaxAttempts": 4,
"ErrorEquals": [
"States.ALL"
],
"BackoffRate": 2.0
}
],
"Type": "Task"
}
},
"StartAt": "Line7"
}
Heaviside
Compiler
28. Downsample Deep Dive: Overview
• Problem Description
• Need to iteratively downsample a dataset to build a resolution
hierarchy
• Enables “zooming out” for large scale visualization and analysis
• Workflow is run infrequently and on-demand by users
• Workflow needs to scale from 2GB to 2PB of data
• Implementation
• Use a Step Function to manage failures and iterate processing
• Since downsample is “embarrassingly parallel” invoke Lambda
in parallel to perform image processing
• Serverless Benefit
• Can massively scale processing for short period of time and on-
demand without an administrator in the loop
• Don’t need to worry about high availability
30. Downsample Deep Dive: Process
• User requests a channel to be
downsampled via the API
• Step Function invokes Lambdas in
parallel to downsample data while
providing status to the API
• Step Function iterates automatically
to build the resolution hierarchy
?
31. Ingest Deep Dive: Overview
• Problem Description
• Need to transfer large amounts of on-premise image data into the Boss
• Support both data transfer to the cloud and "ingest" into the Boss format
• Workflow is run infrequently and on-demand by users, but often in “bursts”
as teams deliver data for the same deadlines
• Workflow needs to scale from 2GB to 2PB of data
• Implementation
• Utilize SQS, S3, Lambda, and DynamoDB to provide high-throughput,
reliable upload and processing pipeline
• Serverless Benefits
• Don’t need to keep servers up when workflow is not running
• Can massively scale processing for short period of time on-demand
32. Ingest Deep Dive: Create an Ingest Job
• The Ingest process is
on-demand and can be
started at any time
• User uploads
configuration file
• Boss API creates a
temporary task queue
33. Ingest Deep Dive: Populate Upload Queue
• Step Function invoked to
populate the Upload
Task Queue
• First, Lambda is called in
parallel to upload
messages to SQS
• Next, the Step Function
waits to allow SQS to
become consistent
• Finally, the Step
Function verifies the
number of messages in
the queue is correct
34. Ingest Deep Dive: Upload Tiles
• The Ingest Client operates
distributed and in parallel,
uploading tiles as fast as
possible to Amazon S3
• Amazon S3 PUT events
invoke Lambda to track
tiles
• When enough tiles arrive,
a second Lambda is
asynchronously invoked to
ingest the tiles
????
35. Ingest Deep Dive: Ingest Cuboids
• Lambda function converts
image tiles into
compressed 3D matrices
• Processed data is written
to final S3 bucket and
indexed
• Temporary image files are
deleted
X
36. Ingest rate limited by the
user’s local resources
and bandwidth
Ingest Deep Dive: Benefit of Serverless Ingest
• Support for multiple users
ingesting in parallel
• Does not impact the rest of
the system’s performance
• Auto-scales automatically
Current transfers have reached >4Gbps
37. Lambda Design Considerations
• Duration and memory limitations
• 5 minutes, 1.5GB memory max can limit applications
• More memory = More CPU
• Your lambda will run FASTER but cost MORE per 100ms
• Optimize allocated memory independently for each
lambda function to minimize cost
• Code and dependencies (virtualenv) limited to 250MB
• Lambda capacity is tied to execution duration
• If your Lambda calls external services (e.g. DynamoDB,
S3), network and external latencies WILL effect
execution time
• This can result in interesting failure modes and
cascading failures
• As your Lambda starts to throttle and automatically retry,
things can continue to back up even more
• Circuit breakers and other resilient design patterns are
useful
38. DynamoDB Design Considerations
• Object size drives capacity
• As a read or write grows in size,
consumed capacity increases
• The largest size record uses 400x
the capacity of the smallest size
record
• When you pay for capacity you
are actually paying for partitions
• If you need to deal with a hot
partition you need to DOUBLE
your capacity
• Beware of the Hot Partition
• Happens when you read/write
heavily to keys in the same
partition
• Can be very confusing as you
have provisioned plenty of
capacity but still get throttled
• Be sure to “spread” your keys
across partitions
• Prepend a hash!
Units of Capacity required for writes = Number of item writes per second x item size in 1KB blocks
Units of Capacity required for reads = Number of item reads per second x item size in 4KB blocks
39. Scaling Up Lambda and DynamoDB
• If you want to scale Lambda you must up your limits
• Increase your Lambda capacity limit
• If you project your lambda function into a VPC, make sure your network
architecture can handle the bandwidth
• Increase your ENI limit to match your Lambda capacity
• If interacting with S3 heavily, pre-shard your bucket
• Use DynamoDB Auto Scaling!
• DynamoDB can scale up infinitely, but only down 4 times a day
• TEST and then TEST and then TEST again
• Attempt to model user behavior with end-to-end regression tests
• Update your model of user behavior with time
• Look into error and log aggregators
• When things to bad, they go pretty bad so it’s hard to debug
DevOps Fail
41. Acknowledgements
JHU/APL
Denise D’Angelo
Tim Gion
Sandy Hider
Priya Manavalan
Jordan Matelsky
Derek Pryor
Will Gray Roncal
Brock Wester
IARPA
David A. Markowitz
R. Jacob Vogelstein
JHU
Alex Baden
Kunal Lillaney
Randal Burns
Team 1
David Cox
Hanspeter Pfister
Jeff Lichtman
Team 3
George Church
Sandra Kuhlman
Tai Sing Lee
Alan Yuille
Team 2
Andreas Tolias
Sebastian Seung
R. Clay Reid
Nuno da Costa