SlideShare a Scribd company logo
Your easy move to serverless computing and
radically simplified data processing
Dr. Gil Vernik, IBM Research
About myself
• Gil Vernik
• IBM Research from 2010
• PhD in mathematics. Post-doc in Germany
• Architect, 25+ years of development experience
• Active in open source
• Recent interest – Cloud. Hybrid cloud. Big Data. Storage. Serverless
Twitter: @vernikgil
https://www.linkedin.com/in/gil-vernik-1a50a316/
Agenda
What problem we solve
Why serverless computing
How to make an easy move to serverless
Use cases
This project has received funding from the European Union’s Horizon 2020 research and innovation
programme under grant agreement No 825184.
http://cloudbutton.eu
The motivation..
Simulations
• Alice is working in the risk
management department at the bank
• She needs to evaluate a new contract
• She decided to run a Monte-Carlo
simulation to evaluate the contract
• There is need about 100,000,000
calculations to get a better estimation
This Photo by Unknown Author is licensed under CC BY-SA
The challenge
How and where to scale the code of Monte Carlo simulations?
Business logic
Data processing
• Maria needs to run face detection using TensorFlow over millions of
images. The process requires raw images to be pre-processed
before used by TensorFlow
• Maria wrote a code and tested it on a single image
• Now she needs to execute the same code at massive scale, with
parallelism, on terabytes of data stored in object storage
Raw image Pre-processed image
The challenge
How to scale the code to run in parallel on terabytes of data without become
a systems expert in scaling the code and learn storage semantics?
IBM Cloud Object
Storage
Mid summary
• How and where to scale the code?
• How to process massive data sets without become a storage
expert?
• How to scale certain flows from the existing applications
without major disruption to the existing system?
VMs, containers and the rest
• Naïve solution to scale an application - provision high resourced virtual
machines and run your application there
• Complicated , Time consuming, Expensive
• Recent trend is to leverage container platforms
• Containers have better granularity comparing to VMs, better resource
allocations, and so on.
• Docker containers became popular, yet many challenges how to ”containerize”
existing code or applications
• Comparing VMs and containers is beyond the scope of this talk…
• Leverage Function as a Service platforms
FaaS - "Hello Strata NY”
Deploy the code
(as specified by the FaaS provider)
Invoke “helloStrata”
“Hello Strata NY”
Invoke “helloStrata”
“Hello Strata NY”
# main() will be invoked when you Run This Action.
#
# @param Cloud Functions actions accept a single
parameter,
# which must be a JSON object.
#
# @return which must be a JSON object.
# It will be the output of this action.
#
#
import sys
def main(dict):
if 'name' in dict:
name = dict['name']
else:
name = 'Strata NY'
greeting = 'Hello ' + name + '!'
print(greeting)
return {'greeting':greeting}
IBM Cloud Functions
”helloStrata”
FaaS
code()
Event Action
Deploy
the code
Input
Output
• Unit of computation is a function
• Function is a short lived task
• Smart activation, event driven, etc.
• Usually stateless
• Transparent auto-scaling
• Pay only for what you use
• No administration
• All other aspects of the execution are
delegated to the Cloud Provider
Function as a Service
IBM Cloud Functions
Are there still challenges?
• How to integrate FaaS into existing applications and frameworks
without major disruption?
• Users need to be familiar with API of storage and FaaS platform
• How to control and coordinate invocations
• How to scale the input and generate output
1
4
Push to the Cloud
• Occupy the Cloud: Distributed Computing for the 99%,
(Eric Jonas, Qifan Pu, Shivaram Venkataraman, Ion Stoica, Benjamin Recht , 2017)
• Why is it still ”complicated” to move workflows to the Cloud?
User need to be familiar with cloud provider API, use deployments
tools, write code according to cloud provider spec and so on.
• Can FaaS be used for broad scope of flows? (RISELab at UC Berkley, 2017)
PyWren - an open source framework released
Push to the cloud with PyWren
• Serverless for more use cases
(not just event based or “Glue” for services)
• Push to the Cloud experience
• Designed to scale Python application at massive scale
Python code
Serverless
action1
Serverless action 2
Serverless
action1000
………
………
Cloud Button Toolkit
• PyWren-IBM ( aka CloudButton Toolkit) is a novel Python
framework extending PyWren
• ~800 commits to PyWren-IBM on top of PyWren
• Being developed as part of CloudButton project
• Leaded by IBM Research Haifa
• Open source https://github.com/pywren/pywren-ibm-cloud
PyWren-IBM example
data = [1,2,3,4]
def my_map_function(x):
return x+7
PyWren-IBM
print (cb.get_result())
[8,9,10,11]
IBM Cloud Functions
import pywren_ibm_cloud as cbutton
cb = cbutton.ibm_cf_executor()
cb.map(my_map_function, data))
PyWren-IBM
PyWren-IBM
https://www.youtube.com/watch?v=EYa95KyYEtg
PyWren-IBM example
data = “cos://mybucket/year=2019/”
def my_map_function(obj, boto3_client):
// business logic
return obj.name
PyWren-IBM
print (cb.get_result())
[d1.csv, d2.csv, d3.csv,….]
IBM Cloud Functions
import pywren_ibm_cloud as cbutton
cb = cbutton.ibm_cf_executor()
cb.map(my_map_function, data))
PyWren-IBM
PyWren-IBM
Unique differentiations of PyWren-IBM
• Pluggable implementation for FaaS platforms
• IBM Cloud Functions, Apache OpenWhisk, OpenShift by Red Hat, Kubernetess
• Supports Docker containers
• Seamless integration with Python notebooks
• Advanced input data partitioner
• Data discovery to process large amounts of data stored in IBM Cloud Object
storage, chunking of CSV files, supports user provided partition logic
• Unique functionalities
• Map-Reduce, monitoring, retry, in-memory queues, authentication token reuse,
pluggable storage backends, and many more..
What PyWren-IBM good for
• Batch processing, UDF, ETL, HPC and Monte Carlo simulations
• Embarrassingly parallel workload or problems - often the case where there is little or no
dependency or need for communication between parallel tasks
• Subset of map-reduce flows
Input Data
Results
………Tasks 1 2 3 n
What PyWren-IBM requires?
Function as a Service platform
• IBM Cloud Functions,
Apache OpenWhisk
• OpenShift, Kubernetes, etc.
Storage accessed from
Function as a Service platform
through S3 API
• IBM Cloud Object Storage
• Red Hat Ceph
PyWren-IBM and HPC This Photo by Unknown Author is licensed under CC BY-SA
What is HPC?
• High Performance Computing
• Mostly used to solve advanced problems that may be
simulations, analysis, research problems , etc.
• Does HPC well defined? – depends whom you ask
• Super computers or highly parallel processing or both?
• MPI (Message Passing Interface) for communication or
there is only need to exchange results between simulations?
• Data locality or “fast “access to the data?
• Super fast? “fast” enough? Or good enough?
This Photo by Unknown Author is licensed under CC BY-NC
HPC and “super” computers
• Dedicated HPC super computers
• Designed to be super fast
• Calculations usually rely on Message
Passing Interface (MPI)
• Pros : HPC super computers
• Cons: HPC super computers
DedicatedHPC
supercomputers
HPC simulations
HPC and VMs
• No need to buy expensive machines
• Frameworks to run HPC flows over VMs
• Flows usually depends on MPI, data locality
• Recent academic interest
• Pros : Virtual Machines
• Cons: Virtual Machines
VirtualMachines
private,cloud,etc.
HPC simulations
HPC and Containers
Containers
• Good granularity, parallelism, resource
allocation, etc.
• Research papers, frameworks
• Singularity / Docker containers
• Pros: containers
• Cons: many focuses how to move entire
application into containers, which
usually require to re-design applications
HPC simulations
HPC and FaaS with PyWren-IBM
HPC simulations
Containers
• FaaS is a perfect platform to scale code and
applications
• Many FaaS platforms allows users to use
Docker containers
• Code can contain any dependencies
• PyWren-IBM is natural fit for many HPC
flows
• Pros : the easy move to serverless
• Try it yourself…
PyWren-IBM
overFaaS
Use cases and demos..
IBM Cloud
Object Storage
PyWren-IBM framework
https://github.com/pywren/pywren-ibm-cloud
IBM Cloud
Functions
Monte Carlo and PyWren-IBM
PyWren is natural fit to scale Monte Carlo
computations across FaaS platform
User need to write business logic and PyWren does
the rest
Monte Carlo methods are a broad class of computational algorithms
- evaluate the risk and uncertainty, investments in projects,
popular methods in finance
Stock price prediction
• A mathematical approach for stock price modelling. More accurate for
modelling prices over longer periods of time
• We run Monte Carlo stock prediction over IBM Cloud Functions with
PyWren-IBM
• With PyWren-IBM total code is ~40 lines. Without PyWren-IBM
running the same code requires 100s of additional lines of code
Number of
forecasts
Local run (1CPU,
4 cores)
IBM CF Total number of CF
invocations
100,000 10,000 seconds ~70 seconds 1000
• We run 1000 concurrent invocations, each consuming 1024MB of memory
• Each invocation predicted a forecast of 1080 days and used 100 random samples per prediction.Totally we did 108,000,000 calculations
About 2500 forecasts predicted stock price around $130
https://www.youtube.com/watch?v=vF5HI2q5VKw
Protein Folding
• Proteins are biological polymers that carry out most
of the cell’s day-to-day functions.
• Protein structure leads to protein function
• Proteins are made from a linear chain of amino acids
and folded into variety of 3-D shapes
• Protein folding is a complex process
that is not yet completely understood
This Photo by Unknown Author is licensed under CC BY-SA-NC
This Photo by Unknown Author is licensed under CC BY-SA
Replica exchange
• Monte Carlo simulations are popular methods to predict protein folding
• ProtoMol is special designed framework for molecular dynamics
• http://protomol.sourceforge.net
• A highly parallel replica exchange molecular dynamics (REMD) method used
to exchange Monte Carlo process for efficient sampling
• A series of tasks (replicas) are run in parallel at various temperatures
• From time to time the configurations of neighboring tasks are exchanged
• Various HPC frameworks allows to run Protein Folding
• Depends on MPI
• VMs or dedicated HPC machines
Protein folding with PyWren-IBM
PyWren-IBM
submit a job of X
invocations
each running
ProtoMol
PyWren-IBM
collect results of
all invocations
REMD algorithms uses
output of invocations as
an input to the next job
IBM Cloud Functions
Each invocation runs ProtoMol library to run
Monte Carlo simulations. ProtoMol uses MPI for
communication between threads
* This Photo by Unknown Author is licensed under CC BY-SA
*
Our experiment – 99 jobs
• Each job executes many IBM CF invocations
• Each invocation runs 100 Monte Carlo steps
• Each step running 10000 Molecular Dynamic steps
• REMD exchange the results of the completed job
which used as an input to the following job
• Our approach doesn’t use MPI
PyWren-IBM for batch data processing
PyWren-IBM for data processing
Face recognition experiment with PyWren-IBM over IBM Cloud
• Align faces using open source from 1000 images stored in IBM cloud
object storage
• Given python code that know how to extract face from a single image
• Run from any Python notebook
Processing images without PyWren-IBM
import logging
import os
import sys
import time
import shutil
import cv2
from openface.align_dlib import AlignDlib
logger = logging.getLogger(__name__)
temp_dir = '/tmp'
def preprocess_image(bucket, key, data_stream, storage_handler):
"""
Detect face, align and crop :param input_path. Write output to :param output_path
:param bucket: COS bucket
:param key: COS key (object name ) - may contain delimiters
:param storage_handler: can be used to read / write data from / into COS
"""
crop_dim = 180
#print("Process bucket {} key {}".format(bucket, key))
sys.stdout.write(".")
# key of the form /subdir1/../subdirN/file_name
key_components = key.split('/')
file_name = key_components[len(key_components)-1]
input_path = temp_dir + '/' + file_name
if not os.path.exists(temp_dir + '/' + 'output'):
os.makedirs(temp_dir + '/' +'output')
output_path = temp_dir + '/' +'output/' + file_name
with open(input_path, 'wb') as localfile:
shutil.copyfileobj(data_stream, localfile)
exists = os.path.isfile(temp_dir + '/' +'shape_predictor_68_face_landmarks')
if exists:
pass;
else:
res = storage_handler.get_object(bucket, 'lfw/model/shape_predictor_68_face_landmarks.dat', stream =
True)
with open(temp_dir + '/' +'shape_predictor_68_face_landmarks', 'wb') as localfile:
shutil.copyfileobj(res, localfile)
align_dlib = AlignDlib(temp_dir + '/' +'shape_predictor_68_face_landmarks')
image = _process_image(input_path, crop_dim, align_dlib)
if image is not None:
#print('Writing processed file: {}'.format(output_path))
cv2.imwrite(output_path, image)
f = open(output_path, "rb")
processed_image_path = os.path.join('output',key)
storage_handler.put_object(bucket, processed_image_path, f)
os.remove(output_path)
else:
pass;
#print("Skipping filename: {}".format(input_path))
os.remove(input_path)
def _process_image(filename, crop_dim, align_dlib):
image = None
aligned_image = None
image = _buffer_image(filename)
if image is not None:
aligned_image = _align_image(image, crop_dim, align_dlib)
else:
raise IOError('Error buffering image: {}'.format(filename))
return aligned_image
def _buffer_image(filename):
logger.debug('Reading image: {}'.format(filename))
image = cv2.imread(filename, )
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
return image
def _align_image(image, crop_dim, align_dlib):
bb = align_dlib.getLargestFaceBoundingBox(image)
aligned = align_dlib.align(crop_dim, image, bb, landmarkIndices=AlignDlib.INNER_EYES_AND_BOTTOM_LIP)
if aligned is not None:
aligned = cv2.cvtColor(aligned, cv2.COLOR_BGR2RGB)
return aligned
import ibm_boto3
import ibm_botocore
from ibm_botocore.client import Config
from ibm_botocore.credentials import DefaultTokenManager
t0 = time.time()
client_config = ibm_botocore.client.Config(signature_version='oauth',
max_pool_connections=200)
api_key = config['ibm_cos']['api_key']
token_manager = DefaultTokenManager(api_key_id=api_key)
cos_client = ibm_boto3.client('s3', token_manager=token_manager,
config=client_config, endpoint_url=config['ibm_cos']['endpoint'])
try:
paginator = cos_client.get_paginator('list_objects_v2')
page_iterator = paginator.paginate(Bucket="gilvdata", Prefix = 'lfw/test/images')
print (page_iterator)
except ibm_botocore.exceptions.ClientError as e:
print(e)
class StorageHandler:
def __init__(self, cos_client):
self.cos_client = cos_client
def get_object(self, bucket_name, key, stream=False, extra_get_args={}):
"""
Get object from COS with a key. Throws StorageNoSuchKeyError if the given key does not exist.
:param key: key of the object
:return: Data of the object
:rtype: str/bytes
"""
try:
r = self.cos_client.get_object(Bucket=bucket_name, Key=key, **extra_get_args)
if stream:
data = r['Body']
else:
data = r['Body'].read()
return data
except ibm_botocore.exceptions.ClientError as e:
if e.response['Error']['Code'] == "NoSuchKey":
raise StorageNoSuchKeyError(key)
else:
raise e
def put_object(self, bucket_name, key, data):
"""
Put an object in COS. Override the object if the key already exists.
:param key: key of the object.
:param data: data of the object
:type data: str/bytes
:return: None
"""
try:
res = self.cos_client.put_object(Bucket=bucket_name, Key=key, Body=data)
status = 'OK' if res['ResponseMetadata']['HTTPStatusCode'] == 200 else 'Error'
try:
log_msg='PUT Object {} size {} {}'.format(key, len(data), status)
logger.debug(log_msg)
except:
log_msg='PUT Object {} {}'.format(key, status)
logger.debug(log_msg)
except ibm_botocore.exceptions.ClientError as e:
if e.response['Error']['Code'] == "NoSuchKey":
raise StorageNoSuchKeyError(key)
else:
raise e
temp_dir = '/home/dsxuser/.tmp'
storage_client = StorageHandler(cos_client)
for page in page_iterator:
if 'Contents' in page:
for item in page['Contents']:
key = item['Key']
r = cos_client.get_object(Bucket='gilvdata', Key=key)
data = r['Body']
Business Logic Boiler plate
• Loop over all images
• Close to 100 lines of “boiler
plate” code to find the
images, read and write the
objects, etc.
• Data scientist needs to be
familiar with s3 API
• Execution time
approximately 36
minutes!
Processing images with PyWren-IBM
import logging
import os
import sys
import time
import shutil
import cv2
from openface.align_dlib import AlignDlib
logger = logging.getLogger(__name__)
temp_dir = '/tmp'
def preprocess_image(bucket, key, data_stream, storage_handler):
"""
Detect face, align and crop :param input_path. Write output to :param output_path
:param bucket: COS bucket
:param key: COS key (object name ) - may contain delimiters
:param storage_handler: can be used to read / write data from / into COS
"""
crop_dim = 180
#print("Process bucket {} key {}".format(bucket, key))
sys.stdout.write(".")
# key of the form /subdir1/../subdirN/file_name
key_components = key.split('/')
file_name = key_components[len(key_components)-1]
input_path = temp_dir + '/' + file_name
if not os.path.exists(temp_dir + '/' + 'output'):
os.makedirs(temp_dir + '/' +'output')
output_path = temp_dir + '/' +'output/' + file_name
with open(input_path, 'wb') as localfile:
shutil.copyfileobj(data_stream, localfile)
exists = os.path.isfile(temp_dir + '/' +'shape_predictor_68_face_landmarks')
if exists:
pass;
else:
res = storage_handler.get_object(bucket, 'lfw/model/shape_predictor_68_face_landmarks.dat', stream =
True)
with open(temp_dir + '/' +'shape_predictor_68_face_landmarks', 'wb') as localfile:
shutil.copyfileobj(res, localfile)
align_dlib = AlignDlib(temp_dir + '/' +'shape_predictor_68_face_landmarks')
image = _process_image(input_path, crop_dim, align_dlib)
if image is not None:
#print('Writing processed file: {}'.format(output_path))
cv2.imwrite(output_path, image)
f = open(output_path, "rb")
processed_image_path = os.path.join('output',key)
storage_handler.put_object(bucket, processed_image_path, f)
os.remove(output_path)
else:
pass;
#print("Skipping filename: {}".format(input_path))
os.remove(input_path)
def _process_image(filename, crop_dim, align_dlib):
image = None
aligned_image = None
image = _buffer_image(filename)
if image is not None:
aligned_image = _align_image(image, crop_dim, align_dlib)
else:
raise IOError('Error buffering image: {}'.format(filename))
return aligned_image
def _buffer_image(filename):
logger.debug('Reading image: {}'.format(filename))
image = cv2.imread(filename, )
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
return image
def _align_image(image, crop_dim, align_dlib):
bb = align_dlib.getLargestFaceBoundingBox(image)
aligned = align_dlib.align(crop_dim, image, bb, landmarkIndices=AlignDlib.INNER_EYES_AND_BOTTOM_LIP)
if aligned is not None:
aligned = cv2.cvtColor(aligned, cv2.COLOR_BGR2RGB)
return aligned
pw = pywren.ibm_cf_executor(config=config, runtime='pywren-dlib-runtime_3.5')
bucket_name = 'gilvdata/lfw/test/images'
results = pw.map_reduce(preprocess_image, bucket_name, None, None).get_result()
Business Logic Boiler plate
• Under 3 lines of “boiler
plate”!
• Data scientist does not
need to use s3 API!
• Execution time is 35s
• 35 seconds as compared
to 36 minutes!
Input
Application Code
or Docker Image
Raw Satellite
Imagery
IBM Cloud Functions
With PyWren-IBM Processed Raster Data
Processed Raster Data
Meta Data
Processed Vector Data
• Horizontally Scalable
• Unified Imagery Ingestion Pipeline
• Efficient Output Query on Both Raster Data and
Vector Data
Satellite batch data processing
Output
PyWren-IBM for spatial metabolomics
Big dataMass spectrometry Computational biology
ORGANISM
1 m
tissues
Microbial
plates
1 cm
SINGLe CELLs
1 μm
Alexandrov team at embl Heidelberg
Spatial metabolomics across scales
Protsyuk et al, Nature Protocols 2018
Bouslimani et al, PNAS 2017
Palmer et al, Nature Methods 2017
Alexandrov et al, BioRxiv 2019
Rappez et al, BioRxiv 2019
Interdisciplinary
Single-cell biology
-omics
COMPUTATIONAL
applications
INFLAMMATION
IMMUNITY
CANCER
Methods dev
Molecular
Image analysis
ML, AI, BIG DATA
METASPACE use case
• Alexandrov team at EMBL develops novel computational
biology tools to reveal the spatial organization of
metabolic processes https://www.embl.de/research/units/scb/alexandrov/
• “Imaging mass spectrometry” application
• Uses Apache Spark and deployed across VMs in the cloud
Customer
uploads
medical image
Customer choose
molecular
databases
Data pre-
processing and
segmentation
Molecular scan of
datasets with
dedicated algorithms
Generation
output images
with results
Metabolomics with PyWren-IBM
Metabolomics application with PyWren-IBM
• We use PyWren-IBM to provide a prototype that deploys Metabolite
annotation engine as a serverless actions in the IBM Cloud
• https://github.com/metaspace2020/pywren-annotation-pipeline
Benefit of PyWren-IBM
• Better control of data partitions
• Speed of deployment, no need VMs
• Elasticity and automatic scale
• And many more..
Molecular Databases
up to 100M molecular strings
Dataset Input
up to 50GB binary file
Behind the scenes
Molecularannotationengine
Imageprocessing
IBM Cloud Functions
Results
Metabolite annotation engine
deployed by PyWren-IBM
tumorbrain
A whole-body section of a
mouse model showing
localization of glutamate.
Glutamate is a well-known
neurotransmitter abundant in
the brain. It however is linked to
cancer where it supports
proliferation and growth
of cancer cells. Both facts are
supported by the detected
localization, obtained using
METASPACE.
Data provided by Genentech.
glutamate
Annotation results
Summary
PyWren-IBM is a novel framework with advanced capabilities to run
user code in the cloud
We demonstrated the benefit of PyWren-IBM for HPC, molecular
biology and batch data pre-processing
For more use case and examples visit our project page
https://github.com/pywren/pywren-ibm-cloud
All is open source
Gil Vernik
gilv@il.ibm.com
Thank you
gilv@il.ibm.com

More Related Content

What's hot

Netflix Cloud Platform Building Blocks
Netflix Cloud Platform Building BlocksNetflix Cloud Platform Building Blocks
Netflix Cloud Platform Building Blocks
Sudhir Tonse
 
Cloud Migration: Moving to the Cloud
Cloud Migration: Moving to the CloudCloud Migration: Moving to the Cloud
Cloud Migration: Moving to the Cloud
Dr.-Ing. Michael Menzel
 
Pragmatic Approach to Workload Migrations - London Summit Enteprise Track RePlay
Pragmatic Approach to Workload Migrations - London Summit Enteprise Track RePlayPragmatic Approach to Workload Migrations - London Summit Enteprise Track RePlay
Pragmatic Approach to Workload Migrations - London Summit Enteprise Track RePlay
Amazon Web Services
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Source
aspyker
 
5 Points to Consider - Enterprise Road Map to AWS Cloud
5 Points to Consider  - Enterprise Road Map to AWS Cloud5 Points to Consider  - Enterprise Road Map to AWS Cloud
5 Points to Consider - Enterprise Road Map to AWS Cloud
Blazeclan Technologies Private Limited
 
Aws migration strategy
Aws migration strategyAws migration strategy
Aws migration strategy
Suliman Lei
 
Ciso executive summit 2012
Ciso executive summit 2012Ciso executive summit 2012
Ciso executive summit 2012
Bill Burns
 
Netflix Global Applications - NoSQL Search Roadshow
Netflix Global Applications - NoSQL Search RoadshowNetflix Global Applications - NoSQL Search Roadshow
Netflix Global Applications - NoSQL Search Roadshow
Adrian Cockcroft
 
AICamp - Dr Ramine Tinati - Making Computer Vision Real
AICamp - Dr Ramine Tinati - Making Computer Vision RealAICamp - Dr Ramine Tinati - Making Computer Vision Real
AICamp - Dr Ramine Tinati - Making Computer Vision Real
Ramine Tinati
 
Best practices for cloud migration (June 2016)
Best practices for cloud migration (June 2016)Best practices for cloud migration (June 2016)
Best practices for cloud migration (June 2016)
Julien SIMON
 
Intel IT Open Cloud - What's under the Hood and How do we Drive it?
Intel IT Open Cloud - What's under the Hood and How do we Drive it?Intel IT Open Cloud - What's under the Hood and How do we Drive it?
Intel IT Open Cloud - What's under the Hood and How do we Drive it?
Odinot Stanislas
 
Dystopia as a Service
Dystopia as a ServiceDystopia as a Service
Dystopia as a Service
Adrian Cockcroft
 
Feasibility of cloud migration for large enterprises
Feasibility of cloud migration for large enterprisesFeasibility of cloud migration for large enterprises
Feasibility of cloud migration for large enterprises
Anant Damle
 
Cloud Migration Strategy - IT Transformation with Cloud
Cloud Migration Strategy - IT Transformation with CloudCloud Migration Strategy - IT Transformation with Cloud
Cloud Migration Strategy - IT Transformation with Cloud
Blazeclan Technologies Private Limited
 
Cloud Migration Cookbook: A Guide To Moving Your Apps To The Cloud
Cloud Migration Cookbook: A Guide To Moving Your Apps To The CloudCloud Migration Cookbook: A Guide To Moving Your Apps To The Cloud
Cloud Migration Cookbook: A Guide To Moving Your Apps To The Cloud
New Relic
 
Microservices development for DevOps
Microservices development for DevOpsMicroservices development for DevOps
Microservices development for DevOps
TMME - TECH MEETUP FOR MYANMAR ENGINEERS IN JP
 
AWS Cloud Technology And Future of Faster Modern Architecture
AWS Cloud Technology And Future of Faster Modern ArchitectureAWS Cloud Technology And Future of Faster Modern Architecture
AWS Cloud Technology And Future of Faster Modern Architecture
TMME - TECH MEETUP FOR MYANMAR ENGINEERS IN JP
 
CCNA17 CloudStack and NFV
CCNA17 CloudStack and NFVCCNA17 CloudStack and NFV
CCNA17 CloudStack and NFV
ShapeBlue
 
Cloud Migration and Portability Best Practices
Cloud Migration and Portability Best PracticesCloud Migration and Portability Best Practices
Cloud Migration and Portability Best Practices
RightScale
 
Open-source vs. public cloud in the Big Data landscape. Friends or Foes?
Open-source vs. public cloud in the Big Data landscape. Friends or Foes?Open-source vs. public cloud in the Big Data landscape. Friends or Foes?
Open-source vs. public cloud in the Big Data landscape. Friends or Foes?
GetInData
 

What's hot (20)

Netflix Cloud Platform Building Blocks
Netflix Cloud Platform Building BlocksNetflix Cloud Platform Building Blocks
Netflix Cloud Platform Building Blocks
 
Cloud Migration: Moving to the Cloud
Cloud Migration: Moving to the CloudCloud Migration: Moving to the Cloud
Cloud Migration: Moving to the Cloud
 
Pragmatic Approach to Workload Migrations - London Summit Enteprise Track RePlay
Pragmatic Approach to Workload Migrations - London Summit Enteprise Track RePlayPragmatic Approach to Workload Migrations - London Summit Enteprise Track RePlay
Pragmatic Approach to Workload Migrations - London Summit Enteprise Track RePlay
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Source
 
5 Points to Consider - Enterprise Road Map to AWS Cloud
5 Points to Consider  - Enterprise Road Map to AWS Cloud5 Points to Consider  - Enterprise Road Map to AWS Cloud
5 Points to Consider - Enterprise Road Map to AWS Cloud
 
Aws migration strategy
Aws migration strategyAws migration strategy
Aws migration strategy
 
Ciso executive summit 2012
Ciso executive summit 2012Ciso executive summit 2012
Ciso executive summit 2012
 
Netflix Global Applications - NoSQL Search Roadshow
Netflix Global Applications - NoSQL Search RoadshowNetflix Global Applications - NoSQL Search Roadshow
Netflix Global Applications - NoSQL Search Roadshow
 
AICamp - Dr Ramine Tinati - Making Computer Vision Real
AICamp - Dr Ramine Tinati - Making Computer Vision RealAICamp - Dr Ramine Tinati - Making Computer Vision Real
AICamp - Dr Ramine Tinati - Making Computer Vision Real
 
Best practices for cloud migration (June 2016)
Best practices for cloud migration (June 2016)Best practices for cloud migration (June 2016)
Best practices for cloud migration (June 2016)
 
Intel IT Open Cloud - What's under the Hood and How do we Drive it?
Intel IT Open Cloud - What's under the Hood and How do we Drive it?Intel IT Open Cloud - What's under the Hood and How do we Drive it?
Intel IT Open Cloud - What's under the Hood and How do we Drive it?
 
Dystopia as a Service
Dystopia as a ServiceDystopia as a Service
Dystopia as a Service
 
Feasibility of cloud migration for large enterprises
Feasibility of cloud migration for large enterprisesFeasibility of cloud migration for large enterprises
Feasibility of cloud migration for large enterprises
 
Cloud Migration Strategy - IT Transformation with Cloud
Cloud Migration Strategy - IT Transformation with CloudCloud Migration Strategy - IT Transformation with Cloud
Cloud Migration Strategy - IT Transformation with Cloud
 
Cloud Migration Cookbook: A Guide To Moving Your Apps To The Cloud
Cloud Migration Cookbook: A Guide To Moving Your Apps To The CloudCloud Migration Cookbook: A Guide To Moving Your Apps To The Cloud
Cloud Migration Cookbook: A Guide To Moving Your Apps To The Cloud
 
Microservices development for DevOps
Microservices development for DevOpsMicroservices development for DevOps
Microservices development for DevOps
 
AWS Cloud Technology And Future of Faster Modern Architecture
AWS Cloud Technology And Future of Faster Modern ArchitectureAWS Cloud Technology And Future of Faster Modern Architecture
AWS Cloud Technology And Future of Faster Modern Architecture
 
CCNA17 CloudStack and NFV
CCNA17 CloudStack and NFVCCNA17 CloudStack and NFV
CCNA17 CloudStack and NFV
 
Cloud Migration and Portability Best Practices
Cloud Migration and Portability Best PracticesCloud Migration and Portability Best Practices
Cloud Migration and Portability Best Practices
 
Open-source vs. public cloud in the Big Data landscape. Friends or Foes?
Open-source vs. public cloud in the Big Data landscape. Friends or Foes?Open-source vs. public cloud in the Big Data landscape. Friends or Foes?
Open-source vs. public cloud in the Big Data landscape. Friends or Foes?
 

Similar to Your easy move to serverless computing and radically simplified data processing

Virtualization and cloud computing
Virtualization and cloud computingVirtualization and cloud computing
Virtualization and cloud computing
Deep Gupta
 
A Complete Guide Cloud Computing
A Complete Guide Cloud ComputingA Complete Guide Cloud Computing
A Complete Guide Cloud Computing
Sripati Mahapatra
 
Unleashing Apache Kafka and TensorFlow in the Cloud

Unleashing Apache Kafka and TensorFlow in the Cloud
Unleashing Apache Kafka and TensorFlow in the Cloud

Unleashing Apache Kafka and TensorFlow in the Cloud

Kai Wähner
 
Java Agile ALM: OTAP and DevOps in the Cloud
Java Agile ALM: OTAP and DevOps in the CloudJava Agile ALM: OTAP and DevOps in the Cloud
Java Agile ALM: OTAP and DevOps in the Cloud
MongoDB
 
Serverless brewbox
Serverless   brewboxServerless   brewbox
Serverless brewbox
Lino Telera
 
Right scale enterprise solution
Right scale enterprise solution Right scale enterprise solution
Right scale enterprise solution
Brad , Yun Lee
 
Right scale enterprise solution
Right scale enterprise solution Right scale enterprise solution
Right scale enterprise solution
Brad , Yun Lee
 
A non-technical introduction to Cloud Computing
A non-technical introduction to Cloud ComputingA non-technical introduction to Cloud Computing
A non-technical introduction to Cloud Computing
William Pourmajidi
 
Accelerate Digital Transformation with IBM Cloud Private
Accelerate Digital Transformation with IBM Cloud PrivateAccelerate Digital Transformation with IBM Cloud Private
Accelerate Digital Transformation with IBM Cloud Private
Michael Elder
 
Protecting Your Power Systems with Cloud-based HA/DR
Protecting Your Power Systems with Cloud-based HA/DRProtecting Your Power Systems with Cloud-based HA/DR
Protecting Your Power Systems with Cloud-based HA/DR
Precisely
 
[Capitole du Libre] #serverless -  mettez-le en oeuvre dans votre entreprise...
[Capitole du Libre] #serverless -  mettez-le en oeuvre dans votre entreprise...[Capitole du Libre] #serverless -  mettez-le en oeuvre dans votre entreprise...
[Capitole du Libre] #serverless -  mettez-le en oeuvre dans votre entreprise...
Ludovic Piot
 
RightScale Webinar: Operationalize Your Enterprise AWS Usage Through an IT Ve...
RightScale Webinar: Operationalize Your Enterprise AWS Usage Through an IT Ve...RightScale Webinar: Operationalize Your Enterprise AWS Usage Through an IT Ve...
RightScale Webinar: Operationalize Your Enterprise AWS Usage Through an IT Ve...
RightScale
 
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflowsCloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Yong Feng
 
Cloud computing
Cloud computingCloud computing
Cloud computing
Srinivasa Rao
 
Un-clouding the cloud
Un-clouding the cloudUn-clouding the cloud
Un-clouding the cloud
Davinder Kohli
 
Docker Orchestration: Welcome to the Jungle! JavaOne 2015
Docker Orchestration: Welcome to the Jungle! JavaOne 2015Docker Orchestration: Welcome to the Jungle! JavaOne 2015
Docker Orchestration: Welcome to the Jungle! JavaOne 2015
Patrick Chanezon
 
Cloud computing
Cloud computingCloud computing
Cloud computing
Amit Kumar
 
How to Manage Your Cloud by Drupal (DrupalCon CPH 2010)
How to Manage Your Cloud by Drupal (DrupalCon CPH 2010)How to Manage Your Cloud by Drupal (DrupalCon CPH 2010)
How to Manage Your Cloud by Drupal (DrupalCon CPH 2010)
DOCOMO Innovations, Inc.
 
Evolve or Fall Behind: Driving Transformation with Containers - Sai Vennam - ...
Evolve or Fall Behind: Driving Transformation with Containers - Sai Vennam - ...Evolve or Fall Behind: Driving Transformation with Containers - Sai Vennam - ...
Evolve or Fall Behind: Driving Transformation with Containers - Sai Vennam - ...
CodeOps Technologies LLP
 
Move existing middleware to the cloud
Move existing middleware to the cloudMove existing middleware to the cloud
Move existing middleware to the cloud
Arthur De Magalhaes
 

Similar to Your easy move to serverless computing and radically simplified data processing (20)

Virtualization and cloud computing
Virtualization and cloud computingVirtualization and cloud computing
Virtualization and cloud computing
 
A Complete Guide Cloud Computing
A Complete Guide Cloud ComputingA Complete Guide Cloud Computing
A Complete Guide Cloud Computing
 
Unleashing Apache Kafka and TensorFlow in the Cloud

Unleashing Apache Kafka and TensorFlow in the Cloud
Unleashing Apache Kafka and TensorFlow in the Cloud

Unleashing Apache Kafka and TensorFlow in the Cloud

 
Java Agile ALM: OTAP and DevOps in the Cloud
Java Agile ALM: OTAP and DevOps in the CloudJava Agile ALM: OTAP and DevOps in the Cloud
Java Agile ALM: OTAP and DevOps in the Cloud
 
Serverless brewbox
Serverless   brewboxServerless   brewbox
Serverless brewbox
 
Right scale enterprise solution
Right scale enterprise solution Right scale enterprise solution
Right scale enterprise solution
 
Right scale enterprise solution
Right scale enterprise solution Right scale enterprise solution
Right scale enterprise solution
 
A non-technical introduction to Cloud Computing
A non-technical introduction to Cloud ComputingA non-technical introduction to Cloud Computing
A non-technical introduction to Cloud Computing
 
Accelerate Digital Transformation with IBM Cloud Private
Accelerate Digital Transformation with IBM Cloud PrivateAccelerate Digital Transformation with IBM Cloud Private
Accelerate Digital Transformation with IBM Cloud Private
 
Protecting Your Power Systems with Cloud-based HA/DR
Protecting Your Power Systems with Cloud-based HA/DRProtecting Your Power Systems with Cloud-based HA/DR
Protecting Your Power Systems with Cloud-based HA/DR
 
[Capitole du Libre] #serverless -  mettez-le en oeuvre dans votre entreprise...
[Capitole du Libre] #serverless -  mettez-le en oeuvre dans votre entreprise...[Capitole du Libre] #serverless -  mettez-le en oeuvre dans votre entreprise...
[Capitole du Libre] #serverless -  mettez-le en oeuvre dans votre entreprise...
 
RightScale Webinar: Operationalize Your Enterprise AWS Usage Through an IT Ve...
RightScale Webinar: Operationalize Your Enterprise AWS Usage Through an IT Ve...RightScale Webinar: Operationalize Your Enterprise AWS Usage Through an IT Ve...
RightScale Webinar: Operationalize Your Enterprise AWS Usage Through an IT Ve...
 
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflowsCloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Un-clouding the cloud
Un-clouding the cloudUn-clouding the cloud
Un-clouding the cloud
 
Docker Orchestration: Welcome to the Jungle! JavaOne 2015
Docker Orchestration: Welcome to the Jungle! JavaOne 2015Docker Orchestration: Welcome to the Jungle! JavaOne 2015
Docker Orchestration: Welcome to the Jungle! JavaOne 2015
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
How to Manage Your Cloud by Drupal (DrupalCon CPH 2010)
How to Manage Your Cloud by Drupal (DrupalCon CPH 2010)How to Manage Your Cloud by Drupal (DrupalCon CPH 2010)
How to Manage Your Cloud by Drupal (DrupalCon CPH 2010)
 
Evolve or Fall Behind: Driving Transformation with Containers - Sai Vennam - ...
Evolve or Fall Behind: Driving Transformation with Containers - Sai Vennam - ...Evolve or Fall Behind: Driving Transformation with Containers - Sai Vennam - ...
Evolve or Fall Behind: Driving Transformation with Containers - Sai Vennam - ...
 
Move existing middleware to the cloud
Move existing middleware to the cloudMove existing middleware to the cloud
Move existing middleware to the cloud
 

Recently uploaded

A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
oaxefes
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
taqyea
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
VyNguyen709676
 
Cell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docxCell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docx
vasanthatpuram
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
ytypuem
 
UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
exukyp
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
eoxhsaa
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
asyed10
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
1tyxnjpia
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 

Recently uploaded (20)

A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
 
Cell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docxCell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docx
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
 
UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
一比一原版(Sheffield毕业证书)谢菲尔德大学毕业证如何办理
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 

Your easy move to serverless computing and radically simplified data processing

  • 1. Your easy move to serverless computing and radically simplified data processing Dr. Gil Vernik, IBM Research
  • 2. About myself • Gil Vernik • IBM Research from 2010 • PhD in mathematics. Post-doc in Germany • Architect, 25+ years of development experience • Active in open source • Recent interest – Cloud. Hybrid cloud. Big Data. Storage. Serverless Twitter: @vernikgil https://www.linkedin.com/in/gil-vernik-1a50a316/
  • 3. Agenda What problem we solve Why serverless computing How to make an easy move to serverless Use cases
  • 4. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 825184. http://cloudbutton.eu
  • 6. Simulations • Alice is working in the risk management department at the bank • She needs to evaluate a new contract • She decided to run a Monte-Carlo simulation to evaluate the contract • There is need about 100,000,000 calculations to get a better estimation This Photo by Unknown Author is licensed under CC BY-SA
  • 7. The challenge How and where to scale the code of Monte Carlo simulations? Business logic
  • 8. Data processing • Maria needs to run face detection using TensorFlow over millions of images. The process requires raw images to be pre-processed before used by TensorFlow • Maria wrote a code and tested it on a single image • Now she needs to execute the same code at massive scale, with parallelism, on terabytes of data stored in object storage Raw image Pre-processed image
  • 9. The challenge How to scale the code to run in parallel on terabytes of data without become a systems expert in scaling the code and learn storage semantics? IBM Cloud Object Storage
  • 10. Mid summary • How and where to scale the code? • How to process massive data sets without become a storage expert? • How to scale certain flows from the existing applications without major disruption to the existing system?
  • 11. VMs, containers and the rest • Naïve solution to scale an application - provision high resourced virtual machines and run your application there • Complicated , Time consuming, Expensive • Recent trend is to leverage container platforms • Containers have better granularity comparing to VMs, better resource allocations, and so on. • Docker containers became popular, yet many challenges how to ”containerize” existing code or applications • Comparing VMs and containers is beyond the scope of this talk… • Leverage Function as a Service platforms
  • 12. FaaS - "Hello Strata NY” Deploy the code (as specified by the FaaS provider) Invoke “helloStrata” “Hello Strata NY” Invoke “helloStrata” “Hello Strata NY” # main() will be invoked when you Run This Action. # # @param Cloud Functions actions accept a single parameter, # which must be a JSON object. # # @return which must be a JSON object. # It will be the output of this action. # # import sys def main(dict): if 'name' in dict: name = dict['name'] else: name = 'Strata NY' greeting = 'Hello ' + name + '!' print(greeting) return {'greeting':greeting} IBM Cloud Functions ”helloStrata” FaaS
  • 13. code() Event Action Deploy the code Input Output • Unit of computation is a function • Function is a short lived task • Smart activation, event driven, etc. • Usually stateless • Transparent auto-scaling • Pay only for what you use • No administration • All other aspects of the execution are delegated to the Cloud Provider Function as a Service IBM Cloud Functions
  • 14. Are there still challenges? • How to integrate FaaS into existing applications and frameworks without major disruption? • Users need to be familiar with API of storage and FaaS platform • How to control and coordinate invocations • How to scale the input and generate output 1 4
  • 15. Push to the Cloud • Occupy the Cloud: Distributed Computing for the 99%, (Eric Jonas, Qifan Pu, Shivaram Venkataraman, Ion Stoica, Benjamin Recht , 2017) • Why is it still ”complicated” to move workflows to the Cloud? User need to be familiar with cloud provider API, use deployments tools, write code according to cloud provider spec and so on. • Can FaaS be used for broad scope of flows? (RISELab at UC Berkley, 2017) PyWren - an open source framework released
  • 16. Push to the cloud with PyWren • Serverless for more use cases (not just event based or “Glue” for services) • Push to the Cloud experience • Designed to scale Python application at massive scale Python code Serverless action1 Serverless action 2 Serverless action1000 ……… ………
  • 17. Cloud Button Toolkit • PyWren-IBM ( aka CloudButton Toolkit) is a novel Python framework extending PyWren • ~800 commits to PyWren-IBM on top of PyWren • Being developed as part of CloudButton project • Leaded by IBM Research Haifa • Open source https://github.com/pywren/pywren-ibm-cloud
  • 18. PyWren-IBM example data = [1,2,3,4] def my_map_function(x): return x+7 PyWren-IBM print (cb.get_result()) [8,9,10,11] IBM Cloud Functions import pywren_ibm_cloud as cbutton cb = cbutton.ibm_cf_executor() cb.map(my_map_function, data)) PyWren-IBM PyWren-IBM
  • 20. PyWren-IBM example data = “cos://mybucket/year=2019/” def my_map_function(obj, boto3_client): // business logic return obj.name PyWren-IBM print (cb.get_result()) [d1.csv, d2.csv, d3.csv,….] IBM Cloud Functions import pywren_ibm_cloud as cbutton cb = cbutton.ibm_cf_executor() cb.map(my_map_function, data)) PyWren-IBM PyWren-IBM
  • 21. Unique differentiations of PyWren-IBM • Pluggable implementation for FaaS platforms • IBM Cloud Functions, Apache OpenWhisk, OpenShift by Red Hat, Kubernetess • Supports Docker containers • Seamless integration with Python notebooks • Advanced input data partitioner • Data discovery to process large amounts of data stored in IBM Cloud Object storage, chunking of CSV files, supports user provided partition logic • Unique functionalities • Map-Reduce, monitoring, retry, in-memory queues, authentication token reuse, pluggable storage backends, and many more..
  • 22. What PyWren-IBM good for • Batch processing, UDF, ETL, HPC and Monte Carlo simulations • Embarrassingly parallel workload or problems - often the case where there is little or no dependency or need for communication between parallel tasks • Subset of map-reduce flows Input Data Results ………Tasks 1 2 3 n
  • 23. What PyWren-IBM requires? Function as a Service platform • IBM Cloud Functions, Apache OpenWhisk • OpenShift, Kubernetes, etc. Storage accessed from Function as a Service platform through S3 API • IBM Cloud Object Storage • Red Hat Ceph
  • 24. PyWren-IBM and HPC This Photo by Unknown Author is licensed under CC BY-SA
  • 25. What is HPC? • High Performance Computing • Mostly used to solve advanced problems that may be simulations, analysis, research problems , etc. • Does HPC well defined? – depends whom you ask • Super computers or highly parallel processing or both? • MPI (Message Passing Interface) for communication or there is only need to exchange results between simulations? • Data locality or “fast “access to the data? • Super fast? “fast” enough? Or good enough? This Photo by Unknown Author is licensed under CC BY-NC
  • 26. HPC and “super” computers • Dedicated HPC super computers • Designed to be super fast • Calculations usually rely on Message Passing Interface (MPI) • Pros : HPC super computers • Cons: HPC super computers DedicatedHPC supercomputers HPC simulations
  • 27. HPC and VMs • No need to buy expensive machines • Frameworks to run HPC flows over VMs • Flows usually depends on MPI, data locality • Recent academic interest • Pros : Virtual Machines • Cons: Virtual Machines VirtualMachines private,cloud,etc. HPC simulations
  • 28. HPC and Containers Containers • Good granularity, parallelism, resource allocation, etc. • Research papers, frameworks • Singularity / Docker containers • Pros: containers • Cons: many focuses how to move entire application into containers, which usually require to re-design applications HPC simulations
  • 29. HPC and FaaS with PyWren-IBM HPC simulations Containers • FaaS is a perfect platform to scale code and applications • Many FaaS platforms allows users to use Docker containers • Code can contain any dependencies • PyWren-IBM is natural fit for many HPC flows • Pros : the easy move to serverless • Try it yourself… PyWren-IBM overFaaS
  • 30. Use cases and demos.. IBM Cloud Object Storage PyWren-IBM framework https://github.com/pywren/pywren-ibm-cloud IBM Cloud Functions
  • 31. Monte Carlo and PyWren-IBM PyWren is natural fit to scale Monte Carlo computations across FaaS platform User need to write business logic and PyWren does the rest Monte Carlo methods are a broad class of computational algorithms - evaluate the risk and uncertainty, investments in projects, popular methods in finance
  • 32. Stock price prediction • A mathematical approach for stock price modelling. More accurate for modelling prices over longer periods of time • We run Monte Carlo stock prediction over IBM Cloud Functions with PyWren-IBM • With PyWren-IBM total code is ~40 lines. Without PyWren-IBM running the same code requires 100s of additional lines of code Number of forecasts Local run (1CPU, 4 cores) IBM CF Total number of CF invocations 100,000 10,000 seconds ~70 seconds 1000 • We run 1000 concurrent invocations, each consuming 1024MB of memory • Each invocation predicted a forecast of 1080 days and used 100 random samples per prediction.Totally we did 108,000,000 calculations About 2500 forecasts predicted stock price around $130
  • 34. Protein Folding • Proteins are biological polymers that carry out most of the cell’s day-to-day functions. • Protein structure leads to protein function • Proteins are made from a linear chain of amino acids and folded into variety of 3-D shapes • Protein folding is a complex process that is not yet completely understood This Photo by Unknown Author is licensed under CC BY-SA-NC This Photo by Unknown Author is licensed under CC BY-SA
  • 35. Replica exchange • Monte Carlo simulations are popular methods to predict protein folding • ProtoMol is special designed framework for molecular dynamics • http://protomol.sourceforge.net • A highly parallel replica exchange molecular dynamics (REMD) method used to exchange Monte Carlo process for efficient sampling • A series of tasks (replicas) are run in parallel at various temperatures • From time to time the configurations of neighboring tasks are exchanged • Various HPC frameworks allows to run Protein Folding • Depends on MPI • VMs or dedicated HPC machines
  • 36. Protein folding with PyWren-IBM PyWren-IBM submit a job of X invocations each running ProtoMol PyWren-IBM collect results of all invocations REMD algorithms uses output of invocations as an input to the next job IBM Cloud Functions Each invocation runs ProtoMol library to run Monte Carlo simulations. ProtoMol uses MPI for communication between threads * This Photo by Unknown Author is licensed under CC BY-SA * Our experiment – 99 jobs • Each job executes many IBM CF invocations • Each invocation runs 100 Monte Carlo steps • Each step running 10000 Molecular Dynamic steps • REMD exchange the results of the completed job which used as an input to the following job • Our approach doesn’t use MPI
  • 37. PyWren-IBM for batch data processing
  • 38. PyWren-IBM for data processing Face recognition experiment with PyWren-IBM over IBM Cloud • Align faces using open source from 1000 images stored in IBM cloud object storage • Given python code that know how to extract face from a single image • Run from any Python notebook
  • 39. Processing images without PyWren-IBM import logging import os import sys import time import shutil import cv2 from openface.align_dlib import AlignDlib logger = logging.getLogger(__name__) temp_dir = '/tmp' def preprocess_image(bucket, key, data_stream, storage_handler): """ Detect face, align and crop :param input_path. Write output to :param output_path :param bucket: COS bucket :param key: COS key (object name ) - may contain delimiters :param storage_handler: can be used to read / write data from / into COS """ crop_dim = 180 #print("Process bucket {} key {}".format(bucket, key)) sys.stdout.write(".") # key of the form /subdir1/../subdirN/file_name key_components = key.split('/') file_name = key_components[len(key_components)-1] input_path = temp_dir + '/' + file_name if not os.path.exists(temp_dir + '/' + 'output'): os.makedirs(temp_dir + '/' +'output') output_path = temp_dir + '/' +'output/' + file_name with open(input_path, 'wb') as localfile: shutil.copyfileobj(data_stream, localfile) exists = os.path.isfile(temp_dir + '/' +'shape_predictor_68_face_landmarks') if exists: pass; else: res = storage_handler.get_object(bucket, 'lfw/model/shape_predictor_68_face_landmarks.dat', stream = True) with open(temp_dir + '/' +'shape_predictor_68_face_landmarks', 'wb') as localfile: shutil.copyfileobj(res, localfile) align_dlib = AlignDlib(temp_dir + '/' +'shape_predictor_68_face_landmarks') image = _process_image(input_path, crop_dim, align_dlib) if image is not None: #print('Writing processed file: {}'.format(output_path)) cv2.imwrite(output_path, image) f = open(output_path, "rb") processed_image_path = os.path.join('output',key) storage_handler.put_object(bucket, processed_image_path, f) os.remove(output_path) else: pass; #print("Skipping filename: {}".format(input_path)) os.remove(input_path) def _process_image(filename, crop_dim, align_dlib): image = None aligned_image = None image = _buffer_image(filename) if image is not None: aligned_image = _align_image(image, crop_dim, align_dlib) else: raise IOError('Error buffering image: {}'.format(filename)) return aligned_image def _buffer_image(filename): logger.debug('Reading image: {}'.format(filename)) image = cv2.imread(filename, ) image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) return image def _align_image(image, crop_dim, align_dlib): bb = align_dlib.getLargestFaceBoundingBox(image) aligned = align_dlib.align(crop_dim, image, bb, landmarkIndices=AlignDlib.INNER_EYES_AND_BOTTOM_LIP) if aligned is not None: aligned = cv2.cvtColor(aligned, cv2.COLOR_BGR2RGB) return aligned import ibm_boto3 import ibm_botocore from ibm_botocore.client import Config from ibm_botocore.credentials import DefaultTokenManager t0 = time.time() client_config = ibm_botocore.client.Config(signature_version='oauth', max_pool_connections=200) api_key = config['ibm_cos']['api_key'] token_manager = DefaultTokenManager(api_key_id=api_key) cos_client = ibm_boto3.client('s3', token_manager=token_manager, config=client_config, endpoint_url=config['ibm_cos']['endpoint']) try: paginator = cos_client.get_paginator('list_objects_v2') page_iterator = paginator.paginate(Bucket="gilvdata", Prefix = 'lfw/test/images') print (page_iterator) except ibm_botocore.exceptions.ClientError as e: print(e) class StorageHandler: def __init__(self, cos_client): self.cos_client = cos_client def get_object(self, bucket_name, key, stream=False, extra_get_args={}): """ Get object from COS with a key. Throws StorageNoSuchKeyError if the given key does not exist. :param key: key of the object :return: Data of the object :rtype: str/bytes """ try: r = self.cos_client.get_object(Bucket=bucket_name, Key=key, **extra_get_args) if stream: data = r['Body'] else: data = r['Body'].read() return data except ibm_botocore.exceptions.ClientError as e: if e.response['Error']['Code'] == "NoSuchKey": raise StorageNoSuchKeyError(key) else: raise e def put_object(self, bucket_name, key, data): """ Put an object in COS. Override the object if the key already exists. :param key: key of the object. :param data: data of the object :type data: str/bytes :return: None """ try: res = self.cos_client.put_object(Bucket=bucket_name, Key=key, Body=data) status = 'OK' if res['ResponseMetadata']['HTTPStatusCode'] == 200 else 'Error' try: log_msg='PUT Object {} size {} {}'.format(key, len(data), status) logger.debug(log_msg) except: log_msg='PUT Object {} {}'.format(key, status) logger.debug(log_msg) except ibm_botocore.exceptions.ClientError as e: if e.response['Error']['Code'] == "NoSuchKey": raise StorageNoSuchKeyError(key) else: raise e temp_dir = '/home/dsxuser/.tmp' storage_client = StorageHandler(cos_client) for page in page_iterator: if 'Contents' in page: for item in page['Contents']: key = item['Key'] r = cos_client.get_object(Bucket='gilvdata', Key=key) data = r['Body'] Business Logic Boiler plate • Loop over all images • Close to 100 lines of “boiler plate” code to find the images, read and write the objects, etc. • Data scientist needs to be familiar with s3 API • Execution time approximately 36 minutes!
  • 40. Processing images with PyWren-IBM import logging import os import sys import time import shutil import cv2 from openface.align_dlib import AlignDlib logger = logging.getLogger(__name__) temp_dir = '/tmp' def preprocess_image(bucket, key, data_stream, storage_handler): """ Detect face, align and crop :param input_path. Write output to :param output_path :param bucket: COS bucket :param key: COS key (object name ) - may contain delimiters :param storage_handler: can be used to read / write data from / into COS """ crop_dim = 180 #print("Process bucket {} key {}".format(bucket, key)) sys.stdout.write(".") # key of the form /subdir1/../subdirN/file_name key_components = key.split('/') file_name = key_components[len(key_components)-1] input_path = temp_dir + '/' + file_name if not os.path.exists(temp_dir + '/' + 'output'): os.makedirs(temp_dir + '/' +'output') output_path = temp_dir + '/' +'output/' + file_name with open(input_path, 'wb') as localfile: shutil.copyfileobj(data_stream, localfile) exists = os.path.isfile(temp_dir + '/' +'shape_predictor_68_face_landmarks') if exists: pass; else: res = storage_handler.get_object(bucket, 'lfw/model/shape_predictor_68_face_landmarks.dat', stream = True) with open(temp_dir + '/' +'shape_predictor_68_face_landmarks', 'wb') as localfile: shutil.copyfileobj(res, localfile) align_dlib = AlignDlib(temp_dir + '/' +'shape_predictor_68_face_landmarks') image = _process_image(input_path, crop_dim, align_dlib) if image is not None: #print('Writing processed file: {}'.format(output_path)) cv2.imwrite(output_path, image) f = open(output_path, "rb") processed_image_path = os.path.join('output',key) storage_handler.put_object(bucket, processed_image_path, f) os.remove(output_path) else: pass; #print("Skipping filename: {}".format(input_path)) os.remove(input_path) def _process_image(filename, crop_dim, align_dlib): image = None aligned_image = None image = _buffer_image(filename) if image is not None: aligned_image = _align_image(image, crop_dim, align_dlib) else: raise IOError('Error buffering image: {}'.format(filename)) return aligned_image def _buffer_image(filename): logger.debug('Reading image: {}'.format(filename)) image = cv2.imread(filename, ) image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) return image def _align_image(image, crop_dim, align_dlib): bb = align_dlib.getLargestFaceBoundingBox(image) aligned = align_dlib.align(crop_dim, image, bb, landmarkIndices=AlignDlib.INNER_EYES_AND_BOTTOM_LIP) if aligned is not None: aligned = cv2.cvtColor(aligned, cv2.COLOR_BGR2RGB) return aligned pw = pywren.ibm_cf_executor(config=config, runtime='pywren-dlib-runtime_3.5') bucket_name = 'gilvdata/lfw/test/images' results = pw.map_reduce(preprocess_image, bucket_name, None, None).get_result() Business Logic Boiler plate • Under 3 lines of “boiler plate”! • Data scientist does not need to use s3 API! • Execution time is 35s • 35 seconds as compared to 36 minutes!
  • 41. Input Application Code or Docker Image Raw Satellite Imagery IBM Cloud Functions With PyWren-IBM Processed Raster Data Processed Raster Data Meta Data Processed Vector Data • Horizontally Scalable • Unified Imagery Ingestion Pipeline • Efficient Output Query on Both Raster Data and Vector Data Satellite batch data processing Output
  • 42. PyWren-IBM for spatial metabolomics
  • 43. Big dataMass spectrometry Computational biology ORGANISM 1 m tissues Microbial plates 1 cm SINGLe CELLs 1 μm Alexandrov team at embl Heidelberg Spatial metabolomics across scales Protsyuk et al, Nature Protocols 2018 Bouslimani et al, PNAS 2017 Palmer et al, Nature Methods 2017 Alexandrov et al, BioRxiv 2019 Rappez et al, BioRxiv 2019 Interdisciplinary Single-cell biology -omics COMPUTATIONAL applications INFLAMMATION IMMUNITY CANCER Methods dev Molecular Image analysis ML, AI, BIG DATA
  • 44. METASPACE use case • Alexandrov team at EMBL develops novel computational biology tools to reveal the spatial organization of metabolic processes https://www.embl.de/research/units/scb/alexandrov/ • “Imaging mass spectrometry” application • Uses Apache Spark and deployed across VMs in the cloud Customer uploads medical image Customer choose molecular databases Data pre- processing and segmentation Molecular scan of datasets with dedicated algorithms Generation output images with results
  • 45. Metabolomics with PyWren-IBM Metabolomics application with PyWren-IBM • We use PyWren-IBM to provide a prototype that deploys Metabolite annotation engine as a serverless actions in the IBM Cloud • https://github.com/metaspace2020/pywren-annotation-pipeline Benefit of PyWren-IBM • Better control of data partitions • Speed of deployment, no need VMs • Elasticity and automatic scale • And many more..
  • 46. Molecular Databases up to 100M molecular strings Dataset Input up to 50GB binary file Behind the scenes Molecularannotationengine Imageprocessing IBM Cloud Functions Results Metabolite annotation engine deployed by PyWren-IBM
  • 47. tumorbrain A whole-body section of a mouse model showing localization of glutamate. Glutamate is a well-known neurotransmitter abundant in the brain. It however is linked to cancer where it supports proliferation and growth of cancer cells. Both facts are supported by the detected localization, obtained using METASPACE. Data provided by Genentech. glutamate Annotation results
  • 48. Summary PyWren-IBM is a novel framework with advanced capabilities to run user code in the cloud We demonstrated the benefit of PyWren-IBM for HPC, molecular biology and batch data pre-processing For more use case and examples visit our project page https://github.com/pywren/pywren-ibm-cloud All is open source Gil Vernik gilv@il.ibm.com Thank you gilv@il.ibm.com