Fannie Mae leverages Amazon SageMaker for machine learning applications to more accurately value properties and reduce mortgage risk. Amazon SageMaker provides a fully managed service that enables Fannie Mae to focus on modeling while ensuring data security, self-service access, and end-to-end governance through techniques like private subnets, encryption, IAM policies, and operating zones. The presentation demonstrates how to get started with TensorFlow on Amazon SageMaker.
Building Machine Learning Inference Pipelines at Scale (July 2019)Julien SIMON
Talk at OSCON, Portland, 18/07/2019
Real-life Machine Learning applications require more than a single model. Data may need pre-processing: normalization, feature engineering, dimensionality reduction, etc. Predictions may need post-processing: filtering, sorting, combining, etc.
Our goal: build scalable ML pipelines with open source (Spark, Scikit-learn, XGBoost) and managed services (Amazon EMR, AWS Glue, Amazon SageMaker)
Building Machine Learning Inference Pipelines at Scale (July 2019)Julien SIMON
Talk at OSCON, Portland, 18/07/2019
Real-life Machine Learning applications require more than a single model. Data may need pre-processing: normalization, feature engineering, dimensionality reduction, etc. Predictions may need post-processing: filtering, sorting, combining, etc.
Our goal: build scalable ML pipelines with open source (Spark, Scikit-learn, XGBoost) and managed services (Amazon EMR, AWS Glue, Amazon SageMaker)
Optimize your Machine Learning Workloads on AWS (July 2019)Julien SIMON
Talk at Floor 28, Tel Aviv.
Infrastructure, tips to speed up training, hyperparameter optimization, model compilation, Amazon SageMaker Neo, cost optimization, Amazon Elastic Inference
Amazon SageMaker is a fully-managed platform that lets developers and data scientists build and scale machine learning solutions. First, we'll show you how SageMaker Ground Truth helps you label large training datasets. Then, using Jupyter notebooks, we'll show you how to build, train and deploy models using built-in algorithms and frameworks (TensorFlow, Apache MXNet, etc). Finally, we'll show you how to use 3rd-party models from the AWS marketplace.
Amazon SageMaker is a fully-managed platform that lets developers and data scientists build and scale machine learning solutions. First, we'll show you how SageMaker Ground Truth helps you label large training datasets. Then, using Jupyter notebooks, we'll show you how to build, train and deploy models using built-in algorithms and frameworks (TensorFlow, Apache MXNet, etc). Finally, we'll show you how to use 3rd-party models from the AWS marketplace.
Using Amazon SageMaker to build, train, & deploy your ML ModelsAmazon Web Services
Machine Learning Workshops at the San Francisco Loft
Build, Train, and Deploy ML Models Using SageMaker
Amazon SageMaker is a fully-managed service that enables data scientists and developers to quickly and easily build, train, and deploy machine learning models, at scale. This session will introduce you the features of Amazon SageMaker, including a one-click training environment, highly-optimized machine learning algorithms with built-in model tuning, and deployment without engineering effort. With zero-setup required, Amazon SageMaker significantly decreases your training time and overall cost of building production machine learning systems.
Level: 200-300
Speaker: Martin Schade - R&D Engineer, AWS Solutions Architecture
This session will build on the previous one, focusing on performance and cost optimization. First, we'll show you how to automatically tune hyper-parameters, and quickly converge to optimal models. Second, you'll learn how to use SageMaker Neo, a new service that optimizes models for the underlying hardware architecture. Third, we'll show you how Elastic Inference lets you attach GPU acceleration to EC2 and SageMaker instances at the fraction of the cost of a full-fledged GPU instance. Finally, we'll share additional cost optimization tips for SageMaker.
Integrating Deep Learning into your Enterprise
In this workshop we return to one of the popular Machine Learning Framework - scikit-learn. We scikit-learn's decision tree classifier to train the model. Decision Trees (DTs) are a non-parametric supervised learning method used for classification and regression. The goal is to create a model that predicts the value of a target variable by learning simple decision rules inferred from the data features. We follow the whole machine learning pipeline from algorithm selection, training and finally deployment of an endpoint. We would be working with the widely available Iris dataset and the endpoint would be predicting what species the sample belongs to from the Sepal width and length, Petal width and length. Through this workshop we would know all the internal details of how we use containers to train and deploy our machine learning workloads.
Level: 300-400
When, how and if to adopt Microservices, AWS Startup Day Cape Town 2018Amazon Web Services
Microservices are not for everyone! If you're a small shop, a monolith provides a great amount of value and reduces the complexities involved. However as your company grows, this monolith becomes more difficult to maintain. We’ll look at how microservices allow you to easily deploy and debug atomic pieces of infrastructure which allows for increased velocity in reliable, tested, and consistent deploys. We’ll look into key metrics you can use to identify the right time to begin the transition from monolith to microservices.
Speaker: Bradley Acar, Solutions Architect, AWS
Optimize your Machine Learning Workloads on AWS (July 2019)Julien SIMON
Talk at Floor 28, Tel Aviv.
Infrastructure, tips to speed up training, hyperparameter optimization, model compilation, Amazon SageMaker Neo, cost optimization, Amazon Elastic Inference
Amazon SageMaker is a fully-managed platform that lets developers and data scientists build and scale machine learning solutions. First, we'll show you how SageMaker Ground Truth helps you label large training datasets. Then, using Jupyter notebooks, we'll show you how to build, train and deploy models using built-in algorithms and frameworks (TensorFlow, Apache MXNet, etc). Finally, we'll show you how to use 3rd-party models from the AWS marketplace.
Amazon SageMaker is a fully-managed platform that lets developers and data scientists build and scale machine learning solutions. First, we'll show you how SageMaker Ground Truth helps you label large training datasets. Then, using Jupyter notebooks, we'll show you how to build, train and deploy models using built-in algorithms and frameworks (TensorFlow, Apache MXNet, etc). Finally, we'll show you how to use 3rd-party models from the AWS marketplace.
Using Amazon SageMaker to build, train, & deploy your ML ModelsAmazon Web Services
Machine Learning Workshops at the San Francisco Loft
Build, Train, and Deploy ML Models Using SageMaker
Amazon SageMaker is a fully-managed service that enables data scientists and developers to quickly and easily build, train, and deploy machine learning models, at scale. This session will introduce you the features of Amazon SageMaker, including a one-click training environment, highly-optimized machine learning algorithms with built-in model tuning, and deployment without engineering effort. With zero-setup required, Amazon SageMaker significantly decreases your training time and overall cost of building production machine learning systems.
Level: 200-300
Speaker: Martin Schade - R&D Engineer, AWS Solutions Architecture
This session will build on the previous one, focusing on performance and cost optimization. First, we'll show you how to automatically tune hyper-parameters, and quickly converge to optimal models. Second, you'll learn how to use SageMaker Neo, a new service that optimizes models for the underlying hardware architecture. Third, we'll show you how Elastic Inference lets you attach GPU acceleration to EC2 and SageMaker instances at the fraction of the cost of a full-fledged GPU instance. Finally, we'll share additional cost optimization tips for SageMaker.
Integrating Deep Learning into your Enterprise
In this workshop we return to one of the popular Machine Learning Framework - scikit-learn. We scikit-learn's decision tree classifier to train the model. Decision Trees (DTs) are a non-parametric supervised learning method used for classification and regression. The goal is to create a model that predicts the value of a target variable by learning simple decision rules inferred from the data features. We follow the whole machine learning pipeline from algorithm selection, training and finally deployment of an endpoint. We would be working with the widely available Iris dataset and the endpoint would be predicting what species the sample belongs to from the Sepal width and length, Petal width and length. Through this workshop we would know all the internal details of how we use containers to train and deploy our machine learning workloads.
Level: 300-400
When, how and if to adopt Microservices, AWS Startup Day Cape Town 2018Amazon Web Services
Microservices are not for everyone! If you're a small shop, a monolith provides a great amount of value and reduces the complexities involved. However as your company grows, this monolith becomes more difficult to maintain. We’ll look at how microservices allow you to easily deploy and debug atomic pieces of infrastructure which allows for increased velocity in reliable, tested, and consistent deploys. We’ll look into key metrics you can use to identify the right time to begin the transition from monolith to microservices.
Speaker: Bradley Acar, Solutions Architect, AWS
Governance @ Scale: Compliance Automation in AWS | AWS Public Sector Summit 2017Amazon Web Services
You did it! You've made the decision to migrate, but governance is slowing you down. Traditionally, IT governance has required long, detailed documents and hours of work, until now. AWS and Trend Micro are helping enterprises today to seamlessly overcome, and automate, the top three barriers you face when scaling governance; Account Management, Cost Enforcement and Compliance Automation. Join this session and get a peek at the inner workings of the AWS & Trend Micro Governance @ scale solution that helps you quickly deliver high-impact controls in an automated, repeatable fashion. Learn More: https://aws.amazon.com/government-education/
“Cloud First” Helps Hub Intl Grow the Business with Splunk on AWS (ANT330-S) ...Amazon Web Services
What does “Cloud First” really mean for your processes, customers, and the future of your business? In this session, learn from Hub International’s CISO and Head of Architecture how the insurance company is transforming their business by migrating to the cloud for greater scale, cost savings, performance, and security. With a goal to go all-in on AWS by end of 2018, Hub International needed real-time visibility across its cloud and hybrid environments to mitigate risks and ensure seamless migrations—thereby keeping customers happy and their data safe. See how Hub International is harnessing machine data and predictive analytics to secure applications and infrastructure, control costs and improve capacity planning. This session is brought to you by AWS partner, Splunk.
Building State-of-the-Art Computer Vision Models Using MXNet and Gluon (AIM36...Amazon Web Services
Implementing computer vision (CV) models just got simpler and faster. In this chalk talk, learn how to implement CV models using MXNet and the Gluon CV Toolkit, which provides implementations of state-of-the-art deep learning algorithms in computer vision to help engineers, researchers, and students quickly prototype products, validate new ideas, and learn computer vision.
AWS Summit Singapore - Advanced AWS Patterns for the EnterpriseAmazon Web Services
John Painter, Managing Director – Asia, Sourced.
Arrian van Zal, Consultant, Sourced. With a transformational platform comes new principals and techniques for managing a modern enterprise application portfolio. Application updates are now measured in hours, not months, and all of this must overlay with the correct security and risk practices that apply to all regulated businesses. Looking across our global portfolio we have picked a number of patterns that seek to solve common, or sometimes very specific challenges that are present in the Enterprise landscape. This will include key patterns across Serverless computing, Continuous Delivery, and Machine Learning.
This fast-paced technical session will provide an ‘in the trenches’ view of some of the solutions we have built in the last 12 months, backed with live demonstration, and provide actionable designs to take into your organisation. Join us while we present demonstrations, tips, strategies, and cutting edge patterns from Sourced's battle-hardened consultants.
ECS Offers Customers a cloud management platform leveraging AWS ML for algorithmic cloud spend forecasting, predictive modeling, and anomaly detection.
Challenge:
Customers require advanced prediction models in order to optimize their AWS usage and perform anomaly detection to determine unusual spend patterns.</p>
Solution:
In support of ECS’s AWS Premier Consulting, Audited MSP, and Channel Reseller capability, ECS developed a cloud automated management portal (CAMP) which processes thousands of billing records on a daily basis. We’ve designed CAMP to deliver advanced forecasting and prediction models for cloud usage, anomaly detection, and alerts based on a multi-dimensional analysis platform.
ECS delivers this solution using the following Amazon Web Services: AWS ML SageMaker, Athena, Elastic Load Balancer (ELB), Elastic Compute Cloud (EC2), Elastic Block Store (EBS), and Relational Database Service (RDS) for SQL Server, S3, Lambda and Route 53.
Benefits:
ECS’s CAMP solution leverages the AWS ML platform enabling customers to:
• Translate cloud data into knowledge and mission intelligence
• Provide a deeper understanding to facilitate better decision-making
• Enable advanced, predictive modeling and forecasting of an organization’s AWS usage while driving adoption through more clear understanding of spend patterns
• Improved anomaly detection accuracy and alerting based on a holistic enterprise view to optimize and automate governance policies and compliance
by Jeff Puchalski, Application Security Engineer, AWS
Insider threat detection! How do we use AWS products to find an insider threat. We will cover Macie, GuardDuty and lambda to review a production account actions and remediate findings as they arise . We will also cover the utilization of CloudWatch to unify our finds into a single pane of glass.
The Well-Architected workshop is a free, advanced-level workshop that describes the benefits of the AWS Well-Architected Framework. This enables customers to review and improve their cloud architectures and better understand the business impact of their design decisions. It addresses general design principles, best practices, and guidance in five pillars of the Well-Architected Framework. We recommend that attendees of this course have the following pre-requisites: Strong working knowledge of AWS core services and features, as well as previous architectural experience.
Breaking Up the Monolith While Migrating to AWS (GPSTEC320) - AWS re:Invent 2018Amazon Web Services
Modernization involves implementing business processes and technology that provide your business applications with high availability, agility, and elasticity. Nowhere is this more important than in breaking apart the monolith. Modernizing an application as part of a migration can be extremely successful if you follow the AWS migration methodology of “discover, plan, migrate, and optimize” as you move that application to the cloud. In this session, we share what we learned from over 400 successful migrations. We also show you how to virtually break a monolith to a modernized architecture as part of the planning phase and accelerate your migration using container technologies and application discovery tools.
by Nathan Case, Sr. Consultant, AWS
Insider threat detection! How do we use AWS products to find an insider threat. We will cover Macie, GuardDuty and lambda to review a production account actions and remediate findings as they arise . We will also cover the utilization of CloudWatch to unify our finds into a single pane of glass. Level 400
Serverless State Management & Orchestration for Modern Apps (API302) - AWS re...Amazon Web Services
Keeping track of state and orchestrating the components of a distributed application is complex. AWS Step Functions makes the job simpler, faster, and more intuitive. In this session, learn how to leverage AWS Step Functions to design and run workflows for your serverless, containerized, and instance-based architectures. We explore practical applications of orchestration spanning different industries and workloads. For each, we walk through the architecture, lessons learned, and business outcomes. Expect to leave this session with a practical understanding of how to use orchestration to express your application’s business logic more productively while improving its resilience.
AWS 201 Webinar Series - Rightsizing and Cost Optimizing your DeploymentAmazon Web Services
Leveraging the AWS Cloud can help you further lower your overall IT costs and avoid fixed, upfront IT investments. Learning how to right-size your environments can help you to go from capacity guessing to meeting QoE targets for your customers. The session will also cover best practices on how to Architect for Cost from real world customer use cases and ultimately how the AWS Cloud can help you increase revenue by focusing on Innovation and Return on Agility.
Key takeaways
- Replace up-front capital expenses with low variable costs
- Outsource undifferentiated IT tasks to useful services
- Evaluate the total Cost of (Non) Ownership
- Build Cost-aware architectures
- AWS features that help you reduce your spend
- Different purchasing options available with AWS
Who should attend
- Technical Users: Developers, engineers, system administrators and architects
- Decision Makers: IT Managers, directors and business leaders
Similar to AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (December 2019) (20)
An introduction to computer vision with Hugging FaceJulien SIMON
In this code-level talk, Julien will show you how to quickly build and deploy computer vision applications based on Transformer models. Along the way, you'll learn about the portfolio of open source and commercial Hugging Face solutions, and how they can help you deliver high-quality solutions faster than ever before.
Starting your AI/ML project right (May 2020)Julien SIMON
In this talk, we’ll see how you can put your AI/ML project on the right track from the get-go. Applying common sense and proven best practices, we’ll discuss skills, tools, methods, and more. We’ll also look at several real-life projects built by AWS customers in different industries and startups.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
4. TensorFlow
https://www.tensorflow.org
• Main API in Python, with support for Javascript, Java, C++
• TensorFlow 1.x: symbolic execution
• ‘Define then run’: build a graph, optimize it, feed data, and compute
• Low-level API: variables, placeholders, tensor operations
• High-level API: tf.estimator.*
• Keras library: Sequential and Functional API, predefined layers
• TensorFlow 2.0: imperative execution (aka eager execution)
• ‘Define by run’: normal Python code, similar to numpy
• Run it, inspect it, debug it
• Keras is the preferred API
5. AWS: The platform of choice for TensorFlow
https://aws.amazon.com/tensorflow/
85% of all
TensorFlow workloads
in the cloud run on
AWS
89% of all deep
learning workloads in
the cloud run on AWS
6. TensorFlow: a first-class citizen on Amazon SageMaker
• Built-in TensorFlow containers for training and prediction
• Code available on Github: https://github.com/aws/sagemaker-tensorflow-containers
• Build it, run it on your own machine, customize it, etc.
• Versions : 1.4.1 1.15 (2.0 coming soon)
• Not just TensorFlow
• Standard tools: TensorBoard, TensorFlow Serving
• SageMaker features: Local Mode, Script Mode, Model Tuning, Spot Training, Pipe Mode,
Amazon EFS & Amazon FSx for Lustre, Amazon Elastic Inference, etc.
• Performance optimizations: GPUs and CPUs (AWS, Intel MKL-DNN library)
• Distributed training: Parameter Server and Horovod
7. Amazon SageMaker
re:Invent 2019 announcements
SageMaker Studio
SageMaker Notebooks
(preview)
SageMaker Debugger
SageMaker Experiments
SageMaker
Model Monitor
SageMaker Autopilot
9. Fannie Mae is a leading source of financing for
mortgage lenders
• Provide access to affordable mortgage financing in all markets at all times
• Effectively manage and reduce risk to our business, taxpayers, and the housing
finance system
10. Accurate property valuation reduces mortgage
risk
It is used in all stages of the loan lifecycle:
• Origination and underwriting, where a
lender determines whether a borrower's
loan application is an acceptable risk
• Post-purchase quality control
• Portfolio risk management, financial
reporting, and regulatory reporting
• Loss mitigation
Fannie Mae credit portfolio is ~$3 trillion
Mortgage lifecycle
Origination
Servicing
Securitization
Foreclosure
11. Machine learning example: Property valuation
Property appraisal by certified/licensed appraiser
• Quantitative valuation based on comparable property sale
prices and market trends
• Adjustments for unobservable inputs
Fannie Mae is leveraging machine learning
• Automated home price valuation model based on observables
(XGBoost, KNN)
• Automated review of the adjustment based on visual
inspection (TensorFlow – CNN)
Fannie Mae receives ~40,000 appraisal reports,
with 500,000+ property images every day
12. Technology challenges in machine learning
Limited
CPU/GPU
resources to
train and run
models
No streamlined
approach for
model
development
Process of
packaging and
hosting models is
complex and time
consuming
Difficult to
connect machine
learning and
analytics tools to
data
13. Amazon SageMaker fits our needs
• Flexible and self-
service machine
learning platform
• Easy access to
compute resources
and data
• Streamlined model
training and
deployment
• Built-in governance
procedure and audit
trail
14. Automated property image classification
Three multi-layer convolutional neural network
models with transferred learning
1st layer
fixes image
orientation
2nd layer
identifies
room type
3rd layer
predicts
marketability
score
15. Benefits of Amazon SageMaker
Effective cost management
• Never pay for idle; the cost is based on actual vCPU/GPU usage, not the
maximum processing capacity of the infrastructure
• Designed to enable performance improvement at zero cost
Rapid time to market
• Instant access to dedicated computing resources
• Ability to focus on business needs; no server to manage and no complex code
to write for distributed model training, hyperparameter tuning, or model
deployment
AWS breadth and depth
• Streamlined integration with big data analytics platform
• Automated version controls, governance, audit trails, and secured workload
• Business resiliency
16. Consideration for provisioning Amazon SageMaker
Implementation of governance is as important as developing business capabilities
• InfoSec risk management
• Data governance
• Model governance
• Technology risk management
Establish guiding principles at the start
• Technology and software
• Models and analytics
Consider data gravity
• Co-locate machine learning platform with data sources
We engaged with the Amazon SageMaker team early
17. A special shout-out to the Fannie Mae Digital Incubator team for
developing the property image classification machine learning model:
Hamid Reza Khakpour, Timur Fatykhov, and Felix Meale
19. Three very important goals
Realistic to achieve all the above with a fully-managed service such as Amazon SageMaker?
+ Non-negotiable data security
+ Self-service access
+ End-to-end governance with traceability
20. Given these conditions …
• Amazon SageMaker infrastructure is deployed in AWS-managed, multi-
tenant VPCs and subnets
• Data scientists work with highly sensitive data using powerful dev tools
How do we keep data absolutely secure?
21. Private subnet
Keeping data secure: Harden network security
Fannie Mae VPC Amazon VPC
AWS Cloud
Amazon
SageMaker
notebook
Amazon S3
Security group
ENI
AWS KMS
AWS
CodeCommit
Amazon
ECR
Amazon
SageMaker
Amazon
SageMaker
training
Amazon
SageMaker
hosting
ENI
ENI
S3 gateway
endpoint
Endpoint subnet
Interface VPC endpoints
Security group
+ How do we prevent
data exfiltration?
+ How do we avoid
exposure to internet?
23. Keeping data secure: Encrypt everywhere
Use customer managed CMK for volumes and S3 encryption
Enable Amazon S3 default encryption. Additionally use deny policies to prevent unencrypted uploads
24. With the greater flexibility of self-service access …
• How do we ensure users comply with security controls?
• How do we ensure users do not step into each other?
26. Enabling governance: Operating zones
Data
scientists
Train
&
test models
NPI
Amazon
SageMaker
notebooks
Non-NPI
Research zone
Controlled
code/model
migration
Production
Retrain
Deploy
Automated
process
Approve
models
Lambda
API Gateway
Application CI/CD
DevUATProd
Test
Deploy
API Gateway
Application
developers
AWS CodeCommit
Lambda
Application zone
Development
Create guardrails early: Establish zones to manage ML lifecycle
27. Machine learning orchestration with auditing: Example
Source
code
AWS
CodeCommit
Amazon S3
Dataset
Model
artifacts
AWS
CodeBuild
Lambda LambdaApproval
gate
Batch transformAWS
Lambda
AWS
CodePipeline
Deploy endpoint
AWS Cloud
Train
+ Reproducible and
reusable pipeline
+ Built-in audit
tracking capability
+ Other options:
AWS Step Functions,
Apache Airflow
Amazon DynamoDB
Model metadata for
audit tracking
Amazon
ECR
Create model
28. Fannie Mae’s Enterprise Data Lake (EDL) at a glance
Build machine learning capability with a fully functional data lake as a foundation
... and growing
100+ applications3,000+ datasets 500+ AWS Glue Data
Catalog databases
1,000+ users
29. Amazon SageMaker in EDL: Reference architecture
Amazon
QuickSight
Amazon ES
Amazon
Athena
Amazon S3
Encrypted
objects
Amazon EMR
Amazon
SageMaker
AWS Glue
metastore
Amazon
ECS
AWS Batch
AWS
Lambda
Corporate
data center
Data warehouses
RDBMS
File systems
Third-party data
ADFS
AWS Direct
Connect
Rest/CLI
Zscaler
Ingestion
AWS Step
Functions
Amazon
Redshift
Workflow ETL/Analytics/ML Data visualization
Enterprise data lake platform
Amazon
Kinesis
Auditing Security and governance Logging and monitoring
Platform built with 100% native AWS services => less integration challenges
30. Key takeaways
+ New IAM context keys are valuable
+ Restrict access to buckets, utilize S3 endpoint
policy
+ Amazon SageMaker has full support for
PrivateLink endpoints; Enabling and enforcing
those is crucial
+ Data is a first-class primitive in ML workflows;
keep track of data collection and preparation
+ Make predictions traceable to original training
record
+ Introduce segregation of duties; establish
operating zones
+ Leverage data lake pattern
Build a highly-secure, self-service & end-to-end traceable ML capability with Amazon SageMaker
With these new capabilities, Amazon SageMaker now covers the complete machine learning workflow to build, train, and deploy machine learning models quickly, at scale.