Gab 2018 seguridad y escalado en azure service fabric

Alberto Diaz Martin
alberto.diaz@encamina.com - @adiazcan
Alberto Diaz cuenta con más de 15 años de experiencia en la Industria IT, todos ellos trabajando
con tecnologías Microsoft. Actualmente, es Chief Technology Innovation Officer en ENCAMINA,
liderando el desarrollo de software con tecnología Microsoft, y miembro del equipo de
Dirección.
Para la comunidad, trabaja como organizador y speaker de las conferencias más relevantes del
mundo Microsoft en España, en las cuales es uno de los referentes en SharePoint, Office 365 y
Azure. Autor de diversos libros y artículos en revistas profesionales y blogs, en 2013 empezó a
formar parte del equipo de Dirección de CompartiMOSS, una revista digital sobre tecnologías
Microsoft.
Desde 2011 ha sido nombrado Microsoft MVP, reconocimiento que ha renovado por séptimo
año consecutivo. Se define como un geek, amante de los smartphones y desarrollador.
Fundador de TenerifeDev (www.tenerifedev.com), un grupo de usuarios de .NET en Tenerife, y
coordinador de SUGES (Grupo de Usuarios de SharePoint de España, www.suges.es)

#GlobalAzure
Intro to Service Fabric

Thumbnail
Service
Thumbnail
ServicePhoto Share
Service
Photo Share
Service
Photo Share
Service
Photo Share
Service
Thumbnail
Service
Photo Share
Service
node.js
Thumbnail
Service
.NET
Photo Share
Service
V1
Thumbnail
Service
V1
Thumbnail
Service
V2

Azure Other CloudsDev Box
Azure Service Fabric
Any OS, Any Cloud
containers and microservices
On-Premise Data Centers

Azure Other CloudsOn-Premise Data CentersDev Box
Azure Service Fabric
Any OS, Any Cloud

VM #1
Service Fabric
Your code, etc.
VM #2
Service Fabric
Your code, etc. VM #3
Service Fabric
Your code, etc.
VM #4
Service Fabric
Your code, etc.
VM #5
Service Fabric
Your code, etc.
Your code, etc.
(Port: 19080)
Web Request
Port: 80
Service Fabric cluster

#GlobalAzure
PowerShell Module
Azure CLI, Service Fabric CLI

New-AzureRmServiceFabricCluster -ResourceGroupName $RGname -Location
$clusterloc -ClusterSize $numNodes -VmPassword $pwd -
CertificateSubjectName $subname
-CertificatePassword $pwd -CertificateOutputFolder $pfxfolder

#GlobalAzure
Demo
Create a secure cluster
using PowerShell

#GlobalAzure
Securing your cluster

Service Fabric Cluster
Key Vault
AAD
Security
LB#3LB#2LB#1
NSG#1 NSG#2 NSG#2
VMSS* ##1
VM
VM
VM
VMSS* #1
VM
VM
VM
VMSS#1
VM
VM
VM
For
Diagnostics
Azure Storage
For SF logs
For VHDs
For VHDsManaged Disk
For VHDs
Service Fabric Cluster
VNET
LB#3LB#2LB#1
VMSS#1
VM
VM
VM
VMSS#2
VM
VM
VM
VMSS#3
VM
VM
VM
NSG#1 NSG#2 NSG#3
Jump Server

ClientConnectionEndpoint (TCP) 19000
HttpGatewayEndpoint (HTTP/TCP) 19080
SMB support for Image Store 445, 134
ClusterConnectionEndpointPort (TCP) 1025
LeaseDriverEndpointPort (TCP) 1026
Ephemeral Port range As needed, min 256
ports
App ports As needed

#GlobalAzure
Demo
Review of a cluster with
NSG enabled on Portal

#GlobalAzure
Planning your cluster

FD1 FD2 FD3 FD4 FD5
• Number of FDs determines the headroom needed in case of unplanned failures
• Examples include a PDU failing or TOR maintenance that can take out all
machines in a rack
• In terms of capacity – you need to leave enough headroom to accommodate
failure of at least one FD
• This will result in SF moving/creating new replicas on the available machines in
other FDs
PDU Burn out
Replica

FD1 FD2 FD3 FD4 FD5
• Number of Upgrade Domains determines the headroom needed in case
of planned failures/downtimes
• An example is when a Service Fabric upgrade going on, and a UD is
down, you have to have room for additional replicas if need be
Replica
UD1 UD2 UD3 UD4 UD5 UD6 UD7 UD8 UD9 UD10
SF upgrade

You should plan your capacity in such a way that your service
can at least survive:
• A loss of one FD
• A UD being down because of an upgrade going on
• A additional random node/VM failing
FD1 FD2 FD3 FD4 FD5
UD1 UD2 UD3 UD4 UD5 UD6 UD7 UD8 UD9 UD10

New-AzureRmServiceFabricCluster -ResourceGroupName $RGname
-Location $clusterloc -ClusterSize 1 -VmPassword $pwd
-CertificateSubjectName $subname -CertificatePassword $pwd
-OS UbuntuServer1604
New-AzureRmServiceFabricCluster -ResourceGroupName $RGname
-Location $clusterloc -ClusterSize 3 -VmPassword $pwd
-CertificateSubjectName $subname -CertificatePassword $pwd
-OS WindowsServer2016DatacenterwithContainers

#GlobalAzure
Deploy Test Clusters
through Portal
Demo

Add-AzureRmServiceFabricNode -ResourceGroupName $RGname -Name
$clusterName -NodeType $nodeType -Number $addNumNodes
Remove-AzureRmServiceFabricNode -ResourceGroupName $RGname -
Name $clusterName -NodeType $nodeType -Number $addNumNodes

Add-AzureRmServiceFabricNodetype -ResourceGroupName $RGname -
Name $clusterName -NodeType $nodeType ……
Remove-AzureRmServiceFabricNodeType -ResourceGroupName $RGname
-Name $clusterName -NodeType $nodeType …..

#GlobalAzure
Scale out a cluster using
the PowerShell Module
Demo

#GlobalAzure
Business continuity planning

The Recovery Point Objective (RPO) determines
the amount of data you can afford to lose in a disaster
The Recovery Time Objective (RTO) is the
maximum tolerable length of time that your service can
be down after a disaster occurs

Types of Disasters
RPO and RTO = 0, Write
latency acceptable
RPO and RTO > 0
Data Center Outages Cross-regional SF cluster Stand up a new cluster,
restore from backup
Cluster down (Very low probability for cross-
regional clusters)
Stand up a new cluster,
restore from backup
Stand up a new cluster,
restore from backup
Machine / Node down Deploy across 5+ FDs, 5+ UDs,
Design for write quorum losses
Deploy across 5+ FDs, 5+ UDs,
Design for write quorum losses
Other sources of data loss
or “oops”
Restore from backup Restore from backup

#GlobalAzure
Monitoring and diagnostics

Cluster and
Node state
Is the cluster healthy?
Are all the nodes up?
Detect and diagnose hardware
and infrastructure issues
Application
and Service
state
Upgrade status, number of
services and replicas
Detect software and app issues,
reduce service downtime
Resource
Usage
Do all the nodes need to be up?
What is the average CPU
usage?
Understand resource
consumption and drive better
business decisions
Performance
Tracking
Is there any unexpected
latency? Are the services
responsive?
Optimize application, service,
and infrastructure performance
Custom
Application
Metrics
Is your app being used in the
way that you expected? Is
solution effective?
Generate business insights and
improvements

#GlobalAzureDemo #5
Setting up monitoring
and diagnostics at
cluster creation
Demo

alberto.diaz@encamina.com
@adiazcan

Gab 2018 seguridad y escalado en azure service fabric

Gab 2018 seguridad y escalado en azure service fabric

Recommended

Recommended

More Related Content

Similar to Gab 2018 seguridad y escalado en azure service fabric

Similar to Gab 2018 seguridad y escalado en azure service fabric (20)

More from Alberto Diaz Martin

More from Alberto Diaz Martin (20)

Recently uploaded

Recently uploaded (20)

Gab 2018 seguridad y escalado en azure service fabric

Editor's Notes