Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Windows containers troubleshooting

143 views

Published on

Troubleshooting and best practices for Windows Container and K8S

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Windows containers troubleshooting

  1. 1. Windows containers troubleshooting Alexey Bokov, Microsoft, Commercial Software Engineering
  2. 2. Common troubles 1) Windows Pod are failed to resolve DNS 2) Versions problems 3) Pause image problems
  3. 3. Windows Pods are failed to resolve DNS After Windows Node rebooted, Host Network Servicy Policy need to be cleaned up # On Windows Node Start-BitsTransfer -Source https://raw.githubusercontent.com/Microsoft/SDN/master/Kubernetes/windows/hns.psm1 Import-Module .hns.psm1 Stop-Service kubeproxy Stop-Service kubelet Get-HnsNetwork | ? Name -eq l2Bridge | Remove-HnsNetwork Get-HnsPolicyList | Remove-HnsPolicyList Start-Service kubelet Start-Service kubeproxy
  4. 4. Versions matching • Container image must match host • How to check: • Windows version: major.minor.build.revision ( 10.0.14393.103 ) • Build changes when new version published, revision when Windows updates are applied • Actually if build numbers are different it blocking from start, for patches it might start 1) Use ‘ver’ inside commang prompt C:>ver Microsoft Windows [Version 10.0.16299.125] 2) Read registry PS C:Usersabokov> (Get-ItemProperty 'HKLM:SOFTWAREMicrosoftWindows NTCurrentVersion').BuildLabEx 17763.1.amd64fre.rs5_release.180914-1434
  5. 5. Choose version to use There’s no ‘latest’ tag anymore for Microsoft Windows images You need to specify: FROM mcr.microsoft.com/windows/nanoserver:1809-KB4493509 or FROM mcr.microsoft.com/windows/nanoserver:10.0.17763.437 For ServerCore: FROM mcr.microsoft.com/windows/servercore:ltsc2019
  6. 6. Windows Server servicing channels LTSC – Long Term Servicing Channel ( 5 mainstream support + 5 years extended), release every 2-3 years. Currently it’s Windows Server 2019 SAC - Semi-Annual Servicing Channel (18 months of support), 2 releases per year, current Windows Server 1903 ( 2019, March ) Long-Term Servicing Channel (Windows Server 2019) Semi-Annual Channel (Windows Server Recommended scenarios General purpose file servers, Microsoft and non-Microsoft workloads, traditional apps, infrastructure roles, software-defined Datacenter, and hyper-converged infrastructure Containerized applications, container hosts, and application scenarios benefiting from faster innovation New releases Every 2–3 years Every 6 months Support 5 years of mainstream support, plus 5 years of extended support 18 months Editions All available Windows Server editions Standard and Datacenter editions Who can use All customers through all channels Software Assurance and cloud customers only Installation options Server Core and Server with Desktop Experience Server Core for container host and image and Nano Server container image
  7. 7. Versions matching In k8s you may check it with ‘$kubectl describe node 38519acs9010’: .. System Info: Machine ID: 38519acs9010 System UUID: Boot ID: Kernel Version: 10.0 14393 (14393.1715.amd64fre.rs1_release_inmarket.170906-1810) OS Image: Operating System: windows Architecture: amd64 ..
  8. 8. Image naming Containers on Windows Server 1709 should use images with 1709 tags, e.g. microsoft/aspnet:4.7.2-windowsservercore-1709 microsoft/windowsservercore:1709 microsoft/iis:windowsservercore-1709 Containers on Windows Server 1803 should use images with 1803 tags, e.g. microsoft/aspnet:4.7.2-windowsservercore-1803 microsoft/windowsservercore:1803 microsoft/iis:windowsservercore-1803
  9. 9. Access to Windows ServerCore Container via RDP (dev/qa only!) Windows Server code has it but disable, to enable set this to 1 HKLMSystemCurrentControlSetControlTerminal ServerTemporaryALiC FROM microsoft/windowsservercore:1709_KB4074588 RUN net user /add abokov RUN net user abokov Abokov!2.718281828 RUN net localgroup "Remote Desktop Users" abokov /add RUN net localgroup "Administrators" abokov /add RUN cmd /k reg add "HKLMSystemCurrentControlSetControlTerminal Server" /v TemporaryALiC /t REG_DWORD /d 1 Or run: cscript C:WindowsSystem32Scregedit.wsf /ar 0
  10. 10. RDP in K8s (dev/qa only!)# rdp.yaml apiVersion: v1 kind: Service metadata: name: rdp spec: type: LoadBalancer ports: - protocol: TCP port: 3389 targetPort: 3389 --- kind: Endpoints apiVersion: v1 metadata: name: rdp subsets: - addresses: - ip: <node-ip> ports: - port: 3389 $ kubectl create -f rdp.yaml $ kubectl get svc rdp NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE rdp LoadBalancer 10.0.99.149 52.52.52.52 3389:32008/TCP 5m Connect via mstsc.exe -v 52.52.52.52
  11. 11. Open Questions to talk 1) Configuring pause image 2) Debugging http traffic ( lost packets or web server goes down ) – any alternatives to tcpdump/fiddler 3) Super common topic: vhd -> docker ( or containerize all the things )
  12. 12. What has gone well • Microsoft leadership in sig-windows • Microsoft engineers respond to bug reports in a timely manner • Meetings with Windows container and container networking teams have been extremely productive Struggles we've had • Configuring HNS and CNI properly • Long-standing Windows platform issues with no timeline for resolution • Development process for Windows CNI plugins • Tracking windows issues Troubleshooting • Debugging Kubernetes test failures is time-consuming • Often we can get something working or develop some workaround, but we don't understand why

×