Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

M|18 Getting Started with Analytics: MariaDB AX + Kubernetes

232 views

Published on

M|18 Getting Started with Analytics: MariaDB AX + Kubernetes

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

M|18 Getting Started with Analytics: MariaDB AX + Kubernetes

  1. 1. MariaDB AX on Containers Thomas Boyd Solutions Architect Getting Started with Analytics the Easy Way
  2. 2. Agenda • MariaDB AX • Kubernetes • MariaDB AX on Kubernetes • Q & A
  3. 3. MariaDB AX Analytics for the Agile Business
  4. 4. MariaDB AX • GPLv2 Open Source • Columnar, Massively Parallel MariaDB Storage Engine • Scalable, high-performance analytics platform • Built in redundancy and high availability • Runs on premise, on AWS cloud • Full SQL syntax and capabilities regardless of platform Big Data Sources Analytics Insight MariaDB ColumnStore . . . Node 1 Node 2 Node 3 Node N Local / AWS® / GlusterFS ® ELT Tool s BI Tool s
  5. 5. MariaDB AX Architecture Columnar Distributed Data Storage User Connections User Module n User Module 1 Performance Module n Performance Module 2 Performance Module 1 MariaDB Front End Query Engine User Module Processes SQL Requests Performance Module Distributed Processing Engine
  6. 6. MAX RANK MIN DENSE_RANK COUNT PERCENT_RANK SUM NTH_VALUE AVG FIRST_VALUE VARIANCE LAST_VALUE VAR_POP CUME_DIST VAR_SAMP LAG STD LEAD STDDEV NTILE STDDEV_POP PERCENTILE_CON T STDDEV_SAMP PERCENTILE_DISC ROW_NUMBER MEDIAN • Aggregate over a series of related rows • Simplified function for complex statistical analytics over sliding window per row - Cumulative, moving or centered aggregates - Simple Statistical functions like rank, max, min, average, median - More complex functions such as distribution, percentile, lag, lead - Without running complex sub-queries Windowing Functions Source : InfiniDB SQL Syntax Guide
  7. 7. Data Exportand Data Im port Bulk Data Load cpimport, LOAD DATA INFILE Bulk Data Export mysql client, odbc, jdbc Integration with MariaDB ColumnStore cpimport and sql interface
  8. 8. MariaDB AX High performance columnar storage engine that support wide variety of analytical use cases with SQL in a highly scalable distributed environments Parallel query processing for distributed environments Faster, More Efficient Queries Single SQL Interface for OLTP and analytics Easier Enterprise Analytics Power of SQL and Freedom of Open Source to Big Data Analytics Better Price Performance
  9. 9. Industry Category Use Case Gaming Behavior Analytics Projecting and predicting user behavior based on past and current data Advertising Customer Analytics Customer behavior data for market segmentation and predictive analytics. Advertising Loyalty Analytics Customer analytics focusing on a person’s commitment to a product, company, or brand. Web, E-commerce Click Stream Analytics Web activity analysis, software testing, market research with analytics on data about the clicks areas of web pages while web browsing [Deal News] Marketing Promotional Testing Using marketing and campaign management data to identify the best criteria to be used for a particular marketing offer. Social Network Network Analytics Relationship analytics among network nodes Financial Fraud Analytics Monitoring user financial transactions and identifying patterns of behaviour to predict and detect abnormal or fraudulent activity to prevent damage to user and institution. Healthcare Patient Analytics Analyzing patient medical records to identify patterns to be used for improved medical treatment. Healthcare Clinical Analytics Analyzing clinical data and its impact on patients to identify patterns to be used for improved medical treatment. Telco Network and Application Performance Analytics Streaming data from network devices and applications enriched with business operations data to uncover actionable insights for network planning, operations and marketing analytics Aviation Flight analytics Proactively project parts replacement, maintenance and air-plane retirement based on real-time and historically collected flight parameter data [Boeing] Customer Use Cases
  10. 10. Kubernetes Container orchestration moving mainstream
  11. 11. But First: What do Containers give me? Encapsulation of Dependencies • O/S packages & Patches • Execution environment (e.g. Python 2.7) • Application Code & Dependencies Process Isolation • Isolate the process from anything else running Faster, Lightweight virtualization
  12. 12. Virtual Machines vs. Containers App 1 App 2 App 3 Bins/Libs Bins/Libs Bins/Libs Guest OS Guest OS Guest OS Hypervisor Host Operating System Infrastructure Docker Engine Operating System Infrastructure App 1 App 2 App 3 Bins/Libs Bins/Libs Bins/Libs
  13. 13. What about orchestration and Management? Orchestration and Management of Containers and higher-level constructs (services, deployments, etc….) is evolving Amazon ECS Google Container Engine Azure Container Service
  14. 14. Brilliant for Stateless Components Source: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/
  15. 15. Brilliant for Stateless Components
  16. 16. Brilliant for Stateless Components
  17. 17. Brilliant for Stateless Components
  18. 18. Brilliant for Stateless Components
  19. 19. Containers + Distributed Database = Challenges • Data Durability – Ephemeral container storage • Cluster Formation – Configuration and Coordination of User Modules and Performance Modules • Cluster Maintenance & Changes – Planned and unplanned node failures – Scale-up and scale-down • Application Connections – Application tier (analytics tools) should not need to change connection information when DB topology changes
  20. 20. MariaDB AX + Kubernetes Getting started in Dev and Test Environments
  21. 21. Monthly AWS Bill Under-utilized Laptop Motivation ● How to Test Complex, Scale-out Deployments
  22. 22. Aspirational Pracitcal 100% “Cloud Native” “Desire for Kubernetes to enable low-friction porting of apps from VMs to containers” source: https://kubernetes.io/docs/concepts/ cluster-administration/networking/
  23. 23. Minikube for Single Node Kubernetes https://kubernetes.io/docs/getting-started-guides/minikube/
  24. 24. What kind of Experience is this going to be?
  25. 25. Minikube & Kubernetes Tips • Do the tutorials (https://kubernetes.io/docs/tutorials/) • Read (and re-read) the Documentation • Visual queues (terminal colors) for layers • Hypervisor selection • VM location and size • Recent version of Kubernetes (--kubernetes-version=v1.8.5 --bootstrapper kubeadm) • setting DOCKER env (eval $(minikube docker-env)
  26. 26. What about orchestration and Management? Orchestration and Management of Containers and higher-level constructs (services, deployments, etc….) is evolving Amazon ECS Google Container Engine Azure Container Service
  27. 27. docker kubelet docker kubelet docker kubelet node node node kubernetes master(s) kubectl REST Kubernetes Components kubernetes interfaces volumes containers volumes containers volumes containers
  28. 28. mysql: 3306 mysql: 3306 port: 3306 Kubernetes Objects pods services spec: controllers mysql: 3306 spec: deployments provide access to manage manage
  29. 29. Mimicking Virtual Machines “If there exists a headless service in the same namespace as the pod and with the same name as the subdomain, the cluster’s KubeDNS Server also returns an A record for the Pod’s fully qualified hostname.” Source: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/ Also: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/ • Leverage KubeDNS Server for easy naming • Host sshd daemon on each container • Shared key for ssh as kubernetes secret • Utilize StatefulSets https://github.com/WonkyWumpus/easy-sshd-ubuntu-1604
  30. 30. SSH with Ease tboyd$ kubectl describe pod m01|grep IP IP: 172.17.0.7 tboyd$ minikube ssh $ ssh -i .ssh/easy-key root@172.17.0.7 Welcome to Ubuntu 16.04.3 LTS (GNU/Linux 4.9.13 x86_64) root@m01:~# ssh root@m02 root@m02:~#
  31. 31. MariaDB AX with StatefulSets • Builds on the easy-sshd github • MariaDB AX software staged to images • Manually run standard install process to create MariaDB AX Cluster https://github.com/WonkyWumpus/mdb-cs-easy-sshd-ubuntu-1604
  32. 32. MariaDB AX Prereqs and Staging Software
  33. 33. MariaDB AX: UM & PM StatefulSets, UM Service
  34. 34. MariaDB AX: Standard Install & Config • Beware: execute install from pm-0 container! • Install .debs • Run /usr/local/mariadb/columnstore/bin/postConfigure • Access cluster through UM Service https://mariadb.com/kb/en/library/installing-and-configuring-a-multi- server-columnstore-system-11x/
  35. 35. MariaDB AX + Kubernetes: Possible Future Directions • Leverage persistent volumes and persistent volume claims • Cluster formation and config moved into docker images – MariaDB AX running and waiting to join cluster – Intelligent entrypoint script for automatic cluster join • User Module Tier automatic scaling • Performance Module Tier automatic scaling • Logic to tie DB Roots and Persistent External Storage – 24 DB Roots: Instantaneously burst from 1 to 24 PM nodes!
  36. 36. Resources • Kubernetes Documentation • MariaDB ColumnStore Documentation • MariaDB AX Datasheet • IHME Customer Story • What’s New in MariaDB AX • 5 Simple Steps to get Started with MariaDB and Tableau • Extract more Value with MariaDB ColumnStore Analytics
  37. 37. Questions?

×