This document provides an overview of Apache Ambari, an open source framework for provisioning, managing and monitoring Hadoop clusters. It discusses Ambari's architecture and features for provisioning clusters, managing services, monitoring metrics and alerts, and extensibility through Ambari stacks, views and blueprints. The document also outlines Ambari's release cadence and upcoming features around operations, extensibility and troubleshooting insights.
Apache Ambari: Managing Hadoop and YARNHortonworks
Part of the Hortonworks YARN Ready Webinar Series, this session is about management of Apache Hadoop and YARN using Apache Ambari. This series targets developers and we will feature a demo on Ambari.
Developing Terraform Modules at Scale - HashiTalks 2021TomStraub5
This document discusses best practices for developing Terraform modules at scale. It covers key topics like defining module structure, using modules, managing module versions and upgrades, discoverability, and release processes. The goal is to help make modules reusable, versioned, and easily consumed as infrastructure codebases grow in size and complexity.
High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...Salesforce Engineering
Apache HBase is an open source, non-relational, distributed datastore modeled after Google’s Bigtable, that runs on top of the Apache Hadoop Distributed Filesystem and provides low-latency random-access storage for HDFS-based compute platforms like Apache Hadoop and Apache Spark. Apache Phoenix is a high performance relational database layer over HBase optimized for low latency applications. This session will explore how the Data Platform and Services group at Salesforce.com supports teams of application developers accustomed to structured relational data access, while surfacing additional advantages of the underlying flexible scale-out datastore.
The document provides an overview of Terraform and discusses why it was chosen over other infrastructure as code tools. It outlines an agenda covering Terraform installation, configuration, and use of data sources and resources to build example infrastructure including a VCN, internet gateway, subnets, and how to taint and destroy resources. The live demo then walks through setting up Terraform and using it to provision example OCI resources.
Managing your Hadoop Clusters with Apache AmbariDataWorks Summit
Deploying, configuring, and managing large Apache Hadoop and HBase clusters can be quite complex. Once you have your clusters, keeping them up and running and making sure that the SLAs are met presents even more challenges and headaches to Hadoop operators. To make matters worse, managing upgrades can be a nightmare. Hadoop users are presented with their own fair share of difficulties such as slow running jobs and not knowing why they are slow. For third-party software vendors interested in incorporating Hadoop management and monitoring capabilities, there does not seem to be an obvious, easy solution. Apache Ambari is aimed at making lives of Hadoop operators, users, and integrators simpler by providing a management interface to do all of that and more. This session presents usages of Ambari`s Web UI for Hadoop operators (deploying, managing, and monitoring) as well as Hadoop users (job analytics). The talk will also touch upon Ambari`s REST API and how it is used in the real world. The session concludes by revealing the future roadmap of Ambari including queue management, upgrade, disaster recovery, high availability, and more.
Elastic Stack 을 이용한 게임 서비스 통합 로깅 플랫폼 - elastic{on} 2019 SeoulSeungYong Oh
elastic{on} 2019 Seoul 에서 발표한 데브시스터즈(Devsisters Corp.) 의 Elastic Stack 기반 게임 서비스 통합 로깅 플랫폼 소개 발표 자료입니다.
발표 영상은 https://www.elastic.co/kr/elasticon/tour/2019/seoul/devsisters-game-service-integration-logging-platform-using-elastic-stack 에서 보실 수 있습니다.
This document provides an overview of Apache Ambari, an open source framework for provisioning, managing and monitoring Hadoop clusters. It discusses Ambari's architecture and features for provisioning clusters, managing services, monitoring metrics and alerts, and extensibility through Ambari stacks, views and blueprints. The document also outlines Ambari's release cadence and upcoming features around operations, extensibility and troubleshooting insights.
Apache Ambari: Managing Hadoop and YARNHortonworks
Part of the Hortonworks YARN Ready Webinar Series, this session is about management of Apache Hadoop and YARN using Apache Ambari. This series targets developers and we will feature a demo on Ambari.
Developing Terraform Modules at Scale - HashiTalks 2021TomStraub5
This document discusses best practices for developing Terraform modules at scale. It covers key topics like defining module structure, using modules, managing module versions and upgrades, discoverability, and release processes. The goal is to help make modules reusable, versioned, and easily consumed as infrastructure codebases grow in size and complexity.
High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...Salesforce Engineering
Apache HBase is an open source, non-relational, distributed datastore modeled after Google’s Bigtable, that runs on top of the Apache Hadoop Distributed Filesystem and provides low-latency random-access storage for HDFS-based compute platforms like Apache Hadoop and Apache Spark. Apache Phoenix is a high performance relational database layer over HBase optimized for low latency applications. This session will explore how the Data Platform and Services group at Salesforce.com supports teams of application developers accustomed to structured relational data access, while surfacing additional advantages of the underlying flexible scale-out datastore.
The document provides an overview of Terraform and discusses why it was chosen over other infrastructure as code tools. It outlines an agenda covering Terraform installation, configuration, and use of data sources and resources to build example infrastructure including a VCN, internet gateway, subnets, and how to taint and destroy resources. The live demo then walks through setting up Terraform and using it to provision example OCI resources.
Managing your Hadoop Clusters with Apache AmbariDataWorks Summit
Deploying, configuring, and managing large Apache Hadoop and HBase clusters can be quite complex. Once you have your clusters, keeping them up and running and making sure that the SLAs are met presents even more challenges and headaches to Hadoop operators. To make matters worse, managing upgrades can be a nightmare. Hadoop users are presented with their own fair share of difficulties such as slow running jobs and not knowing why they are slow. For third-party software vendors interested in incorporating Hadoop management and monitoring capabilities, there does not seem to be an obvious, easy solution. Apache Ambari is aimed at making lives of Hadoop operators, users, and integrators simpler by providing a management interface to do all of that and more. This session presents usages of Ambari`s Web UI for Hadoop operators (deploying, managing, and monitoring) as well as Hadoop users (job analytics). The talk will also touch upon Ambari`s REST API and how it is used in the real world. The session concludes by revealing the future roadmap of Ambari including queue management, upgrade, disaster recovery, high availability, and more.
Elastic Stack 을 이용한 게임 서비스 통합 로깅 플랫폼 - elastic{on} 2019 SeoulSeungYong Oh
elastic{on} 2019 Seoul 에서 발표한 데브시스터즈(Devsisters Corp.) 의 Elastic Stack 기반 게임 서비스 통합 로깅 플랫폼 소개 발표 자료입니다.
발표 영상은 https://www.elastic.co/kr/elasticon/tour/2019/seoul/devsisters-game-service-integration-logging-platform-using-elastic-stack 에서 보실 수 있습니다.
22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...Athens Big Data
Title: MLOps Workshop: The Full ML Lifecycle - How to Use ML in Production
Speakers: Spyros Cavadias (https://www.linkedin.com/in/spyros-cavadias/), Konstantinos Pittas (https://www.linkedin.com/in/konstantinos-pittas-83310270/), Thanos Gkinakos (https://www.linkedin.com/in/thanos-gkinakos-03582a128/)
Date: Saturday, December 17, 2022
Event: https://www.meetup.com/athens-big-data/events/289927468/
Terraform is a tool for provisioning and managing infrastructure safely and efficiently. It allows infrastructure to be defined as code so it can be versioned, shared, and treated like any other code. Key features include showing execution plans so changes are not a surprise, building a graph of dependencies to parallelize changes, and enabling complex automated changesets to infrastructure. Operators use Terraform for the benefits of infrastructure as code, managing multiple cloud platforms through a single workflow, and providing self-service infrastructure through reusable modules.
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise UsersDataWorks Summit
Apache Knox Gateway is a proxy for interacting with Apache Hadoop clusters in a secure way providing authentication, service level authorization, and many other extensions to secure any HTTP interactions in your cluster. One main feature of Apache Knox Gateway is the ability to extend the reach of your REST APIs to the internet while still securing your cluster and working with Kerberos. Recent contributions to the Apache Knox community have added support for Single Sign On (SSO) based on Pac4j 1.8.9 which is a very powerful security engine which provides SSO support through SAML2, OAuth, OpenID, and CAS. In addition, through recent community contributions Apache Ambari, and Apache Ranger can now also provide SSO authentication through Knox. This paper will discuss the architecture of Knox SSO, it will explain how enterprise user could benefit by this feature and will present enterprise use cases for Knox SSO, and integration with open source Shibboleth, ADFS Windows server Idp support, and Okta cloud Idp.
Becoming an AWS Policy Ninja using AWS IAM - AWS Summit Tel Aviv 2017Amazon Web Services
Are you interested in becoming an expert in managing access to your AWS resources? Have you ever wondered how to best scope down permissions for least privilege access? Do you have multiple AWS accounts and need to know how to manage access to resources centrally? In this session, we take an in-depth look at AWS Identity and Access Management (IAM) and AWS Organizations. You will learn how to quickly create IAM policies to manage fine-grained access to your resources. Throughout the session, we will cover common use cases, such as how to grant a user access to an Amazon S3 bucket or permissions to launch an Amazon EC2 instance of a specific type. You will also learn how to create and use Service Control Policies (SCPs) through Organizations to manage AWS service use across all your accounts centrally.
알고 보니 클라우드가 더 비싸다는 의문과 오해를 풀어드립니다.
‘2018 국내 클라우드 도입의 현주소 설문조사’ 결과 클라우드를 도입 완료 했거나 도입 중인 조직의 가장 큰 고민 중 비용 최적화(46%)가 1위를 차지했으며, 클라우드 도입을 고려 중인 기업의 3대 챌린지 중 클라우드 비용 관리(45%)가 2위를 차지할 만큼 클라우드 비용에 대한 기업의 고민이 늘어가고 있습니다.
특히 기업의 클라우드 비용 관리는 비즈니스 전략과도 연관되므로 매우 중요한 요소입니다.
베스핀글로벌의 OpsNow팀이 클라우드 비용을 50%로 절감하는 방법과 운영 효율을 2배 더 높일 수 있는 방법 대해 알차게 알려드립니다.
Data in Hadoop is getting bigger every day, consumers of the data are growing, organizations are now looking at making their Hadoop cluster compliant to federal regulations and commercial demands. Apache Ranger simplifies the management of security policies across all components in Hadoop. Ranger provides granular access controls to data.
The deck describes what security tools are available in Hadoop and their purpose then it moves on to discuss in detail Apache Ranger.
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...Spark Summit
This document summarizes Uber's use of Spark as a data platform to support multi-tenancy and various data applications. Key points include:
- Uber uses Spark on YARN for resource management and isolation between teams/jobs. Parquet is used as the columnar file format for performance and schema support.
- Challenges include sharing infrastructure between many teams with different backgrounds and use cases. Spark provides a common platform.
- An Uber Development Kit (UDK) is used to help users get Spark jobs running quickly on Uber's infrastructure, with templates, defaults, and APIs for common tasks.
This document discusses programmatically tuning Spark jobs. It recommends collecting historical metrics like stage durations and task metrics from previous job runs. These metrics can then be used along with information about the execution environment and input data size to optimize configuration settings like memory, cores, partitions for new jobs. The document demonstrates using the Robin Sparkles library to save metrics and get an optimized configuration based on prior run data and metrics. Tuning goals include reducing out of memory errors, shuffle spills, and improving cluster utilization.
Virtualization Forum 2015, Praha, 7.10.2015
sál VMware
Jestliže SlideShare nezobrazí prezentaci korektně, můžete si ji stáhnout ve formátu .ppsx nebo .pdf.
This document discusses repetitive system administration tasks and proposes Ansible as a solution. It describes how Ansible works using agentless SSH to automate tasks like software installation, configuration, and maintenance across multiple servers. Key aspects covered include Ansible's inventory, modules, playbooks, templates, variables, roles and Docker integration. Ansible Tower is also introduced as a GUI tool for running Ansible jobs. The document recommends Ansible for anyone doing the same tasks across multiple servers to gain efficiencies over manual processes.
[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-RegionJi-Woong Choi
OpenStack Ceph & Neutron에 대한 설명을 담고 있습니다.
1. OpenStack
2. How to create instance
3. Ceph
- Ceph
- OpenStack with Ceph
4. Neutron
- Neutron
- How neutron works
5. OpenStack HA- controller- l3 agent
6. OpenStack multi-region
사례로 알아보는 MariaDB 마이그레이션
현대적인 IT 환경과 애플리케이션을 만들기 위해 우리는 오늘도 고민을 거듭합니다. 최근 들어 오픈소스 DB가 많은 업무에 적용되고 검증이 되면서, 점차 무거운 상용 데이터베이스를 가벼운 오픈소스 DB로 전환하는 움직임이 대기업의 미션 크리티컬 업무까지로 확산하고 있습니다. 이는 클라우드 환경 및 마이크로 서비스 개념 확산과도 일치하는 움직임입니다.
상용 DB를 MariaDB로 이관한 사례를 통해 마이그레이션의 과정과 효과를 살펴 볼 수 있습니다.
MariaDB로 이관하는 것은 어렵다는 생각을 막연히 가지고 계셨다면 본 자료를 통해 이기종 데이터베이스를 MariaDB로 마이그레이션 하는 작업이 어렵지 않게 수행될 수 있다는 점을 실제 사례를 통해 확인하시길 바랍니다.
웨비나 동영상
https://www.youtube.com/watch?v=xRsETZ5cKz8&t=52s
Introductory Overview to Managing AWS with TerraformMichael Heyns
The document provides an overview of Terraform including:
- Terraform is an open source tool from HashiCorp that allows defining and provisioning infrastructure in a code-based declarative way across multiple cloud platforms and services.
- Key concepts include providers that define cloud resources, configuration files that declare the desired state, and a plan-apply workflow to provision and manage infrastructure resources.
- Common Terraform commands are explained like init, plan, apply, destroy, output and their usage.
CloudStack and OpenStack both provide platform for managing and deploying virtual infrastructure. CloudStack UI is easier to use and more user friendly, while OpenStack UI is simpler but based on Django framework. CloudStack uses monolithic controller architecture with datacenter model, while OpenStack is more fragmented with shared nothing architecture. CloudStack networking supports basic, advanced, flat and VLAN modes. OpenStack uses security groups and supports flat, DHCP and VLAN modes. CloudStack storage is primary and secondary, while OpenStack uses Cinder for block storage and Swift for object storage. CloudStack deployment is easier while OpenStack typically requires tools like Puppet or Chef.
Confd, systemd, fleet을 이용한 어플리케이션 배포 in CoreOS충섭 김
Confd, systemd, fleet을 이용한 어플리케이션 배포 in CoreOS
Docker Seoul Meetup #2에서 발표한 자료입니다.
CoreOS에서 confd와 sidekick service를 이용한 서비스 배포에 대한 내용입니다.
http://www.youtube.com/watch?v=5ixJCM6pAcg
영상과 함께 보시면 더 좋습니다 :)
22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...Athens Big Data
Title: MLOps Workshop: The Full ML Lifecycle - How to Use ML in Production
Speakers: Spyros Cavadias (https://www.linkedin.com/in/spyros-cavadias/), Konstantinos Pittas (https://www.linkedin.com/in/konstantinos-pittas-83310270/), Thanos Gkinakos (https://www.linkedin.com/in/thanos-gkinakos-03582a128/)
Date: Saturday, December 17, 2022
Event: https://www.meetup.com/athens-big-data/events/289927468/
Terraform is a tool for provisioning and managing infrastructure safely and efficiently. It allows infrastructure to be defined as code so it can be versioned, shared, and treated like any other code. Key features include showing execution plans so changes are not a surprise, building a graph of dependencies to parallelize changes, and enabling complex automated changesets to infrastructure. Operators use Terraform for the benefits of infrastructure as code, managing multiple cloud platforms through a single workflow, and providing self-service infrastructure through reusable modules.
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise UsersDataWorks Summit
Apache Knox Gateway is a proxy for interacting with Apache Hadoop clusters in a secure way providing authentication, service level authorization, and many other extensions to secure any HTTP interactions in your cluster. One main feature of Apache Knox Gateway is the ability to extend the reach of your REST APIs to the internet while still securing your cluster and working with Kerberos. Recent contributions to the Apache Knox community have added support for Single Sign On (SSO) based on Pac4j 1.8.9 which is a very powerful security engine which provides SSO support through SAML2, OAuth, OpenID, and CAS. In addition, through recent community contributions Apache Ambari, and Apache Ranger can now also provide SSO authentication through Knox. This paper will discuss the architecture of Knox SSO, it will explain how enterprise user could benefit by this feature and will present enterprise use cases for Knox SSO, and integration with open source Shibboleth, ADFS Windows server Idp support, and Okta cloud Idp.
Becoming an AWS Policy Ninja using AWS IAM - AWS Summit Tel Aviv 2017Amazon Web Services
Are you interested in becoming an expert in managing access to your AWS resources? Have you ever wondered how to best scope down permissions for least privilege access? Do you have multiple AWS accounts and need to know how to manage access to resources centrally? In this session, we take an in-depth look at AWS Identity and Access Management (IAM) and AWS Organizations. You will learn how to quickly create IAM policies to manage fine-grained access to your resources. Throughout the session, we will cover common use cases, such as how to grant a user access to an Amazon S3 bucket or permissions to launch an Amazon EC2 instance of a specific type. You will also learn how to create and use Service Control Policies (SCPs) through Organizations to manage AWS service use across all your accounts centrally.
알고 보니 클라우드가 더 비싸다는 의문과 오해를 풀어드립니다.
‘2018 국내 클라우드 도입의 현주소 설문조사’ 결과 클라우드를 도입 완료 했거나 도입 중인 조직의 가장 큰 고민 중 비용 최적화(46%)가 1위를 차지했으며, 클라우드 도입을 고려 중인 기업의 3대 챌린지 중 클라우드 비용 관리(45%)가 2위를 차지할 만큼 클라우드 비용에 대한 기업의 고민이 늘어가고 있습니다.
특히 기업의 클라우드 비용 관리는 비즈니스 전략과도 연관되므로 매우 중요한 요소입니다.
베스핀글로벌의 OpsNow팀이 클라우드 비용을 50%로 절감하는 방법과 운영 효율을 2배 더 높일 수 있는 방법 대해 알차게 알려드립니다.
Data in Hadoop is getting bigger every day, consumers of the data are growing, organizations are now looking at making their Hadoop cluster compliant to federal regulations and commercial demands. Apache Ranger simplifies the management of security policies across all components in Hadoop. Ranger provides granular access controls to data.
The deck describes what security tools are available in Hadoop and their purpose then it moves on to discuss in detail Apache Ranger.
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...Spark Summit
This document summarizes Uber's use of Spark as a data platform to support multi-tenancy and various data applications. Key points include:
- Uber uses Spark on YARN for resource management and isolation between teams/jobs. Parquet is used as the columnar file format for performance and schema support.
- Challenges include sharing infrastructure between many teams with different backgrounds and use cases. Spark provides a common platform.
- An Uber Development Kit (UDK) is used to help users get Spark jobs running quickly on Uber's infrastructure, with templates, defaults, and APIs for common tasks.
This document discusses programmatically tuning Spark jobs. It recommends collecting historical metrics like stage durations and task metrics from previous job runs. These metrics can then be used along with information about the execution environment and input data size to optimize configuration settings like memory, cores, partitions for new jobs. The document demonstrates using the Robin Sparkles library to save metrics and get an optimized configuration based on prior run data and metrics. Tuning goals include reducing out of memory errors, shuffle spills, and improving cluster utilization.
Virtualization Forum 2015, Praha, 7.10.2015
sál VMware
Jestliže SlideShare nezobrazí prezentaci korektně, můžete si ji stáhnout ve formátu .ppsx nebo .pdf.
This document discusses repetitive system administration tasks and proposes Ansible as a solution. It describes how Ansible works using agentless SSH to automate tasks like software installation, configuration, and maintenance across multiple servers. Key aspects covered include Ansible's inventory, modules, playbooks, templates, variables, roles and Docker integration. Ansible Tower is also introduced as a GUI tool for running Ansible jobs. The document recommends Ansible for anyone doing the same tasks across multiple servers to gain efficiencies over manual processes.
[오픈소스컨설팅] Open Stack Ceph, Neutron, HA, Multi-RegionJi-Woong Choi
OpenStack Ceph & Neutron에 대한 설명을 담고 있습니다.
1. OpenStack
2. How to create instance
3. Ceph
- Ceph
- OpenStack with Ceph
4. Neutron
- Neutron
- How neutron works
5. OpenStack HA- controller- l3 agent
6. OpenStack multi-region
사례로 알아보는 MariaDB 마이그레이션
현대적인 IT 환경과 애플리케이션을 만들기 위해 우리는 오늘도 고민을 거듭합니다. 최근 들어 오픈소스 DB가 많은 업무에 적용되고 검증이 되면서, 점차 무거운 상용 데이터베이스를 가벼운 오픈소스 DB로 전환하는 움직임이 대기업의 미션 크리티컬 업무까지로 확산하고 있습니다. 이는 클라우드 환경 및 마이크로 서비스 개념 확산과도 일치하는 움직임입니다.
상용 DB를 MariaDB로 이관한 사례를 통해 마이그레이션의 과정과 효과를 살펴 볼 수 있습니다.
MariaDB로 이관하는 것은 어렵다는 생각을 막연히 가지고 계셨다면 본 자료를 통해 이기종 데이터베이스를 MariaDB로 마이그레이션 하는 작업이 어렵지 않게 수행될 수 있다는 점을 실제 사례를 통해 확인하시길 바랍니다.
웨비나 동영상
https://www.youtube.com/watch?v=xRsETZ5cKz8&t=52s
Introductory Overview to Managing AWS with TerraformMichael Heyns
The document provides an overview of Terraform including:
- Terraform is an open source tool from HashiCorp that allows defining and provisioning infrastructure in a code-based declarative way across multiple cloud platforms and services.
- Key concepts include providers that define cloud resources, configuration files that declare the desired state, and a plan-apply workflow to provision and manage infrastructure resources.
- Common Terraform commands are explained like init, plan, apply, destroy, output and their usage.
CloudStack and OpenStack both provide platform for managing and deploying virtual infrastructure. CloudStack UI is easier to use and more user friendly, while OpenStack UI is simpler but based on Django framework. CloudStack uses monolithic controller architecture with datacenter model, while OpenStack is more fragmented with shared nothing architecture. CloudStack networking supports basic, advanced, flat and VLAN modes. OpenStack uses security groups and supports flat, DHCP and VLAN modes. CloudStack storage is primary and secondary, while OpenStack uses Cinder for block storage and Swift for object storage. CloudStack deployment is easier while OpenStack typically requires tools like Puppet or Chef.
Confd, systemd, fleet을 이용한 어플리케이션 배포 in CoreOS충섭 김
Confd, systemd, fleet을 이용한 어플리케이션 배포 in CoreOS
Docker Seoul Meetup #2에서 발표한 자료입니다.
CoreOS에서 confd와 sidekick service를 이용한 서비스 배포에 대한 내용입니다.
http://www.youtube.com/watch?v=5ixJCM6pAcg
영상과 함께 보시면 더 좋습니다 :)
본 워크샵에서는 사용자가 Wild Rydes 서비스를 통해 현재 있는 위치에서 유니콘 호출 및 탑승을 할 수 있는 스타트업 아이디어를 구현한다는 시나리오로 함께 웹 애플리케이션을 만들어 배포해 봅니다. 이 서비스는 사용자에게 HTML 기반 사용자 인터페이스를 제공하여, 사용자가 원하는 위치를 표시하고 유니콘 요청을 하면, 가까운 유니콘을 보내기기 위해 RESTful 웹 서비스로 백엔드를 제공합니다. 또한, 사용자가 유니콘 타기를 요청하기 전에 기본적으로 회원 가입을 하고 로그인 할 수있는 기능을 제공합니다. 이를 위해 AWS Lambda, Amazon API Gateway, Amazon S3, Amazon DynamoDB, Amazon Cognito를 활용합니다.
3. Ambari page는 http://[AMBARI_SERVER_HOST_NAME]:8080 으로 접속할 수 있습니다. Admin 계정의 기본 정보는
ID : admin, PASS : admin 입니다.
1) Ambari Dashboard UI
1. Ambari Dashboard
3
* Ambari server Port 변경은 /etc/ambari-server/conf/ambari.properties의 client.api.port 정보를 변경 또는 추가
4. 4
Ambari UI 위쪽에는 다음과 같은 표시 줄이 있습니다. 이 표시줄에는 다음 정보 및 컨트롤이 포함되어 있습니다.
2) Status Bar
1. Ambari Dashboard
•Ambari 로고 - 클러스터를 모니터링하는 데 사용할 수 있는 대시보드를 엽니다.
•클러스터 이름 # ops - 진행 중인 Ambari 작업 수를 표시합니다. 클러스터 이름
또는 # ops를 선택하면 백그라운드 작업 목록이 표시됩니다.
•# alerts - 클러스터에 대한 경고 또는 중요한 알림(있을 경우)을 표시합니다.
•Dashboard - 대시보드를 표시합니다.
•Services - 클러스터의 서비스에 대한 정보 및 구성 설정입니다.
•Hosts - 클러스터의 노드에 대한 정보 및 구성 설정입니다.
•Alerts - 정보, 경고 및 중요한 알림에 대한 로그입니다.
•Admin - 클러스터에 설치된 소프트웨어 스택/서비스, 서비스 계정 정보 및
Kerberos 보안 설정이 가능합니다.
•Admin 단추 - Ambari 관리, 사용자 설정 및 로그 아웃입니다.
5. 대시보드의 Metrics 탭은 클러스터의 상태를 한 눈에 쉽게 모니터할 수 있는 일련의 위젯을 포함합니다. CPU
Usage와 같은 여러 위젯은 클릭하면 추가 정보를 제공합니다.
1) Ambari Dashboard UI
1. Ambari Dashboard
5
7. 7
Alerts 메뉴는 Ambari에서 사용하는 일반적인 경고 상태를 포함합니다.
1) Alerts UI
2. Alerts
• OK
• Warning
• CRITICAL
• UNKNOWN
8. 8
Action 메뉴를 클릭하고 Manage Alert Groups를 선택하여 Alert 그룹을 관리할 수 있습니다.
2) Manage Alert Group
2. Alerts
9. 9
Action 메뉴에서 경고 방법을 관리하고 경고 알림을 만들 수도 있습니다. 특정 경고가 발생하면 e-mail 또는 SNMP를
통해 알림을 보낼 수 있습니다.
3) Create Alert Notification
2. Alerts
10. 10
클러스터 내의 노드에 대한 자세한 내용은 호스트를 선택합니다. 그런 다음 관심 있는 특정 노드를 선택하면 노드에
대한 상세 정보, component 재시작, component 추가 등의 작업을 할 수 있습니다.
1) Hosts UI
3. Hosts
11. 11
Hosts 탭에서는 다음과 같은 작업을 수행할 수 있습니다.
2) Host Actions
3. Hosts
Actions 메뉴를 사용하여 수행할 작업을 선택합니다.
• Start all components - 호스트에서 모든 구성 요소를
시작합니다.
• Stop all components - 호스트에서 모든 구성 요소를
중지합니다.
• Restart all components - 호스트에서 모든 구성 요소
를 중지하고 시작합니다.
• Turn on maintenance mode - 호스트에 대한 경고를
표시하지 않습니다. 경고를 생성하는 작업을 수행하는
경우 이 모드를 활성화해야 합니다. 예를 들어 서비스를
중지하고 시작합니다.
• Turn off maintenance mode - 호스트를 정상 경고 상
태로 되돌립니다.
• Stop - 호스트에서 DataNode 또는 NodeManagers를
중지합니다.
• Start - 호스트에서 DataNode 또는 NodeManagers를
시작합니다.
• Restart - 호스트에서 DataNode 또는 NodeManagers
를 중지하고 시작합니다.
• Decommission - 호스트를 클러스터에서 제거합니다.
• Recommission - 이전에 서비스를 해지한 호스트를
클러스터에 추가합니다.
12. 12
대시보드의 Services 세로 막대는 클러스터에서 실행되는 서비스 상태에 대한 빠른 정보를 제공합니다. 다양한
아이콘은 수행해야 하는 상태 또는 작업을 나타내는 데 사용됩니다.
1) Services UI
4. Services
13. 13
대시보드의 Services 세로 막대는 클러스터에서 실행되는 서비스 상태에 대한 빠른 정보를 제공합니다. 다양한
아이콘은 수행해야 하는 상태 또는 작업을 나타내는 데 사용됩니다.
1) Services UI
4. Services
대시보드 또는 서비스 페이지에서 서비스 목록
아래쪽의 작업 단추를 사용하여 모든 서비스를
중지하고 시작합니다.
14. 14
특정 서비스를 선택하면 해당 서비스에 대한 자세한 정보가 표시됩니다.
1) Services UI
4. Services
15. 15
Actions 단추는 모든 서비스를 다시 시작할 수 있는 반면, 특정 서비스를 시작하거나 중지, 다시 시작하려는 경우가
많습니다. 다음 메뉴를 사용하여 개별 서비스에서 작업을 수행할 수 있습니다.
2) Service Actions
4. Services
1. Dashboard 또
는 Services 페이지에서
서비스를 선택합니다.
2. Summary 탭 위쪽에
서 Service Actions 단
추를 사용하여 수행할
작업을 선택합니다. 모
든 노드의 서비스를 다
시 시작합니다.
16. 16
일부 서비스는 페이지의 위쪽에 Quick Links 링크를 표시합니다. 이를 사용하여 서비스 관련 웹 UI에 액세스할 수
있습니다.
3) Services Quick Links
4. Services
• Job History - MapReduce 작업 기록
• Resource Manager - YARN ResourceManager UI
• NameNode - HDFS(Hadoop Distributed File System) NameNode UI
• Oozie Web UI - Oozie UI
17. 17
View는 3rd Party API 및 UI와 함께 플러그인하여 Ambari를 확장하는 방법입니다. Hortonworks Ambari에서는
기본적으로 HDFS, YARN, HIVE 등에 대한 View를 제공합니다.
1) View UI
5. View
• Yarn 큐 관리자: 큐 관리자는 YARN 큐를 보고 수정하기 위한 간단한 UI를
제공합니다.
• Hive 뷰: Hive 뷰를 사용하면 웹 브라우저에서 직접 Hive 쿼리를 실행할 수
있습니다. 쿼리를 저장하고 결과 확인하며 클러스터 저장소에 결과를 저장
하거나 로컬 시스템에 다운로드할 수 있습니다.
18. 18
Manage 탭에서는 Cluster의 버전정보, Group & Role 관리, View 관리 등의 작업을 수행할 수 있습니다.
1) Manage UI
6. Manage
19. 19
User + Group Management 탭에서는 Ambari의 Group & Role 관리 작업을 수행할 수 있습니다.
2) User + Group Management UI
6. Manage
21. 21
1) Ambari Server 재시작 & postgresql 재시작
[참고] Ambari 서비스 재시작
1. Ambari Server 재시작
# ssh hdm1
# su - root
# ambari-server restart
Using python /usr/bin/python
Restarting ambari-server
Waiting for server stop...
Ambari Server stopped
Ambari Server running with administrator privileges.
Organizing resource files at /var/lib/ambari-server/resources...
Ambari database consistency check started...
Server PID at: /var/run/ambari-server/ambari-server.pid
Server out at: /var/log/ambari-server/ambari-server.out
Server log at: /var/log/ambari-server/ambari-server.log
Waiting for server start.................................................
Server started listening on 8080
2. postgresql 재시작
# ssh hdm1
# su - root
# service postgresql restart
22. 22
2) Ambari Agent 재시작
[참고] Ambari 서비스 재시작
## 모든 클러스터 호스트에서 수행
# su - root
>> ambari-agent restart
[hdm1] Restarting ambari-agent
[hdm1] Verifying Python version compatibility...
[hdm1] Using python /usr/bin/python
[hdm1] Found ambari-agent PID: 50225
[hdm1] Stopping ambari-agent
[hdm1] Removing PID file at /run/ambari-agent/ambari-agent.pid
[hdm1] ambari-agent successfully stopped
[hdm1] Verifying Python version compatibility...
[hdm1] Using python /usr/bin/python
[hdm1] Checking for previously running Ambari Agent...
[hdm1] Starting ambari-agent
[hdm1] Verifying ambari-agent process status...
[hdm1] Ambari Agent successfully started
[hdm1] Agent PID at: /run/ambari-agent/ambari-agent.pid
[hdm1] Agent out at: /var/log/ambari-agent/ambari-agent.out
[hdm1] Agent log at: /var/log/ambari-agent/ambari-agent.log
23. Service 상태 점검
# curl -u admin:admin -X GET http://[SERVER_HOSTNAME]:8080/api/v1/clusters/ [CLUSTER_NAME] /services/[SERVICE_NAME]
# curl -u admin:admin -X GET http://192.168.11.148:8080/api/v1/clusters/prum/services/HAWQ
Service 중지
# curl -u admin:admin -H "X-Requested-By: ambari" -X PUT -d '{"RequestInfo":{"context":"Stop Service via REST
"},"Body":{"ServiceInfo":{"state":"INSTALLED"}}}' http:// [SERVER_HOSTNAME] :8080/api/v1/clusters/[CLUSTER_NAME]/services/ [SERVICE_NAME]
curl -u admin:admin -H "X-Requested-By: ambari" -X PUT -d '{"RequestInfo":{"context":"Stop Service via REST
"},"Body":{"ServiceInfo":{"state":"INSTALLED"}}}' http://192.168.11.148:8080/api/v1/clusters/prum/services/HAWQ
Service 시작
# curl -u admin:admin -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Start Service via REST"}, "Body": {"ServiceInfo": {"state":
"STARTED"}}}' http:// [SERVER_HOSTNAME] :8080/api/v1/clusters/[CLUSTER_NAME]/services/ [SERVICE_NAME]
# curl -u admin:admin -i -H 'X-Requested-By: ambari' -X PUT -d '{"RequestInfo": {"context" :"Start Service via REST"}, "Body": {"ServiceInfo": {"state":
"STARTED"}}}' http://hdm1.gphd.local:8080/api/v1/clusters/sdi/services/HDFS
Configuration 수정
- 주의점 : delete 시 Ambari UI 상에서 완전 삭제 됨, set을 이용하여 다시 추가 가능
- set을 이용하여 추가 시 Advanced에 추가되는 것이 아닌 Custom으로 추가 됨, 원래 위치가 Advanced 여도 마찬가지
# /var/lib/ambari-server/resources/scripts/configs.sh [set|get|delete] [hostname] [clustername] [config_file_name] [config_key] [config_value]
# /var/lib/ambari-server/resources/scripts/configs.sh set localhost sdi mapred-site "mapreduce.map.memory.mb" "512"
# /var/lib/ambari-server/resources/scripts/configs.sh get localhost sdi mapred-site
# /var/lib/ambari-server/resources/scripts/configs.sh delete localhost sdi mapred-site "mapreduce.map.memory.mb“
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=41812517
23
Apache Ambari를 REST API를 통해 조작하는 방법에 대해 설명합니다.
1) Ambari REST API
[참고] Ambari REST API