With the advent of Hadoop, there comes the need for professionals skilled in Hadoop Administration making it imperative to be skilled as a Hadoop Admin for better career, salary and job opportunities.
Hadoop 3.0 has been years in the making, and now it's finally arriving. Andrew Wang and Daniel Templeton offer an overview of new features, including HDFS erasure coding, YARN Timeline Service v2, YARN federation, and much more, and discuss current release management status and community testing efforts dedicated to making Hadoop 3.0 the best Hadoop major release yet.
Hadoop Operations - Best practices from the fieldUwe Printz
Talk about Hadoop Operations and Best Practices for building and maintaining Hadoop cluster.
Talk was held at the data2day conference in Karlsruhe, Germany on 27.11.2014
Hadoop Administrator Online training course by (Knowledgebee Trainings) with mastering Hadoop Cluster: Planning & Deployment, Monitoring, Performance tuning, Security using Kerberos, HDFS High Availability using Quorum Journal Manager (QJM) and Oozie, Hcatalog/Hive Administration.
Contact : knowledgebee@beenovo.com
Hadoop meets Agile! - An Agile Big Data ModelUwe Printz
Big Data projects are a struggle, not only on the technical side but also on the organizational side. In this talk the author shares his experience and opinions from almost 5 years of Big Data projects and develops an Agile Big Data Model which reflects his ideas on how Big Data projects can be successful, even in large companies.
Talk held at the crossover meetup of the "Agile Stammtisch Rhein-Main" and the "Hadoop & Spark User Group Rhein-Main" at codecentric AG on 31.01.2017.
Introduction to Cloudera's Administrator Training for Apache HadoopCloudera, Inc.
Learn who is best suited to attend the full Administrator Training, what prior knowledge you should have, and what topics the course covers. Cloudera Senior Curriculum Manager, Ian Wrigley, will discuss the skills you will attain during Admin Training and how they will help you move your Hadoop deployment from strategy to production and prepare for the Cloudera Certified Administrator for Apache Hadoop (CCAH) exam.
This talk takes you on a rollercoaster ride through Hadoop 2 and explains the most significant changes and components.
The talk has been held on the JavaLand conference in Brühl, Germany on 25.03.2014.
Agenda:
- Welcome Office
- YARN Land
- HDFS 2 Land
- YARN App Land
- Enterprise Land
HDFS Tiered Storage: Mounting Object Stores in HDFSDataWorks Summit
Most users know HDFS as the reliable store of record for big data analytics. HDFS is also used to store transient and operational data when working with cloud object stores, such as Azure HDInsight and Amazon EMR. In these settings- but also in more traditional, on premise deployments- applications often manage data stored in multiple storage systems or clusters, requiring a complex workflow for synchronizing data between filesystems to achieve goals for durability, performance, and coordination.
Building on existing heterogeneous storage support, we add a storage tier to HDFS to work with external stores, allowing remote namespaces to be "mounted" in HDFS. This capability not only supports transparent caching of remote data as HDFS blocks, it also supports synchronous writes to remote clusters for business continuity planning (BCP) and supports hybrid cloud architectures.
This idea was presented at last year’s Summit in San Jose. Lots of progress has been made since then and the feature is in active development at the Apache Software Foundation on branch HDFS-9806, driven by Microsoft and Western Digital. We will discuss the refined design & implementation and present how end-users and admins will be able to use this powerful functionality.
Hadoop 3.0 has been years in the making, and now it's finally arriving. Andrew Wang and Daniel Templeton offer an overview of new features, including HDFS erasure coding, YARN Timeline Service v2, YARN federation, and much more, and discuss current release management status and community testing efforts dedicated to making Hadoop 3.0 the best Hadoop major release yet.
Hadoop Operations - Best practices from the fieldUwe Printz
Talk about Hadoop Operations and Best Practices for building and maintaining Hadoop cluster.
Talk was held at the data2day conference in Karlsruhe, Germany on 27.11.2014
Hadoop Administrator Online training course by (Knowledgebee Trainings) with mastering Hadoop Cluster: Planning & Deployment, Monitoring, Performance tuning, Security using Kerberos, HDFS High Availability using Quorum Journal Manager (QJM) and Oozie, Hcatalog/Hive Administration.
Contact : knowledgebee@beenovo.com
Hadoop meets Agile! - An Agile Big Data ModelUwe Printz
Big Data projects are a struggle, not only on the technical side but also on the organizational side. In this talk the author shares his experience and opinions from almost 5 years of Big Data projects and develops an Agile Big Data Model which reflects his ideas on how Big Data projects can be successful, even in large companies.
Talk held at the crossover meetup of the "Agile Stammtisch Rhein-Main" and the "Hadoop & Spark User Group Rhein-Main" at codecentric AG on 31.01.2017.
Introduction to Cloudera's Administrator Training for Apache HadoopCloudera, Inc.
Learn who is best suited to attend the full Administrator Training, what prior knowledge you should have, and what topics the course covers. Cloudera Senior Curriculum Manager, Ian Wrigley, will discuss the skills you will attain during Admin Training and how they will help you move your Hadoop deployment from strategy to production and prepare for the Cloudera Certified Administrator for Apache Hadoop (CCAH) exam.
This talk takes you on a rollercoaster ride through Hadoop 2 and explains the most significant changes and components.
The talk has been held on the JavaLand conference in Brühl, Germany on 25.03.2014.
Agenda:
- Welcome Office
- YARN Land
- HDFS 2 Land
- YARN App Land
- Enterprise Land
HDFS Tiered Storage: Mounting Object Stores in HDFSDataWorks Summit
Most users know HDFS as the reliable store of record for big data analytics. HDFS is also used to store transient and operational data when working with cloud object stores, such as Azure HDInsight and Amazon EMR. In these settings- but also in more traditional, on premise deployments- applications often manage data stored in multiple storage systems or clusters, requiring a complex workflow for synchronizing data between filesystems to achieve goals for durability, performance, and coordination.
Building on existing heterogeneous storage support, we add a storage tier to HDFS to work with external stores, allowing remote namespaces to be "mounted" in HDFS. This capability not only supports transparent caching of remote data as HDFS blocks, it also supports synchronous writes to remote clusters for business continuity planning (BCP) and supports hybrid cloud architectures.
This idea was presented at last year’s Summit in San Jose. Lots of progress has been made since then and the feature is in active development at the Apache Software Foundation on branch HDFS-9806, driven by Microsoft and Western Digital. We will discuss the refined design & implementation and present how end-users and admins will be able to use this powerful functionality.
With Hadoop-3.0.0-alpha2 being released in January 2017, it's time to have a closer look at the features and fixes of Hadoop 3.0.
We will have a look at Core Hadoop, HDFS and YARN, and answer the emerging question whether Hadoop 3.0 will be an architectural revolution like Hadoop 2 was with YARN & Co. or will it be more of an evolution adapting to new use cases like IoT, Machine Learning and Deep Learning (TensorFlow)?
A comprehensive overview of the security concepts in the open source Hadoop stack in mid 2015 with a look back into the "old days" and an outlook into future developments.
Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...Abhiraj Butala
The talk covers limitations of current Hadoop eco-system components in handling security (Authentication, Authorization, Auditing) in multi-tenant, multi-application environments. Then it proposes how we can use Apache Ranger and HDFS super-user connections to enforce correct HDFS authorization policies and achieve the required auditing.
Most users know HDFS as the reliable store of record for big data analytics. HDFS is also used to store transient and operational data when working with cloud object stores, such as Microsoft Azure or Amazon S3, and on-premises object stores, such as Western Digital’s ActiveScale. In these settings, applications often manage data stored in multiple storage systems or clusters, requiring a complex workflow for synchronizing data between filesystems for business continuity planning (BCP) and/or supporting hybrid cloud architectures to achieve the required business goals for durability, performance, and coordination.
To resolve this complexity, HDFS-9806 has added a PROVIDED storage tier to mount external storage systems in the HDFS NameNode. Building on this functionality, we can now allow remote namespaces to be synchronized with HDFS, enabling asynchronous writes to the remote storage and the possibility to synchronously and transparently read data back to a local application wanting to access file data which is stored remotely. In this talk, which corresponds to the work in progress under HDFS-12090, we will present how the Hadoop admin can manage storage tiering between clusters and how that is then handled inside HDFS through the snapshotting mechanism and asynchronously satisfying the storage policy.
Speakers
Chris Douglas, Microsoft, Principal Research Software Engineer
Thomas Denmoor, Western Digital, Object Storage Architect
Upgrade Without the Headache: Best Practices for Upgrading Hadoop in ProductionCloudera, Inc.
Walk through some of the best practices to keep in mind when it comes to upgrading your cluster, and learn how to leverage new Upgrade Wizard features in Cloudera Enterprise 5.3.
For most mission critical workloads, downtime is never an option. Any downtime can have a direct impact on revenue and lead to frantic calls in the middle of the night. For this reason, upgrading the software that powers these workloads can often be a daunting task. It can cause unpredictable issues without access to support. That’s why an enterprise-grade administration tool is crucial for running Hadoop in production. Hadoop consists of dozens of components, running across multiple machines, all with their own configurations. That can lead to a lot of complexity and uncertainty - especially when taking the upgrade plunge.
Cloudera Manager makes it easy and is the only production-ready administration tool for Hadoop. Not only does Cloudera Manager feature zero-downtime rolling upgrades, but it also has a built in Upgrade Wizard to make upgrades simple and predictable.
With the advent of Hadoop, there comes the need for professionals skilled in Hadoop Administration making it imperative to be skilled as a Hadoop Admin for better career, salary and job opportunities.
With the advent of Hadoop, there comes the need for professionals skilled in Hadoop Administration making it imperative to be skilled as a Hadoop Admin for better career, salary and job opportunities.
Talk held at a combined meeting of the Web Performance Karlsruhe (http://www.meetup.com/Karlsruhe-Web-Performance-Group/events/153207062) & Big Data Karlsruhe/Stuttgart (http://www.meetup.com/Big-Data-User-Group-Karlsruhe-Stuttgart/events/162836152) user groups.
Agenda:
- Why Hadoop 2?
- HDFS 2
- YARN
- YARN Apps
- Write your own YARN App
- Tez, Hive & Stinger Initiative
Data Protection in Hybrid Enterprise Data Lake EnvironmentDataWorks Summit
In the current digital world, Enterprises are drowning under the weight of data that are required to store for customers, for corporate analysis, and for the business forecast. With the convergence of cloud, IoT, and big data technologies, data lakes are becoming the critical fuel for enterprise-wide digital transformations which are proven to be cost-effective, self-service with elastic in nature. This enterprise data is spread widely across numerous clusters and repositories residing in both the companies data centers and multiple cloud locations posing a new “data protection” problem in hybrid environments. Protecting data is very critical as part of every business continuity plan because data loss or corruption may have a huge impact on enterprise survival. Protecting data is more challenging than ever in a complex hybrid enterprise data lake environments since we need to answer questions such as
- How do we move data seamlessly between enterprise data centers and cloud?
- How to secure enterprise data that resides in different locations with multiple authorization policies?
- How do we protect data from natural or accidental disasters to ensure operational continuity?
Not having immediate answers to these questions makes it very difficult for business users and platform operators to do their jobs in protecting data in hybrid enterprise data lake environments. Therefore enterprises require a unified data protection orchestration platform which seamlessly protects the data across multiple environments. In this talk, we will address the above challenges faced by enterprises using Apache Hadoop, Apache Hive, Apache Ranger and Apache Atlas.
We will outline using a unified open source orchestration platform how,
- You can protect mission-critical data along with their security and governance policies across multiple data lakes and change data capture works using Apache Hadoop, Apache Hive, Apache Ranger and Apache Atlas.
- You can monitor replication jobs and metric collections associated with the replicated data across hybrid enterprise data lake environments.
We will also showcase,
- How to seamlessly replicate HDFS data, Hive databases between Hortonworks clusters securely along with Apache Ranger policies and Apache Atlas metadata.
- How to securely move the data between on-premise clusters and cloud storages.
With the advent of Hadoop, there comes the need for professionals skilled in Hadoop Administration making it imperative to be skilled as a Hadoop Admin for better career, salary and job opportunities.
Know how to setup a Hadoop Cluster With HDFS High Availability here : www.edureka.co/blog/how-to-set-up-hadoop-cluster-with-hdfs-high-availability/
The Hadoop Cluster Administration course at Edureka starts with the fundamental concepts of Apache Hadoop and Hadoop Cluster. It covers topics to deploy, manage, monitor, and secure a Hadoop Cluster. You will learn to configure backup options, diagnose and recover node failures in a Hadoop Cluster. The course will also cover HBase Administration. There will be many challenging, practical and focused hands-on exercises for the learners. Software professionals new to Hadoop can quickly learn the cluster administration through technical sessions and hands-on labs. By the end of this six week Hadoop Cluster Administration training, you will be prepared to understand and solve real world problems that you may come across while working on Hadoop Cluster.
With Hadoop-3.0.0-alpha2 being released in January 2017, it's time to have a closer look at the features and fixes of Hadoop 3.0.
We will have a look at Core Hadoop, HDFS and YARN, and answer the emerging question whether Hadoop 3.0 will be an architectural revolution like Hadoop 2 was with YARN & Co. or will it be more of an evolution adapting to new use cases like IoT, Machine Learning and Deep Learning (TensorFlow)?
A comprehensive overview of the security concepts in the open source Hadoop stack in mid 2015 with a look back into the "old days" and an outlook into future developments.
Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...Abhiraj Butala
The talk covers limitations of current Hadoop eco-system components in handling security (Authentication, Authorization, Auditing) in multi-tenant, multi-application environments. Then it proposes how we can use Apache Ranger and HDFS super-user connections to enforce correct HDFS authorization policies and achieve the required auditing.
Most users know HDFS as the reliable store of record for big data analytics. HDFS is also used to store transient and operational data when working with cloud object stores, such as Microsoft Azure or Amazon S3, and on-premises object stores, such as Western Digital’s ActiveScale. In these settings, applications often manage data stored in multiple storage systems or clusters, requiring a complex workflow for synchronizing data between filesystems for business continuity planning (BCP) and/or supporting hybrid cloud architectures to achieve the required business goals for durability, performance, and coordination.
To resolve this complexity, HDFS-9806 has added a PROVIDED storage tier to mount external storage systems in the HDFS NameNode. Building on this functionality, we can now allow remote namespaces to be synchronized with HDFS, enabling asynchronous writes to the remote storage and the possibility to synchronously and transparently read data back to a local application wanting to access file data which is stored remotely. In this talk, which corresponds to the work in progress under HDFS-12090, we will present how the Hadoop admin can manage storage tiering between clusters and how that is then handled inside HDFS through the snapshotting mechanism and asynchronously satisfying the storage policy.
Speakers
Chris Douglas, Microsoft, Principal Research Software Engineer
Thomas Denmoor, Western Digital, Object Storage Architect
Upgrade Without the Headache: Best Practices for Upgrading Hadoop in ProductionCloudera, Inc.
Walk through some of the best practices to keep in mind when it comes to upgrading your cluster, and learn how to leverage new Upgrade Wizard features in Cloudera Enterprise 5.3.
For most mission critical workloads, downtime is never an option. Any downtime can have a direct impact on revenue and lead to frantic calls in the middle of the night. For this reason, upgrading the software that powers these workloads can often be a daunting task. It can cause unpredictable issues without access to support. That’s why an enterprise-grade administration tool is crucial for running Hadoop in production. Hadoop consists of dozens of components, running across multiple machines, all with their own configurations. That can lead to a lot of complexity and uncertainty - especially when taking the upgrade plunge.
Cloudera Manager makes it easy and is the only production-ready administration tool for Hadoop. Not only does Cloudera Manager feature zero-downtime rolling upgrades, but it also has a built in Upgrade Wizard to make upgrades simple and predictable.
With the advent of Hadoop, there comes the need for professionals skilled in Hadoop Administration making it imperative to be skilled as a Hadoop Admin for better career, salary and job opportunities.
With the advent of Hadoop, there comes the need for professionals skilled in Hadoop Administration making it imperative to be skilled as a Hadoop Admin for better career, salary and job opportunities.
Talk held at a combined meeting of the Web Performance Karlsruhe (http://www.meetup.com/Karlsruhe-Web-Performance-Group/events/153207062) & Big Data Karlsruhe/Stuttgart (http://www.meetup.com/Big-Data-User-Group-Karlsruhe-Stuttgart/events/162836152) user groups.
Agenda:
- Why Hadoop 2?
- HDFS 2
- YARN
- YARN Apps
- Write your own YARN App
- Tez, Hive & Stinger Initiative
Data Protection in Hybrid Enterprise Data Lake EnvironmentDataWorks Summit
In the current digital world, Enterprises are drowning under the weight of data that are required to store for customers, for corporate analysis, and for the business forecast. With the convergence of cloud, IoT, and big data technologies, data lakes are becoming the critical fuel for enterprise-wide digital transformations which are proven to be cost-effective, self-service with elastic in nature. This enterprise data is spread widely across numerous clusters and repositories residing in both the companies data centers and multiple cloud locations posing a new “data protection” problem in hybrid environments. Protecting data is very critical as part of every business continuity plan because data loss or corruption may have a huge impact on enterprise survival. Protecting data is more challenging than ever in a complex hybrid enterprise data lake environments since we need to answer questions such as
- How do we move data seamlessly between enterprise data centers and cloud?
- How to secure enterprise data that resides in different locations with multiple authorization policies?
- How do we protect data from natural or accidental disasters to ensure operational continuity?
Not having immediate answers to these questions makes it very difficult for business users and platform operators to do their jobs in protecting data in hybrid enterprise data lake environments. Therefore enterprises require a unified data protection orchestration platform which seamlessly protects the data across multiple environments. In this talk, we will address the above challenges faced by enterprises using Apache Hadoop, Apache Hive, Apache Ranger and Apache Atlas.
We will outline using a unified open source orchestration platform how,
- You can protect mission-critical data along with their security and governance policies across multiple data lakes and change data capture works using Apache Hadoop, Apache Hive, Apache Ranger and Apache Atlas.
- You can monitor replication jobs and metric collections associated with the replicated data across hybrid enterprise data lake environments.
We will also showcase,
- How to seamlessly replicate HDFS data, Hive databases between Hortonworks clusters securely along with Apache Ranger policies and Apache Atlas metadata.
- How to securely move the data between on-premise clusters and cloud storages.
With the advent of Hadoop, there comes the need for professionals skilled in Hadoop Administration making it imperative to be skilled as a Hadoop Admin for better career, salary and job opportunities.
Know how to setup a Hadoop Cluster With HDFS High Availability here : www.edureka.co/blog/how-to-set-up-hadoop-cluster-with-hdfs-high-availability/
The Hadoop Cluster Administration course at Edureka starts with the fundamental concepts of Apache Hadoop and Hadoop Cluster. It covers topics to deploy, manage, monitor, and secure a Hadoop Cluster. You will learn to configure backup options, diagnose and recover node failures in a Hadoop Cluster. The course will also cover HBase Administration. There will be many challenging, practical and focused hands-on exercises for the learners. Software professionals new to Hadoop can quickly learn the cluster administration through technical sessions and hands-on labs. By the end of this six week Hadoop Cluster Administration training, you will be prepared to understand and solve real world problems that you may come across while working on Hadoop Cluster.
Forrester predicts, CIOs who are late to the Hadoop game will finally make the platform a priority in 2015. Hadoop has evolved as a must-to-know technology and has been a reason for better career, salary and job opportunities for many professionals.
With the advent of Hadoop, there comes the need for professionals skilled in Hadoop Administration making it imperative to be skilled as a Hadoop Admin for better career, salary and job opportunities.
Forrester predicts, CIOs who are late to the Hadoop game will finally make the platform a priority in 2015. Hadoop has evolved as a must-to-know technology and has been a reason for better career, salary and job opportunities for many professionals.
5 Scenarios: When To Use & When Not to Use HadoopEdureka!
Forrester predicts, CIOs who are late to the Hadoop game will finally make the platform a priority in 2015. Hadoop has evolved as a must-to-know technology and has been a reason for better career, salary and job opportunities for many professionals.
Talend For Big Data : Secret Key to HadoopEdureka!
Today, when data is mushrooming and coming in heterogeneous forms, there is a growing need for a flexible, adaptable platforms. Talend fits just perfect in this space with a proven track record, making scope for vast opportunities. If you understand how to manage, transform, store your organisation data (retail, banking, airlines, research, insurance, cards etc.) and effectively represent it, then you are the resource organizations are looking for.
With the advent of Hadoop, there comes the need for professionals skilled in Hadoop Administration making it imperative to be skilled as a Hadoop Admin for better career, salary and job opportunities.
The following blogs will help you understand the significance of Hadoop Administration training:
http://www.edureka.co/blog/why-should-you-go-for-hadoop-administration-course/
http://www.edureka.co/blog/how-to-become-a-hadoop-administrator/
http://www.edureka.co/blog/hadoop-admin-responsibilities/
Introduction To Hadoop Administration - SpringPeopleSpringPeople
The Hadoop framework is used by major players including Google, Yahoo and IBM, largely for applications involving search engines and advertising. The popularity of Hadoop is juts increasing exponentially.
An overview of securing Hadoop. Content primarily by Balaji Ganesan, one of the leaders of the Apache Argus project. Presented on Sept 4, 2014 at the Toronto Hadoop User Group by Adam Muise.
As organizations start migrating to Cloud to improve their operational efficiency and reduce cost, Amazon's Cloud Service is one the most sought out platforms for all Cloud needs. The Blog talks about Cloud Computing, its benefits, best practices in AWS and so on.
В связи с ростом трафика и необходимостью объемного анализа данных, большие данные стали одной из самых популярных областей в сфере IT, и многие компании в настоящее время работают над этим вопросом — развертывают кластеры проекта Hadoop, который в настоящее время является самой популярной платформой для обработки больших данных. В докладе в доступной форме будут представлены вопросы обеспечения безопасности Hadoop или, точнее, их принципы, а также продемонстрированы различные векторы атак на кластер.
Best Hadoop Institutes : kelly tecnologies is the best Hadoop training Institute in Bangalore.Providing hadoop courses by realtime faculty in Bangalore.
Talk Abstract
The latest release of Accumulo has enhanced support for encryption at rest and in motion thanks to developments such as the pluggable key encryption strategy. Methods and practices to protect and manage encryption keys are left to the user. This talk details methods of protecting and managing encryption keys used for encryption at rest in Accumulo. Methods covered include file system and custom schemes as well as the leveraging of newly-released Hadoop features such as native HDFS encryption at rest and the Hadoop Key Management System (KMS) to protect and manage keys.
Speaker
Anthony Young-Garner
Software Engineer, Cloudera
Anthony Young-Garner joined Cloudera when it acquired the Austin-based security software company Gazzang. In the interim, he and the Gazzang team have been bringing their capabilities in encryption and key management to Hadoop.
Startup Case Study: Leveraging the Broad Hadoop Ecosystem to Develop World-Fi...DataWorks Summit
Back in 2014, our team set out to change the way the world exchanges and collaborates with data. Our vision was to build a single tenant environment for multiple organisations to securely share and consume data. And we did just that, leveraging multiple Hadoop technologies to help our infrastructure scale quickly and securely.
Today Data Republic’s technology delivers a trusted platform for hundreds of enterprise level companies to securely exchange, commercialise and collaborate with large datasets.
Join Head of Engineering, Juan Delard de Rigoulières and Senior Solutions Architect, Amin Abbaspour as they share key lessons from their team’s journey with Hadoop:
* How a startup leveraged a clever combination of Hadoop technologies to build a secure data exchange platform
* How Hadoop technologies helped us deliver key solutions around governance, security and controls of data and metadata
* An evaluation on the maturity and usefulness of some Hadoop technologies in our environment: Hive, HDFS, Spark, Ranger, Atlas, Knox, Kylin: we've use them all extensively.
* Our bold approach to expose APIs directly to end users; as well as the challenges, learning and code we created in the process
* Learnings from the front-line: How our team coped with code changes, performance tuning, issues and solutions while building our data exchange
Whether you’re an enterprise level business or a start-up looking to scale - this case study discussion offers behind-the-scenes lessons and key tips when using Hadoop technologies to manage data governance and collaboration in the cloud.
Speakers:
Juan Delard De Rigoulieres, Head of Engineering, Data Republic Pty Ltd
Amin Abbaspour, Senior Solutions Architect, Data Republic
What to learn during the 21 days Lockdown | EdurekaEdureka!
Register Here: https://resources.edureka.co/21-days-learning-plan-webinar/
In light of the complete national lockdown for 21 days, we invite you to join a FREE webinar by renowned Mentor and Advisor, Nitin Gupta as he helps you create a 21-day learning gameplan to maximize returns for your career.
The webinar will help freshers and experienced professionals to capitalize on these 21 days and figure out the best technologies to learn while confined to home.
You will also get all your questions and doubts resolved in real-time.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
Meetup: https://www.meetup.com/edureka/
Top 10 Dying Programming Languages in 2020 | EdurekaEdureka!
YouTube Link: https://youtu.be/LSM7hD6GM4M
Get Edureka Certified in Trending Programming Languages: https://www.edureka.co
In this highly competitive IT industry, everyone wants to learn programming languages that will keep them ahead of the game. But knowing what to learn so you gain the most out of your knowledge is a whole other ball game. So, we at Edureka have prepared a list of Top 10 Dying Programming Languages 2020 that will help you to make the right choice for your career. Meanwhile, if you ever wondered about which languages are slated for continuing uptake and possible greatness, we have a list for that, too.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
Top 5 Trending Business Intelligence Tools | EdurekaEdureka!
YouTube Link: https://youtu.be/eEwq_mPd1iI
Edureka BI Certification Training Courses: https://www.edureka.co/bi-and-visualization-certification-courses
Receiving insights and finding trends is absolutely critical for businesses to scale and adapt as the years go on. This is exactly what business intelligence does and the best thing about these software solutions is that their potential uses are practically unlimited.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
Tableau Tutorial for Data Science | EdurekaEdureka!
YouTube Link:https://youtu.be/ZHNdSKMluI0
Edureka Tableau Certification Training: https://www.edureka.co/tableau-certification-training
This Edureka's PPT on "Tableau for Data Science" will help you to utilize Tableau as a tool for Data Science, not only for engagement but also comprehension efficiency. Through this PPT, you will learn to gain the maximum amount of insight with the least amount of effort.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
YouTube Link:https://youtu.be/CVv8zhYEjUE
Edureka Python Certification Training: https://www.edureka.co/data-science-python-certification-course
This Edureka PPT on 'Python Programming' will help you learn Python programming basics with the help of interesting hands-on implementations.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
YouTube Link:https://youtu.be/LvgqSMlIXFs
Get Edureka Certified in Trending Project Management Certifications: https://www.edureka.co/project-management-and-methodologies-certification-courses
Whether you want to scale up your career or are trying to switch your career path, Project Management Certifications seems to be a perfect choice in either case. So, we at Edureka have prepared a list of Top 5 Project Management Certifications that you must check out in 2020 for a major career boost.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
Top Maven Interview Questions in 2020 | EdurekaEdureka!
YouTube Link: https://youtu.be/5iTcAR4fScM
**DevOps Certification Courses - https://www.edureka.co/devops-certification-training***
This video on 'Maven Interview Questions' discusses the most frequently asked Maven Interview Questions. This PPT will help give you a detailed explanation of the topics which will help you in acing the interviews.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
YouTube Link: https://youtu.be/xHUiYEIcY_I
** Linux Administration Certification Training - https://www.edureka.co/linux-admin **
Linux Mint is the first operating system that people from Windows or Mac are drawn towards when they have to switch to Linux in their work environment. Linux Mint has been around since the year 2006 and has grown and matured into a very user-friendly OS. Do watch the PPT till the very end to see all the demonstrations.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
How to Deploy Java Web App in AWS| EdurekaEdureka!
YouTube Link:https://youtu.be/Ozc5Yu_IcaI
** Edureka AWS Architect Certification Training - https://www.edureka.co/aws-certification-training**
This Edureka PPT shows how to deploy a java web application in AWS using AWS Elastic Beanstalk. It also describes the advantages of using AWS for this purpose.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
YouTube Link:https://youtu.be/phPCkkWT76k
*** Edureka Digital Marketing Course: https://www.edureka.co/post-graduate/digital-marketing-certification***
This Edureka PPT on "Top 10 Reasons to Learn Digital Marketing" will help you understand why you should take up Digital Marketing
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
YouTube Link: https://youtu.be/R132INtDg9k
** RPA Training: https://www.edureka.co/robotic-process-automation-training**
This PPT on RPA in 2020 will provide a glimpse of the accomplishments and benefits provided by RPA. Also, it will list out the new changes and technologies that will collaborate with RPA in 2020.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
YouTube Link: https://youtu.be/mb8WOHejlT8
**DevOps Certification Courses - https://www.edureka.co/devops-certification-training **
This PPT shows how to configure Jenkins to receive email notifications. It also includes a demo that shows how to do it in 6 simple steps in the Windows machine.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
EA Algorithm in Machine Learning | EdurekaEdureka!
YouTube Link: https://youtu.be/DIADjJXrgps
** Machine Learning Certification Training: https://www.edureka.co/machine-learning-certification-training **
This Edureka PPT on 'EM Algorithm In Machine Learning' covers the EM algorithm along with the problem of latent variables in maximum likelihood and Gaussian mixture model.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
YouTube Link: https://youtu.be/Zsl7ttA9Kcg
PGP in AI and Machine Learning (9 Months Online Program): https://www.edureka.co/post-graduate/machine-learning-and-ai
This Edureka PPT on "Cognitive AI" explains cognitive computing and how it helps in making better human decisions at work. Also, it explains the differences between cognitive computing and artificial intelligence.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
YouTube Link: https://youtu.be/0djPrlaxx_U
Edureka AWS Architect Certification Training - https://www.edureka.co/aws-certification-training
This Edureka PPT on AWS Cloud Practitioner will provide a complete guide to your AWS Cloud Practitioner Certification exam. It will explain the exam details, objectives, why you should get certified and also how AWS certification will help your career.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
Blue Prism Top Interview Questions | EdurekaEdureka!
YouTube Link: https://youtu.be/ykbRdUNIbyQ
** RPA Training: https://www.edureka.co/robotic-process-automation-certification-courses**
This PPT on Blue Prism Interview Questions will cover the Top 50 Blue Prism related questions asked in your interviews.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
YouTube Link: https://youtu.be/ge4qhkl9uKg
AWS Architect Certification Training: https://www.edureka.co/aws-certification-training
This PPT will help you in understanding how AWS deals smartly with Big Data. It also shows how AWS can solve Big Data challenges with ease.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaEdureka!
YouTube Link: https://youtu.be/amlkE0g-YFU
** Artificial Intelligence and Deep Learning: https://www.edureka.co/ai-deep-learni... **
This Edureka PPT on 'A Star Algorithm' teaches you all about the A star Algorithm, the uses, advantages and disadvantages and much more. It also shows you how the algorithm can be implemented practically and has a comparison between the Dijkstra and itself.
Check out our playlist for more videos: http://bit.ly/2taym8X
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
Kubernetes Installation on Ubuntu | EdurekaEdureka!
YouTube Link: https://youtu.be/UWg3ORRRF60
Kubernetes Certification: https://www.edureka.co/kubernetes-certification
This Edureka PPT will help you set up a Kubernetes cluster having 1 master and 1 node. The detailed step by step instructions is demonstrated in this PPT.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
YouTube Link: https://youtu.be/GJQ36pIYbic
DevOps Training: https://www.edureka.co/devops-certification-training
This Edureka DevOps Tutorial for Beginners talks about What is DevOps and how it works. You will learn about several DevOps tools (Git, Jenkins, Docker, Puppet, Ansible, Nagios) involved at different DevOps stages such as version control, continuous integration, continuous delivery, continuous deployment, continuous monitoring.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Enhancing Performance with Globus and the Science DMZGlobus
ESnet has led the way in helping national facilities—and many other institutions in the research community—configure Science DMZs and troubleshoot network issues to maximize data transfer performance. In this talk we will present a summary of approaches and tips for getting the most out of your network infrastructure using Globus Connect Server.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
2. www.edureka.co/hadoop-adminSlide 2 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions
Objectives
At the end of this module, you will be able to
Hadoop Cluster introduction
Recommended Configuration for cluster
Hadoop cluster running modes
Hadoop Security with Kerberos
HDFS Security with ACLs (Access Control Lists )
Hadoop Admin Responsibilities
Demo on Security
4. Slide 4
RAM: 16GB
Hard disk: 6 x 2TB
Processor: Xenon with 2 cores
Ethernet: 3 x 10 GB/s
OS: 64-bit CentOS
Hadoop Cluster: A Typical Use Case
RAM: 16GB
Hard disk: 6 x 2TB
Processor: Xenon with 2 cores.
Ethernet: 3 x 10 GB/s
OS: 64-bit CentOS
RAM: 64 GB,
Hard disk: 1 TB
Processor: Xenon with 8 Cores
Ethernet: 3 x 10 GB/s
OS: 64-bit CentOS
Power: Redundant Power Supply
RAM: 32 GB,
Hard disk: 1 TB
Processor: Xenon with 4 Cores
Ethernet: 3 x 10 GB/s
OS: 64-bit CentOS
Power: Redundant Power Supply
Active NameNodeSecondary NameNode
DataNode DataNode
RAM: 64 GB,
Hard disk: 1 TB
Processor: Xenon with 8 Cores
Ethernet: 3 x 10 GB/s
OS: 64-bit CentOS
Power: Redundant Power Supply
StandBy NameNode
Optional
RAM: 16GB
Hard disk: 6 x 2TB
Processor: Xenon with 2 cores
Ethernet: 3 x 10 GB/s
OS: 64-bit CentOS
DataNode
DataNode DataNode DataNode
www.edureka.co/hadoop-admin
5. www.edureka.co/hadoop-adminSlide 5
Slave Nodes: Recommended Configuration
Higher-performance vs lower performance components
Save the Money, Buy more Nodes!
General ( Depends on requirement
‘base’ configuration for a slave Node
» 4 x 1 TB or 2 TB hard drives, in a
JBOD* configuration
» Do not use RAID!
» 2 x Quad-core CPUs
» 24 -32GB RAM
» Gigabit Ethernet
General Configuration
Multiples of ( 1 hard drive + 2 cores
+ 6-8GB RAM) generally work well
for many types of applications
Special Configuration
Slave Nodes
“A cluster with more nodes performs better than one with fewer, slightly faster nodes”
6. www.edureka.co/hadoop-adminSlide 6
Hadoop Cluster Modes
Hadoop can run in any of the following three modes:
Fully-Distributed Mode
Pseudo-Distributed Mode
No daemons, everything runs in a single JVM
Suitable for running MapReduce programs during development
Has no DFS
Hadoop daemons run on the local machine
Hadoop daemons run on a cluster of machines
Standalone (or Local) Mode
7. Slide 7 www.edureka.in/hadoop-admin
Security issues in Hadoop Cluster
Unauthorized clients can impersonate authorized users and access the cluster
Get the blocks directly from the Data nodes by bypassing the Name node
Eavesdropping of data packets being sent by Data nodes to client
Not all users should have access to sensitive data
No User verification for Map Reduce code execution, malicious users could submit a job
Insecure Network Transport
No Message level security
8. Slide 8 www.edureka.in/hadoop-admin
Hadoop security considerations
Authentication
Authorization
Access control
Data masking and encryption
Network security
Integrity
Confidentiality
Audits and event monitoring
10. Slide 10 www.edureka.in/hadoop-admin
Kerberos to the rescue
Network authentication protocol
Developed at MIT in the mid 1980s
Easy for administrators to manage passwords by storing them centrally
Enhance security by ensuring no clear text passwords are transmitted
Allow users to access different services with the same password
Available as open source or in supported commercial software
11. Slide 11 www.edureka.in/hadoop-admin
Kerberos Design Requirements
Interactions between hosts and clients should be encrypted.
Must be convenient for users (or they won’t use it).
Protect against intercepted credentials.
Kerberos is based on the Secret-Key Distribution Model
-keys are the basis of authentication in Kerberos
-typically a short sequence of bytes.
-used to both encrypt & decrypt
12. Slide 12 www.edureka.in/hadoop-admin
Kerberos Components & Terminology
Kerberos Client
Kerberos Server
Kerberos Key Distribution Center ( KDC )
Authentication Server ( AS )
Ticket-Granting Server ( TGS )
Users and Services in a Kerberos realm are know as Principals.
13. Slide 13 www.edureka.in/hadoop-admin
Kerberos to the rescue
Kerberos Integration
User Authentication
User and Group access control list at
cluster level
Tokens
Delegation
Job
Block Access
Simple Authentication and Security Layer
(SASL) with RPC digest mechanism
Server
1: Authentication
Get TGT
2: Authorization
Get Service Ticket
3: Service Request
Start Service Session
Kerberos Key Distribution Center
Authentication
Server
Ticket Granting
Server
Client
14. Slide 14 www.edureka.in/hadoop-admin
Kerberos to the rescue
Server
Kerberos Key Distribution Center
Authentication
Server
Ticket Granting
Server
Client
1.Request TGT (Auth)
2.Responds with encrypted session key + TGT (TGT + Sk1)
3. Request Service ticket by providing TGT
4. Encrypted session key and ticket granted for service access
( TGT + Sk2 )
5. Authenticates with Service Ticket
(Auth + TGT)
6. Server responds with encrypted timestamp ( Sk2 + Auth )
(Auth + TGT)
Auth -> Authenticator
TGT -> Ticket Granting Ticket
Sk1 Sk2 -> Session Key
15. Slide 15 www.edureka.in/hadoop-admin
Kerberos advantages
A password never travels over the network. Only time-sensitive tickets travel over the network.
Passwords or secret keys are only known to the KDC and the principal.
Kerberos supports passwords or secret keys to be stored in a centralized credential store that is LDAP-
complaint. This makes it easy for the administrators to manage the system and the users.
Servers don't have to store any tickets or any client-specific details to authenticate a client.
17. Slide 17 www.edureka.in/hadoop-admin
HDFS Permissions ( ACLs )
HDFS has supported a permission model equivalent to traditional Unix permission
For each file or directory, permissions are managed for a set of 3 distinct user classes
Owner
Group
Others
There are 3 different permissions controlled for each user class
Read
Write
Execute
For files : The r permission is required to read the file, and the w permission is required to write or append to
the file.
For directories : the r permission is required to list the contents of the directory, the w permission is required
to create or delete files or directories, and the x permission is required to access a child of the directory.
18. Slide 18 www.edureka.in/hadoop-admin
HDFS Permissions ( ACLs )
Each client process that accesses HDFS has a two-part identity composed of the user name, and groups list.
Whenever HDFS must do a permissions check for a file or directory foo accessed by a client process
1. If the user name matches the owner of foo, then the owner permissions are tested
2. Else if the group of foo matches any of member of the groups list, then the group permissions are
tested
3. Otherwise the other permissions of foo are tested.
4. If a permissions check fails, the client operation fails.
19. Slide 19 www.edureka.in/hadoop-admin
ACLs Shell Commands
hdfs dfs -getfacl [-R] <path>
Displays the Access Control Lists (ACLs) of files and directories. If a directory has a default ACL, then
getfacl also displays the default ACL.
hdfs dfs -setfacl [-R] [-b |-k -m |-x <acl_spec> <path>] |[--set <acl_spec> <path>]
Sets Access Control Lists (ACLs) of files and directories.
hdfs dfs -ls <args>
The output of ls will append a ‘+’ character to the permissions string of any file or directory that has an
ACL.
21. www.edureka.co/hadoop-adminSlide 21
Hadoop Admin Responsibilities
Responsible for implementation and administration of Hadoop infrastructure.
Testing HDFS, Hive, Pig and MapReduce access for Applications.
Cluster maintenance tasks like Backup, Recovery, Upgrade, Patching.
Performance tuning and Capacity planning for Clusters.
Monitor Hadoop cluster and deploy security.
22. LIVE Online Class
Class Recording in LMS
24/7 Post Class Support
Module Wise Quiz
Project Work
Verifiable Certificate
www.edureka.co/hadoop-adminSlide 22 Twitter @edurekaIN, Facebook /edurekaIN, use #askEdureka for Questions
How it Works?