The document outlines 11 steps to execute a word count program in Hadoop, including creating directories, writing Java code for a mapper and reducer, compiling the code, creating a JAR file, copying input files to HDFS, executing the program via Hadoop, and checking the output on the NameNode. It also provides an alternate execution command without creating a JAR file.
As part of the recent release of Hadoop 2 by the Apache Software Foundation, YARN and MapReduce 2 deliver significant upgrades to scheduling, resource management, and execution in Hadoop.
At their core, YARN and MapReduce 2’s improvements separate cluster resource management capabilities from MapReduce-specific logic. YARN enables Hadoop to share resources dynamically between multiple parallel processing frameworks such as Cloudera Impala, allows more sensible and finer-grained resource configuration for better cluster utilization, and scales Hadoop to accommodate more and larger jobs.
Are you looking for MOBILE APPLICATION DEVELOPMENT? But still confused, What is Mobile application development process? If Yes, then you are at right place. In today scenario, companies and small enterprise are mostly focusing on building a mobile app presence.
“ The mobile app development industry is growing at a blazing 43% per year and shows no signs of slowing down.”
We have created this PPT to help you understand the process of mobile application development.
These are some of the questions this PPT will answer for you:
1. What is the current stats of mobile application market?
2. How can Mobile Application be benefiting your enterprise?
3. How can small business like restaurant business or other get to heights by an mobile application?
4. How do Mobile Applications can benefits your business?
5. What your Clients are finding?
As part of the recent release of Hadoop 2 by the Apache Software Foundation, YARN and MapReduce 2 deliver significant upgrades to scheduling, resource management, and execution in Hadoop.
At their core, YARN and MapReduce 2’s improvements separate cluster resource management capabilities from MapReduce-specific logic. YARN enables Hadoop to share resources dynamically between multiple parallel processing frameworks such as Cloudera Impala, allows more sensible and finer-grained resource configuration for better cluster utilization, and scales Hadoop to accommodate more and larger jobs.
Are you looking for MOBILE APPLICATION DEVELOPMENT? But still confused, What is Mobile application development process? If Yes, then you are at right place. In today scenario, companies and small enterprise are mostly focusing on building a mobile app presence.
“ The mobile app development industry is growing at a blazing 43% per year and shows no signs of slowing down.”
We have created this PPT to help you understand the process of mobile application development.
These are some of the questions this PPT will answer for you:
1. What is the current stats of mobile application market?
2. How can Mobile Application be benefiting your enterprise?
3. How can small business like restaurant business or other get to heights by an mobile application?
4. How do Mobile Applications can benefits your business?
5. What your Clients are finding?
Hive Training -- Motivations and Real World Use Casesnzhang
Hive is an open source data warehouse systems based on Hadoop, a MapReduce implementation.
This presentation introduces the motivations of developing Hive and how Hive is used in the real world situation, particularly in Facebook.
Artificial intelligence is based on the principle that human intelligence can be defined in a way that a machine can easily mimic it and execute tasks, from the simplest to those that are even more complex. The goals of artificial intelligence include learning, reasoning, and perception.
“We’re at beginning of a golden age of AI. Recent advancements have already led to invention that previously lived in the realm of science fiction – and we have only scratched the surface of what’s possible”– JEFF BEZOS, Amazon CEO
Some examples, vision-recognition systems on self-driving cars, in the recommendation engines that suggest products you might like based on what you bought in the past, speech, and language recognition of the Siri virtual assistant on the Apple iPhone.
AI is making a huge impact in all domains of the industry. Every industry looking to automate certain jobs through the use of intelligent machinery. And a good Agriculture and farming are one of the oldest and most important professions in the world. It plays an important role in the economic sector. Worldwide, agriculture is a $5 trillion industry.
The global population is expected to reach more than nine billion by 2050 which will require an increase in agricultural production by 70% to fulfill the demand. As the world population is increasing due to which land water and resources becoming insufficient to continue the demand-supply chain. So, we need a smarter approach and become more efficient about how we farm and can be most productive
In this presentation, We will cover are challenges faced by farmers by using traditional methods of farming and how Artificial Intelligence is making a revolution in agriculture by replacing traditional methods by using more efficient methods and helping the world to become a better place.
Artificial Intelligence in agriculture not only helping farmers to automate their farming but also shifts to precise cultivation for higher crop yield and better quality while using fewer resources.
Companies involved in improving machine learning or Artificial Intelligence-based products or services like training data for agriculture, drone, and automated machine making will get technological advancement in the future will provide more useful applications to this sector helping the world deal with food production issues for the growing population.
The future of AI in farming largely depends on the adoption of AI solutions. Although some large-scale researches are in progress and some applications are already in the market, yet industry in agriculture is underserved. Moreover, creating predictive solutions to solve a real challenge faced by farmers in farming is still in progress at an early stage.
Open source software presentation
Advantages of open-source software
Disadvantages of open-source software
MYTH about open source software
Example of open source
What is the open source license
open source vs closed course
Why do people prefer using open source software?
A MapReduce job usually splits the input data-set into independent chunks which are processed by the map tasks in a completely parallel manner. The framework sorts the outputs of the maps, which are then input to the reduce tasks. Typically both the input and the output of the job are stored in a file-system.
Apache Bigtop: a crash course in deploying a Hadoop bigdata management platformrhatr
A long time ago in a galaxy far, far away only the chosen few could deploy and operate a fully functional Hadoop cluster. Vendors were taking pride in rationalizing this experience to their customers by creating various distributions including Apache Hadoop. It all changed when Cloudera decided to support Apache Bigtop as the first 100% community driven bigdata management distribution of Apache Hadoop. Today, most major commercial distribution of Apache Hadoop are based on Bigtop. Bigtop has won the Hadoop distributions war and is offering a superset of packaged components. In this talk we will focus on practical advice of how to deploy and start operating a Hadoop cluster using Bigtop’s packages and deployment code. We will dive into the details of using packages of Hadoop ecosystem provided by Bigtop and how to build data management pipelines in support your enterprise applications.
Learning Objectives - This module will help you in understanding Apache Hive Installation, Loading and Querying Data in Hive and so on.
Topics - Hive Architecture and Installation, Comparison with Traditional Database, HiveQL: Data Types, Operators and Functions, Hive Tables (Managed Tables and External Tables, Partitions and Buckets, Storage Formats, Importing Data, Altering Tables, Dropping Tables), Querying Data (Sorting And Aggregating, Map Reduce Scripts, Joins & Subqueries, Views, Map and Reduce side Joins to optimize Query).
Hadoop YARN is a specific component of the open source Hadoop platform for big data analytics.
YARN stands for “Yet Another Resource Negotiator”. YARN was introduced to make the most out of HDFS.
Job scheduling is also handled by YARN.
Hadoop is the popular open source like Facebook, Twitter, RFID readers, sensors, and implementation of MapReduce, a powerful tool so on.Your management wants to derive designed for deep analysis and transformation of information from both the relational data and thevery large data sets. Hadoop enables you to unstructuredexplore complex data, using custom analyses data, and wants this information as soon astailored to your information and questions. possible.Hadoop is the system that allows unstructured What should you do? Hadoop may be the answer!data to be distributed across hundreds or Hadoop is an open source project of the Apachethousands of machines forming shared nothing Foundation.clusters, and the execution of Map/Reduce It is a framework written in Java originallyroutines to run on the data in that cluster. Hadoop developed by Doug Cutting who named it after hishas its own filesystem which replicates data to sons toy elephant.multiple nodes to ensure if one node holding data Hadoop uses Google’s MapReduce and Google Filegoes down, there are at least 2 other nodes from System technologies as its foundation.which to retrieve that piece of information. This It is optimized to handle massive quantities of dataprotects the data availability from node failure, which could be structured, unstructured orsomething which is critical when there are many semi-structured, using commodity hardware, thatnodes in a cluster (aka RAID at a server level). is, relatively inexpensive computers. This massive parallel processing is done with greatWhat is Hadoop? performance. However, it is a batch operation handling massive quantities of data, so theThe data are stored in a relational database in your response time is not immediate.desktop computer and this desktop computer As of Hadoop version 0.20.2, updates are nothas no problem handling this load. possible, but appends will be possible starting inThen your company starts growing very quickly, version 0.21.and that data grows to 10GB. Hadoop replicates its data across differentAnd then 100GB. computers, so that if one goes down, the data areAnd you start to reach the limits of your current processed on one of the replicated computers.desktop computer. Hadoop is not suitable for OnLine Transaction So you scale-up by investing in a larger computer, Processing workloads where data are randomly and you are then OK for a few more months. accessed on structured data like a relational When your data grows to 10TB, and then 100TB. database.Hadoop is not suitable for OnLineAnd you are fast approaching the limits of that Analytical Processing or Decision Support Systemcomputer. workloads where data are sequentially accessed onMoreover, you are now asked to feed your structured data like a relational database, to application with unstructured data coming from generate reports that provide business sources intelligence. Hadoop is used for Big Data. It complements OnLine Transaction Processing and OnLine Analytical Pro
Hive Training -- Motivations and Real World Use Casesnzhang
Hive is an open source data warehouse systems based on Hadoop, a MapReduce implementation.
This presentation introduces the motivations of developing Hive and how Hive is used in the real world situation, particularly in Facebook.
Artificial intelligence is based on the principle that human intelligence can be defined in a way that a machine can easily mimic it and execute tasks, from the simplest to those that are even more complex. The goals of artificial intelligence include learning, reasoning, and perception.
“We’re at beginning of a golden age of AI. Recent advancements have already led to invention that previously lived in the realm of science fiction – and we have only scratched the surface of what’s possible”– JEFF BEZOS, Amazon CEO
Some examples, vision-recognition systems on self-driving cars, in the recommendation engines that suggest products you might like based on what you bought in the past, speech, and language recognition of the Siri virtual assistant on the Apple iPhone.
AI is making a huge impact in all domains of the industry. Every industry looking to automate certain jobs through the use of intelligent machinery. And a good Agriculture and farming are one of the oldest and most important professions in the world. It plays an important role in the economic sector. Worldwide, agriculture is a $5 trillion industry.
The global population is expected to reach more than nine billion by 2050 which will require an increase in agricultural production by 70% to fulfill the demand. As the world population is increasing due to which land water and resources becoming insufficient to continue the demand-supply chain. So, we need a smarter approach and become more efficient about how we farm and can be most productive
In this presentation, We will cover are challenges faced by farmers by using traditional methods of farming and how Artificial Intelligence is making a revolution in agriculture by replacing traditional methods by using more efficient methods and helping the world to become a better place.
Artificial Intelligence in agriculture not only helping farmers to automate their farming but also shifts to precise cultivation for higher crop yield and better quality while using fewer resources.
Companies involved in improving machine learning or Artificial Intelligence-based products or services like training data for agriculture, drone, and automated machine making will get technological advancement in the future will provide more useful applications to this sector helping the world deal with food production issues for the growing population.
The future of AI in farming largely depends on the adoption of AI solutions. Although some large-scale researches are in progress and some applications are already in the market, yet industry in agriculture is underserved. Moreover, creating predictive solutions to solve a real challenge faced by farmers in farming is still in progress at an early stage.
Open source software presentation
Advantages of open-source software
Disadvantages of open-source software
MYTH about open source software
Example of open source
What is the open source license
open source vs closed course
Why do people prefer using open source software?
A MapReduce job usually splits the input data-set into independent chunks which are processed by the map tasks in a completely parallel manner. The framework sorts the outputs of the maps, which are then input to the reduce tasks. Typically both the input and the output of the job are stored in a file-system.
Apache Bigtop: a crash course in deploying a Hadoop bigdata management platformrhatr
A long time ago in a galaxy far, far away only the chosen few could deploy and operate a fully functional Hadoop cluster. Vendors were taking pride in rationalizing this experience to their customers by creating various distributions including Apache Hadoop. It all changed when Cloudera decided to support Apache Bigtop as the first 100% community driven bigdata management distribution of Apache Hadoop. Today, most major commercial distribution of Apache Hadoop are based on Bigtop. Bigtop has won the Hadoop distributions war and is offering a superset of packaged components. In this talk we will focus on practical advice of how to deploy and start operating a Hadoop cluster using Bigtop’s packages and deployment code. We will dive into the details of using packages of Hadoop ecosystem provided by Bigtop and how to build data management pipelines in support your enterprise applications.
Learning Objectives - This module will help you in understanding Apache Hive Installation, Loading and Querying Data in Hive and so on.
Topics - Hive Architecture and Installation, Comparison with Traditional Database, HiveQL: Data Types, Operators and Functions, Hive Tables (Managed Tables and External Tables, Partitions and Buckets, Storage Formats, Importing Data, Altering Tables, Dropping Tables), Querying Data (Sorting And Aggregating, Map Reduce Scripts, Joins & Subqueries, Views, Map and Reduce side Joins to optimize Query).
Hadoop YARN is a specific component of the open source Hadoop platform for big data analytics.
YARN stands for “Yet Another Resource Negotiator”. YARN was introduced to make the most out of HDFS.
Job scheduling is also handled by YARN.
Hadoop is the popular open source like Facebook, Twitter, RFID readers, sensors, and implementation of MapReduce, a powerful tool so on.Your management wants to derive designed for deep analysis and transformation of information from both the relational data and thevery large data sets. Hadoop enables you to unstructuredexplore complex data, using custom analyses data, and wants this information as soon astailored to your information and questions. possible.Hadoop is the system that allows unstructured What should you do? Hadoop may be the answer!data to be distributed across hundreds or Hadoop is an open source project of the Apachethousands of machines forming shared nothing Foundation.clusters, and the execution of Map/Reduce It is a framework written in Java originallyroutines to run on the data in that cluster. Hadoop developed by Doug Cutting who named it after hishas its own filesystem which replicates data to sons toy elephant.multiple nodes to ensure if one node holding data Hadoop uses Google’s MapReduce and Google Filegoes down, there are at least 2 other nodes from System technologies as its foundation.which to retrieve that piece of information. This It is optimized to handle massive quantities of dataprotects the data availability from node failure, which could be structured, unstructured orsomething which is critical when there are many semi-structured, using commodity hardware, thatnodes in a cluster (aka RAID at a server level). is, relatively inexpensive computers. This massive parallel processing is done with greatWhat is Hadoop? performance. However, it is a batch operation handling massive quantities of data, so theThe data are stored in a relational database in your response time is not immediate.desktop computer and this desktop computer As of Hadoop version 0.20.2, updates are nothas no problem handling this load. possible, but appends will be possible starting inThen your company starts growing very quickly, version 0.21.and that data grows to 10GB. Hadoop replicates its data across differentAnd then 100GB. computers, so that if one goes down, the data areAnd you start to reach the limits of your current processed on one of the replicated computers.desktop computer. Hadoop is not suitable for OnLine Transaction So you scale-up by investing in a larger computer, Processing workloads where data are randomly and you are then OK for a few more months. accessed on structured data like a relational When your data grows to 10TB, and then 100TB. database.Hadoop is not suitable for OnLineAnd you are fast approaching the limits of that Analytical Processing or Decision Support Systemcomputer. workloads where data are sequentially accessed onMoreover, you are now asked to feed your structured data like a relational database, to application with unstructured data coming from generate reports that provide business sources intelligence. Hadoop is used for Big Data. It complements OnLine Transaction Processing and OnLine Analytical Pro
The title "Big Data using Hadoop.pdf" suggests that the document is likely a PDF file that focuses on the utilization of Hadoop technology in the context of Big Data. Hadoop is a popular open-source framework for distributed storage and processing of large datasets. The document is expected to cover various aspects of working with big data, emphasizing the role of Hadoop in managing and analyzing vast amounts of information.
Hadoop installation on windows using virtual box and also hadoop installation on ubuntu
http://logicallearn2.blogspot.in/2018/01/hadoop-installation-on-ubuntu.html
Installation and setup hadoop publishedDipendra Kusi
Here you will learn how to setup Apache Hadoop for Big Data and learn the basic script like word count in Apache Hadoop in Cloudera environment. Further more you will learn how to create jar file in eclipse that can be run in Hadoop framework and required library for it.
Get to know the configuration with Hadoop installation types and also handling of the HDFS files.
Let me know if anything is required. Happy to help.
Ping me google #bobrupakroy.
Talk soon!
Docman - The swiss army knife for Drupal multisite docroot management and dep...Aleksey Tkachenko
Introducing Docman (available on github, alpha state, but used already in production environment): the Swiss Army Knife for Drupal multisite docroot management and deployment. Docman acts as a layer between your docroot – usually a git repository somewhere, but not limited to it– and multiple vendors working on different websites using your standards and predefined sets of modules.
Accompanying slides for the class “Introduction to Hadoop” at the PRACE Autumn school 2020 - HPC and FAIR Big Data organized by the faculty of Mechanical Engineering of the University of Ljubljana (Slovenia).
A tutorial presentation based on hadoop.apache.org documentation.
I gave this presentation at Amirkabir University of Technology as Teaching Assistant of Cloud Computing course of Dr. Amir H. Payberah in spring semester 2015.
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
Palestine last event orientationfvgnh .pptxRaedMohamed3
An EFL lesson about the current events in Palestine. It is intended to be for intermediate students who wish to increase their listening skills through a short lesson in power point.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
Biological screening of herbal drugs: Introduction and Need for
Phyto-Pharmacological Screening, New Strategies for evaluating
Natural Products, In vitro evaluation techniques for Antioxidants, Antimicrobial and Anticancer drugs. In vivo evaluation techniques
for Anti-inflammatory, Antiulcer, Anticancer, Wound healing, Antidiabetic, Hepatoprotective, Cardio protective, Diuretics and
Antifertility, Toxicity studies as per OECD guidelines
Embracing GenAI - A Strategic ImperativePeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Honest Reviews of Tim Han LMA Course Program.pptxtimhan337
Personal development courses are widely available today, with each one promising life-changing outcomes. Tim Han’s Life Mastery Achievers (LMA) Course has drawn a lot of interest. In addition to offering my frank assessment of Success Insider’s LMA Course, this piece examines the course’s effects via a variety of Tim Han LMA course reviews and Success Insider comments.
A Strategic Approach: GenAI in EducationPeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
1. WORD COUNT PROGRAM EXECUTION STEPS IN HADOOP
Step 1: Create a directory called wordcount in /home/user/Documents/
cd /home/user/Documents/
sudo mkdir wordcount
cd wordcount
Step 2: Create a WordCount.java file in the wordcount directory
vi WordCount.java
Sample content of the WordCount.java file
//package org.myorg;
import java.io.IOException;
import java.util.*;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;
public class WordCount {
public static class Map extends MapReduceBase implements Mapper<LongWritable, Text,
Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable>
output, Reporter reporter) throws IOException {
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
output.collect(word, one);
}
}
}
public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable,
Text, IntWritable> {
public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text,
IntWritable> output, Reporter reporter) throws IOException {
int sum = 0;
while (values.hasNext()) {
sum += values.next().get();
}
output.collect(key, new IntWritable(sum));
}
}
public static void main(String[] args) throws Exception {
JobConf conf = new JobConf(WordCount.class);
2. conf.setJobName("wordcount");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
conf.setMapperClass(Map.class);
//conf.setCombinerClass(Reduce.class);
conf.setReducerClass(Reduce.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));
JobClient.runJob(conf);
}
}
Step 3: Create a directory called wordcountc in /home/user/Documents/wordcount/
sudo mkdir wordcountc
Step 4: Create a directory on the Hadoop file system
hdfs dfs -mkdir /example1
Step 5: Copy the input file from the local system to Hadoop file system
hdfs dfs -copyFromLocal /home/user/Documents/emp.txt /example/
Step 6: sudo javac -classpath /usr/local/hadoop/share/hadoop/common/hadoop-common-
2.6.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-core-
2.6.0.jar:/usr/local/hadoop/share/hadoop/common/lib/hadoop-annotations-2.6.0.jar -d
wordcountc/ WordCount.java
Step 7: Now after compilation 3 class files will be generated in the directory 'wordcount'
Step 8: Create jar file using the command
sudo jar -cvf wordcountj.jar -C /home/user/Documents/wordcount/wordcountc .
Step 9: Change to /usr/local/hadoop/ folder
cd /usr/local/hadoop/ folder
Step 10: Execute using the below command
bin/hadoop jar /home/user/Documents/wordcount/wordcountj.jar WordCount
/example1/emp.txt output
Step 11: The output can be checked by typing http://localhost:50070 on the browser window. It
will display data node information. In the menu, under 'utilities', you can see an option for
'browse the file system'. Click that and find out the result of the execution under
'/user/hdpuser/output' directory.
Execution of wordcount program without creating a jar file
Execute the following command
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar
wordcount /example1/emp.txt output1
Prepared by
Jiju K Joseph, AP/CSE