BIG DATA LAB sahil today.docx

Submitted by –
Roll No –
RegistrationNo –
Subject Code – CAP457
Subject Name - Big Data
Submitted to – Dr. Apash Roy

1. Make a documentation about the steps of installation process of Cloudera in Virtual
machine with screenshot
Ans:
● Step1: You can download the latest version of VirtualBox from the Virtual Box
website: https://www.virtualbox.org/wiki/Downloads according to the version of
your operating systemWindows, Mac or Linux.
● Step2: creating a new virtual machine (vm)

On the screen, click the "New" button on the toolbar.
● Step3: Enter the name of the virtual machine, the type of the operating systemand
the speciﬁc version of the operating system, then click the "Next" button.

● Step4: Enter the amount of memory required by the VM and click the "Next"
button.

● Step5: then click on the option to create a new virtual hard drive by clicking the
"Create" button.

● Step6: then click on VDI(virtualBox Disk Image).then click on next button.

● Step7: then click on the dynamically allocated option by clicking the "Next"
button.

● Step8: enter the required location, name and size of the virtual disk and clickthe
"Create" button.

● Installation process of cloudera in oracal VirtualBox:
● Step1:
https://downloads.cloudera.com/demo_vm/virtualbox/cloudera-quickstart-vm- 5.13.0-
0-virtualbox.zip
Then click on this link and download the zip ﬁle.

● Step2: After downloading this zip file extract this zip file from here then click on
extract button.
● Step3: then click on this file to open the file. and click on the first file.

● Step4: after click on first file this file open on oracal VirtualBox then click on import
button. Then it will take some time for import .(depand on your internet speed)

● Step5: then click on cloudera quickstart for open cloudera.it will take sometime.

2. Explain how we can share the files from host to virtual machine andvice-
versa.
Ans:
● Step1: First, open cloudera on oracle VirtualBox.

● Step2: Then click on HUE .Then sign_in in HUE.
USERNAME-cloudera
PASSWORD-cloudera

● Step3: close Hue and create a new folder on Desktopthen put some content onthis
folder.

● Step4: Again go to cloudera and click on Device->Shared folder->Shared folder
setting.
● Step5: open Shared folder interface. Then click on add folder option.

● Step6: Then select the folder path other -> then select the recently created folder on
your saved location(Desktop) -> check the Auto-mount option -> then check Make
permanent option -> then click on ok button.

● Step7: Then create a new folder on cloudera. Right click on the desktop then
click on create folder option for creating a new folder on cloudera.

● Step8: for mount the ﬁle(host to virtual box)
Then go to the terminal .
After that type $su command.
Then give password-cloudera.
Then type this following command for mount this folder:
Mount -t vboxsf Debanjana /home/cloudera/Desktop/window_ﬁles.

● Step9: then click on the created folder on cloudera(window_files).then it will
show your save content on this folder.
● Step10: for umount file(virtual to host)
Then go to the terminal .
After that type $su command.
Then give password-cloudera.
Then type this following command for umount this folder:
umount -t vboxsf Debanjana /home/cloudera/Desktop/window_files.

Q.3. Make a documentation about how to create table in Cloudera with screenshot
Ans:
● Step1: First, open Cloudera on Oracle VirtualBox.
● Step2: Then go to the terminal .
After that type the $ hive command to go inside the hive → then click on enter
button.

● Step3: Then create a database using create database command with the
database name. Like “create database debanjana ; ”→ then click on the
enter button.

● Step4: then create those following table in the database (debanjana):
1.hotel
2.room
3.booking
4.guest
Using create table command with table name and its attribute.
1. create hotel table:
Create table debanjana.hotel (hotel_no int, name varchar(20),
address varchar(50));
Then describe the table using desc table . like desc debanjan.hotel;

2. create room table:
Create table debanjana.room (room_no int, hotel_no int, type char(4),
price decimal (10,2));
Then describe the table using desc table . like desc debanjan.room; .
3. create booking table:
Create table debanjana.booking (hotel_no int, guest_no int, room_no int,
date_from date, date_to date );

Then describe the table using desc table . like desc debanjan.booking; .
4. create guest table:
Create table debanjana.guest (guest_no int, name varchar(20), address
varchar(50));
Then describe the table using desc table . like desc debanjan.guest;.
● Step5: then check all the tables are created on database in hive or not.
First open the quickstart cloudera → then click on hue → then click on
editor → then click on hive → then click on your created database
(debanjana) → then check all tables are created in database or not.

● Step6: then use the „use‟ command to select a specific database and
perform operation into that database.
Like use database name ( use debanjana) → then click on the enter button.

● Step7: Then type show tables command show all created tables into the
created database(debanjana).
● Step8:Then insert the values into the created table.
1.hotel
2.room
3.booking
4.guest
1. insert values into hotel table:
● a. Insert into debanjana.hotel values(1, „Dooars mountain‟, „Dooars');
→ then click on enter button.

b. Insert into debanjana.hotelvalues(2, „sinclairs‟, „Dooars'); → then click on the
enter button.
● Then write select *from table name. Like select *from hotel. It's used to
shows inserted information in table.
● Then go to the quickstart cloudera → then click on hue → then click on
Quies → click on editor → then click on hive → then click on created
database (debanjana) → then write quories on the hive terminal

Select *from hotel; → then click on enter button .its shows inserts values in
the table(hotel).
2. insert values into hotel table:
● a. Insert into debanjana.room values(101,1,‟V‟,3000.00)
● Then write select *from table name. Like select *from room. It's used to
show inserted information in a table.

Select *from room; → then click on enter button .It shows inserted values
in the table(room).
3. insert values into booking table:
● a. Insert into debanjana.booking values(1,1,101,‟2022-03-26‟, „2022-03-28‟);

Select *from booking; → then click on enter button .It shows inserted
values in the table(booking).
4. insert values into guest table:
● a. Insert into debanjana.guest values(1,‟Anurag‟,‟kolkata‟);

Q.4.Make documentation about how to import data from window to cloudera in
Cloudera with screenshot .
Ans:
● Step1: First, open Cloudera on Oracle VirtualBox.
● Step2: Then download the given .txt ﬁle which one mam share on lpulive.
Shared ﬁle name is a.txt.

● Step3: then go to the cloudera → then click on device → then click on shared
clipboard → then click on Bidirectional.

● Step4: then create a file on the cloudera desktop
At First right click on the cloudera → then click on create document
option → then click on empty file option → then rename the file name if you
want (my file name is Data) → then click on enter button.
● Step5: then right click on the created file(Data) → then write something in
the file → then click on save button for the save.

● Step6: then open the cloudera terminal → then open the already
downloaded file(a.txt) → then copy the file content → then past it on the
cloudera terminal → then click on the enter button.

● Step7: then write hive command on the terminal to go inside the hive →
then click on the enter button.
● Step8: then write show tables; command to show all created tables on the
hive.

● Step9: then write select count (*) from customers command on the
cloudera terminal . its use to count the number of records on that file.
● Step10: then write select *from customers(table name) to show content of
the customers table.

● Step11: then create a table(test) with attribute test id (integer type data).
Create table test(id int); using this command.
● Step12: then write desc table name(test) command to describe the table.
desc test;

● Step13: if you want to add more columns in an existing table then use the
Alter table statement with table name(test) and attribute name(string data
type) .
alter table test add columns(name string);
Then use desc table name(test) to describe the table.
● Step14: If you want to modify columns name in an existing table then use
the alter table statement with table name(test) change old column to new
column name with data type.
alter table test change id test_id int;

● Step15: if you want to replace columns name and data type of the column
in an existing table(test) use alter table table name replace columns with
columns name and data type of the columns attribute .
alter table test replace columns(testid int, testname char(20));
● Step16: using row format delimited statements we separated the table data
row by row .
And
fields terminated by „,‟; using this command field will be terminated by
single comma(,) in the created table.

● Step17: already we have created a file with the name Data in the cloudera
Desktop (step no 4) . if u want to load data from cloudera desktop to
created table (test) then use this bellow command
Load data local inpath „/home/cloudera/Desktop/Data into table test1;
● Step18: then go to quickstart cloudera live → then click on hue editor →
then click on default database → then click refresh button to check tables
are created on database or not.

5. Make a documentation about the steps of interactions with the Hadoop file system of
Cloudera in Virtual machine with screenshot .
Ans:
● Step1: First open the cloudera in the virtual machine.
● Step2: then create a file on the cloudera desktop
First right click on the cloudera → then click on create document option
→ then click on empty file option → then rename the file name if you want
(my file name is Debanjana) → then click on enter button.

● Step3: then right click on the created file(Debanjana) → then write
something in the file → then click on save button for the save.

● Step4: then open the cloudera terminal
1. To check a list of file system in hadoop
hadoop fs -ls or hdfs dfs -ls

2. To check a directory of root in hdfs or hadoop
Hdfs dfs -ls /
● Step5: Using the help command it will show all the commands of hadoop .
hadoop fs -help

● Step6: in hadoop create a directory we use mkdir command.
hadoop fs -mkdir directory name(Bigdata)

● Step7: in hadoop create a directory we use rmdir command.
hadoop fs -rmdir directory name(Bigdata)
● Step8: to display the content of the file using cat command.
Hdfs dfs -cat file name(Debanjana)
● Step9: to copy the content of a file in a new file using cp command.
hdfs dfs -cp old_file name new_file name.
hdfs dfs -cp Debanjana New

● Step10: Now if u want to copy a file from hadoop to is on cloudera desktop
Use the below command:
hadoop fs -copyFromLocal /home/cloudera/Desktop/DebanjanaNiogi
Now use ls command to show the file.

● Step11: use of hadoop jar command
The hadoop jar command runs a program contained in a jar file.
Jar meaning is java runfile.
Users can bundle their MapReduce code in a jar file and execute it
using this below command.
hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar

● Step12: here we use wordcount command to count the average length of
the word in the input file .
hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar
wordcount Debanjana DebanjanaNiogi.

● Step13: You will see that a new folder is created on your cloudera desktop
with the name of DebanjanaNiogi.
● Step14: Then you see in this folder there are two documents . → then click
on first document (part-r-00000)

Now you will see how many times the same word will repeat on your file .

● Step15: Now check how to run a jar:
1. First go to the computer in cloudera.

2.then go to filesystem.
3. Then go to the usr folder.

5. Then go to hadoop-mapreduce folder in lib.
6. Then go to hadoop-mapreduce-examples.jar folder.

7. Then click on the visible file and you will see the content of the file.

BIG DATA LAB sahil today.docx

BIG DATA LAB sahil today.docx

Recommended

Recommended

More Related Content

Similar to BIG DATA LAB sahil today.docx

Similar to BIG DATA LAB sahil today.docx (20)

Recently uploaded

Recently uploaded (20)

BIG DATA LAB sahil today.docx