SlideShare a Scribd company logo
Student Data
Management
for
Data Gyan
PROJECT PURPOSE:
To focus on the student's data from
various data sources collected from trusted websites
and updating them in the database featuring the key
aspects of students’ potentiality.
Source of Data:
Internal Student data from the firm
and the external data from different online sources
such as Facebook, LinkedIn, and Naukri.com.
Tools Used:
 Microsoft Excel
 Microsoft SQL Server Database
 Tableau Desktop
PROJECT TITLE
 Acquiring the student's data from different sources and updating them
in the database.
 The overall strategy of the project is to find out the insights from the
data and to identify the pivotal features of the students which would
bring out each students' potentiality.
 All this insights obtained performing the analysis will be presented to
the management team.
 Data collection was done mainly in two parts which are as follows:
1. Internal Data:
As the first step involves the data collection, the Management team
provided us with the internal data for the analytical process.
2. External Data:
External data were extracted from various online sources such as
LinkedIn, Facebook, Naukri & both the internal and External data was jotted
down to Microsoft Excel for further process.
Internal Data:
We received the internal data from the management team of Data Gyan and the attributes
pertaining to it were Names, Gender, Email, Contact No, Educational background, Age, courses pursued, Fees,
Installment(Y/N), Preferences(Weekend or Weekdays).
LinkedIn:
As a external data sources we gathered the data through the marketing team, and attributes pertaining to it were
Names, Gender, Email, Contact No, Age, Course pursued, Fees, Educational background, Experience in Years.
Naukri:
As a external data sources we gathered the data through the marketing team, and attributes pertaining to it were
Names, Gender, Email, Contact No, state, Fees, Educational background, skills, Experience in Years.
Facebook:
As a external data sources we gathered the data through the marketing team, and attributes pertaining to it were
Names, Gender, Email, Contact No, state, Educational background, Hobbies.
Once the data were extracted from various sources which includes both internal and external data, were
assembled in Excel sheets. After the data was collaborated into different excel sheets, then the data was
further merged into a single spreadsheet called Master data.
Detail description of the steps followed for collection of data and collating it
into the master database is as follows:
 Initially a template in MS-excel for the master data sheet was prepared and
few important attributes which need to be captured in it and subsequently into
the database were finalized after thorough discussions with the team.
 Attributes such as Name, Email, Contact no, Age, Gender, State, Course pursued,
Skills, Pass out year, Pass out Month, Educational background were collected for
500+ students from Internal & External data into Master data sheet.
Now before transferring data into the master sheet, the data was cleansed ,
modified in order to make the data uniform.
The attributes whose data were modified are as follows:
LinkedIn: Name, Email, course, Skills, University & Experience.
Naukri: Name, Email, Qualifications, Specializations, Fees, Pass out year, Pass out
month, Skills, State, University, Experience in years.
Facebook: Name, Email, State, University, Hobbies.
Since some sources of data contains only Full name whereas some sources of data
contains both first name and last name but not the complete name, hence we need to
derive the full name from those data where both first name and last name are given
and need to split those data into first and last names where full name is given.
Master data must contain the cleaned data, so for the formation we have followed few methods
such as, Data cleansing, Data profiling, Data mining etc., to purify the data completely.
The methods followed for the procurement are as follows:
CONCATENATION
CONCATENATION:
Concatenation is the process of merging two or more strings into a single
output.
 In the above reference concatenation was used to merge ‘First name’ & ‘Last
name’ to get the “ Full Name”.
 In the above reference concatenation was used to merge ‘First name’ & ‘Email
(from test table)’ to get the final resulted Email_Id.
VLOOKUP (Vertical Look Up):
It is a function that makes Excel search for a certain value in a column in order to
return a value from a different column in the same row.
 In the above references VLOOKUP was used for the skills & course.
 As it can be seen VLOOKUP looked up for a value from the test table & by selecting the table
array, column no & the Boolean value False, which has return the exact match for the particular
column.
Nested IF:
Nested if is used for testing multiple IF function. In Nested IF we can test up
to 64 condition/ criteria.
 For the given references Nested IF was used to find the data of a particular column.
 Initially a condition was given from a related column using the IF statement and then a
True Value was Mentioned. Another condition was mentioned using another IF
statement similarly with true value & In the Final IF statement both the True & False
value were jotted down to get the Output.
IF Condition:
 It is a logical operator used for decision making which test the content of the particular
cell and returns a ‘True or False’ value.
 For the given reference a condition was mentioned using a IF statement, using a True
and False value to get the final output.
After data is retrieved and combined from multiple sources (extracted), cleaned and
formatted (transformed), it is then loaded into a storage system. In this case we used SQL
server Database.
Steps involved in Importing the Data:
Procedure for importing data:
Step 1: Expand database > Capstone Project (Database)> Tasks > Import Data.
Step 2: Data Source (Microsoft Excel) > File Path > Excel Version & click on next.
Step 3: Select Master data sheet and we can edit the mappings (optional) & finally click on Next.
Step 4: Finally the data gets loaded into the SQL server and success message is displayed after the data
has been loaded into the destination.
Skills wise Student count:
Query :
select Skills, count(skills) as total_skills_count from Masterdata group by
Skills order by total_skills_count desc;
OUTPUT:
Age wise Student count:
Query :
OUTPUT:
select Age,count(age) as total_age_count from Masterdata group by
Age order by total_age_count desc;
Fees wise Student count:
Query :
OUTPUT:
select Fees,count(fees) as total_fees_count from Masterdata group by
Fees order by total_fees_count desc;
Preference wise Student count:
Query :
OUTPUT:
select Preference, count(preference) as total_preference_count from Masterdata
group by Preference order by total_preference_count desc;
Installments wise Student count:
Query :
OUTPUT:
select [Installment_Y/N], count([Installment_Y/N]) as total_installment_count
from Masterdata group by [Installment_Y/N] order by total_installment_count
desc;
Specializations wise Student count:
Query :
OUTPUT:
select Specializations, count(specializations) as total_specializations_count
from Masterdata group by Specializations order by total_specializations_count
desc;
Experience wise Student count:
Query :
OUTPUT:
select Experience_in_years, count(Experience_in_years) as total_experience_count
from Masterdata group by Experience_in_years order by total_experience_count
desc;
Course wise Student count:
Query :
OUTPUT:
select Course, count(course) as total_course_count from Masterdata group by
Course order by total_course_count desc;
University wise Student count:
Query :
OUTPUT:
select University, count(university) as total_unversity_count from Masterdata
group by University order by total_unversity_count desc;
Qualifications wise Student count:
Query :
OUTPUT:
select Qualifications, count(Qualifications) as total_qualifications_count from
Masterdata group by Qualifications order by total_qualifications_count desc;
State wise Student count:
Query :
OUTPUT:
select State, count(state) as total_state_count from Masterdata group by
State order by total_state_count desc;
Gender wise Student count:
Query :
OUTPUT:
select Gender, count(gender) as total_gender_count from Masterdata group by
Gender order by total_gender_count desc;
Hobbies wise Student count:
Query :
OUTPUT:
select Hobbies, count(hobbies) as total_hobbies_count from Masterdata group by
Hobbies order by total_hobbies_count desc;
Year wise Student count:
Query :
OUTPUT:
select Passout_year, count(Passout_year) as totat_passout_year_count from
Masterdata group by Passout_year order by totat_passout_year_count desc;
Month wise Student count:
Query :
OUTPUT:
select Passout_month, count(Passout_month) as totat_passout_month_count from
Masterdata group by Passout_month order by totat_passout_month_count desc;
 In SQL database it is easier to extract data as per our requirement.
 In any organization there may be a large number of Master data files and as a result
maintaining a database can help.
 MS-excel has a limited capacity to store up to 10 Lakh data.
 Hence under those circumstances where we need to deal with much larger volumes of
data, importing into SQL is useful.
Now, for visualization of data in order to draw important insights from it regarding
potential business opportunities and target areas so that Data Gyan can take
important decision such as:
 Identify those places or areas where it can set up centers.
 Modify or increases the courses portfolio.
 Identify important areas for investment and come up with appropriate marketing
strategies.
• Data visualization is the
graphical representation of
information and data. By using
visual elements like charts, graphs,
and maps, data
visualization tools provide an
accessible way to see and
understand trends, outliers, and
patterns in data
Skills wise Student count:
 There are 10 different types of
skills viz BI, C, C++, Excel, IT, Java,
R, SQL, Tableau, VBA
 We can observe from the text table
that maximum students possess these
3 skills i.e. BI, SQL & IT.
 Rest of the students possess the
remaining 7 skills i.e. C, C++, Excel,
Java, R, Tableau, VBA.
State wise Course:
 There are 5 courses available
in data Gyan i.e. Data
analytics, Business analytics,
Software, Programming &
Database.
 From this bar graph we can
observe that programming is the
most preferred course among the
students in Bihar.
 We can also observe that data
analytics, & programming are the
least preferred courses among the
students of Rajasthan & Punjab
respectively
Installments wise Students:
 From this Circle map we can
observe that 32 students
have opted for the
Installment Payment mode
& 531 students have not
opted for this payment
mode, so we conclude that
maximum number of
students have opted One
time payment mode.
Qualifications wise Students:
 There are 8 different types of
qualifications viz, BA, BBA, BCom,
BSC, BTech, MBA, MCA, MSC.
 We can observe from the highlighted
table that maximum students possess
these 4 qualifications i.e. BA, BSC,
Btech & MCA.
 Few students possess the qualifications
BCom, MBA & MSC.
 Least number of students possess BBA
Qualifications.
Year wise Students:
 We can observe that the students
passed in 3 academic years viz
2018, 2019 & 2020.
 From this Bar graph we can
decode that 57 No of students
passed in the year 2018, 224 No
of students passed in the year
2019, 282 No of students
passed in the year 2020.
Month wise Students:
 We can observe that the students
passed in 3 months viz July,
August & October.
 From this line graph we can
decode that 57 No of students
passed in the Month October,
224 No of students passed in
the month July, 282 No of
students passed in the Month
August.
Specializations wise Students:
 From the bubble chart we can
observe that there are 8 types of
specializations viz Physics,
Software Development, Solid
Mechanics, History, Accounting,
Chemistry, Finance,
Entrepreneurship.
 We can see that maximum students
possess these 4 specializations i.e. ,
Software Development, Solid
Mechanics, History, Chemistry
 Few students possess the
specializations in Physics, Accounting
& Finance.
 Least number of students pursue
entrepreneurship.
Experience wise Students:
 From this bar graph we can
observe that Students are
experienced from 0 to 7
years.
 Maximum No of students
have 2 years of experience.
 Least No of students have 4
years of experience.
Gender wise Students:
 From this Pie chart we observe
that out of total number of
students:
 No of Male – 388
 No of Female – 175
State wise Students:
 From the given map we can
see that students from all
over India are interested to
pursue various courses at
Data Gyan Institute.
 We conclude from the given
map that maximum
interested students belong to
Bihar.
Data management
Data management
Data management
Data management

More Related Content

What's hot

Database and Math Relations
Database and Math RelationsDatabase and Math Relations
Database and Math Relations
Prof Ansari
 
Data Mining And Data Warehousing Laboratory File Manual
Data Mining And Data Warehousing Laboratory File ManualData Mining And Data Warehousing Laboratory File Manual
Data Mining And Data Warehousing Laboratory File Manual
Nitin Bhasin
 
Db lec 01
Db lec 01Db lec 01
Introduction of Database Design and Development
Introduction of Database Design and DevelopmentIntroduction of Database Design and Development
Introduction of Database Design and Development
Er. Nawaraj Bhandari
 
Cis 515 Effective Communication-snaptutorial.com
Cis 515 Effective Communication-snaptutorial.comCis 515 Effective Communication-snaptutorial.com
Cis 515 Effective Communication-snaptutorial.com
jhonklinz10
 
Entity-Relationship Data Model in DBMS
Entity-Relationship Data Model in DBMSEntity-Relationship Data Model in DBMS
Entity-Relationship Data Model in DBMS
Prof Ansari
 
CIS 515 Enhance teaching / snaptutorial.com
CIS 515 Enhance teaching / snaptutorial.com CIS 515 Enhance teaching / snaptutorial.com
CIS 515 Enhance teaching / snaptutorial.com
donaldzs56
 
IRJET- Resume Information Extraction Framework
IRJET- Resume Information Extraction FrameworkIRJET- Resume Information Extraction Framework
IRJET- Resume Information Extraction Framework
IRJET Journal
 
Chapter-2 Database System Concepts and Architecture
Chapter-2 Database System Concepts and ArchitectureChapter-2 Database System Concepts and Architecture
Chapter-2 Database System Concepts and Architecture
Kunal Anand
 
Data documentation and retrieval using unity in a universe®
Data documentation and retrieval using unity in a universe®Data documentation and retrieval using unity in a universe®
Data documentation and retrieval using unity in a universe®
ANIL247048
 
Data models
Data modelsData models
Data models
Hira Bukhari
 
Presentation
PresentationPresentation
Presentation
Xiaoyu Chen
 
CIS 515 Education Organization / snaptutorial.com
CIS 515 Education Organization / snaptutorial.comCIS 515 Education Organization / snaptutorial.com
CIS 515 Education Organization / snaptutorial.com
McdonaldRyan38
 
Enhanced Retrieval of Web Pages using Improved Page Rank Algorithm
Enhanced Retrieval of Web Pages using Improved Page Rank AlgorithmEnhanced Retrieval of Web Pages using Improved Page Rank Algorithm
Enhanced Retrieval of Web Pages using Improved Page Rank Algorithm
ijnlc
 
Elements of Data Documentation
Elements of Data DocumentationElements of Data Documentation
Elements of Data Documentation
ssri-duke
 
CS828 P5 Individual Project v101
CS828 P5 Individual Project v101CS828 P5 Individual Project v101
CS828 P5 Individual Project v101
ThienSi Le
 
Database design
Database designDatabase design
Database design
Jennifer Polack
 
Entity relationship modelling - DE L300
Entity relationship modelling - DE L300Entity relationship modelling - DE L300
Entity relationship modelling - DE L300
Edwin Ayernor
 
A Detail Database Architecture
A Detail Database ArchitectureA Detail Database Architecture
A Detail Database Architecture
Prof Ansari
 
Development of a new indexing technique for XML document retrieval
Development of a new indexing technique for XML document retrievalDevelopment of a new indexing technique for XML document retrieval
Development of a new indexing technique for XML document retrieval
Amjad Ali
 

What's hot (20)

Database and Math Relations
Database and Math RelationsDatabase and Math Relations
Database and Math Relations
 
Data Mining And Data Warehousing Laboratory File Manual
Data Mining And Data Warehousing Laboratory File ManualData Mining And Data Warehousing Laboratory File Manual
Data Mining And Data Warehousing Laboratory File Manual
 
Db lec 01
Db lec 01Db lec 01
Db lec 01
 
Introduction of Database Design and Development
Introduction of Database Design and DevelopmentIntroduction of Database Design and Development
Introduction of Database Design and Development
 
Cis 515 Effective Communication-snaptutorial.com
Cis 515 Effective Communication-snaptutorial.comCis 515 Effective Communication-snaptutorial.com
Cis 515 Effective Communication-snaptutorial.com
 
Entity-Relationship Data Model in DBMS
Entity-Relationship Data Model in DBMSEntity-Relationship Data Model in DBMS
Entity-Relationship Data Model in DBMS
 
CIS 515 Enhance teaching / snaptutorial.com
CIS 515 Enhance teaching / snaptutorial.com CIS 515 Enhance teaching / snaptutorial.com
CIS 515 Enhance teaching / snaptutorial.com
 
IRJET- Resume Information Extraction Framework
IRJET- Resume Information Extraction FrameworkIRJET- Resume Information Extraction Framework
IRJET- Resume Information Extraction Framework
 
Chapter-2 Database System Concepts and Architecture
Chapter-2 Database System Concepts and ArchitectureChapter-2 Database System Concepts and Architecture
Chapter-2 Database System Concepts and Architecture
 
Data documentation and retrieval using unity in a universe®
Data documentation and retrieval using unity in a universe®Data documentation and retrieval using unity in a universe®
Data documentation and retrieval using unity in a universe®
 
Data models
Data modelsData models
Data models
 
Presentation
PresentationPresentation
Presentation
 
CIS 515 Education Organization / snaptutorial.com
CIS 515 Education Organization / snaptutorial.comCIS 515 Education Organization / snaptutorial.com
CIS 515 Education Organization / snaptutorial.com
 
Enhanced Retrieval of Web Pages using Improved Page Rank Algorithm
Enhanced Retrieval of Web Pages using Improved Page Rank AlgorithmEnhanced Retrieval of Web Pages using Improved Page Rank Algorithm
Enhanced Retrieval of Web Pages using Improved Page Rank Algorithm
 
Elements of Data Documentation
Elements of Data DocumentationElements of Data Documentation
Elements of Data Documentation
 
CS828 P5 Individual Project v101
CS828 P5 Individual Project v101CS828 P5 Individual Project v101
CS828 P5 Individual Project v101
 
Database design
Database designDatabase design
Database design
 
Entity relationship modelling - DE L300
Entity relationship modelling - DE L300Entity relationship modelling - DE L300
Entity relationship modelling - DE L300
 
A Detail Database Architecture
A Detail Database ArchitectureA Detail Database Architecture
A Detail Database Architecture
 
Development of a new indexing technique for XML document retrieval
Development of a new indexing technique for XML document retrievalDevelopment of a new indexing technique for XML document retrieval
Development of a new indexing technique for XML document retrieval
 

Similar to Data management

2. DD-sample.docx
2. DD-sample.docx2. DD-sample.docx
2. DD-sample.docx
dpgdpg
 
Students academic performance using clustering technique
Students academic performance using clustering techniqueStudents academic performance using clustering technique
Students academic performance using clustering technique
saniacorreya
 
Big data project
Big data projectBig data project
Big data project
Kedar Kumar
 
Using Multiple Tools to Create Dashboards
Using Multiple Tools to Create DashboardsUsing Multiple Tools to Create Dashboards
Using Multiple Tools to Create Dashboards
Colby Stoever
 
Dwbi Project
Dwbi ProjectDwbi Project
Dwbi Project
Sonali Gupta
 
Major AssignmentDue5pm Friday, Week 11. If you unable to submit on.docx
Major AssignmentDue5pm Friday, Week 11. If you unable to submit on.docxMajor AssignmentDue5pm Friday, Week 11. If you unable to submit on.docx
Major AssignmentDue5pm Friday, Week 11. If you unable to submit on.docx
infantsuk
 
Student’s Skills Evaluation Techniques using Data Mining.
Student’s Skills Evaluation Techniques using Data Mining.Student’s Skills Evaluation Techniques using Data Mining.
Student’s Skills Evaluation Techniques using Data Mining.
IOSRjournaljce
 
Data processing and analysis final
Data processing and analysis finalData processing and analysis final
Data processing and analysis final
Akul10
 
James Colby Maddox Business Intellignece and Computer Science Portfolio
James Colby Maddox Business Intellignece and Computer Science PortfolioJames Colby Maddox Business Intellignece and Computer Science Portfolio
James Colby Maddox Business Intellignece and Computer Science Portfolio
colbydaman
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
Mahir Haque
 
Web Quest Baseball
Web Quest BaseballWeb Quest Baseball
Web Quest Baseball
cubsfan3371
 
Machine learning project_promotion
Machine learning project_promotionMachine learning project_promotion
Machine learning project_promotion
kahhuey
 
basis data 02.pptx
basis data 02.pptxbasis data 02.pptx
basis data 02.pptx
MuhammadNaufalMuthah
 
IRJET- Design and Development of Ranking System using Sentimental Analysis
IRJET-  	  Design and Development of Ranking System using Sentimental AnalysisIRJET-  	  Design and Development of Ranking System using Sentimental Analysis
IRJET- Design and Development of Ranking System using Sentimental Analysis
IRJET Journal
 
Final Presentation
Final PresentationFinal Presentation
Final Presentation
Puneet Duggal
 
20150814 Wrangling Data From Raw to Tidy vs
20150814 Wrangling Data From Raw to Tidy vs20150814 Wrangling Data From Raw to Tidy vs
20150814 Wrangling Data From Raw to Tidy vs
Ian Feller
 
placement management system.pptx
placement management system.pptxplacement management system.pptx
placement management system.pptx
PriyansuPradhan2
 
Data Industry Opportunities.pptx
Data Industry Opportunities.pptxData Industry Opportunities.pptx
Data Industry Opportunities.pptx
XimenaBustamante14
 
Chapter 05 Machine Learning.pptx
Chapter 05 Machine Learning.pptxChapter 05 Machine Learning.pptx
Chapter 05 Machine Learning.pptx
ssuser957b41
 
Using Data to Drive Instruction
Using Data to Drive InstructionUsing Data to Drive Instruction
Using Data to Drive Instruction
Roger Sevilla
 

Similar to Data management (20)

2. DD-sample.docx
2. DD-sample.docx2. DD-sample.docx
2. DD-sample.docx
 
Students academic performance using clustering technique
Students academic performance using clustering techniqueStudents academic performance using clustering technique
Students academic performance using clustering technique
 
Big data project
Big data projectBig data project
Big data project
 
Using Multiple Tools to Create Dashboards
Using Multiple Tools to Create DashboardsUsing Multiple Tools to Create Dashboards
Using Multiple Tools to Create Dashboards
 
Dwbi Project
Dwbi ProjectDwbi Project
Dwbi Project
 
Major AssignmentDue5pm Friday, Week 11. If you unable to submit on.docx
Major AssignmentDue5pm Friday, Week 11. If you unable to submit on.docxMajor AssignmentDue5pm Friday, Week 11. If you unable to submit on.docx
Major AssignmentDue5pm Friday, Week 11. If you unable to submit on.docx
 
Student’s Skills Evaluation Techniques using Data Mining.
Student’s Skills Evaluation Techniques using Data Mining.Student’s Skills Evaluation Techniques using Data Mining.
Student’s Skills Evaluation Techniques using Data Mining.
 
Data processing and analysis final
Data processing and analysis finalData processing and analysis final
Data processing and analysis final
 
James Colby Maddox Business Intellignece and Computer Science Portfolio
James Colby Maddox Business Intellignece and Computer Science PortfolioJames Colby Maddox Business Intellignece and Computer Science Portfolio
James Colby Maddox Business Intellignece and Computer Science Portfolio
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Web Quest Baseball
Web Quest BaseballWeb Quest Baseball
Web Quest Baseball
 
Machine learning project_promotion
Machine learning project_promotionMachine learning project_promotion
Machine learning project_promotion
 
basis data 02.pptx
basis data 02.pptxbasis data 02.pptx
basis data 02.pptx
 
IRJET- Design and Development of Ranking System using Sentimental Analysis
IRJET-  	  Design and Development of Ranking System using Sentimental AnalysisIRJET-  	  Design and Development of Ranking System using Sentimental Analysis
IRJET- Design and Development of Ranking System using Sentimental Analysis
 
Final Presentation
Final PresentationFinal Presentation
Final Presentation
 
20150814 Wrangling Data From Raw to Tidy vs
20150814 Wrangling Data From Raw to Tidy vs20150814 Wrangling Data From Raw to Tidy vs
20150814 Wrangling Data From Raw to Tidy vs
 
placement management system.pptx
placement management system.pptxplacement management system.pptx
placement management system.pptx
 
Data Industry Opportunities.pptx
Data Industry Opportunities.pptxData Industry Opportunities.pptx
Data Industry Opportunities.pptx
 
Chapter 05 Machine Learning.pptx
Chapter 05 Machine Learning.pptxChapter 05 Machine Learning.pptx
Chapter 05 Machine Learning.pptx
 
Using Data to Drive Instruction
Using Data to Drive InstructionUsing Data to Drive Instruction
Using Data to Drive Instruction
 

Recently uploaded

NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
iammrhaywood
 
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxBeyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
EduSkills OECD
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
RAHUL
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
amberjdewit93
 
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
GeorgeMilliken2
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
Nicholas Montgomery
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
Academy of Science of South Africa
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
Dr. Mulla Adam Ali
 
How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17
Celine George
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
Katrina Pritchard
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
Jyoti Chand
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
Celine George
 
PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.
Dr. Shivangi Singh Parihar
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
Colégio Santa Teresinha
 
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
สมใจ จันสุกสี
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
HajraNaeem15
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
heathfieldcps1
 

Recently uploaded (20)

NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptxNEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
NEWSPAPERS - QUESTION 1 - REVISION POWERPOINT.pptx
 
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxBeyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
 
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
 
How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
 
PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
 
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
 

Data management

  • 1. Student Data Management for Data Gyan PROJECT PURPOSE: To focus on the student's data from various data sources collected from trusted websites and updating them in the database featuring the key aspects of students’ potentiality. Source of Data: Internal Student data from the firm and the external data from different online sources such as Facebook, LinkedIn, and Naukri.com. Tools Used:  Microsoft Excel  Microsoft SQL Server Database  Tableau Desktop PROJECT TITLE
  • 2.  Acquiring the student's data from different sources and updating them in the database.  The overall strategy of the project is to find out the insights from the data and to identify the pivotal features of the students which would bring out each students' potentiality.  All this insights obtained performing the analysis will be presented to the management team.
  • 3.
  • 4.  Data collection was done mainly in two parts which are as follows: 1. Internal Data: As the first step involves the data collection, the Management team provided us with the internal data for the analytical process. 2. External Data: External data were extracted from various online sources such as LinkedIn, Facebook, Naukri & both the internal and External data was jotted down to Microsoft Excel for further process.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9. Internal Data: We received the internal data from the management team of Data Gyan and the attributes pertaining to it were Names, Gender, Email, Contact No, Educational background, Age, courses pursued, Fees, Installment(Y/N), Preferences(Weekend or Weekdays). LinkedIn: As a external data sources we gathered the data through the marketing team, and attributes pertaining to it were Names, Gender, Email, Contact No, Age, Course pursued, Fees, Educational background, Experience in Years. Naukri: As a external data sources we gathered the data through the marketing team, and attributes pertaining to it were Names, Gender, Email, Contact No, state, Fees, Educational background, skills, Experience in Years. Facebook: As a external data sources we gathered the data through the marketing team, and attributes pertaining to it were Names, Gender, Email, Contact No, state, Educational background, Hobbies. Once the data were extracted from various sources which includes both internal and external data, were assembled in Excel sheets. After the data was collaborated into different excel sheets, then the data was further merged into a single spreadsheet called Master data.
  • 10.
  • 11. Detail description of the steps followed for collection of data and collating it into the master database is as follows:  Initially a template in MS-excel for the master data sheet was prepared and few important attributes which need to be captured in it and subsequently into the database were finalized after thorough discussions with the team.  Attributes such as Name, Email, Contact no, Age, Gender, State, Course pursued, Skills, Pass out year, Pass out Month, Educational background were collected for 500+ students from Internal & External data into Master data sheet.
  • 12. Now before transferring data into the master sheet, the data was cleansed , modified in order to make the data uniform. The attributes whose data were modified are as follows: LinkedIn: Name, Email, course, Skills, University & Experience. Naukri: Name, Email, Qualifications, Specializations, Fees, Pass out year, Pass out month, Skills, State, University, Experience in years. Facebook: Name, Email, State, University, Hobbies. Since some sources of data contains only Full name whereas some sources of data contains both first name and last name but not the complete name, hence we need to derive the full name from those data where both first name and last name are given and need to split those data into first and last names where full name is given.
  • 13. Master data must contain the cleaned data, so for the formation we have followed few methods such as, Data cleansing, Data profiling, Data mining etc., to purify the data completely. The methods followed for the procurement are as follows: CONCATENATION
  • 14. CONCATENATION: Concatenation is the process of merging two or more strings into a single output.  In the above reference concatenation was used to merge ‘First name’ & ‘Last name’ to get the “ Full Name”.  In the above reference concatenation was used to merge ‘First name’ & ‘Email (from test table)’ to get the final resulted Email_Id.
  • 15. VLOOKUP (Vertical Look Up): It is a function that makes Excel search for a certain value in a column in order to return a value from a different column in the same row.  In the above references VLOOKUP was used for the skills & course.  As it can be seen VLOOKUP looked up for a value from the test table & by selecting the table array, column no & the Boolean value False, which has return the exact match for the particular column.
  • 16. Nested IF: Nested if is used for testing multiple IF function. In Nested IF we can test up to 64 condition/ criteria.
  • 17.  For the given references Nested IF was used to find the data of a particular column.  Initially a condition was given from a related column using the IF statement and then a True Value was Mentioned. Another condition was mentioned using another IF statement similarly with true value & In the Final IF statement both the True & False value were jotted down to get the Output.
  • 18. IF Condition:  It is a logical operator used for decision making which test the content of the particular cell and returns a ‘True or False’ value.  For the given reference a condition was mentioned using a IF statement, using a True and False value to get the final output.
  • 19. After data is retrieved and combined from multiple sources (extracted), cleaned and formatted (transformed), it is then loaded into a storage system. In this case we used SQL server Database. Steps involved in Importing the Data:
  • 20.
  • 21.
  • 22.
  • 23. Procedure for importing data: Step 1: Expand database > Capstone Project (Database)> Tasks > Import Data. Step 2: Data Source (Microsoft Excel) > File Path > Excel Version & click on next. Step 3: Select Master data sheet and we can edit the mappings (optional) & finally click on Next. Step 4: Finally the data gets loaded into the SQL server and success message is displayed after the data has been loaded into the destination.
  • 24. Skills wise Student count: Query : select Skills, count(skills) as total_skills_count from Masterdata group by Skills order by total_skills_count desc; OUTPUT:
  • 25. Age wise Student count: Query : OUTPUT: select Age,count(age) as total_age_count from Masterdata group by Age order by total_age_count desc;
  • 26. Fees wise Student count: Query : OUTPUT: select Fees,count(fees) as total_fees_count from Masterdata group by Fees order by total_fees_count desc;
  • 27. Preference wise Student count: Query : OUTPUT: select Preference, count(preference) as total_preference_count from Masterdata group by Preference order by total_preference_count desc;
  • 28. Installments wise Student count: Query : OUTPUT: select [Installment_Y/N], count([Installment_Y/N]) as total_installment_count from Masterdata group by [Installment_Y/N] order by total_installment_count desc;
  • 29. Specializations wise Student count: Query : OUTPUT: select Specializations, count(specializations) as total_specializations_count from Masterdata group by Specializations order by total_specializations_count desc;
  • 30. Experience wise Student count: Query : OUTPUT: select Experience_in_years, count(Experience_in_years) as total_experience_count from Masterdata group by Experience_in_years order by total_experience_count desc;
  • 31. Course wise Student count: Query : OUTPUT: select Course, count(course) as total_course_count from Masterdata group by Course order by total_course_count desc;
  • 32. University wise Student count: Query : OUTPUT: select University, count(university) as total_unversity_count from Masterdata group by University order by total_unversity_count desc;
  • 33. Qualifications wise Student count: Query : OUTPUT: select Qualifications, count(Qualifications) as total_qualifications_count from Masterdata group by Qualifications order by total_qualifications_count desc;
  • 34. State wise Student count: Query : OUTPUT: select State, count(state) as total_state_count from Masterdata group by State order by total_state_count desc;
  • 35. Gender wise Student count: Query : OUTPUT: select Gender, count(gender) as total_gender_count from Masterdata group by Gender order by total_gender_count desc;
  • 36. Hobbies wise Student count: Query : OUTPUT: select Hobbies, count(hobbies) as total_hobbies_count from Masterdata group by Hobbies order by total_hobbies_count desc;
  • 37. Year wise Student count: Query : OUTPUT: select Passout_year, count(Passout_year) as totat_passout_year_count from Masterdata group by Passout_year order by totat_passout_year_count desc;
  • 38. Month wise Student count: Query : OUTPUT: select Passout_month, count(Passout_month) as totat_passout_month_count from Masterdata group by Passout_month order by totat_passout_month_count desc;
  • 39.  In SQL database it is easier to extract data as per our requirement.  In any organization there may be a large number of Master data files and as a result maintaining a database can help.  MS-excel has a limited capacity to store up to 10 Lakh data.  Hence under those circumstances where we need to deal with much larger volumes of data, importing into SQL is useful. Now, for visualization of data in order to draw important insights from it regarding potential business opportunities and target areas so that Data Gyan can take important decision such as:  Identify those places or areas where it can set up centers.  Modify or increases the courses portfolio.  Identify important areas for investment and come up with appropriate marketing strategies.
  • 40. • Data visualization is the graphical representation of information and data. By using visual elements like charts, graphs, and maps, data visualization tools provide an accessible way to see and understand trends, outliers, and patterns in data
  • 41.
  • 42.
  • 43. Skills wise Student count:  There are 10 different types of skills viz BI, C, C++, Excel, IT, Java, R, SQL, Tableau, VBA  We can observe from the text table that maximum students possess these 3 skills i.e. BI, SQL & IT.  Rest of the students possess the remaining 7 skills i.e. C, C++, Excel, Java, R, Tableau, VBA.
  • 44. State wise Course:  There are 5 courses available in data Gyan i.e. Data analytics, Business analytics, Software, Programming & Database.  From this bar graph we can observe that programming is the most preferred course among the students in Bihar.  We can also observe that data analytics, & programming are the least preferred courses among the students of Rajasthan & Punjab respectively
  • 45. Installments wise Students:  From this Circle map we can observe that 32 students have opted for the Installment Payment mode & 531 students have not opted for this payment mode, so we conclude that maximum number of students have opted One time payment mode.
  • 46. Qualifications wise Students:  There are 8 different types of qualifications viz, BA, BBA, BCom, BSC, BTech, MBA, MCA, MSC.  We can observe from the highlighted table that maximum students possess these 4 qualifications i.e. BA, BSC, Btech & MCA.  Few students possess the qualifications BCom, MBA & MSC.  Least number of students possess BBA Qualifications.
  • 47. Year wise Students:  We can observe that the students passed in 3 academic years viz 2018, 2019 & 2020.  From this Bar graph we can decode that 57 No of students passed in the year 2018, 224 No of students passed in the year 2019, 282 No of students passed in the year 2020.
  • 48. Month wise Students:  We can observe that the students passed in 3 months viz July, August & October.  From this line graph we can decode that 57 No of students passed in the Month October, 224 No of students passed in the month July, 282 No of students passed in the Month August.
  • 49. Specializations wise Students:  From the bubble chart we can observe that there are 8 types of specializations viz Physics, Software Development, Solid Mechanics, History, Accounting, Chemistry, Finance, Entrepreneurship.  We can see that maximum students possess these 4 specializations i.e. , Software Development, Solid Mechanics, History, Chemistry  Few students possess the specializations in Physics, Accounting & Finance.  Least number of students pursue entrepreneurship.
  • 50. Experience wise Students:  From this bar graph we can observe that Students are experienced from 0 to 7 years.  Maximum No of students have 2 years of experience.  Least No of students have 4 years of experience.
  • 51. Gender wise Students:  From this Pie chart we observe that out of total number of students:  No of Male – 388  No of Female – 175
  • 52. State wise Students:  From the given map we can see that students from all over India are interested to pursue various courses at Data Gyan Institute.  We conclude from the given map that maximum interested students belong to Bihar.