SlideShare a Scribd company logo
Data Warehousing and Business Intelligence
Project
on
Women Empowerment and Gender Equality
Alekhya Bhupati
x18132634
MSc/PGDip Data Analytics – 2019/20
Submitted to: Sean Heeney
National College of Ireland
Project Submission Sheet – 2019/2020
School of Computing
Student Name: Alekhya Bhupati
Student ID: x18132634
Programme: MSc Data Analytics
Year: 2019/20
Module: Data Warehousing and Business Intelligence
Lecturer: Sean Heeney
Submission Due
Date:
12/04/2019
Project Title: Women Empowerment and Gender Equality
I hereby certify that the information contained in this (my submission) is information
pertaining to my own individual work that I conducted for this project. All information
other than my own contribution is fully and appropriately referenced and listed in the
relevant bibliography section. I assert that I have not referred to any work(s) other than
those listed. I also include my TurnItIn report with this submission.
ALL materials used must be referenced in the bibliography section. Students are
encouraged to use the Harvard Referencing Standard supplied by the Library. To use
other author’s written or electronic work is an act of plagiarism and may result in disci-
plinary action. Students may be required to undergo a viva (oral examination) if there
is suspicion about the validity of their submitted work.
Signature:
Date: April 12, 2019
PLEASE READ THE FOLLOWING INSTRUCTIONS:
1. Please attach a completed copy of this sheet to each project (including multiple copies).
2. You must ensure that you retain a HARD COPY of ALL projects, both for
your own reference and in case a project is lost or mislaid. It is not sufficient to keep
a copy on computer. Please do not bind projects or place in covers unless specifically
requested.
3. Assignments that are submitted to the Programme Coordinator office must be placed
into the assignment box located outside the office.
Office Use Only
Signature:
Date:
Penalty Applied (if
applicable):
Table 1: Mark sheet – do not edit
Criteria Mark Awarded Comment(s)
Objectives of 5
Related Work of 10
Data of 25
ETL of 20
Application of 30
Video of 10
Presentation of 10
Total of 100
Project Check List
This section capture the core requirements that the project entails represented as a check
list for convenience.
 Used L
A
TEX template
 Three Business Requirements listed in introduction
 At least one structured data source
 At least one unstructured data source
 At least three sources of data
 Described all sources of data
 All sources of data are less than one year old, i.e. released after 17/09/2017
 Inserted and discussed star schema
 Completed logical data map
 Discussed the high level ETL strategy
 Provided 3 BI queries
 Detailed the sources of data used in each query
 Discussed the implications of results in each query
 Reviewed at least 5-10 appropriate papers on topic of your DWBI project
Women Empowerment and Gender Equality
Alekhya Bhupati
x18132634
April 12, 2019
Abstract
Gender Equality is nothing but a human right which makes all the people to live
their life with dignity and freedom regardless of the gender. Women empowerment
in one way plays a crucial role in decreasing the gender inequality and the other
side it paves path for economic development of the countries. But unfortunately,
from centuries women are the victims of this gender discrimination in our society.
Keeping the status of Women in various fields like education, employment and
politics in different countries. In this paper an attempt is made to present important
factors of inequalities that exist in countries worldwide and how they are changing
from past few years so as to have an idea about to what extent the women are
empowered. For this analysis the Data Science is the correct approach for having
a right visualization and analyzing the present growth and the future prediction to
improve the gender equality.
1 Introduction
Gender Inequality becomes one of the major concerns of the society. From past few
centuries struggling to get equality in gender in all the aspects of life but still women
are lacking in few areas of development. And women are facing gender discrimination in
many sectors like in Education, higher designations in companies, Parliament member
etc., But this discrimination is indirectly leading to the economic degradation of the
countries because when the women is well educated and empowered then the families will
improve which in turn leads to the economic development of the society. Now a days
women are coming out from the shell which society made that women should be limited
to them because of that we are able to see a little growth of women in society. So, all
these concerns made me to think about the necessity of gender equality and I choose
this topic for my project. Here my data sets consist of data regarding women and men
employability, seats held by women in national parliament, literacy rate, Mean years of
schooling and global gender gap index. By using these data, we can analyze by making
BI queries how far the development of women Is taking place and where it is lagging.
(Req-1) My first requirement is to analyze that Employability of women in different sectors
like Agriculture, Industry and Services and Women in politics and comparing the
gap between men and women Region years for the years 2010 to 2018
(Req-2) My second requirement is to compare the literacy rate and means of schooling
women and men and analyzing the gap between the over years from 1990 to 2016.
1
Source Type Brief Summary
ILO (Intenational
Labour Oragani-
sation)
Structured Provides the data for percentage of employ-
ment of men and women in different sectors
in all countries from 2006 t0 2018.
UN data Structured Provides the data of seats held by women in
national parliament for some random years
from 1990 to 2018.
UNICEF Structured It contains the data of Literacy Rate of male
and female from 1986 to 2016, which is a
statistical update of 2018.
UNDP Structured In this site two data sets are downloaded,
Mean years of Schooling of male and Mean
years of Schooling of female over the years
1986 to 2016, which is a statistical update
of 2018. Where both data sets are combined
and structured using R
Statista Structured Downloaded the data that provides the in-
formation about the ten largest populated
countries in the world for the year 2018
The World Bank Structured and
Unstructured
From this source downloaded one structured
data set of Global Gender gap Index from
2006 to 2007 and another unstructured data
of Countries classified based on region and
income level which is scrapped using R
Table 2: Summary of sources of data used in the project
(Req-3) My Third BI Query is easy Analyze whether this gender gap is based on Region or
Income level of the countries or depends on population of country.
2 Data Sources
Sources of data used in the project.
2.1 Source 1: ILO
ILO Contains the data for the Employment of women and men in different sectors which
helps us to compare that employability rate between the women and men and also to know
in which sectors the women have to improve this will be visualized further in the business
intelligence(BI) query. The link for the data is https://www.ilo.org/wesodata/
2.2 Source 2: UN data
Downloaded the csv file which contains the data of Percentage of women sharing the seats
in national parliament. By this data we can analyze how much the women are empowered
in our parliament and how much of growth taking place every year. This source2 is
combined with source 1 and used in our first BI query to show the growth rate of women
in different sectors by comparing with men. The source link is http://data.un.org/
2.3 Source 3: UNICEF
The link for data source is https://data.unicef.org/topic/education/literacy/
.From this site we have got structured data source contains the information on literacy
rate of men and women over years, where literacy is one of the main components of
economic growth of the countries. Further we will be visualizing this data in our second
BI Query.
2.4 Source 4: UNDP
UNDP means United Nations Development program, which contains a huge amount of
data related to the growth of countries. From this site we have taken two structured
data set for Mean years of Schooling Male and Mean years of Schooling Female. As the
education in turn helps in making the good society both in economically and techno-
logically. So we have used this source 4 with Source 3 and will be visualizing later in
our BI query 2 to analyze how the mean years of schooling effecting literacy rate and
also what percentage of female are lacking in education than men. The link for data is
http://hdr.undp.org/en/data
2.5 Source 5: Statista
Statista was the fifth source which is also used as structured data, The link for data set is
https://www.statista.com/statistics/262879/countries-with-the-largest-population/
.This data set contains the ten largest populated countries In the year 2018.We are using
this population of the countries to find whether gender equality depends on population
or on other factors which we can see in our third BI query.
2.6 Source 6: The World Bank
The World Bank was the last source, in which have used two data sets one is structured
and other is unstructured data. https://tcdata360.worldbank.org/indicators/af52ebe9?
country=BRAindicator=27959viz=line_chartyears=2006,2018indicators=944
compareBy=region this data set contains the data of global gender gap index for different
indicators for all the countries by which we can easily analyze what percentage of women
are facing gender inequalities.
Another data for this project was also from this source https://datahelpdesk.
worldbank.org/knowledgebase/articles/906519 which we web scraped using R. This
contains the data of World Countries Region wise and Income Level of the Countries
wise which is used for us to analyze our data based on this parameters to get more clear
idea on which areas and fields we are having gender inequalities and what should be in
particular fields to improve empowerment of women which we will be visualized in our
BI Queries later.
3 Related Work
A number of studies have shown that sustainable development is impossible without
women’s empowerment and gender equality. Consequently, it is asserted that gender
equality is both a human rights issue and a precondition for, and indicator of, sustainable
development (Alvarez and Lopez, 2013).
Providing women and girls with equal access to education, health care, decent work,
and representation in political and economic decision-making processes will fuel sus-
tainable economies and benefit societies and humanity at large (United Nations (n.d.)
Sustainable Development).This gender inequality can be observed in several aspects of
daily life such as access to education, job opportunities, and economic resources (United
Nations Development Programme [UNDP], 2015). When compared to men, women have
greater access to the use of force, greater access to resource control, and more advanta-
geous cultural ideologies.
According to the report of (United Nations (n.d.) Sustainable Development), nearly
in 18 countries, husbands can legally prevent their wives from working; in 39 countries,
daughters and sons do not have equal inheritance rights; and 49 countries lack laws
protecting women from domestic violence. Only 52 per cent of women married or in a
union freely make their own decisions about sexual relations, contraceptive use and health
care.
In 2011 only 20 percent of the low-income nations had achieved gender parity in
primary education and 66 Percent of the worlds 774 million illiterate adults were still
women. There is consensus that gender equity is an important goal to be achieved (e.g.,
UN Women, 2011).In 2016, the women literacy rate increased to 89 percent(UNICEF).
Women are attaining sustainable growth in education, but they were not getting equal
opportunities to prove themselves.
Globally, the labour force participation rate for men and women aged 15 and over
continues its long-term decline; it stands at 61.8 per cent in 2018, down by 1.4 percent-
age points over the past decade. The decline in womens participation rate has been
slower than that of men, resulting in a slight narrowing of the gender gap. These trends
reflect different patterns across the life cycle, resulting from changes in both education
participation among youth and, at the other end of the scale, older workers retirement
choices. The headline finding, however, is that, on average around the world, women
remain much less likely to participate in the labour market than men. At 48.5 per cent
in 2018, womens global labour force participation rate is 26.5 percentage points below
that of men (table 1). Since 1990, this gap has narrowed by 2 percentage points, with
the bulk of the reduction occurring in the years up to 2009. The rate of improvement,
which has been slowing since 2009, is expected to grind to a halt during 201821, and
possibly even reverse, potentially negating the relatively minor improvements in gender
equality in access to the labour market achieved over the past decade. (International
Labour organization -Trends for women 2018).
Greater participation of women in social and political sphere is essential to make the
social and political institutions more representative. It serves as a tool for empowerment
of women and contributes to gender sensitive decision making. Globally, there are 29
States in which women account for less than 10 per cent of parliamentarians in single or
lower houses, as of November 2018, including 4 chambers with no women at all (ibid.).
Wide variations remain in the average percentages of women parliamentarians in each
region. As of November 2018, these were (single, lower and upper houses combined):
Nordic countries, 42.3 percent; Americas, 30 percent; Europe including Nordic countries,
27.7 percent; Europe excluding Nordic countries, 26.6 percent; sub-Saharan Africa, 23.6
percent; Asia, 19.4 percent; Arab States, 17.8 percent; and the Pacific, 17 percent (Inter-
Parliamentary Union, 2018).
Womens representation in local governments can make a difference. Research on pan-
chayats (local councils) in India discovered that the number of drinking water projects
in areas with women-led councils was 62 per cent higher than in those with men-led
councils. In Norway, a direct causal relationship between the presence of women in
municipal councils and childcare coverage was found. (R. Chattopadhyay and E. Duflo,
2014).Women demonstrate political leadership by working across party lines through par-
liamentary woman’s caucuses - even in the most politically combative environments - and
by championing issues of gender equality, such as the elimination of gender-based violence,
parental leave and childcare, pensions, gender-equality laws and electoral reform.
4 Data Model
This section we can see the information about the dimensions created in SSIS. There are
seven dimension and one fact table which are connected using star schema as shown in
the below Figure 1. The star schema further separates the business process data into
facts which has measurable data.
Figure 1: Star Schema
Detailed discussion on Facts and Dimension as follows:
Dim Country: Dim Country contains the names of all the countries. This data was
obtained from the World Bank and Statista.
Dim Region: Dim Region contains the names of all the region which was taken by
selecting distinct Regions to analyze our data Region wise. This data was gathered from
The world Bank.
Dim Income: Dim Income contains the names of all the income level of the countries
was taken by selecting distinct income level of the country to analyze our data by Income
level country wise. This data is also gathered from The World Bank
Dim Sector; Dim Sector contains the different sectors of occupations of men and
women. In data is collected from these two data sources ILO and UN data.
Dim Gender: Dim Gender contains the rows of Gender as Female or Male, which is
important dimension in our project as we are comparing every field on gender to know
the gender inequalities. The data is obtained from ILO, UNICEF, UNDP and The World
Bank.
Dim Year: Dim Year contains the years which will be used to analyze the how growth
rate is changing over years.The data is collected form ILO, UNICEF, UNDP and The
World Bank.
Dim Indicator: Dim Indicator Contains the fields which indicate the gender gap index
such as educational attainment, health and survival index etc., by combining this dimen-
sion with some other dimensions and fact table we can analyze the gender gap between
men and women.
Fact Women Empowerment: Fact table is interlinked with all the dimensions which
includes measures and foreign keys from the connected dimensions which are having the
country id, region id, income level id, sector id, gender id, year id, percentage of women
and men in different sectors, percentage of literacy, mean years of schooling, global gender
gap subindex, number of inhabitants in millions.
5 Logical Data Map
Table 3: Logical Data Map describing all transforma-
tions, sources and destinations for all components of the
data model illustrated in Figure 1
Source Column Destination Column Type Transformation
ILO Location Dim Country Country Dimension Changed the Name Country, removed commas in the
column
ILO 2006,
2007, 2008,
2009, 2010,
20011, 2012,
2013, 2014,
2015,2016,
2017, 2018,
Dim Year Year Dimension Here the all year columns are combined in to single
Year column using melt function in R $
ILO Gender Dim Gender Gender Dimension No Transformation done
ILO Sector Dim Sector Sector Dimension No Transformation done
ILO No Header Fact Women
Empower-
ment
Percentage Fact created new column Percentage with percentage values
of all years using melt function in R
UN data Region Coun-
try Area
Dim Country Country Dimension Changed the Name Country, removed commas in the
column
UN data Year Dim Year Year Dimension No Transformation done
UN data No Header Dim Gender Gender Dimension Added new column gender in R
UN data Series Dim Sector Sector Dimension Renamed the column name to Sector and replaced the
column text with Parliament using replace function in
R
Continued on next page
Table 3 – Continued from previous page
Source Column Destination Column Type Transformation
UN data Value Fact Women
Empower-
ment
Percentage Fact Added new column Percentage by making 100 minus
value of women sharing seats in percentage we got men
sharing seats in parliament and added the value accord-
ing to the gender
UNICEF No Header Dim Year Year Dimension Here the all year columns are combined in to single
Year column using melt function in R
UNICEF Youth Lit-
eracy rate,
population
15.24 years,
female and
Youth Lit-
eracy rate,
population
15.24 years,
female
Dim Gender Gender Dimension Here both the columns are combined using melt func-
tions and added a column Gender according to values
in Literacy rate differentiate between men and women
UNICEF No Header Fact Women
Empower-
ment
Literacy Rate Fact created new column Literacy Rate with percentage val-
ues of all years using melt function in R
UNDP Country Dim Country Country Dimension No Transformation done
UNDP No Header Dim Year Year Dimension Here the all year columns are combined in to single
Year column using melt function in R
UNDP No Header Dim Gender Gender Dimension Here we have taken data from two files contains male
schooling years rate and female Schooling years, and
combined both values using melt function in R and
named the column as Literacy and added this new Gen-
der column to differentiate between those values.
Continued on next page
Table 3 – Continued from previous page
Source Column Destination Column Type Transformation
UNDP No Header Fact Women
Empower-
ment
Schooling Fact created new column Literacy Rate with percentage val-
ues of all years using melt function in R
Statista No Header Dim Country Country Dimension Added column Name Country in R
Statista Inhabitants
in millions
Fact Women
Empower-
ment
Inhabitants Fact Changed the name to Inhabitants in R and removed
commas in the value
Statista No Header Dim Year Year Dimension Added new column year and assigned year 2018 in R
The World
Bank
Country
Name
Dim Country Country Dimension Changed the name to Country in R
The World
Bank
Indicator Dim Indica-
tor
Indicator Dimension Reduced the size of column text
The World
Bank
2006, 2007,
2008, 2009,
2010, 2011,
2012, 2013,
2014, 2015,
2016, 2017,
2018
Dim Year Year Dimension Here the all year columns are combined in to single
Year column
The World
Bank
No Header Fact Women
Empower-
ment
Subindex Fact created new column Subindex with subindex values of
all years using melt function in R
The World
Bank
No Header Dim Country Country Dimension Scrapped data of countries from this source and named
column as Country in R and removed commas in the
text
The World
Bank
No Header Dim Region Region Dimension Scrapped data of Regions from this source and named
column as Region in R and removed commas in the text
Continued on next page
Table 3 – Continued from previous page
Source Column Destination Column Type Transformation
The World
Bank
No Header Dim Income
Level
Income Level Dimension Scrapped data of Income Level Countries from this
source and named column as Income Level in R and
removed commas in the text
6 ETL Process
ETL is a process where the data is extracted and cleaned and then transformed into a
format which is easy to use and then loaded into database by staging process.
Extract
The common step in every ETL process to extract the data, where exploratory data
analysis involves identifying the correct data which serves our purpose. It mostly depends
on what kind of data we are searching, and it can be scrapped or downloaded from
different sources and the data should be cleaned to utilize it in a proper manner. Over
here there are 6 sources which are been used from ILO, UN data , UNICEF, UNDP,
Statista and also the data is web scrapped using R from The World bank. After getting
data from these sources it is fully cleaned in R and converted in to csv file format for
further process.
Transform
After collecting data from the extraction process. The next step is to transform the
data. Here the data is transferred to the target destination. But cleaning of data is more
important before transferring the data to serve our purpose of analysis Which involves
removal of commas and special characters in the text , removal of null values and any
unnecessary data, Transpose of row and columns and adding or replacing new column
names to have similarity with other sources. After completing this cleaning process then
it is stored in a csv file format for the loading process.
Load
After getting the cleaning and transformed data, the following step is loading of data
into the staging area through the automation process in r. After automated in R in SSIS,
all the sources will populate raw tables with data which is further used to create fact table
and dimension tables. Here fact tables consists of measure values in it which are country
id, region id, income level id, sector id, gender id, year id, percentage of women and men
in different sectors, percentage of literacy, mean years of schooling, global gender gap
subindex, number of inhabitants in millions. And similar way all the Dimension table
contains the primary key ids which are connected to fact table as foreign keys to create
a star schema. After creating star schema in SSIS then we deploy the cube and create
hierarchy. After this process our database is ready for analyzing and visualize our BI
queries in Tableau
7 Application
Below are three BI Queries noted in Section 1 which we are going to analyze in our
project.
7.1 BI Query 1: Is there any growth of women in different
employment sectors and also in political sector?
For this query, the contributing sources of data are ILO, World Bank. The visualization
obtained as illustrated in Figure 2. Here it demonstrates the women and men employa-
bility in different sectors for the years 2010 to 2018 Region wise. So here we can see that
women have equal opportunities or more in Services sectors, but women have very less
priority in parliament sector in all the regions and the growth is negligible over years. In
the other two sectors, agriculture men and women are having more or less equal oppor-
tunities and in industry sector as well women having less opportunities than men. So,
the overall graph shows women are facing gender inequalities in all the sectors especially
more in Parliament sector.
Figure 2: Results for BI Query 1
7.2 BI Query 2:status of growth of women in educational at-
tainment?
The data sets used for my second query are UNICEF and UNDP. The visualization
obtained as illustrated in Figure 3. In this we have compared the literacy rate and mean
years of schooling of men and women. From the analysis we can found that mean years of
schooling of women increasing over years from 1990 to 2016 which in turn increasing the
literacy rate. Hence there is significant growth rate of women in education attainment
but still there exists a gender gap when compared to men.
Figure 3: Results for BI Query 2
7.3 BI Query 3: On which factors the gender gap is depending
Regions, income level of the countries or population of the
countries?
The data sets from The World Bank and Statista are used in this BI Query. The vi-
sualization obtained as illustrated in Figure 4. Here we made comparison between the
population of the countries, Region of the countries and income level of the countries
using Global Gender gap index. Here we can see the population is not affecting the
global gender gap directly. But the income level of the countries and regions are affecting
the global gender gap. Hence necessary measures are taken to improve the low-income
economies so that we can have significant economic growth.
Figure 4: Results for BI Query 3
7.4 Discussion
In this section I’m going to discuss more about the BI query and how it deals with the
present situations faced by women in society. My first does the growth rate of women in
different sectors. As per the outcome of visualization there is wide gender gap between
men and women and also women in different areas. In Agriculture and Services sectors
for the regions South Asia and Sub Saharan Africa the employment of women is high
in Agriculture and less in Services sector, which indicates that women are facing major
gender inequalities in these areas when compared to others. And, in every region the gap
between men and women in political sector is very high.
Second BI query represents that mean years of schooling over the years from 1996 to
2016 has drastic growth and also correspondingly there was growth in literacy rate. But
even though the literacy rate increasing still there is a gap between men and women and
women are lagging behind men.
Third BI query ask for on which factor the gender gap is depending, here if we see the
population of China which is far higher than Brazil , and if we compare the Economic
opportunity index of Brazil and China both are equal . By this we can say that the
population is not affecting the gender gap directly. In the same way if see the graph of
income level of countries and the education attainment index is low when compared to
other income economies. We can say that Region and Income level of the countries are
affecting the gender gap.
8 Conclusion and Future Work
Providing education for women in less developed countries. Providing self-empowerment
and self-help groups. And also providing equal opportunities for women in all the sectors.
Women should have given more place in parliament. Encouraging women to develop in
their field at which they are good at and make a career. Other than this, society should
change the mentality toward the word Women
After all the discussion, it can be concluded that all the three Bi Queries are well
analyzed, and the data is up to date as compared to the other sources of reports and
papers done above. It can be said new laws and awareness programs must be started for
women empowerment. Data warehousing and business intelligence is the best approach
to analyze the present situation and to forecast the future development programs to be
done for development of women.
References
- 2018a. World Employment and Social Outlook: Trends for women 2018 (Geneva).
- Alvarez and Lopez, 2013 Alvarez, Michelle Lopez From unheard screams to powerful
voices: a case study of Women’s political empowerment in the Philippines 12th National
Convention on Statistics (NCS) EDSA Shangri-la Hotel, Mandaluyong City October 12,
2013 (2013) Google Scholar
- ibid.
- Inter-Parliamentary Union. Women in national parliaments, as at 1 November 2018
- R. Chattopadhyay and E. Duflo (2004.WomenasPolicyMakers : Evidencefroma
Randomized Policy Experiment in India, Econometrica 72(5), pp. 14091443; K. A.
Bratton and L. P. Ray, 2002, Descriptive Representation: Policy Outcomes and Municipal
Day-Care Coverage in Norway, American Journal of Political Science, 46(2), pp. 428437.
- UNESCO, Education for All Global Monitoring Report 2013/14: Teaching and
Learning Achieving Quality for All, UNESCO, Paris, 2014.
- UNICEF, The State of the Worlds Children 2015: Reimagine the Future Innovation
for Every Child, UNICEF, New York, 2014.
- United Nations (n.d.) Sustainable Development. Gender Equality Why It Mat-
ters. Available at: http://www.un.org/sustainabledevelopment/gender-equality/ [Google
Scholar]
- United Nations Development Programme [UNDP] (2015). Human Development Re-
port 2015. Work for Human Development. Available at: http://hdr.undp.org/sites/default
/files/2015humandevelopmentreport.pdf
- UN Women (2011). The Womens Empowerment Principles: Equality Means Busi-
ness. Available at: http://www.unwomen.org/-/media/headquarters/attachments/sections/
library/publications/2011/10/women-s-empowerment-principlesen
References
Appendix
Vedio link
https://youtu.be/w0fHvbUe8Jw
R code
#### Global gender gap -source from the WOrld bank
library(dplyr)
library(reshape2)
getwd ()
setwd(/Users/MOLAP/Documents/DWBI/Raw_Data_Files)
#read a file
gender1 - read.csv(Gap.csv,TRUE ,,)
#Melt
gender1.m1=
melt(gender1, id.vars = c(Country.ISO3,Country.Name,
Indicator,Subindicator.Type),
measure.vars = c(X2006,X2007,
X2008,X2009,X2010,X2011,
X2012,X2013,X2014,
X2015,X2016,X2018))
gender1.m1=
melt(gender1, measure.vars=c(X2006,X2007,X2008
,X2009,X2010,X2011,X2012,
X2013,X2014,X2015,X2016,X2018),
variable.name = Year, value.name = subindex)
gender1.m2- gender1.m1
#Filter
gender1.m2- filter(gender1.m2,Indicator %in% c(Global Gender Gap
Economic Participation and Opportunity Subindex,
Global Gender Gap Educational Attainment Subindex,
Global Gender Gap Health and Survival Subindex,
Global Gender Gap Political Empowerment subindex))
#Remove first letter in a value
gender1.m2$Year_1= as.numeric(gsub(X, , gender1.m2$Year ))
#Remove column
gender1.m2 - gender1.m2[,-5]
#Rename
gender1.m3 - gender1.m2
names(gender1.m3)[ names(gender1.m3) == Country.ISO3] - Country_ID
names(gender1.m3)[ names(gender1.m3) == Country.Name] - Country
names(gender1.m3)[ names(gender1.m3) ==
Subindicator.Type] - Subindicator_Type
names(gender1.m3)[ names(gender1.m3) == Year_1] - Year
#Reorder Columns
gender1.m3 - gender1.m3[c(1,2,6,3,4,5)]
#Remove column
gender1.m3 - gender1.m3[,-1]
#Remove spaces in strings
gender1.m4 - gender1.m3
#to add underscore between strings
gender1.m4$Indicator= gsub( , _, gender1.m4$Indicator)
#Remove unnecessary data
gender1.m4$Indicator= gsub( Global_Gender_Gap_, ,
gender1.m4$Indicator)
#Remove Commas and rest of the value
gender1.m5 - gender1.m4
gender1.m5$Country= gsub( ,.*, , gender1.m5$Country)
gender1.m6 - gender1.m5
# Omit NA values
gender1.m6 - na.omit(gender1.m6)
write.csv(gender1.m6, file = C:/ Users/MOLAP/Documents/DWBI/
Cleaned_Data_Files/Global_Gender_gap.csv,
quote= F, row.names = F)
##### Youth Literacy rate - Source from UNICEF
library(dplyr)
library(reshape2)
getwd ()
setwd(/Users/MOLAP/Documents/DWBI/Raw_Data_Files)
#read a file
lit - read.csv(Youth_Literacy_Rate.csv,TRUE ,,)
#Rename
lit.d1 - lit
lit.d1 -lit.d1[,-2]
str(lit.d2)
names(lit.d1)[ names(lit.d1) ==  ..] - Year
names(lit.d1)[ names(lit.d1) ==
Youth.literacy.rate .. population.15.24.years .. female] - Female
names(lit.d1)[ names(lit.d1) ==
Youth.literacy.rate .. population.15.24.years .. male] - Male
#melt
lit.d2= melt(lit.d1, id.vars = c(Year),
measure.vars = c(Female,Male))
lit.d2= melt(lit.d1, measure.vars = c(Female,Male),
variable.name = Gender, value.name = Literacy_Rate)
#omit NA Values
lit.d2 - na.omit(lit.d2)
write.csv(lit.d2, file =
C:/ Users/MOLAP/Documents/DWBI/Cleaned_Data_Files/Literacy_Rate.csv,
quote= F, row.names = F)
###### population - Source from statista
library(dplyr)
library(reshape2)
getwd ()
setwd(/Users/MOLAP/Documents/DWBI/Raw_Data_Files)
#read a file
population - read.csv(
World_Population_ten_largest_countries.csv
,TRUE ,,)
#Rename
population.d1 - population
names(population.d1)[ names(population.d1) ==
Countries.with.the.largest.population.2018] - Country
names(population.d1)[ names(population.d1) == X] - Inhabitants
#Remove row from table
population.d2 - population.d1
population.d2 - population.d2[-c(1,2), ]
#Adding a column
population.d3 -population.d2
population.d3$Year - 2018
#Reorder columns
population.d3 - population.d3[c(1,3,2)]
#Remove Commas
population.d4 - population.d3
population.d4$Inhabitants=
as.numeric(gsub(,,
, population.d4$Inhabitants ))
#Omit NA values
population.d4 - na.omit(population.d4)
write.csv(population.d4, file = C:/ Users/MOLAP/Documents/DWBI/
Cleaned_Data_Files
/Ten_Largest_Populated_Countries.csv,
quote= F, row.names = F)
###### Employment of women - Source from ILO
library(dplyr)
library(reshape2)
getwd ()
setwd(/Users/MOLAP/DWBI/Cleaned_Data_Files)
#read a file
emp1- read.csv(Employment_by_sector_1.csv,TRUE ,,)
emp2- read.csv(Employment_by_sector_2.csv,TRUE ,,)
emp3- read.csv(Employment_by_sector_3.csv,TRUE ,,)
#remove comma
emp1$Employment.by.Sector .. Rate ...=
gsub(,, ,
emp1$Employment.by.Sector .. Rate ...)
emp2$Employment.by.Sector .. Rate ...=
gsub(,, ,
emp2$Employment.by.Sector .. Rate ...)
emp3$Employment.by.Sector .. Rate ...=
gsub(,, ,
emp3$Employment.by.Sector .. Rate ...)
#remove rows
emp1 - emp1[-c(1,2),]
emp2 - emp2[-c(1,2),]
emp3 - emp3[-c(1,2),]
es - rbind(emp1,emp2,emp3)
es.d1-es
#remove columns
es.d1 - es.d1[,-c(16,17,18,19,20)]
#melt
es.d2-es.d1
es.d2= melt(es.d1, id.vars = c(Employment.by.Sector .. Rate ...,X,X.1),
measure.vars =
c(X.2,X.3,X.4,
X.5,X.6,X.7,X.8,
X.9,X.10,X.11,X.12,X.13))
str(es.d2)
es.d2= melt(es.d1, measure.vars =
c(X.2,X.3,X.4,
X.5,X.6,X.7,X.8,X.9
,X.10,X.11,X.12,X.13),
variable.name = Year, value.name = Percentage)
#Rename column
es.d3- es.d2
names(es.d3)[ names(es.d3) == Employment.by.Sector .. Rate ...] - Country
names(es.d3)[ names(es.d3) == X] - Gender
names(es.d3)[ names(es.d3) == X.1] - Sector
#Reorder Columns
es.d3 - es.d3[c(1,4,2,3,5)]
#Revalue data
es.d4-es.d3
es.d4$Year= revalue(es.d4$Year , c(X.2 = 2007, X.3 = 2008, X.4 = 20
, X.8 = 2013, X.9 = 2014, X.10 =
es.d4$Country= gsub(,, , es.d4$Country)
es.d4$Percentage= gsub(%, , es.d4$Percentage)
#Remove NA Values
es.d4 - na.omit(es.d4)
write.csv(es.d4, file = C:/ Users/MOLAP/Documents/DWBI/Cleaned_Data_Files/Emp
##### Mean years schooling - Source from UNDP
library(plyr)
library(dplyr)
library(reshape2)
getwd ()
setwd(/Users/MOLAP/Documents/DWBI/Raw_Data_Files)
#read a file
schm - read.csv(Mean years of schooling_Male.csv,TRUE ,,)
#remove column
schm.d1 - schm[, -which(names(schm) %in% c(Mean.years.of.schooling .. male ..
X.14,X.16,X.18,X.20,X.22,X
#remove row
schm.d1 - schm.d1[-c(1),]
names(schm.d1)[ names(schm.d1) == X] - Country
schm.d1 - schm.d1[-c(173,174,175,176,177,178,179,180),]
#melt
schm.d2= melt(schm.d1, id.vars = c(Country),
measure.vars = c(X.1,X.3,X.5,X.7,X.9,X.11,X.13,X.1
schm.d2= melt(schm.d1, measure.vars = c(X.1,X.3,X.5,X.7,X.9,X.11,
variable.name = Year, value.name = Mean_years_schooling)
#Rename column
schm.d3- schm.d2
schm.d3$Year= revalue(schm.d3$Year , c(X.1 = 1990, X.3 = 1995, X.5 =
#Add column
schm.d3$Gender - Male
#Reoder columns
schm.d4 -schm.d3[c(1,2,4,3)]
#Remove comma
schm.d5-schm.d4
schm.d5$Country= gsub(,, , schm.d5$Country)
#Omit NA
schm.d5 - na.omit(schm.d5)
# ############################################################################
#read a file
schf - read.csv(Mean years of schooling_Female.csv,TRUE ,,)
#remove column
schf.d1 - schf[, -which(names(schf) %in% c(Mean.years.of.schooling .. female .
X.14,X.16,X.18,X.20,X.22
#remove row
schf.d1 - schf.d1[-c(1),]
names(schf.d1)[ names(schf.d1) == X] - Country
schf.d1 - schf.d1[-c(173,174,175,176,177,178,179,180),]
#melt
schf.d2= melt(schf.d1, id.vars = c(Country),
measure.vars = c(X.1,X.3,X.5,X.7,X.9,X.11,X.13,X
schf.d2= melt(schf.d1, measure.vars = c(X.1,X.3,X.5,X.7,X.9,X.11,
variable.name = Year, value.name = Mean_years_schooling)
#Rename column
schf.d3- schf.d2
schf.d3$Year= revalue(schf.d3$Year , c(X.1 = 1990, X.3 = 1995, X.5 =
#Add column
schf.d3$Gender - Female
#Reoder columns
schf.d4 -schf.d3[c(1,2,4,3)]
#Remove comma
schf.d5-schf.d4
schf.d5$Country= gsub(,, , schf.d5$Country)
#Omit NA
schf.d5 - na.omit(schf.d5)
#Write csv
# ############################################################################
Mean_years_of_schooling - rbind(schm.d5,schf.d5)
#Write csv
write.csv(Mean_years_of_schooling , file = C:/ Users/MOLAP/Documents/DWBI/Clea

More Related Content

What's hot

7 ojeka stephen&philip omoke
7 ojeka stephen&philip omoke7 ojeka stephen&philip omoke
7 ojeka stephen&philip omoke
Alexander Decker
 
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATAPREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
kevig
 
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATAPREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
ijnlc
 
Equality in Fife - December 2018
Equality in Fife - December 2018Equality in Fife - December 2018
Equality in Fife - December 2018
Fife Centre for Equalities
 
On What Basis Indian People Vote
On What Basis Indian People VoteOn What Basis Indian People Vote
On What Basis Indian People Vote
inventionjournals
 
Analysis of Rising Tutition Rates in The United States Based on Clustering An...
Analysis of Rising Tutition Rates in The United States Based on Clustering An...Analysis of Rising Tutition Rates in The United States Based on Clustering An...
Analysis of Rising Tutition Rates in The United States Based on Clustering An...
csandit
 
Massachusetts Digital Health Ecosystem
Massachusetts Digital Health EcosystemMassachusetts Digital Health Ecosystem
Massachusetts Digital Health Ecosystem
Brett Campbell
 
E-governance, Accountability, and Leakage in Public Programs. Experimental Ev...
E-governance, Accountability, and Leakage in Public Programs. Experimental Ev...E-governance, Accountability, and Leakage in Public Programs. Experimental Ev...
E-governance, Accountability, and Leakage in Public Programs. Experimental Ev...
Stockholm Institute of Transition Economics
 
What facilitates the delivery of citizen centric e government services in dev...
What facilitates the delivery of citizen centric e government services in dev...What facilitates the delivery of citizen centric e government services in dev...
What facilitates the delivery of citizen centric e government services in dev...
ijcsit
 
Mass Digital Health 2018
Mass Digital Health 2018Mass Digital Health 2018
Mass Digital Health 2018
Massachusetts Technology Collaborative
 
AWARENESS OF RIGHT TO INFORMATION ACT AMONG THE COLLEGE STUDENTS OF PERIYAR E...
AWARENESS OF RIGHT TO INFORMATION ACT AMONG THE COLLEGE STUDENTS OF PERIYAR E...AWARENESS OF RIGHT TO INFORMATION ACT AMONG THE COLLEGE STUDENTS OF PERIYAR E...
AWARENESS OF RIGHT TO INFORMATION ACT AMONG THE COLLEGE STUDENTS OF PERIYAR E...
chelliah paramasivan
 
Workforce Development: The Next Plan
Workforce Development: The Next PlanWorkforce Development: The Next Plan
Workforce Development: The Next Plan
Massachusetts Department of Higher Education
 
1 tenea lewissocw 6301methodological approach
1 tenea lewissocw 6301methodological approach1 tenea lewissocw 6301methodological approach
1 tenea lewissocw 6301methodological approach
licservernoida
 
Adoption of internal web technologies by oecd turkish government officials
Adoption of internal web technologies by oecd turkish government officialsAdoption of internal web technologies by oecd turkish government officials
Adoption of internal web technologies by oecd turkish government officials
ijmpict
 
003 libre(1)
003 libre(1)003 libre(1)
003 libre(1)
Samad Keramatfar
 
Data Day 2012_Kahn_Using Indicators-Boston Indicators
Data Day 2012_Kahn_Using Indicators-Boston IndicatorsData Day 2012_Kahn_Using Indicators-Boston Indicators
Data Day 2012_Kahn_Using Indicators-Boston Indicators
Metropolitan Area Planning Council
 
How can 'IT' improve national competitiveness
How can 'IT' improve national competitivenessHow can 'IT' improve national competitiveness
How can 'IT' improve national competitiveness
Mike Backhouse
 
11.0005www.iiste.org call for paper.[39-44]fostering the practice and teachin...
11.0005www.iiste.org call for paper.[39-44]fostering the practice and teachin...11.0005www.iiste.org call for paper.[39-44]fostering the practice and teachin...
11.0005www.iiste.org call for paper.[39-44]fostering the practice and teachin...
Alexander Decker
 

What's hot (18)

7 ojeka stephen&philip omoke
7 ojeka stephen&philip omoke7 ojeka stephen&philip omoke
7 ojeka stephen&philip omoke
 
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATAPREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
 
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATAPREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATA
 
Equality in Fife - December 2018
Equality in Fife - December 2018Equality in Fife - December 2018
Equality in Fife - December 2018
 
On What Basis Indian People Vote
On What Basis Indian People VoteOn What Basis Indian People Vote
On What Basis Indian People Vote
 
Analysis of Rising Tutition Rates in The United States Based on Clustering An...
Analysis of Rising Tutition Rates in The United States Based on Clustering An...Analysis of Rising Tutition Rates in The United States Based on Clustering An...
Analysis of Rising Tutition Rates in The United States Based on Clustering An...
 
Massachusetts Digital Health Ecosystem
Massachusetts Digital Health EcosystemMassachusetts Digital Health Ecosystem
Massachusetts Digital Health Ecosystem
 
E-governance, Accountability, and Leakage in Public Programs. Experimental Ev...
E-governance, Accountability, and Leakage in Public Programs. Experimental Ev...E-governance, Accountability, and Leakage in Public Programs. Experimental Ev...
E-governance, Accountability, and Leakage in Public Programs. Experimental Ev...
 
What facilitates the delivery of citizen centric e government services in dev...
What facilitates the delivery of citizen centric e government services in dev...What facilitates the delivery of citizen centric e government services in dev...
What facilitates the delivery of citizen centric e government services in dev...
 
Mass Digital Health 2018
Mass Digital Health 2018Mass Digital Health 2018
Mass Digital Health 2018
 
AWARENESS OF RIGHT TO INFORMATION ACT AMONG THE COLLEGE STUDENTS OF PERIYAR E...
AWARENESS OF RIGHT TO INFORMATION ACT AMONG THE COLLEGE STUDENTS OF PERIYAR E...AWARENESS OF RIGHT TO INFORMATION ACT AMONG THE COLLEGE STUDENTS OF PERIYAR E...
AWARENESS OF RIGHT TO INFORMATION ACT AMONG THE COLLEGE STUDENTS OF PERIYAR E...
 
Workforce Development: The Next Plan
Workforce Development: The Next PlanWorkforce Development: The Next Plan
Workforce Development: The Next Plan
 
1 tenea lewissocw 6301methodological approach
1 tenea lewissocw 6301methodological approach1 tenea lewissocw 6301methodological approach
1 tenea lewissocw 6301methodological approach
 
Adoption of internal web technologies by oecd turkish government officials
Adoption of internal web technologies by oecd turkish government officialsAdoption of internal web technologies by oecd turkish government officials
Adoption of internal web technologies by oecd turkish government officials
 
003 libre(1)
003 libre(1)003 libre(1)
003 libre(1)
 
Data Day 2012_Kahn_Using Indicators-Boston Indicators
Data Day 2012_Kahn_Using Indicators-Boston IndicatorsData Day 2012_Kahn_Using Indicators-Boston Indicators
Data Day 2012_Kahn_Using Indicators-Boston Indicators
 
How can 'IT' improve national competitiveness
How can 'IT' improve national competitivenessHow can 'IT' improve national competitiveness
How can 'IT' improve national competitiveness
 
11.0005www.iiste.org call for paper.[39-44]fostering the practice and teachin...
11.0005www.iiste.org call for paper.[39-44]fostering the practice and teachin...11.0005www.iiste.org call for paper.[39-44]fostering the practice and teachin...
11.0005www.iiste.org call for paper.[39-44]fostering the practice and teachin...
 

Similar to DWBI_Project_Women_Empowerment_and_Gender_Gap

In depth Analysis of Suicide and its factors
In depth Analysis of Suicide and its factorsIn depth Analysis of Suicide and its factors
In depth Analysis of Suicide and its factors
YashIyengar
 
DWBI - Criminalytics: Entities affecting the Rate of Crime in Republic of Ireland
DWBI - Criminalytics: Entities affecting the Rate of Crime in Republic of IrelandDWBI - Criminalytics: Entities affecting the Rate of Crime in Republic of Ireland
DWBI - Criminalytics: Entities affecting the Rate of Crime in Republic of Ireland
Shrikant Samarth
 
A Comparative Analysis of the Level of a State’s Economic Development with th...
A Comparative Analysis of the Level of a State’s Economic Development with th...A Comparative Analysis of the Level of a State’s Economic Development with th...
A Comparative Analysis of the Level of a State’s Economic Development with th...
James Darnbrook
 
ASHE 2017 - Annual Status of Higher Education of States and UTs in India
ASHE 2017 - Annual Status of Higher Education of States and UTs in India ASHE 2017 - Annual Status of Higher Education of States and UTs in India
ASHE 2017 - Annual Status of Higher Education of States and UTs in India
Confederation of Indian Industry
 
Final project- Data for Better Lives- Digital Artifact
Final project- Data for Better Lives- Digital ArtifactFinal project- Data for Better Lives- Digital Artifact
Final project- Data for Better Lives- Digital Artifact
Bezawit5
 
Governance of Skills Systems
Governance of Skills SystemsGovernance of Skills Systems
Governance of Skills Systems
OECD Centre for Skills
 
X18136931 dwbi report
X18136931 dwbi reportX18136931 dwbi report
X18136931 dwbi report
KarthikSundaresanSub
 
Technology and accountability – ideas
Technology and accountability – ideasTechnology and accountability – ideas
Technology and accountability – ideas
Laina Emmanuel
 
AN ECONOMIC ANALYSIS OF PUBLIC EXPENDITURE ON SOCIAL SECTORS IN INDIA FROM 20...
AN ECONOMIC ANALYSIS OF PUBLIC EXPENDITURE ON SOCIAL SECTORS IN INDIA FROM 20...AN ECONOMIC ANALYSIS OF PUBLIC EXPENDITURE ON SOCIAL SECTORS IN INDIA FROM 20...
AN ECONOMIC ANALYSIS OF PUBLIC EXPENDITURE ON SOCIAL SECTORS IN INDIA FROM 20...
IAEME Publication
 
Analysis of Indian Agriculture
Analysis of Indian AgricultureAnalysis of Indian Agriculture
Analysis of Indian Agriculture
sushantparte
 
HR Webinar: The New EEO-1 Component 2 Reporting Requirement: Are You Ready?
HR Webinar: The New EEO-1 Component 2 Reporting Requirement: Are You Ready?HR Webinar: The New EEO-1 Component 2 Reporting Requirement: Are You Ready?
HR Webinar: The New EEO-1 Component 2 Reporting Requirement: Are You Ready?
Ascentis
 
Making Gender Targets Count: Time for G20 Leaders to Deliver
Making Gender Targets Count: Time for G20 Leaders to DeliverMaking Gender Targets Count: Time for G20 Leaders to Deliver
Making Gender Targets Count: Time for G20 Leaders to Deliver
Gabriela Ramos
 
Data-Warehouse-and-Business-Intelligence
Data-Warehouse-and-Business-IntelligenceData-Warehouse-and-Business-Intelligence
Data-Warehouse-and-Business-Intelligence
Shantanu Deshpande
 
Exploring the Role and Opportunities for Open Government Data and New Technol...
Exploring the Role and Opportunities for Open Government Data and New Technol...Exploring the Role and Opportunities for Open Government Data and New Technol...
Exploring the Role and Opportunities for Open Government Data and New Technol...
Open Data Research Network
 
Eng gender profile viet nam
Eng gender profile   viet namEng gender profile   viet nam
Eng gender profile viet nam
Nguyen Linh
 
IAMAI Factly Report: People below age 20 or above 50 more susceptible to fake...
IAMAI Factly Report: People below age 20 or above 50 more susceptible to fake...IAMAI Factly Report: People below age 20 or above 50 more susceptible to fake...
IAMAI Factly Report: People below age 20 or above 50 more susceptible to fake...
Social Samosa
 
2786
27862786
Value of connectivity
Value of connectivityValue of connectivity
Value of connectivity
Startupi
 
Mass Media Essay. ️ Conclusion of mass media in education. Essay on Mass Medi...
Mass Media Essay. ️ Conclusion of mass media in education. Essay on Mass Medi...Mass Media Essay. ️ Conclusion of mass media in education. Essay on Mass Medi...
Mass Media Essay. ️ Conclusion of mass media in education. Essay on Mass Medi...
Brittany Simmons
 
Mass Media Essay.pdf
Mass Media Essay.pdfMass Media Essay.pdf
Mass Media Essay.pdf
Rosa Williams
 

Similar to DWBI_Project_Women_Empowerment_and_Gender_Gap (20)

In depth Analysis of Suicide and its factors
In depth Analysis of Suicide and its factorsIn depth Analysis of Suicide and its factors
In depth Analysis of Suicide and its factors
 
DWBI - Criminalytics: Entities affecting the Rate of Crime in Republic of Ireland
DWBI - Criminalytics: Entities affecting the Rate of Crime in Republic of IrelandDWBI - Criminalytics: Entities affecting the Rate of Crime in Republic of Ireland
DWBI - Criminalytics: Entities affecting the Rate of Crime in Republic of Ireland
 
A Comparative Analysis of the Level of a State’s Economic Development with th...
A Comparative Analysis of the Level of a State’s Economic Development with th...A Comparative Analysis of the Level of a State’s Economic Development with th...
A Comparative Analysis of the Level of a State’s Economic Development with th...
 
ASHE 2017 - Annual Status of Higher Education of States and UTs in India
ASHE 2017 - Annual Status of Higher Education of States and UTs in India ASHE 2017 - Annual Status of Higher Education of States and UTs in India
ASHE 2017 - Annual Status of Higher Education of States and UTs in India
 
Final project- Data for Better Lives- Digital Artifact
Final project- Data for Better Lives- Digital ArtifactFinal project- Data for Better Lives- Digital Artifact
Final project- Data for Better Lives- Digital Artifact
 
Governance of Skills Systems
Governance of Skills SystemsGovernance of Skills Systems
Governance of Skills Systems
 
X18136931 dwbi report
X18136931 dwbi reportX18136931 dwbi report
X18136931 dwbi report
 
Technology and accountability – ideas
Technology and accountability – ideasTechnology and accountability – ideas
Technology and accountability – ideas
 
AN ECONOMIC ANALYSIS OF PUBLIC EXPENDITURE ON SOCIAL SECTORS IN INDIA FROM 20...
AN ECONOMIC ANALYSIS OF PUBLIC EXPENDITURE ON SOCIAL SECTORS IN INDIA FROM 20...AN ECONOMIC ANALYSIS OF PUBLIC EXPENDITURE ON SOCIAL SECTORS IN INDIA FROM 20...
AN ECONOMIC ANALYSIS OF PUBLIC EXPENDITURE ON SOCIAL SECTORS IN INDIA FROM 20...
 
Analysis of Indian Agriculture
Analysis of Indian AgricultureAnalysis of Indian Agriculture
Analysis of Indian Agriculture
 
HR Webinar: The New EEO-1 Component 2 Reporting Requirement: Are You Ready?
HR Webinar: The New EEO-1 Component 2 Reporting Requirement: Are You Ready?HR Webinar: The New EEO-1 Component 2 Reporting Requirement: Are You Ready?
HR Webinar: The New EEO-1 Component 2 Reporting Requirement: Are You Ready?
 
Making Gender Targets Count: Time for G20 Leaders to Deliver
Making Gender Targets Count: Time for G20 Leaders to DeliverMaking Gender Targets Count: Time for G20 Leaders to Deliver
Making Gender Targets Count: Time for G20 Leaders to Deliver
 
Data-Warehouse-and-Business-Intelligence
Data-Warehouse-and-Business-IntelligenceData-Warehouse-and-Business-Intelligence
Data-Warehouse-and-Business-Intelligence
 
Exploring the Role and Opportunities for Open Government Data and New Technol...
Exploring the Role and Opportunities for Open Government Data and New Technol...Exploring the Role and Opportunities for Open Government Data and New Technol...
Exploring the Role and Opportunities for Open Government Data and New Technol...
 
Eng gender profile viet nam
Eng gender profile   viet namEng gender profile   viet nam
Eng gender profile viet nam
 
IAMAI Factly Report: People below age 20 or above 50 more susceptible to fake...
IAMAI Factly Report: People below age 20 or above 50 more susceptible to fake...IAMAI Factly Report: People below age 20 or above 50 more susceptible to fake...
IAMAI Factly Report: People below age 20 or above 50 more susceptible to fake...
 
2786
27862786
2786
 
Value of connectivity
Value of connectivityValue of connectivity
Value of connectivity
 
Mass Media Essay. ️ Conclusion of mass media in education. Essay on Mass Medi...
Mass Media Essay. ️ Conclusion of mass media in education. Essay on Mass Medi...Mass Media Essay. ️ Conclusion of mass media in education. Essay on Mass Medi...
Mass Media Essay. ️ Conclusion of mass media in education. Essay on Mass Medi...
 
Mass Media Essay.pdf
Mass Media Essay.pdfMass Media Essay.pdf
Mass Media Essay.pdf
 

Recently uploaded

Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
zsjl4mimo
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
kuntobimo2016
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
g4dpvqap0
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 

Recently uploaded (20)

Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(Harvard毕业证书)哈佛大学毕业证如何办理
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 

DWBI_Project_Women_Empowerment_and_Gender_Gap

  • 1. Data Warehousing and Business Intelligence Project on Women Empowerment and Gender Equality Alekhya Bhupati x18132634 MSc/PGDip Data Analytics – 2019/20 Submitted to: Sean Heeney
  • 2. National College of Ireland Project Submission Sheet – 2019/2020 School of Computing Student Name: Alekhya Bhupati Student ID: x18132634 Programme: MSc Data Analytics Year: 2019/20 Module: Data Warehousing and Business Intelligence Lecturer: Sean Heeney Submission Due Date: 12/04/2019 Project Title: Women Empowerment and Gender Equality I hereby certify that the information contained in this (my submission) is information pertaining to my own individual work that I conducted for this project. All information other than my own contribution is fully and appropriately referenced and listed in the relevant bibliography section. I assert that I have not referred to any work(s) other than those listed. I also include my TurnItIn report with this submission. ALL materials used must be referenced in the bibliography section. Students are encouraged to use the Harvard Referencing Standard supplied by the Library. To use other author’s written or electronic work is an act of plagiarism and may result in disci- plinary action. Students may be required to undergo a viva (oral examination) if there is suspicion about the validity of their submitted work. Signature: Date: April 12, 2019 PLEASE READ THE FOLLOWING INSTRUCTIONS: 1. Please attach a completed copy of this sheet to each project (including multiple copies). 2. You must ensure that you retain a HARD COPY of ALL projects, both for your own reference and in case a project is lost or mislaid. It is not sufficient to keep a copy on computer. Please do not bind projects or place in covers unless specifically requested. 3. Assignments that are submitted to the Programme Coordinator office must be placed into the assignment box located outside the office. Office Use Only Signature: Date: Penalty Applied (if applicable):
  • 3. Table 1: Mark sheet – do not edit Criteria Mark Awarded Comment(s) Objectives of 5 Related Work of 10 Data of 25 ETL of 20 Application of 30 Video of 10 Presentation of 10 Total of 100
  • 4. Project Check List This section capture the core requirements that the project entails represented as a check list for convenience. Used L A TEX template Three Business Requirements listed in introduction At least one structured data source At least one unstructured data source At least three sources of data Described all sources of data All sources of data are less than one year old, i.e. released after 17/09/2017 Inserted and discussed star schema Completed logical data map Discussed the high level ETL strategy Provided 3 BI queries Detailed the sources of data used in each query Discussed the implications of results in each query Reviewed at least 5-10 appropriate papers on topic of your DWBI project
  • 5. Women Empowerment and Gender Equality Alekhya Bhupati x18132634 April 12, 2019 Abstract Gender Equality is nothing but a human right which makes all the people to live their life with dignity and freedom regardless of the gender. Women empowerment in one way plays a crucial role in decreasing the gender inequality and the other side it paves path for economic development of the countries. But unfortunately, from centuries women are the victims of this gender discrimination in our society. Keeping the status of Women in various fields like education, employment and politics in different countries. In this paper an attempt is made to present important factors of inequalities that exist in countries worldwide and how they are changing from past few years so as to have an idea about to what extent the women are empowered. For this analysis the Data Science is the correct approach for having a right visualization and analyzing the present growth and the future prediction to improve the gender equality. 1 Introduction Gender Inequality becomes one of the major concerns of the society. From past few centuries struggling to get equality in gender in all the aspects of life but still women are lacking in few areas of development. And women are facing gender discrimination in many sectors like in Education, higher designations in companies, Parliament member etc., But this discrimination is indirectly leading to the economic degradation of the countries because when the women is well educated and empowered then the families will improve which in turn leads to the economic development of the society. Now a days women are coming out from the shell which society made that women should be limited to them because of that we are able to see a little growth of women in society. So, all these concerns made me to think about the necessity of gender equality and I choose this topic for my project. Here my data sets consist of data regarding women and men employability, seats held by women in national parliament, literacy rate, Mean years of schooling and global gender gap index. By using these data, we can analyze by making BI queries how far the development of women Is taking place and where it is lagging. (Req-1) My first requirement is to analyze that Employability of women in different sectors like Agriculture, Industry and Services and Women in politics and comparing the gap between men and women Region years for the years 2010 to 2018 (Req-2) My second requirement is to compare the literacy rate and means of schooling women and men and analyzing the gap between the over years from 1990 to 2016. 1
  • 6. Source Type Brief Summary ILO (Intenational Labour Oragani- sation) Structured Provides the data for percentage of employ- ment of men and women in different sectors in all countries from 2006 t0 2018. UN data Structured Provides the data of seats held by women in national parliament for some random years from 1990 to 2018. UNICEF Structured It contains the data of Literacy Rate of male and female from 1986 to 2016, which is a statistical update of 2018. UNDP Structured In this site two data sets are downloaded, Mean years of Schooling of male and Mean years of Schooling of female over the years 1986 to 2016, which is a statistical update of 2018. Where both data sets are combined and structured using R Statista Structured Downloaded the data that provides the in- formation about the ten largest populated countries in the world for the year 2018 The World Bank Structured and Unstructured From this source downloaded one structured data set of Global Gender gap Index from 2006 to 2007 and another unstructured data of Countries classified based on region and income level which is scrapped using R Table 2: Summary of sources of data used in the project (Req-3) My Third BI Query is easy Analyze whether this gender gap is based on Region or Income level of the countries or depends on population of country. 2 Data Sources Sources of data used in the project. 2.1 Source 1: ILO ILO Contains the data for the Employment of women and men in different sectors which helps us to compare that employability rate between the women and men and also to know in which sectors the women have to improve this will be visualized further in the business intelligence(BI) query. The link for the data is https://www.ilo.org/wesodata/ 2.2 Source 2: UN data Downloaded the csv file which contains the data of Percentage of women sharing the seats in national parliament. By this data we can analyze how much the women are empowered in our parliament and how much of growth taking place every year. This source2 is
  • 7. combined with source 1 and used in our first BI query to show the growth rate of women in different sectors by comparing with men. The source link is http://data.un.org/ 2.3 Source 3: UNICEF The link for data source is https://data.unicef.org/topic/education/literacy/ .From this site we have got structured data source contains the information on literacy rate of men and women over years, where literacy is one of the main components of economic growth of the countries. Further we will be visualizing this data in our second BI Query. 2.4 Source 4: UNDP UNDP means United Nations Development program, which contains a huge amount of data related to the growth of countries. From this site we have taken two structured data set for Mean years of Schooling Male and Mean years of Schooling Female. As the education in turn helps in making the good society both in economically and techno- logically. So we have used this source 4 with Source 3 and will be visualizing later in our BI query 2 to analyze how the mean years of schooling effecting literacy rate and also what percentage of female are lacking in education than men. The link for data is http://hdr.undp.org/en/data 2.5 Source 5: Statista Statista was the fifth source which is also used as structured data, The link for data set is https://www.statista.com/statistics/262879/countries-with-the-largest-population/ .This data set contains the ten largest populated countries In the year 2018.We are using this population of the countries to find whether gender equality depends on population or on other factors which we can see in our third BI query. 2.6 Source 6: The World Bank The World Bank was the last source, in which have used two data sets one is structured and other is unstructured data. https://tcdata360.worldbank.org/indicators/af52ebe9? country=BRAindicator=27959viz=line_chartyears=2006,2018indicators=944 compareBy=region this data set contains the data of global gender gap index for different indicators for all the countries by which we can easily analyze what percentage of women are facing gender inequalities. Another data for this project was also from this source https://datahelpdesk. worldbank.org/knowledgebase/articles/906519 which we web scraped using R. This contains the data of World Countries Region wise and Income Level of the Countries wise which is used for us to analyze our data based on this parameters to get more clear idea on which areas and fields we are having gender inequalities and what should be in particular fields to improve empowerment of women which we will be visualized in our BI Queries later.
  • 8. 3 Related Work A number of studies have shown that sustainable development is impossible without women’s empowerment and gender equality. Consequently, it is asserted that gender equality is both a human rights issue and a precondition for, and indicator of, sustainable development (Alvarez and Lopez, 2013). Providing women and girls with equal access to education, health care, decent work, and representation in political and economic decision-making processes will fuel sus- tainable economies and benefit societies and humanity at large (United Nations (n.d.) Sustainable Development).This gender inequality can be observed in several aspects of daily life such as access to education, job opportunities, and economic resources (United Nations Development Programme [UNDP], 2015). When compared to men, women have greater access to the use of force, greater access to resource control, and more advanta- geous cultural ideologies. According to the report of (United Nations (n.d.) Sustainable Development), nearly in 18 countries, husbands can legally prevent their wives from working; in 39 countries, daughters and sons do not have equal inheritance rights; and 49 countries lack laws protecting women from domestic violence. Only 52 per cent of women married or in a union freely make their own decisions about sexual relations, contraceptive use and health care. In 2011 only 20 percent of the low-income nations had achieved gender parity in primary education and 66 Percent of the worlds 774 million illiterate adults were still women. There is consensus that gender equity is an important goal to be achieved (e.g., UN Women, 2011).In 2016, the women literacy rate increased to 89 percent(UNICEF). Women are attaining sustainable growth in education, but they were not getting equal opportunities to prove themselves. Globally, the labour force participation rate for men and women aged 15 and over continues its long-term decline; it stands at 61.8 per cent in 2018, down by 1.4 percent- age points over the past decade. The decline in womens participation rate has been slower than that of men, resulting in a slight narrowing of the gender gap. These trends reflect different patterns across the life cycle, resulting from changes in both education participation among youth and, at the other end of the scale, older workers retirement choices. The headline finding, however, is that, on average around the world, women remain much less likely to participate in the labour market than men. At 48.5 per cent in 2018, womens global labour force participation rate is 26.5 percentage points below that of men (table 1). Since 1990, this gap has narrowed by 2 percentage points, with the bulk of the reduction occurring in the years up to 2009. The rate of improvement, which has been slowing since 2009, is expected to grind to a halt during 201821, and possibly even reverse, potentially negating the relatively minor improvements in gender equality in access to the labour market achieved over the past decade. (International Labour organization -Trends for women 2018). Greater participation of women in social and political sphere is essential to make the social and political institutions more representative. It serves as a tool for empowerment of women and contributes to gender sensitive decision making. Globally, there are 29 States in which women account for less than 10 per cent of parliamentarians in single or lower houses, as of November 2018, including 4 chambers with no women at all (ibid.). Wide variations remain in the average percentages of women parliamentarians in each region. As of November 2018, these were (single, lower and upper houses combined):
  • 9. Nordic countries, 42.3 percent; Americas, 30 percent; Europe including Nordic countries, 27.7 percent; Europe excluding Nordic countries, 26.6 percent; sub-Saharan Africa, 23.6 percent; Asia, 19.4 percent; Arab States, 17.8 percent; and the Pacific, 17 percent (Inter- Parliamentary Union, 2018). Womens representation in local governments can make a difference. Research on pan- chayats (local councils) in India discovered that the number of drinking water projects in areas with women-led councils was 62 per cent higher than in those with men-led councils. In Norway, a direct causal relationship between the presence of women in municipal councils and childcare coverage was found. (R. Chattopadhyay and E. Duflo, 2014).Women demonstrate political leadership by working across party lines through par- liamentary woman’s caucuses - even in the most politically combative environments - and by championing issues of gender equality, such as the elimination of gender-based violence, parental leave and childcare, pensions, gender-equality laws and electoral reform. 4 Data Model This section we can see the information about the dimensions created in SSIS. There are seven dimension and one fact table which are connected using star schema as shown in the below Figure 1. The star schema further separates the business process data into facts which has measurable data. Figure 1: Star Schema Detailed discussion on Facts and Dimension as follows: Dim Country: Dim Country contains the names of all the countries. This data was obtained from the World Bank and Statista. Dim Region: Dim Region contains the names of all the region which was taken by selecting distinct Regions to analyze our data Region wise. This data was gathered from
  • 10. The world Bank. Dim Income: Dim Income contains the names of all the income level of the countries was taken by selecting distinct income level of the country to analyze our data by Income level country wise. This data is also gathered from The World Bank Dim Sector; Dim Sector contains the different sectors of occupations of men and women. In data is collected from these two data sources ILO and UN data. Dim Gender: Dim Gender contains the rows of Gender as Female or Male, which is important dimension in our project as we are comparing every field on gender to know the gender inequalities. The data is obtained from ILO, UNICEF, UNDP and The World Bank. Dim Year: Dim Year contains the years which will be used to analyze the how growth rate is changing over years.The data is collected form ILO, UNICEF, UNDP and The World Bank. Dim Indicator: Dim Indicator Contains the fields which indicate the gender gap index such as educational attainment, health and survival index etc., by combining this dimen- sion with some other dimensions and fact table we can analyze the gender gap between men and women. Fact Women Empowerment: Fact table is interlinked with all the dimensions which includes measures and foreign keys from the connected dimensions which are having the country id, region id, income level id, sector id, gender id, year id, percentage of women and men in different sectors, percentage of literacy, mean years of schooling, global gender gap subindex, number of inhabitants in millions.
  • 11. 5 Logical Data Map Table 3: Logical Data Map describing all transforma- tions, sources and destinations for all components of the data model illustrated in Figure 1 Source Column Destination Column Type Transformation ILO Location Dim Country Country Dimension Changed the Name Country, removed commas in the column ILO 2006, 2007, 2008, 2009, 2010, 20011, 2012, 2013, 2014, 2015,2016, 2017, 2018, Dim Year Year Dimension Here the all year columns are combined in to single Year column using melt function in R $ ILO Gender Dim Gender Gender Dimension No Transformation done ILO Sector Dim Sector Sector Dimension No Transformation done ILO No Header Fact Women Empower- ment Percentage Fact created new column Percentage with percentage values of all years using melt function in R UN data Region Coun- try Area Dim Country Country Dimension Changed the Name Country, removed commas in the column UN data Year Dim Year Year Dimension No Transformation done UN data No Header Dim Gender Gender Dimension Added new column gender in R UN data Series Dim Sector Sector Dimension Renamed the column name to Sector and replaced the column text with Parliament using replace function in R Continued on next page
  • 12. Table 3 – Continued from previous page Source Column Destination Column Type Transformation UN data Value Fact Women Empower- ment Percentage Fact Added new column Percentage by making 100 minus value of women sharing seats in percentage we got men sharing seats in parliament and added the value accord- ing to the gender UNICEF No Header Dim Year Year Dimension Here the all year columns are combined in to single Year column using melt function in R UNICEF Youth Lit- eracy rate, population 15.24 years, female and Youth Lit- eracy rate, population 15.24 years, female Dim Gender Gender Dimension Here both the columns are combined using melt func- tions and added a column Gender according to values in Literacy rate differentiate between men and women UNICEF No Header Fact Women Empower- ment Literacy Rate Fact created new column Literacy Rate with percentage val- ues of all years using melt function in R UNDP Country Dim Country Country Dimension No Transformation done UNDP No Header Dim Year Year Dimension Here the all year columns are combined in to single Year column using melt function in R UNDP No Header Dim Gender Gender Dimension Here we have taken data from two files contains male schooling years rate and female Schooling years, and combined both values using melt function in R and named the column as Literacy and added this new Gen- der column to differentiate between those values. Continued on next page
  • 13. Table 3 – Continued from previous page Source Column Destination Column Type Transformation UNDP No Header Fact Women Empower- ment Schooling Fact created new column Literacy Rate with percentage val- ues of all years using melt function in R Statista No Header Dim Country Country Dimension Added column Name Country in R Statista Inhabitants in millions Fact Women Empower- ment Inhabitants Fact Changed the name to Inhabitants in R and removed commas in the value Statista No Header Dim Year Year Dimension Added new column year and assigned year 2018 in R The World Bank Country Name Dim Country Country Dimension Changed the name to Country in R The World Bank Indicator Dim Indica- tor Indicator Dimension Reduced the size of column text The World Bank 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018 Dim Year Year Dimension Here the all year columns are combined in to single Year column The World Bank No Header Fact Women Empower- ment Subindex Fact created new column Subindex with subindex values of all years using melt function in R The World Bank No Header Dim Country Country Dimension Scrapped data of countries from this source and named column as Country in R and removed commas in the text The World Bank No Header Dim Region Region Dimension Scrapped data of Regions from this source and named column as Region in R and removed commas in the text Continued on next page
  • 14. Table 3 – Continued from previous page Source Column Destination Column Type Transformation The World Bank No Header Dim Income Level Income Level Dimension Scrapped data of Income Level Countries from this source and named column as Income Level in R and removed commas in the text
  • 15. 6 ETL Process ETL is a process where the data is extracted and cleaned and then transformed into a format which is easy to use and then loaded into database by staging process. Extract The common step in every ETL process to extract the data, where exploratory data analysis involves identifying the correct data which serves our purpose. It mostly depends on what kind of data we are searching, and it can be scrapped or downloaded from different sources and the data should be cleaned to utilize it in a proper manner. Over here there are 6 sources which are been used from ILO, UN data , UNICEF, UNDP, Statista and also the data is web scrapped using R from The World bank. After getting data from these sources it is fully cleaned in R and converted in to csv file format for further process. Transform After collecting data from the extraction process. The next step is to transform the data. Here the data is transferred to the target destination. But cleaning of data is more important before transferring the data to serve our purpose of analysis Which involves removal of commas and special characters in the text , removal of null values and any unnecessary data, Transpose of row and columns and adding or replacing new column names to have similarity with other sources. After completing this cleaning process then it is stored in a csv file format for the loading process. Load After getting the cleaning and transformed data, the following step is loading of data into the staging area through the automation process in r. After automated in R in SSIS, all the sources will populate raw tables with data which is further used to create fact table and dimension tables. Here fact tables consists of measure values in it which are country id, region id, income level id, sector id, gender id, year id, percentage of women and men in different sectors, percentage of literacy, mean years of schooling, global gender gap subindex, number of inhabitants in millions. And similar way all the Dimension table contains the primary key ids which are connected to fact table as foreign keys to create a star schema. After creating star schema in SSIS then we deploy the cube and create hierarchy. After this process our database is ready for analyzing and visualize our BI queries in Tableau 7 Application Below are three BI Queries noted in Section 1 which we are going to analyze in our project. 7.1 BI Query 1: Is there any growth of women in different employment sectors and also in political sector? For this query, the contributing sources of data are ILO, World Bank. The visualization obtained as illustrated in Figure 2. Here it demonstrates the women and men employa- bility in different sectors for the years 2010 to 2018 Region wise. So here we can see that women have equal opportunities or more in Services sectors, but women have very less priority in parliament sector in all the regions and the growth is negligible over years. In
  • 16. the other two sectors, agriculture men and women are having more or less equal oppor- tunities and in industry sector as well women having less opportunities than men. So, the overall graph shows women are facing gender inequalities in all the sectors especially more in Parliament sector. Figure 2: Results for BI Query 1 7.2 BI Query 2:status of growth of women in educational at- tainment? The data sets used for my second query are UNICEF and UNDP. The visualization obtained as illustrated in Figure 3. In this we have compared the literacy rate and mean years of schooling of men and women. From the analysis we can found that mean years of schooling of women increasing over years from 1990 to 2016 which in turn increasing the literacy rate. Hence there is significant growth rate of women in education attainment but still there exists a gender gap when compared to men. Figure 3: Results for BI Query 2 7.3 BI Query 3: On which factors the gender gap is depending Regions, income level of the countries or population of the countries? The data sets from The World Bank and Statista are used in this BI Query. The vi- sualization obtained as illustrated in Figure 4. Here we made comparison between the
  • 17. population of the countries, Region of the countries and income level of the countries using Global Gender gap index. Here we can see the population is not affecting the global gender gap directly. But the income level of the countries and regions are affecting the global gender gap. Hence necessary measures are taken to improve the low-income economies so that we can have significant economic growth. Figure 4: Results for BI Query 3 7.4 Discussion In this section I’m going to discuss more about the BI query and how it deals with the present situations faced by women in society. My first does the growth rate of women in different sectors. As per the outcome of visualization there is wide gender gap between men and women and also women in different areas. In Agriculture and Services sectors for the regions South Asia and Sub Saharan Africa the employment of women is high in Agriculture and less in Services sector, which indicates that women are facing major gender inequalities in these areas when compared to others. And, in every region the gap between men and women in political sector is very high. Second BI query represents that mean years of schooling over the years from 1996 to 2016 has drastic growth and also correspondingly there was growth in literacy rate. But even though the literacy rate increasing still there is a gap between men and women and women are lagging behind men. Third BI query ask for on which factor the gender gap is depending, here if we see the population of China which is far higher than Brazil , and if we compare the Economic opportunity index of Brazil and China both are equal . By this we can say that the population is not affecting the gender gap directly. In the same way if see the graph of income level of countries and the education attainment index is low when compared to other income economies. We can say that Region and Income level of the countries are affecting the gender gap. 8 Conclusion and Future Work Providing education for women in less developed countries. Providing self-empowerment and self-help groups. And also providing equal opportunities for women in all the sectors. Women should have given more place in parliament. Encouraging women to develop in their field at which they are good at and make a career. Other than this, society should change the mentality toward the word Women After all the discussion, it can be concluded that all the three Bi Queries are well analyzed, and the data is up to date as compared to the other sources of reports and
  • 18. papers done above. It can be said new laws and awareness programs must be started for women empowerment. Data warehousing and business intelligence is the best approach to analyze the present situation and to forecast the future development programs to be done for development of women. References - 2018a. World Employment and Social Outlook: Trends for women 2018 (Geneva). - Alvarez and Lopez, 2013 Alvarez, Michelle Lopez From unheard screams to powerful voices: a case study of Women’s political empowerment in the Philippines 12th National Convention on Statistics (NCS) EDSA Shangri-la Hotel, Mandaluyong City October 12, 2013 (2013) Google Scholar - ibid. - Inter-Parliamentary Union. Women in national parliaments, as at 1 November 2018 - R. Chattopadhyay and E. Duflo (2004.WomenasPolicyMakers : Evidencefroma Randomized Policy Experiment in India, Econometrica 72(5), pp. 14091443; K. A. Bratton and L. P. Ray, 2002, Descriptive Representation: Policy Outcomes and Municipal Day-Care Coverage in Norway, American Journal of Political Science, 46(2), pp. 428437. - UNESCO, Education for All Global Monitoring Report 2013/14: Teaching and Learning Achieving Quality for All, UNESCO, Paris, 2014. - UNICEF, The State of the Worlds Children 2015: Reimagine the Future Innovation for Every Child, UNICEF, New York, 2014. - United Nations (n.d.) Sustainable Development. Gender Equality Why It Mat- ters. Available at: http://www.un.org/sustainabledevelopment/gender-equality/ [Google Scholar] - United Nations Development Programme [UNDP] (2015). Human Development Re- port 2015. Work for Human Development. Available at: http://hdr.undp.org/sites/default /files/2015humandevelopmentreport.pdf - UN Women (2011). The Womens Empowerment Principles: Equality Means Busi- ness. Available at: http://www.unwomen.org/-/media/headquarters/attachments/sections/ library/publications/2011/10/women-s-empowerment-principlesen References Appendix Vedio link https://youtu.be/w0fHvbUe8Jw R code #### Global gender gap -source from the WOrld bank library(dplyr) library(reshape2) getwd () setwd(/Users/MOLAP/Documents/DWBI/Raw_Data_Files) #read a file
  • 19. gender1 - read.csv(Gap.csv,TRUE ,,) #Melt gender1.m1= melt(gender1, id.vars = c(Country.ISO3,Country.Name, Indicator,Subindicator.Type), measure.vars = c(X2006,X2007, X2008,X2009,X2010,X2011, X2012,X2013,X2014, X2015,X2016,X2018)) gender1.m1= melt(gender1, measure.vars=c(X2006,X2007,X2008 ,X2009,X2010,X2011,X2012, X2013,X2014,X2015,X2016,X2018), variable.name = Year, value.name = subindex) gender1.m2- gender1.m1 #Filter gender1.m2- filter(gender1.m2,Indicator %in% c(Global Gender Gap Economic Participation and Opportunity Subindex, Global Gender Gap Educational Attainment Subindex, Global Gender Gap Health and Survival Subindex, Global Gender Gap Political Empowerment subindex)) #Remove first letter in a value gender1.m2$Year_1= as.numeric(gsub(X, , gender1.m2$Year )) #Remove column gender1.m2 - gender1.m2[,-5] #Rename gender1.m3 - gender1.m2 names(gender1.m3)[ names(gender1.m3) == Country.ISO3] - Country_ID names(gender1.m3)[ names(gender1.m3) == Country.Name] - Country names(gender1.m3)[ names(gender1.m3) == Subindicator.Type] - Subindicator_Type names(gender1.m3)[ names(gender1.m3) == Year_1] - Year #Reorder Columns gender1.m3 - gender1.m3[c(1,2,6,3,4,5)] #Remove column gender1.m3 - gender1.m3[,-1] #Remove spaces in strings gender1.m4 - gender1.m3 #to add underscore between strings gender1.m4$Indicator= gsub( , _, gender1.m4$Indicator) #Remove unnecessary data gender1.m4$Indicator= gsub( Global_Gender_Gap_, , gender1.m4$Indicator) #Remove Commas and rest of the value gender1.m5 - gender1.m4 gender1.m5$Country= gsub( ,.*, , gender1.m5$Country) gender1.m6 - gender1.m5 # Omit NA values gender1.m6 - na.omit(gender1.m6) write.csv(gender1.m6, file = C:/ Users/MOLAP/Documents/DWBI/ Cleaned_Data_Files/Global_Gender_gap.csv,
  • 20. quote= F, row.names = F) ##### Youth Literacy rate - Source from UNICEF library(dplyr) library(reshape2) getwd () setwd(/Users/MOLAP/Documents/DWBI/Raw_Data_Files) #read a file lit - read.csv(Youth_Literacy_Rate.csv,TRUE ,,) #Rename lit.d1 - lit lit.d1 -lit.d1[,-2] str(lit.d2) names(lit.d1)[ names(lit.d1) == ..] - Year names(lit.d1)[ names(lit.d1) == Youth.literacy.rate .. population.15.24.years .. female] - Female names(lit.d1)[ names(lit.d1) == Youth.literacy.rate .. population.15.24.years .. male] - Male #melt lit.d2= melt(lit.d1, id.vars = c(Year), measure.vars = c(Female,Male)) lit.d2= melt(lit.d1, measure.vars = c(Female,Male), variable.name = Gender, value.name = Literacy_Rate) #omit NA Values lit.d2 - na.omit(lit.d2) write.csv(lit.d2, file = C:/ Users/MOLAP/Documents/DWBI/Cleaned_Data_Files/Literacy_Rate.csv, quote= F, row.names = F) ###### population - Source from statista library(dplyr) library(reshape2) getwd () setwd(/Users/MOLAP/Documents/DWBI/Raw_Data_Files) #read a file population - read.csv( World_Population_ten_largest_countries.csv ,TRUE ,,) #Rename population.d1 - population names(population.d1)[ names(population.d1) == Countries.with.the.largest.population.2018] - Country names(population.d1)[ names(population.d1) == X] - Inhabitants #Remove row from table population.d2 - population.d1 population.d2 - population.d2[-c(1,2), ] #Adding a column population.d3 -population.d2 population.d3$Year - 2018 #Reorder columns population.d3 - population.d3[c(1,3,2)]
  • 21. #Remove Commas population.d4 - population.d3 population.d4$Inhabitants= as.numeric(gsub(,, , population.d4$Inhabitants )) #Omit NA values population.d4 - na.omit(population.d4) write.csv(population.d4, file = C:/ Users/MOLAP/Documents/DWBI/ Cleaned_Data_Files /Ten_Largest_Populated_Countries.csv, quote= F, row.names = F) ###### Employment of women - Source from ILO library(dplyr) library(reshape2) getwd () setwd(/Users/MOLAP/DWBI/Cleaned_Data_Files) #read a file emp1- read.csv(Employment_by_sector_1.csv,TRUE ,,) emp2- read.csv(Employment_by_sector_2.csv,TRUE ,,) emp3- read.csv(Employment_by_sector_3.csv,TRUE ,,) #remove comma emp1$Employment.by.Sector .. Rate ...= gsub(,, , emp1$Employment.by.Sector .. Rate ...) emp2$Employment.by.Sector .. Rate ...= gsub(,, , emp2$Employment.by.Sector .. Rate ...) emp3$Employment.by.Sector .. Rate ...= gsub(,, , emp3$Employment.by.Sector .. Rate ...) #remove rows emp1 - emp1[-c(1,2),] emp2 - emp2[-c(1,2),] emp3 - emp3[-c(1,2),] es - rbind(emp1,emp2,emp3) es.d1-es #remove columns es.d1 - es.d1[,-c(16,17,18,19,20)] #melt es.d2-es.d1 es.d2= melt(es.d1, id.vars = c(Employment.by.Sector .. Rate ...,X,X.1), measure.vars = c(X.2,X.3,X.4, X.5,X.6,X.7,X.8, X.9,X.10,X.11,X.12,X.13)) str(es.d2) es.d2= melt(es.d1, measure.vars = c(X.2,X.3,X.4, X.5,X.6,X.7,X.8,X.9 ,X.10,X.11,X.12,X.13), variable.name = Year, value.name = Percentage)
  • 22. #Rename column es.d3- es.d2 names(es.d3)[ names(es.d3) == Employment.by.Sector .. Rate ...] - Country names(es.d3)[ names(es.d3) == X] - Gender names(es.d3)[ names(es.d3) == X.1] - Sector #Reorder Columns es.d3 - es.d3[c(1,4,2,3,5)] #Revalue data es.d4-es.d3 es.d4$Year= revalue(es.d4$Year , c(X.2 = 2007, X.3 = 2008, X.4 = 20 , X.8 = 2013, X.9 = 2014, X.10 = es.d4$Country= gsub(,, , es.d4$Country) es.d4$Percentage= gsub(%, , es.d4$Percentage) #Remove NA Values es.d4 - na.omit(es.d4) write.csv(es.d4, file = C:/ Users/MOLAP/Documents/DWBI/Cleaned_Data_Files/Emp ##### Mean years schooling - Source from UNDP library(plyr) library(dplyr) library(reshape2) getwd () setwd(/Users/MOLAP/Documents/DWBI/Raw_Data_Files) #read a file schm - read.csv(Mean years of schooling_Male.csv,TRUE ,,) #remove column schm.d1 - schm[, -which(names(schm) %in% c(Mean.years.of.schooling .. male .. X.14,X.16,X.18,X.20,X.22,X #remove row schm.d1 - schm.d1[-c(1),] names(schm.d1)[ names(schm.d1) == X] - Country schm.d1 - schm.d1[-c(173,174,175,176,177,178,179,180),] #melt schm.d2= melt(schm.d1, id.vars = c(Country), measure.vars = c(X.1,X.3,X.5,X.7,X.9,X.11,X.13,X.1 schm.d2= melt(schm.d1, measure.vars = c(X.1,X.3,X.5,X.7,X.9,X.11, variable.name = Year, value.name = Mean_years_schooling) #Rename column schm.d3- schm.d2 schm.d3$Year= revalue(schm.d3$Year , c(X.1 = 1990, X.3 = 1995, X.5 = #Add column schm.d3$Gender - Male #Reoder columns schm.d4 -schm.d3[c(1,2,4,3)] #Remove comma schm.d5-schm.d4 schm.d5$Country= gsub(,, , schm.d5$Country) #Omit NA schm.d5 - na.omit(schm.d5)
  • 23. # ############################################################################ #read a file schf - read.csv(Mean years of schooling_Female.csv,TRUE ,,) #remove column schf.d1 - schf[, -which(names(schf) %in% c(Mean.years.of.schooling .. female . X.14,X.16,X.18,X.20,X.22 #remove row schf.d1 - schf.d1[-c(1),] names(schf.d1)[ names(schf.d1) == X] - Country schf.d1 - schf.d1[-c(173,174,175,176,177,178,179,180),] #melt schf.d2= melt(schf.d1, id.vars = c(Country), measure.vars = c(X.1,X.3,X.5,X.7,X.9,X.11,X.13,X schf.d2= melt(schf.d1, measure.vars = c(X.1,X.3,X.5,X.7,X.9,X.11, variable.name = Year, value.name = Mean_years_schooling) #Rename column schf.d3- schf.d2 schf.d3$Year= revalue(schf.d3$Year , c(X.1 = 1990, X.3 = 1995, X.5 = #Add column schf.d3$Gender - Female #Reoder columns schf.d4 -schf.d3[c(1,2,4,3)] #Remove comma schf.d5-schf.d4 schf.d5$Country= gsub(,, , schf.d5$Country) #Omit NA schf.d5 - na.omit(schf.d5) #Write csv # ############################################################################ Mean_years_of_schooling - rbind(schm.d5,schf.d5) #Write csv write.csv(Mean_years_of_schooling , file = C:/ Users/MOLAP/Documents/DWBI/Clea