Lego case study tying in from economic freedom and titan
1. Lego Case Study
Tying in from Economic Freedom and Titanic, we’ve covered
Pyramid’s data analytical capabilities to
represent information both numerically and visually. We’ve also
explored how Discovery runs on
certain criteria (Data types, aggregates, etc) that is accessible
and adjustable within Pyramid’s
Modelling tab.
Task Objectives:
1. Apply skills and techniques learnt from Economic Freedom
and Titanic
2. Gain more understanding and make more sense of the ETL
process using Pyramid’s Data Flow
3. Join tables in order to make more sense of the data
4. Draw useful insights on Lego’s success from a sea of
seemingly irrelevant data
Background:
Lego System A/S is a Danish toy production company based in
Billund. It is best known for the
2. manufacture of Lego-brand toys, consisting mostly of
interlocking plastic bricks. Believe it or not, Lego
actually hit a major bump on the road in the early 2000s but
somehow managed to pull themselves
back out and rise up as the world’s top toy company in 2012,
which is the scope of our case study. The
source we will be working with consists of data up to the year
2017.
Context:
The data set available on the drive, when viewed separately,
only gives superficial notes regarding the
state of the toy company. To piece the puzzle together, we will
make use of Joins between tables to
be able to better see the turning point in the company’s history.
The specific Goal of this case study is to draw insights from the
given data to make sense of the articles
below:
- The end of the Lego crisis in 2008 -
https://www.liberation.fr/futurs/2011/10/03/lego-casse-
des-briques_76 5189
- Lego's late take-off in the girls market in 2012 -
https://fortune.com/2015/12/30/lego-friends-
girls/
3. Feel free to google around for more information/articles
regarding the above two points.
Source files:
For this tutorial, we will be working with sets.csv,
Inventories.csv, Inventory_Parts.csv, Color Hues.csv
and Lego Theme Groups, all of which should be on the drive
ready for you to download and work.
https://www.liberation.fr/futurs/2011/10/03/lego-casse-des-
briques_76%205189
https://www.liberation.fr/futurs/2011/10/03/lego-casse-des-
briques_76%205189
https://fortune.com/2015/12/30/lego-friends-girls/
https://fortune.com/2015/12/30/lego-friends-girls/
Walkthrough:
Let’s start by loading in the Sets dataset in Pyramid using Smart
Modelling.
Look and explore the data to see if there’s anything you can
learn from this data set alone. Do note
that you will need to apply aggregates to make more sense of
the data.
With Titanic, the tutorial covered the practice of setting
aggregates in the Modelling tab, but you can
4. also obtain aggregates in Pyramid’s Discovery simply by right
clicking on the column you are
interested in, select Create Measure followed by Aggregate and
click on whichever aggregate you
require.
You will also want to requalify some of your columns back in
the Modelling tab in Pyramid (Hint: What
would the sum of the theme_id column give us?).
With the information we have from this one file, unless you
know what you are looking for specifically
and ahead of time, trying to deduce valuable information from
this dataset alone is really akin to
looking for a needle in a haystack.
To enrich our data, in this tutorial as well as in real world
cases, we bring in additional tables, databases
via joins in order to shed more light on the existing information.
Gauthier has prepared two
homemade tables, Lego Theme Groups and Color Hues which
will help pull all the other tables into
one cohesive database that is more comprehensive and easier to
5. understand.
Let’s go ahead and join our existing sets table with the Lego
Theme Groups table. For starters we will
need to import Lego Theme Groups into our workspace. In the
Modelling tab under data flow, we will
perform the same ETL process that we’ve been working with for
all the previous exercises.
The Lego Theme Groups is an Excel file format. So, go ahead
and drag the Excel bar from the list of
Sources into your workspace.
Click on the Excel bar you just dropped, and upload Lego
Themes Group into the workspace. Once you
upload the file, Pyramid does some minor pre-processing before
actually importing it into the
workspace.
Under Table Selection, make sure to uncheck the box for Sheet
2; it contains a drafted list of movie
6. names which is of no use to us here. Click on Add Select Nodes
and you should see the table appear
in your workspace. Use your mouse to drag and link the new
Select bar to memory.
You can also rename your table for ease of reference in the
future by selecting the Select bar of the
table, and renaming it under the Properties tab. To prevent any
disparity, we’ve renamed this table
as Themes and will continue to refer to it as such.
Once your data flow is established, click on the Data Model tab,
and click on the Tables tab under
Elements.
Quick Run-through on Joins:
There are many join-operations available for us to use but the
7. most commonly used ones are the inner
join and the left join (which is essentially a right join from the
other side).
The inner join allows us to extract information that is exclusive
to both datasets. When we perform
join operations, we require some sort of reference as a means to
extract relevant information called
a key. A primary key, usually expressed as a number, is a
unique and universal identifier that connects
information between multiple tables.
Left or Right Joins allows us to extract overlapping information
between the two tables without losing
information pertaining to the left (or right) table.
Coming back to our workspace, we would like to enrich our
existing data by joining our sets table with
our Lego Themes table. Do note that you can also drag and
move the tables around your workspace
which can be helpful in the future, especially if you work with
left/right joins.
In our case, if you have done some reading online, you will
learn that Lego introduced more
8. mainstream themes (movies, cartoons, etc) in the early 2000s
that eventually led to it being the top
toy maker in the world by 2012, which is why we want to bring
in the Lego Themes dataset. The key
that connects both the Sets and Lego Themes datasets is
theme_id.
Start by dragging theme_id from the Sets table to the Lego
Themes table.
The Properties tab gives us some information regarding the Join
operation we want to perform. In
our case, the data we have is already clean (no missing values),
so it is safe to assume that we won’t
lose any information by performing the Join.
Verify that you are joining the tables according to Theme_id
and check the Bidirectional box. This just
means that both tables will interact with each other in both
directions. Once that’s done, go ahead
and process your model.
Much like Titanic, do drop a mental pin here as you will likely
comeback to perform more joins later
9. on.
Data Analysis:
Let’s start by trying to enrich the charts and graphs we managed
to produce with the sets dataset
alone. Again, this alone doesn’t really show us anything beyond
the ordinary.
However, now that we have information from the Themes table
to work with, we can start enriching
our existing information by adding Category information into
the mix.
Right away, you can see the graph becomes more detailed and
comprehensive. However, the issue
you are probably having now is the fact that this information,
though exact, is not exactly easy to
understand.
Thankfully, Pyramid has tools that would help us visualise this
graph in a simpler fashion. If you click
10. on Change Visual on your tool bar, you will see a list of
visualisation options.
Click on Columns followed by Stacked Column Chart.
The Stacked Column Chart option tells Pyramid in our case, to
stack the various categories vertically
so we can get a graph that is easier to understand.
Right away, you’ll notice that Lego introduced a lot more
themed toys between the late 90s to the
early 2000s, and this was their saving grace; the theme count
increase as well as the variety of colors
is the indicator here.
You could also visualise the increase in Lego’s attempts to
include Movie franchises buy obtaining
required licenses.
Take some time to explore and see what else you can learn
about this case using these two datasets.
11. More Analysis (Homework):
This section aims to walk through the process of joining all
available datasets to make even more
detailed nuances within Lego’s history come to light. The
following steps may seem tedious so for
those who want to see what this is all about without having to
do this tedious task, there is a premade
file with everything already done for your viewing pleasure at
the end of this tutorial.
Let’s enrich our existing data by introducing the Colors Hues
data set. This data set is referenced by a
color id column, which is something to keep in mind when we
perform Joins later on.
For this part of the tutorial, we are trying to show through
analysis that aside from the introduction
of new themes and licensing, Lego also tried to diversify their
target audience by introducing color
schemes that are more alluring to girls (keep in mind, late 90s
early 2000s).
Disclaimer:
This analysis is not an expression of any opinion, wish or
12. recommendation. It is just aimed at searching
for analytical signals in a mass of data and cannot give rise, at
least as far as this exercise is concerned,
to any hasty conclusion of Lego’s diversity policy or any
prejudice between colors and sexes/consumer
genres
There is also a shortcut option in page 23, though not
recommended, that covers the steps between pages 8 and 14 in
less detail.
Go ahead an import the Colors Hues file into your workspace.
Remember that we are working with a
text file(.csv), so make sure you select the Text File bar from
the Sources tab.
(Steps available on page2)
When you go ahead and try to perform a join in the Data Model
tab, you should notice that there is a
slight problem; we have no key to join the Colors Hues table to
the existing two tables we have.
13. To recap, we currently have the sets dataset joined with the
themes dataset using theme_id as the
key. In order to make sense of all of this (Master Data
Management), take some time looking through
the columns of all the datasets and understand what they all
mean. (Are there any other unique
identifying features (keys) that we can use to join all the tables
so that we can get a global view of
Lego’s history?)
Go ahead and import the inventories and inventories_parts data
sets into your workbox.
If you look at the Data Model tab, you’ll notice that Pyramid’s
software automatically tries to infer if
there are any possible joins that it can perform on the available
tables. In this case, the Sets and
Inventories tables are joined using set_num as the key.
To complete the joining of all tables to form your master data,
you will need to use the following keys:
set_num, inventory id and color id.
14. Make sure that all your joins are also set to have a bidirectional
relationship.
Do note that the data type selected for color id in the Colors
Hues table is set to Float. This may cause
some problems when you try to perform your join operation
because the data type selected for
color_id in the Inventory_Parts is set to integer. To fix this
problem, go back to the Data Flow tab,
select the Select bar for the Colors Hues table, and under
properties modify the data type under the
column selection tab.
Also take some time to think about the columns in terms of
measures. Are there any specific
aggregates that are useful in our analysis?
You can always come back later and play about with various
measures and see if that helps with your
analysis.
15. Now that you have everything you need in your work place, go
ahead and build the model and perform
some more analysis to see what else Lego did and how, that
eventually led to it being the biggest toy
company in the world in 2012.
The first thing we mentioned earlier was Lego’s move to
introduce more colors that would interest
girls. Below is a Column Chart of hues plotted against years.
Note that to better illustrate this point,
the column chart is stacked and the colors have been manually
modified to reflect the color they
represent.
If you apply a year filter to narrow down our scope, you will
notice colors that were often associated
with girls back in the day such as pink, orange, beige etc started
to make appearances in the 90s and
became more prevalent as Lego moved on.
16. You can also visualise Lego’s decision to become more
inclusive with girls and boys using the Matrix
Grid option as well. In this case you can also notice a steady
increase in the aforementioned colors.
Go ahead and play about with other measures and attributes to
see if there is anything else that is
interesting you can find.
More Challenging:
If you read a little more into Lego’s history, another problem
they faced was the fact that aside from
a few cases that were really interested in the toy, the vast
majority of their client population were
children who did not have the attention span and patience to be
able to sit through the entire set;
there were too many parts which required more time to finish
during which many children lose
17. interest and give up.
Look through the data you have and see if there is a way to
visualise how Lego solved this problem.
When did Lego started to reduce the complexity within their
toys? (Answer: 1997-2003)
Hint: Lego decided to reduce the complexity of their puzzles as
part of their strategy to win back
popularity during the 1990s to the early 2000s. How can we
show this using our existing data?
Give it a go and see what you can find.
Solution
on the next page.
Looking at the hints provided, what we know for sure is that we
have information regarding Lego’s
annual history with respect to theme IDs and the number of
18. parts per theme. To give us a more
hollistic view as to what went on in the company, we used
aggregates, in this case count in order to
group the number of themes associated with its respective years.
Given the issue on how Lego dealt with complexity, is there a
measure we can use to determine or
represent how complex the toys were on an annual basis? I.e.
Average Complexity per year.
The simplest way to show this is to define average complexity
as the quotient of the total number of
parts per year and the total sum of themes per year.
������� ���������� =
����� ������ �� ����� ��� ����
����� ������ �� �ℎ ���� ��� ����
19. This essentially gives you a ratio which should have a declining
trend over the years.
Having established that, the real challenge here is getting
Pyramid to mathematically perform the
calculations for us, and then visualise it for us to see. There are
2 steps in this process, the first being
to prep our data in the ETL process in the Data Flow tab and
then creating the function that will allow
us to visualise average complexity in Discover.
20. Let’s start by going back to the Data Flow process in
Modelling.
Again, to recap, we need our data to reflect the total count for
all theme IDs, and the Sum of the
number of parts all of which is grouped by year. Pyramid allows
for data transformation as such in the
Data Flow tab. Under the Elements tab, click on the Preparation
ribbon and then drag Summarize
over to your workspace.
Also remember that the information and columns we need for
this are all contained within the Sets
dataset.
So, ensure first and foremost, that the Select bar for Sets is
connected to the Summarize bar and the
21. Summarize bar is connected to Memory. If possible, try to
perform these connections without
disconnecting any pre-existing bonds within the data flow.
Under the Properties tab, you can rename your output table
(Pyramid’s Summarize as with most of
Pyramid’s built-in function in the Data Flow process, often
gives an additional column, table, etc). For
the purpose of this tutorial, we’ve used Complexity as the name
for the resulting table.
Next, go ahead and click on the Add New Column widget and
select the relevant columns with the
respective aggregates we need (if lost see page 17). Once that is
22. done, do a quick check through under
the Data Model tab to make sure everything is in place. If you
accidentally disconnected some of the
flows in the ETL process, make sure that the joins you made
earlier are still valid.
Lastly, make sure the year_Groupby box is checked so that we
can analyse it later as a category. Once
everything is ready, go ahead and process your model.
Once you process your model, you should be able to get to
something like this. Granted, it is similar
to what we already had, but now the information is already
grouped according to their respective
year (contraints already applied).
23. Next to create our formula, click on the Create Calculation tool
under the Measures tab.
Pyramid will start its Logic workspace for you to create your
own formula which you can later reapply
in Discovery.
For this tutorial, we’ll work with Data Points but you are
welcome to explore the rest of the functions
in your own time.
Now under Data Point Properties, navigate your way around the
Measures hierarchy, and check on
the box pertaining to the sum of parts.
24. Next click on the Operators ribbon and select Divide. Repeat
the same process for the Theme ID count
by dragging in another Data Point to your workspace. Hint:
make sure your cursor is parked on the
right of the divide operator.
Once you’re done, click on the Test Formulation button to
ensure that your formula is valid. Save your
formula after that and return to our Discovery workspace (For
purposes of this tutorial, this is saved
as Average Complexity).
Back in the Discovery workspace, under the Measures tab, click
on the Show Business Logic tool. You
25. should be able to see the formula you created appear under the
folder where it is saved.
With all of the above done, all that remains is to visualise your
findings. To make it visually simpler to
understand and digest, lets also apply a filter to the years so we
can see the trend from the 1980s
onwards.
As you can see, Lego did reduce the complexity of their sets in
an attempt to increase the interest
from their client base. The increase in complexity might be due
to an increase in interest and profit
which we can infer but can’t show without additional
26. information.
Short Cuts:
For your convenience, there is a Pyramid Analytics Model
already prepared with all 5 files joined. If
you look at Content Explorer, under the Workgroups Content
folder, click on Pioneer Content, and
select the Lego Case file, you should be able to visualise
everything we talked about between pages 8
to 14, granted not in as much detail unless you follow the steps
and do it manually.
27. Rebecca
Within the changing trends regarding societal expectations,
technology, and risks, there are many aspects of HR which will
be greatly affected, and may require substantial adjustments.
One example of this within the people competency and
employee engagement and retention is the growing expectation
of employees to expect added flexibility and autonomy in the
workplace. Additionally, this change results in even greater
challenges, due to the varying spectrum in which employees’
value or expect this. Therefore, HR must be able, and willing,
to obtain information from employees to gage how important
this is to them, and what processes would produce the greatest
positive effects on the employee’s engagement and retention.
In other words, HR must contain the ability to be flexible and
adaptive to employee’s needs, while also being willing to
implement strategies and processes that will meet those needs.
Next, future trends also inflict changes on risk management
within the workplace competency regarding privacy. Privacy
and cyber security are increasingly becoming critical issues
which must be addressed. HR possesses detailed personal
28. information for all employees within the organization, which
can no longer be protected by locking paperwork up in a filing
cabinet. Instead, this information is increasingly being stored
in digital formats, which can be hacked or compromised
internally or externally. Therefore, HR must enlist the
assistance of sufficient firewalls, password protections, and
other safeguards to ensure this information is secure.
Finally, beyond affecting the people and workplace
competencies, future trends can also affect performance
management within the organizational competency. One such
trend involves the implementation of self-service technology.
This trend provides extensive benefits to HR by automating and
standardizing the process, therefore improving efficiency and
equity, but also meets employees needs by providing them
additional control and influence within the appraisal process by
providing them with more direct access to their performance
reviews and its contents.
Alexandra
What HR functional areas within the technical competency
domain of people do you believe will be impacted the most by
future trends? How?
I believe employee engagement will be impacted by future
trends such as the increase of employees being able to work
29. from home (WFH). Due to the pandemic outbreak that occurred
early in 2020, many companies had no choice but to switch from
working in the office to working from home. Although given the
situation WFH was the only option employers had; however,
some employers noticed that workers were more productive
WFH. A study that was conducted by Standford regarding
16,000 workers over a 9-month period concluded that WFH
increased productivity by 13% (“Surprising WFH Statistics”).
Due to the increase of productivity WFH, many employers have
decided to keep certain positions permanently WFH or willing
to give their employees the option of WFH or working in the
office. As you can imagine, giving employees the option of
working from home or working in the office provides flexibility
and the potential to boost retention, optimize productivity,
enhance remote work culture, nurture happiness, and improve
fitness (Bell, 2020).
What HR functional areas within the technical competency
domain of workplace do you believe will be impacted the most
by future trends? How?
According to the technical competency domain of the
workplace, I believe diversity and inclusion will be impacted by
future trends. For instance, due to WFH becoming a common
option in the workplace, mothers returning from maternity leave
can be part of their children’s lives while maintaining their
careers rather than cutting down on hours to care for their
30. children. Another reason why diversity and inclusion will be
most impacted as it opens the door for more accessibility. For
instance, people who have a disability can avoid commuting to
the office and work from the comfort of their own homes
(Milanesi). On the other hand, employers need to be mindful
that all employees do not have the same resources at home such
as a strong Wi-Fi connection or computer(s) that are essential to
working from home (if the equipment is not being offered by
the company). If companies do not offer equipment for
employees to take home, then they need to take into
consideration that some employees may not have the financial
means to go out and buy the equipment and may not have the
resources needed. Moreover, due to the flexibility of WFM,
companies will be able to attract more diverse talent from
outside the area.
What HR functional areas within the technical competency
domain of the organization do you believe will be impacted the
most by future trends? How?
I believe technology management will be impacted by future
trends as technology continues to evolve. I do not think the
changes in technology will necessarily be impacted negatively;
however, it may cause some problems with adaptability as some
employees are not too tech-savvy and may need more training or
shadowing to become more competent with the software.
References
31. Bell, A. (2020, November 9). 17 Must-Have Remote Employee
Perks You NEED To Know About. SnackNation.
https://snacknation.com/blog/remote-employee-perks/.
Milanesi, C. (2020, March 23). Working from home is great for
diversity. Let's keep it going. Fast Company.
https://www.fastcompany.com/90480008/working-from-home-
is-great-for-diversity-lets-keep-it-going.