2. overview
• building the longitudinal firm-level database (LFLD)
• business demography: the big picture
• application (1): what do policymakers need to know about
business demography?
• application (2): matching the UK Innovation Survey
(UKIS)/Community Innovation Survey (CIS) to the LFLD
• drowning in data? building a longitudinal database from the
BSD local unit datasets
4. sources & building
1. sources
Inter-Departmental Business Register
(IDBR) is a ’live’ register updated for jobs
from HMRC (VAT and PAYE) and Business Register
Employment Survey
the BSD comprises extracts from ’snapshots’ of the IDBR
taken each March (1997 to 2013): no marker dating
measurements
2. building the longitudinal database
focus on firm and job dynamics with firms linked year-to-year
by ID
appearance of first job ≡ birth of firm
dis-appearance of last job ≡ death of firm (secondary:
’active’/’inactive’ flag also used)
no firm can be re-born (with same ID)
assumptions of convenience: SIC at birth; location at birth
(latter currently under revision)
14. 5 brutal facts
• every year a large number private sector firms are born in the
UK ∼ typically between 200,000 and 250,000
• most new born firms are very small ∼ around 90% have less
than 5 employees
• a decade later between 70% and 80% of those new born firms
will be dead
• of those which have survived to age 10 ∼ around 75% of
those born with less than 5 employees will still have less than
five employees
• the firms are born with about 1 million jobs ∼ a decade later
the survivors employ just half a million
15. putting the facts together
• a simple framework can be used to put these facts together
and it leads us to some (possibly) encouraging facts about job
growth
• we start with a table which tracks firms by size at birth from
birth to a date – here 10 years into the future
• this is called an origin/destination table: the rows are origins
– the size-band at birth; the destinations are size-bands 10
years later
• data here is an average of four successive birth cohorts, firms
born in four successive years: 1998, 1999, 2000, 2001
16. Table A: origin destination table,
UK firms, birth to age10, ’000
destination (age 10) size-band
1-4 5-9 10-19 20+ dead all
origin 1-4 36.0 6.1 2.3 1.3 145.1 190.8
(birth) 5-9 1.8 1.4 0.8 0.5 11.4 15.9
size 10-19 0.4 0.3 0.4 0.4 3.9 5.5
band 20+ 0.2 0.1 0.1 0.7 2.6 3.7
all 38.4 7.9 3.6 3.0 163.1 216.0
Note: average of birth cohorts 1998 to 2001
17. Table B: origin destination table
UK jobs by size-band, birth, ’000
destination (age 10) size-band
1-4 5-9 10-19 20+ dead all
origin 1-4 54.1 11.5 4.5 2.6 218.9 291.4
(birth) 5-9 11.3 8.8 5.2 3.7 73.1 102.1
size 10-19 5.8 4.1 5.9 6.2 52.4 74.3
band 20+ 13.8 6.2 7.0 145.0 334.7 506.7
all 84.9 30.6 22.6 157.4 678.9 974.5
Note: average of birth cohorts 1998 to 2001
18. Table C: origin destination table
UK jobs by size-band, age 10, ’000
destination (age 10) size-band
1-4 5-9 10-19 20+ all
origin 1-4 47.2 33.3 28.7 80.2 189.5
(birth) 5-9 3.0 7.1 9.4 29.8 49.3
size 10-19 0.7 1.8 5.5 28.9 36.9
band 20+ 0.2 0.5 1.8 183.5 186.0
all 51.2 42.7 45.5 322.4 461.8
Note: average of birth cohorts 1998 to 2001
19. Table D: origin destination table
UK jobs by size-band
age 10 survivors, change
birth to age10, ’000
destination (age 10) size-band
1-4 5-9 10-19 20+ all
origin 1-4 9.7 23.6 24.4 77.8 135.5
(birth) 5-9 -4.5 0.4 4.8 26.6 27.3
size 10-19 -3.4 -1.5 0.4 23.2 18.6
band 20+ -7.2 -3.8 -4.3 43.7 28.4
all –5.4 18.6 25.3 171.4 209.9
Note: average of birth cohorts 1998 to 2001
20. two (possibly) encouraging facts
• a very small proportion ∼ less than 1% ∼ of the smallest (1 to
4 job) firms survive and make the transition to 20+ employees
• but this 1% make a very large contribution to job growth ∼
accounting for around one third of all (net) jobs added by
survivors
22. what we have learned about the UKIS
from matching: qualitative
• the ’businesses’ which respond to the UKIS are ”reporting
units” (RUs) – this is an IDBR-defined category – for single
workplace firms RUs are just firms, but for multi-workplace
firms RUs are ’groupings’ of workplaces
• more than one RU of a multi-workplace firm may be in the
sampling frame
• the RUs in the sampling frame (stratified by size-band and
sector) must have responded in at least one ONS R&D survey
that they undertook some R&D activity (but may not
necessarily be active at CIS survey period)
• the ’grossed up’ population figures published in BIS reports on
the UKIS are also RUs – no attempt is made to convert RUs
into numbers of firms
23. what we have learned about the UKIS
from matching: quantitative (1)
duplicate firm IDs, waves of CIS,
numbers
CIS year responses firms firms
wave all unique non swp mwp
unique
2 1997 2342 2291 2250 41 1227 1023
3 2001 8172 8075 8018 57 5536 2482
4 2005 16445 16113 15938 175 10589 5349
5 2007 14872 14591 14421 170 9681 4740
6 2009 14281 13994 13846 148 8997 4849
7∗
2011 13770 13556 13478 108 8767 4711
∗The data file had no firm ID, 572 respondents could not be matched
24. what we have learned about the UKIS
from matching: quantitative (2)
consistency of employees & turnover
between CIS and LFLD, single workplace firms,
count of exact record matches
CIS year unique swp employees turnover
wave firms firms
2 1997 2250 1227 187 127
3 2001 8018 5251 441 103
4 2005 15938 10584 870 104
5 2007 14421 9558 1142 645
6 2009 13846 8997 701 92
7 2011 13478 8767 408 37
25. drowning in data? building
a longitudinal database
from the BSD local unit
datasets
26. why are we interested in
longitudinal workplace-level data?
• in firm-level data jobs are ’located’ at the firm’s HQ, so
spatial analysis of workplace-level data to investigate,
job location and re-location
job creation and destruction
• workplace-level data permits a more fine-grained description
of firm-level job growth,
’organic growth’– expansion of an existing workplace or the
founding of a new workplace
’growth by acquistion’ of an existing workplace from another
firm
’death due to disposal’ – where the firm disappears as
workplaces are sold
firm death where all workplaces die
27. algorithm for building
the longitudinal workplace-level
database (1)
• starting with a birth cohort from the LFLD use the live lu
marker to separate firms which are only ever single workplace
(sws) firms
• call the rest multi-workplace (mwp) firms – though they are
not always so – this is the entref list
• database is built birth cohort by cohort: the 1998 birth cohort
comprised 239,000 firms, about 2,000 firms were mwps
• match the BSD LU to the entref list year-by-year from 1998
onwards – this yields the luref list: all the workplaces ever
associated with the mwp firms – the ’workplace history’ of
each mwp firm
• this list is about 16,000 records, since there are on average
eight workplaces per mwp
28. algorithm for building
the longitudinal workplace-level
database (2)
• check for consistency between the live lu series from LFLD
and the frequency count from the workplace history
• to determine the ’firm history’ of each workplace – we merge
the LFLD year-by-year into the ’luref list’
• we use year-to-year comparison of the firm history of each
workplace to determine whether it,
was born to the firm on the ’entref list’
remains alive and is owned by the same firm (continuing)
is acquired from another firm – distinguishing firms on the
entref list from those in some other cohort
is disposed of to another firm – distinguishing firms on the
entref list from those in some other cohort
dies
29. algorithm: problems
• major discontinuity in 2003: missing around 90,000 records
• luref changes, but the same entref
• inconsistency between live lu in the LFLD and the number of
lurefs in the workplace-level BSD
30. algorithm: solutions
• 2003 discontinuity: imputation
if luref and entref unchanged 2002 –> 2004 and death code
2004 == 0 (ie still alive), then set 2003 equal to 2002
(employees average of 2002 and 2004?)
conservative solution: any luref changing entref will be dropped
• luref changes: if luref changes from equal to entref to distinct
luref with entref unchanged match postcodes? conservative
solution: any luref changing entref will be dropped
• many live lu entries missing from early years of BSD, but scale
of problem as yet unknown
31.
32. The statistical data used here is from the Office of National
Statistics (ONS) and is Crown copyright and reproduced with the
permission of the controller of HMSO and Queens Printer for
Scotland.The use of the ONS statistical data in this work does not
imply the endorsement of the ONS in relation to the interpretation
or analysis of the statistical data.
33. coh98 survivors to age 10,
firms and jobs by workplace histories
counts firms jobs98 jobs08 jf98 jf08 growth
s always 39162 127192 233457 3.25 5.96 1.834
ms simple 52 5472 2814 105.23 54.12 0.514
m always 89 40266 45683 452.43 513.29 1.135
sm simple 1033 32895 146019 31.84 141.35 4.439
complex 500 17751 32321 35.50 64.64 1.821
all 40836 223576 460294 5.47 11.27 2.060
shares (%) firms jobs98 jobs08
s always 95.9 56.9 50.7
ms simple 0.1 2.4 0.6
m always 0.2 18.0 9.9
sm simple 2.5 14.7 31.7
complex 1.2 7.9 7.0
34. cohort98, survivors to 2008,
job creation and destruction accounts,
firms and jobs by swp/mwp status,
cumulated 1998 to 2008, ’000 (1)
(a) all
always s simple sm other all
opening 127.2 32.9 63.5 223.6
ownch 106.3 55.0 20.5 181.8
net trans 58.1 -3.2 54.9
transinflow 58.1 -3.2 54.9
transinstock 64.4 80.4 144.0
transout -64.4 -80.4 -144.0
closing 233.5 146.0 80.8 460.3
35. cohort98, survivors to 2008,
job creation and destruction accounts,
firms and jobs by swp/mwp status,
cumulated 1998 to 2008, ’000 (2)
(b) swp
always s simple sm other all
opening 127.2 32.9 15.4 175.5
ownch 106.3 31.5 11.6 149.4
net trans -64.4 -4.1 -68.4
transinflow 0.0 -19.5 -19.5
transinstock 0.0 47.5 47.5
transout -64.4 -32.1 -96.5
closing 233.5 0.0 23.0 256.5
36. cohort98, survivors to 2008,
job creation and destruction accounts,
firms and jobs by swp/mwp status,
cumulated 1998 to 2008, ’000 (3)
(c)mwp
always s simple sm other all
opening 0.0 48.1 48.1
ownch 23.5 8.9 32.4
net trans 122.5 1.0 123.5
transinflow 58.1 16.4 74.5
transinstock 64.4 32.1 96.5
transout 0.0 -47.5 -47.5
closing 146.0 57.8 203.8