Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
On the Design and Implementation
of a Generalized Process for
Business Statistics
BY BRUNO, INFANTE, RUOCCO AND SCANNAPIEC...
Content
1. Context
2. Questions and issues
3. GSBPM applied at Istat
4. Generalized Process for Business Statistics
5. GPB...
1. Context
 Since 2014, Istat is involved in a significant revision of the
statistical production
 Main objective: “enri...
1. Context (cont’d)
 Generic Statistical Business Process Model (GSBPM):
 Describes and defines the set of business proc...
2. Questions and issues
1. How can we manage the integration of ad-hoc legacy applications in a standard
environment built...
3. GSBPM applied at Istat
 GPBS project uses the GSBPM framework to standardise:
 methods and tools for some phases of t...
3. GSBPM applied at Istat (cont’d)
 First step:
 Subset of GSBPM sub-processes, focusing on the Process and Analyze phas...
3. GSBPM applied at Istat (cont’d)
 2nd step:
 Subset of Business Statistics surveys (5 short term statistics and 3
stru...
3. GSBPM applied at Istat (cont’d)
 Preliminary result:
 To reduce duplications, need to harmonise GSBPM sub-processes: ...
4. GPBS
 Concepts:
 Data Service
 Process step
 Statistical Service: a software providing one or more statistical busi...
4. GPBS (cont’d)
 Assessment of GPBS performances:
 Identification of indicators to assess (i) the efficiency and (ii) t...
4. GPBS (cont’d)
 Assessment of GPBS performances (cont’d):
 Target improvements to measure related to the six dimension...
5. GPBS Use Cases
5.1 Detection of influential suspicious units
 GSBPM sub-process: “Review & validate”
 Statistical Ser...
5. GPBS Use Cases (cont’d)
5.2 Manual revision of influential suspicious units
 Orchestration
 “Assuming the output of t...
5. GPBS Use Cases (cont’d)
5.2 Manual revision of influential suspicious units (cont’d)
 In the technical remarks, we hav...
5. GPBS Use Cases (cont’d)
5.3 Business demographic events
 GSBPM sub-process “Create frame & select sample”
 “It would ...
5. GPBS Use Cases (cont’d)
5.3 Business demographic events (cont’d)
 Different update schedules:
 Immediate updates for ...
5. GPBS Use Cases (cont’d)
5.3 Business demographic events (cont’d)
 “Frozen” version:
 Survey specific
 Corresponds to...
5. GPBS Use Cases (cont’d)
5.3 Business demographic events (cont’d)
 Proposed approach:
1. Always have a “current” versio...
5. GPBS Use Cases (cont’d)
5.3 Business demographic events (cont’d)
 Simple approach that solves the coherence and
schedu...
6. Some answers to questions and issues
May come back to questions 5 and 6…
5. Concerning the integration of various stati...
6. Some answers to questions and issues
(cont’d)
6. We do have process metadata describing for instance the process
flow t...
7. Conclusion
 Significant developments in the GPBS
 Clearer view of the specific processes and workflows of
business st...
7. Conclusion (cont’d)
 Remember…
24
Upcoming SlideShare
Loading in …5
×

Session IV - Quality issues and generalized business processes - Bruno, Infante, Rocco, Scannapieco, On the Design and Implementation of a Generalized Process for Business Statistics - Pierre Lavallée, Discussion

81 views

Published on

Evento Istat Aula Magna
Workshop of the Advisory Committee on Statistical Methods
19 novembre 2018

Published in: Education
  • Login to see the comments

  • Be the first to like this

Session IV - Quality issues and generalized business processes - Bruno, Infante, Rocco, Scannapieco, On the Design and Implementation of a Generalized Process for Business Statistics - Pierre Lavallée, Discussion

  1. 1. On the Design and Implementation of a Generalized Process for Business Statistics BY BRUNO, INFANTE, RUOCCO AND SCANNAPIECO DISCUSSION PIERRE LAVALLÉE ROMA, 19-20 NOVEMBRE 2018 1
  2. 2. Content 1. Context 2. Questions and issues 3. GSBPM applied at Istat 4. Generalized Process for Business Statistics 5. GPBS Use Cases 6. Some answers to questions and issues 7. Conclusion 2
  3. 3. 1. Context  Since 2014, Istat is involved in a significant revision of the statistical production  Main objective: “enrich the supply and quality of the information produced, while improving the effectiveness and efficiency of the overall activity”  Generalized Process for Business Statistics (GPBS):  Designing and implementing a standardized system to support business statistics  Integrating in a single environment the different steps of business surveys 3
  4. 4. 1. Context (cont’d)  Generic Statistical Business Process Model (GSBPM):  Describes and defines the set of business processes needed to produce official statistics  Provides a standard framework and harmonized terminology to help statistical organizations to share methods and components, within and between different surveys 4
  5. 5. 2. Questions and issues 1. How can we manage the integration of ad-hoc legacy applications in a standard environment built on generalized functionalities? How to cope with data processing parameters edited by end user at runtime, through graphical interactive interfaces? 2. How can we implement a user-friendly workflow, easy to configure and manage by all type of users, despite their knowledge of software applications? Which is the most suitable environment? 3. As the process chain has a low degree of complexity, is it better to develop an in- house solutions or to evaluate a commercial product. (e.g. enterprise service bus, orchestrator, etc.) to manage the workflow? 4. The generalized statistical production process can be potentially abstracted as a “datadriven” workflow, in line with the scientific nature of Istat’s processes. Can we rely on the characteristics of data-driven workflows for designing and implementing more efficient solutions for the GPBS? 5. Concerning the integration of various statistical services in the workflow, how to design structural metadata to describe the data structures passed as input/output to services and/or accessed as central data repository? 6. We do have process metadata describing for instance the process flow that could have an “active” way in being interpreted for the execution. We do however have also other kinds of process metadata related for instance to the naming of services and processes themselves. How do we manage such kinds of process metadata? Questions already answered in May 2018… 5
  6. 6. 3. GSBPM applied at Istat  GPBS project uses the GSBPM framework to standardise:  methods and tools for some phases of the statistical production process;  the workflow to manage the different phases of the production process  Current situation:  Highly customized methods and tools  Some processes tied to specific persons  “The strong connection between people and processes is a barrier towards sharing and standardization.”  Consequences: 1. Duplicated work and inefficiencies in the production process 2. No reuse of tools and competencies 6
  7. 7. 3. GSBPM applied at Istat (cont’d)  First step:  Subset of GSBPM sub-processes, focusing on the Process and Analyze phases and their interdependencies  These sub-processes corresponds to the standard methodological tasks 7
  8. 8. 3. GSBPM applied at Istat (cont’d)  2nd step:  Subset of Business Statistics surveys (5 short term statistics and 3 structural business statistics) to start the modelling phase  For each selected surveys: 1. Interviewed the referring person, to obtain current description of the processes 2. Modelled and designed the future integrated system built with generalized tools, supporting the different process steps 3. Missing item: Discussion with referring persons on implementation steps. Necessary to get the “buy in” of these persons/projects 8
  9. 9. 3. GSBPM applied at Istat (cont’d)  Preliminary result:  To reduce duplications, need to harmonise GSBPM sub-processes: ‘Review & validation’ and ‘Edit & impute’ 9
  10. 10. 4. GPBS  Concepts:  Data Service  Process step  Statistical Service: a software providing one or more statistical business functions (extract sample, calculate weights, perform error checking, etc.) that can be invoked as a service → Performs the methodological tasks  Orchestrator  Metadata Management:  Structural metadata: data set name, variable names, variable types, unit identifier, classification identifier, etc.  Referential metadata: information objects necessary to run the process  Process Metadata: No formal definition?… 10
  11. 11. 4. GPBS (cont’d)  Assessment of GPBS performances:  Identification of indicators to assess (i) the efficiency and (ii) the quality improvement before and after the adoption of GPBS tools  Main expected outcomes:  Decrease of manual revisions, overlapping steps and overediting  “The reduction of the time lag to produce the final estimates should encourage the survey managers to use GPBS tools, instead of their own procedures” Not necessarily the case for repeated surveys. We would rather say “The reduction of implementation costs and time to produce […]”. 11
  12. 12. 4. GPBS (cont’d)  Assessment of GPBS performances (cont’d):  Target improvements to measure related to the six dimensions of the ESS Quality Assurance Framework, especially accuracy, reliability, coherence and comparability  In case improvements are difficult to assess by explicit indicators: collect survey staff feedbacks and opinions concerning specific topics Likely to happen in several survey steps  At the end, there might not be several real measurable improvements, but rather a “general improvement” of methods, tools and workflow… 12
  13. 13. 5. GPBS Use Cases 5.1 Detection of influential suspicious units  GSBPM sub-process: “Review & validate”  Statistical Services can be invoked either in a synchronous or asynchronous mode  Synchronous: “Not suitable for huge amount of data”  Asynchronous: “Suitable for large amount of data” → Why? 13
  14. 14. 5. GPBS Use Cases (cont’d) 5.2 Manual revision of influential suspicious units  Orchestration  “Assuming the output of the influential values detection […] is a subset of units to be manually reviewed, the survey staff may: 1. Use the available information (other sources or historical survey data) to check the coherence of survey data; 2. Contact the respondents to verify collected data.”  The 2nd step should rather be: 2. If time and budget is sufficient, contact the respondents to verify collected data. Otherwise, flag data to be imputed 14
  15. 15. 5. GPBS Use Cases (cont’d) 5.2 Manual revision of influential suspicious units (cont’d)  In the technical remarks, we have: “The Process Orchestrator should perform the following tasks: Avoid overlapping of different users revisions (transactional support in coordination with the RDBMS);“ → What means RDBMS? 15
  16. 16. 5. GPBS Use Cases (cont’d) 5.3 Business demographic events  GSBPM sub-process “Create frame & select sample”  “It would be useful to store and centrally manage all changes transmitted by the respondents of different surveys, in order to reduce the time gap between the Register and survey data.”  Good point  However, need to keep in mind that a repeated overlapping survey cannot be its only source of update on the Business Register. This can create serious biases in the results of such surveys  In practice, this problem seldom occur because of several various update sources on the Business Register 16
  17. 17. 5. GPBS Use Cases (cont’d) 5.3 Business demographic events (cont’d)  Different update schedules:  Immediate updates for short-term surveys  Different update schedules for structural statistics  Adopted solution: hold two versions of key variables  “Current” version: updated as soon new information is received  “Frozen” version: stable for a certain period and is updated from the current version periodically  Challenge: “design the most appropriate data architecture to standardize the link between the ‘current’ version and the ‘frozen’ one.” 17
  18. 18. 5. GPBS Use Cases (cont’d) 5.3 Business demographic events (cont’d)  “Frozen” version:  Survey specific  Corresponds to the extracted frame for each survey  Statistical Service: 1. Update “current” version 2. Quality check before updating “frozen” version 3. Update “frozen” version  This approach is likely to create coherence and scheduling problems  Ex: Once the “frozen” version is updated, how a given survey can go back for specific tasks such as nonresponse adjustments or calibration? 18
  19. 19. 5. GPBS Use Cases (cont’d) 5.3 Business demographic events (cont’d)  Proposed approach: 1. Always have a “current” version of the Business Register 2. As part of the frame creation process of a specific survey, extract a “frozen” version (frame) from the Business Register 3. Keep the “frozen” version during the whole survey process 4. Update the “current” version of the Business Register using survey feedback (being careful with possible bias problems) 5. Extract a new “frozen” version (frame) for the next occurrence of the survey 19
  20. 20. 5. GPBS Use Cases (cont’d) 5.3 Business demographic events (cont’d)  Simple approach that solves the coherence and scheduling problems  No need to “evaluate the best timing that reduces the trade-off between timeliness and accuracy of the stored information and statistical outputs to be built from the Register.”  “The second issue concerning the workflow, arises from the need to inform and prioritize the different stakeholders about data changes and updates.”  The stakeholders only need to be informed about changes and updates between the two “frozen” versions (i.e., two frames) for their two surveys occurrences 20
  21. 21. 6. Some answers to questions and issues May come back to questions 5 and 6… 5. Concerning the integration of various statistical services in the workflow, how to design structural metadata to describe the data structures passed as input/output to services and/or accessed as central data repository?  SAS® already have PROC options that are more or less “structural metadata”. They can maybe be used as examples for the GPBS Ex: Some generalized software developed by Statistics Canada looks like SAS® PROC.  Look at the development of other members of the UNECE - High-level Group on the Modernisation of Official Statistics. The Common Statistical Production Architecture (CSPA) can help in standardising calls of other software (statistical services) 21
  22. 22. 6. Some answers to questions and issues (cont’d) 6. We do have process metadata describing for instance the process flow that could have an “active” way in being interpreted for the execution. We do however have also other kinds of process metadata related for instance to the naming of services and processes themselves. How do we manage such kinds of process metadata?  Again, the SAS® PROC can be used as examples of “process metadata” for the GPBS 22
  23. 23. 7. Conclusion  Significant developments in the GPBS  Clearer view of the specific processes and workflows of business statistics in accordance to the GSBPM  Key step: Discussion with referring persons on implementation steps. Necessary to get the “buy in” of these persons 23
  24. 24. 7. Conclusion (cont’d)  Remember… 24

×