The Census Hub Project can be considerated at the moment as the most advanced project where Internet technologies and SDMX solutions for data transmission get together for an ambicious goal: the data dissemination of Census 2011 results.
We analyze the Census Hub architecture, where a central Hub at Eurostat side manage the user interface, transforming all selections made by the user on the screen in an sdmx query. This query is sent to the web service at NSI side, that parses the query and transforms it in an SQL query that can be used with a data base containing census data. Depending on how many countrys are involved in the answer, the hub will query the web service provided for that country. Finally, the Hub receive all answer fron NSI's and build up a final table, putting all answers toghether. The importance of this implementation is that is a completely new system that change completely the way to disseminate and exchange official data among organizations.
6. Overview F L E X I B I L I T Y HAR MO NI ZA TION
7. Access to detailed Census data that are methodologically comparable among the Member States and structured in the same way Harmonization
8. Final user should have the possibility to cross tabulate different variables Flexibility
9. The Goals The dissemination of the result of the censuses in the EU should reflect those advantages to the highest possible extent.
10. The Traditional Approach Member States provide microdata to Eurostat. Eurostat aggregates microdata and stores obtained data in a central repository. This repository will be used for data dissemination Member States provide predefinited tables to Eurostat. Eurostat publishes those tables on its website 1 2
11. Approach (1) maximises flexibility in offering data to final users. But: – Aggregation functions on the central system could be very difficult to implement due to: • different confidentiality rules to be applied to microdata from different Countries; • whether data come from a "full" census (conventional or register-based) or from a sample survey. – Data maintenance could be very cumbersome because every time a revision is issued, an entire set of microdata needs to be updated or replaced. The Traditional Approach
12. Approach (2) greatly simplifies the exercise But: It doesn't offer enough flexibility to final users, who would have limited possibilities to tailor data to their information needs. The Traditional Approach
14. We have normally two different approach to exchange data: PUSH and PULL Push and Pool
15. PUSH mode means that the data provider takes action to send the data to the party collecting the data. PULL mode implies that the data provider makes the data available via the Internet. The data consumer then fetches the data on his own initiative. Push and Pool
16. SDMX is primarily focused on the exchange and dissemination of statistical data and metadata. SDMX promotes a “ data sharing ” model to facilitate low-cost, high-quality statistical data and metadata exchange. Data Providers publishes the availability of data/metadata to Data Consumers and the latter are responsible for fetching the data/metadata at will. . Data Sharing Model
20. The Census Hub is based on the concept of data sharing : A group of partners agree on providing access to their data according to standard processes, formats and technologies The Census Hub Idea IT, IE, DE, PT, MT, SI, EE, BG Countries involved GB, ES and GR Additional Countries involved before the end of the year
21.
22. may be retrieved from a database in response to an SDMX-conformant query This architecture often includes also an SDMX registry that implements the general idea of a metadata registry The Census Hub Idea
23. Each National Statistics Institute (NSI) creates a set of non-disclosure data. The delivery of this data would be via an information hub that enabled data sharing on the Internet. Each NSI would provide web access to their data according to standard formats and technologies. A data user browses the hub to search for a dataset of interest using structural metadata (dimensions, attributes, code lists, etc). Data is retrieved directly from the NSI system to the Hub. The Census Hub Idea
30. The Pilot Project Architecture The Q uery builder constructs one or more SDMX queries that will be sent to the related NSIs web services through the W eb service client. When the Web service client receives the responses (in the format of a SDMX cross-sectional data message) from the queried web services, it forwards those to the Result aggregation manager . The Result aggregation manager puts together all the received SDMX data messages and sends the result to the D issemination transformer that makes a transformation from an XML format to HTML or CSV.
31. The Pilot Project Architecture The web service receives a SDMX query and forwards it to the SDMX q uery parser . The SDMX Q uery parser breaks down the query and sends it to the SQL query builder . The SQL query builder creates one or more SQL queries and sends them to D atabase . The result is assembled, by the SDMX-ML assembler , in a SDMX cross-sectional message that will be sent, by the web service, to the central Hub. NSI
32. The Pilot Project Architecture Statistics Portugal Architecture Model
33. The Pilot Project Architecture Statistisches Bundesamt Architecture Model
34.
35. Data should comprise the following dimensions: Sex, Age, Current Activity Status and Territory;
38. March 2008: preparation of requirement specification, functional and technical analysis;
39. April 2008: choice of one data hypercube and related breakdowns to use during the pilot; development of the Data Structure Definition (DSD);
40. June - September 2008: building of application modules (both Eurostat and NSI side); tests;
41. October 2008: evaluation report of the pilot; functional and technical analysis for the full 2011 Census Hub. The Pilot Project Roadmap
42. Eurostat has developed the central Hub and, at the beginning of February 2009, it will be accessible in a test environment . Italy, Portugal, Germany and Ireland have already setup the architecture Italy, Portugal and Ireland have produced documents (available on CIRCA) regarding their experience during the pilot phase ( http://circa.europa.eu/Members/irc/dsis/x-dis-xensus-hub/library?l=/census_documents_1/case_studies) Results of the pilot project
43.
44. Moreover it was produced the Census Hub Web Service implementation Guidelines3 that explains how to build web services, using different IT technologies, capable of communicating correctly with the central hub. (http://circa.europa.eu/Members/irc/dsis/x-dis-xensus-hub/library?l=/census_documents_1/documents ) Finally it is important to highlight how sharing experience and software, between all the involved actors (Eurostat and NSIs), have allowed the reduction of production costs and development time. Results of the pilot project
45.
46. Participants will build an IT infrastructure useful not only for the pilot exercise but also for their 2011 census data warehouse using standards recognized at international level;
47. The same SDMX architecture could be used in other projects with few or no changes. Benefits in participating to the project
48.
49.
50. a wide selection of IT commercial applications and tools are available to work with XML-based data;
51.
52.
53.
54.
55. The used architecture represents the most advanced example of the data sharing detailed in the SDMX standards
56. Volunteer NSIs can acquire a good experience in managing complex IT projects and a good knowledge of SDMX standards
57. As the Pilot has been planned as simple as possible in order to let all the NSIs participate with a minor effort, this project is a good occasion for all those who want to start using SDMX Conclusion