Most data integration software was built to run data through ETL servers. It worked well at the time for several reasons: there wasnât that much dataâ1TB was considered a large amount of data at the time; most data was structured, and the turnaround time for that data was monthly. Even back then, daily loads became a problem for most companies. Because of the limitations of the early tools, much of the work was hand-coded, without documentation, and no central management.
Einstein published his ideas and became a pivotal element in shifting the way we think about physics - from the Newtonian model to the Quantum - in turn this changed the way we think about the world and allowed us to develop new ways of engaging with the world.
We are at a similar juncture. The development of computational technologies allows us to think about astronomical volumes of data and to make meaning of that data.
The mindshift that occurs is that âthe machine is our friendâ. The computer, like all machines, extends our capabilities. As a consequence the types of thinking now required in industry are those that get away from thinking like a computer and shift towards creative engagement with possibilities. Logical thinking is still necessary but it starts to be driven by imagination.
Computational thinking and data science change the way we think about defining and solving problems.
The age of creativity - which increasingly extends its impact from arts applications to business, scientific, technological, entrepreneurship, political, and other contexts.
Most data integration software was built to run data through ETL servers. It worked well at the time for several reasons: there wasnât that much dataâ1TB was considered a large amount of data at the time; most data was structured, and the turnaround time for that data was monthly. Even back then, daily loads became a problem for most companies. Because of the limitations of the early tools, much of the work was hand-coded, without documentation, and no central management.
Einstein published his ideas and became a pivotal element in shifting the way we think about physics - from the Newtonian model to the Quantum - in turn this changed the way we think about the world and allowed us to develop new ways of engaging with the world.
We are at a similar juncture. The development of computational technologies allows us to think about astronomical volumes of data and to make meaning of that data.
The mindshift that occurs is that âthe machine is our friendâ. The computer, like all machines, extends our capabilities. As a consequence the types of thinking now required in industry are those that get away from thinking like a computer and shift towards creative engagement with possibilities. Logical thinking is still necessary but it starts to be driven by imagination.
Computational thinking and data science change the way we think about defining and solving problems.
The age of creativity - which increasingly extends its impact from arts applications to business, scientific, technological, entrepreneurship, political, and other contexts.
Data Center Computing for Data Science: an evolution of machines, middleware,...Paco Nathan
Â
Guest lecture 2013-08-27 at General Assembly in SF for the Data Science program taught by Jacob Bollinger and Thomson Nguyen https://generalassemb.ly/education/data-science/san-francisco
Many thanks to Thomson, Jacob, and the participants in the course. Excellent Q&A!
Received a bottle o' Cardhu (my fave Scotch) in payment for lecture, and since it's Burning Man Week, the city was emptied so we had enough to share with the class :)
Evidence:
https://plus.google.com/u/0/110794698656267747127/posts/GvjhhQ99CTs
Delivering on Standards for Publishing Government Linked Data3 Round Stones
Â
Progress report on publishing open government data using Open Web Standards. Delivered by Bernadette Hyland, co-chair W3C Government Linked Data Working Group at the European Data Forum 2013, Dublin, Ireland.
An invited talk by Paco Nathan in the speaker series at the University of Chicago's Data Science for Social Good fellowship (2013-08-12) http://dssg.io/2013/05/21/the-fellowship-and-the-fellows.html
Learnings generalized from trends in Data Science:
a 30-year retrospective on Machine Learning,
a 10-year summary of Leading Data Science Teams,
and a 2-year survey of Enterprise Use Cases.
http://www.eventbrite.com/event/7476758185
A Roadmap Towards Big Data Opportunities, Emerging Issues and Hadoop as a Sol...Rida Qayyum
Â
The concept of Big Data become extensively popular for their vast usage in emerging technologies. Despite being complex and dynamic, big data environment has been generating the colossal amount of data which is impossible to handle from traditional data processing applications. Nowadays, the Internet of things (IoT) and social media platforms like, Facebook, Instagram, Twitter, WhatsApp, LinkedIn, and YouTube generating data in various formats. Therefore, this promotes a drastic need for technology to store and process this tremendous volume of data. This research outlines the fundamental literature required to understand the concept of big data including its nature, definitions, types, and characteristics. Additionally, the primary focus of the current study is to deal with two fundamental issues; storing an enormous amount of data and fast data processing. Leading to objectives, the paper presents Hadoop as a solution to address the problem and discussed the Hadoop Distributed File System (HDFS) and MapReduce programming framework for storage and processing in Big Data efficiently. Future research directions in this field determined based on opportunities and several emerging issues in Big Data domination. These research directions facilitate the exploration of the domain and the development of optimal solutions to address Big Data storage and processing problems. Moreover, this study contributes to the existing body of knowledge by comprehensively addressing the opportunities and emerging issues of Big Data.
Linked Water Data For Water Information ManagementEdward Curry
Â
The management of water consumption is hindered by low general awareness and absence of precise historical and contextual information. Effective and efficiency management of water resources requires a holistic approach considering all the stages of water usage. A decision support tool for water management services requires access to a number of different data domains and different data providers. The design of next-generation water information management systems poses significant technical challenges in terms of information management, integration of heterogeneous data, and real-time processing of dynamic data. Linked Data is a set of web technologies that enables integration of different data sources. This work investigates the usage of Linked Data technologies in the Water Management domain, describes the fundamental concepts of the approach, details an architecture, and discusses possible water management applications.
Overview on data collection methods and a deep dive on data (primary Vs secondary, qualitative and quantitative). Bias. Data processing and structured, unstructured, semistructured data. Databases jargon.
Data Science For Social Good: Tackling the Challenge of HomelessnessAnita Luthra
Â
A talk presented at the Champions Leadership Conference Series - leveraging data provided by New York Cityâs Department of Homeless Services, software vendor Tibco partnered with SumAll.Org to help tackle the societal challenge of homelessness in New York City.
Making the invisible visible. Managing the digital footprint of development p...UNDP Eurasia
Â
Thanks to new technologies, now accessible also in remote places, development work - and development workers - have an increasing digital footprint. Quite litterally, what was invisible can now become visible, with major implications for aid effectiveness, transparency and fundraising. Being able to manage such footprint effectively and analyse it to identify emerging trends is going to be a differentiating skill in the Development 2.0 world. This presentations illustrates some key concepts, examples and tools that development organisations can use ti analyse and manager their digital footprint.
Big Data, Data Science, Machine Intelligence and Learning: Demystification, T...Prof. Dr. Diego Kuonen
Â
Keynote presentation given by Prof. Dr. Diego Kuonen, CStat PStat CSci, on March 14, 2017 at Eurostat's international conference `New Techniques and Technologies for Statistics (NTTS) 2017' in Brussels, Belgium.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
A Swiss Statistician's 'Big Tent' View on Big Data and Data Science (Version 10)Prof. Dr. Diego Kuonen
Â
Keynote talk given by Dr. Diego Kuonen, CStat PStat CSci, on October 21, 2015, at the `Austrian Statistics Days 2015' in Vienna, Austria.
ABSTRACT
There is no question that big data have hit the business, government and scientific sectors. The demand for skills in data science is unprecedented in sectors where value, competitiveness and efficiency are driven by data. However, there is plenty of misleading hype around the terms 'big data' and 'data science'. This presentation gives a professional Swiss statistician's 'big tent' view on these terms, illustrates the connection between data science and statistics, and highlights some challenges and opportunities from a statistical perspective.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
In this paper we have penetrate an era of Big Data. Through better analysis of the large volumes of data that are becoming available, there is the potential for making faster advances in many scientific disciplines and improving the profitability and success of many enterprises. However, many technical challenges described in this paper must be addressed before this potential can be realized fully. The challenges include not just the obvious issues of scale, but also heterogeneity, lack of structure, error-handling, privacy, timeliness, provenance, and visualization, at all stages of the analysis pipeline from data acquisition to result interpretation.
A Statistician's 'Big Tent' View on Big Data and Data Science (Version 9)Prof. Dr. Diego Kuonen
Â
Presentation given by Dr. Diego Kuonen, CStat PStat CSci, on October 1, 2015, at the `Joint SCITAS and Statistics Seminar' of the EPFL in Lausanne, Switzerland.
ABSTRACT
There is no question that big data have hit the business, government and scientific sectors. The demand for skills in data science is unprecedented in sectors where value, competitiveness and efficiency are driven by data. However, there is plenty of misleading hype around the terms 'big data' and 'data science'. This presentation gives a professional statistician's 'big tent' view on these terms, illustrates the connection between data science and statistics, and highlights some challenges and opportunities from a statistical perspective.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
Big Data, Data-Driven Decision Making and Statistics Towards Data-Informed Po...Prof. Dr. Diego Kuonen
Â
Presentation given by Dr. Diego Kuonen, CStat PStat CSci, on October 20, 2015, at the Swiss Statistical Society's celebration of the `World Statistics Day 2015' in Olten, Switzerland.
Further information are available at https://worldstatisticsday.org/blog.html?c=CHE
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
Data Center Computing for Data Science: an evolution of machines, middleware,...Paco Nathan
Â
Guest lecture 2013-08-27 at General Assembly in SF for the Data Science program taught by Jacob Bollinger and Thomson Nguyen https://generalassemb.ly/education/data-science/san-francisco
Many thanks to Thomson, Jacob, and the participants in the course. Excellent Q&A!
Received a bottle o' Cardhu (my fave Scotch) in payment for lecture, and since it's Burning Man Week, the city was emptied so we had enough to share with the class :)
Evidence:
https://plus.google.com/u/0/110794698656267747127/posts/GvjhhQ99CTs
Delivering on Standards for Publishing Government Linked Data3 Round Stones
Â
Progress report on publishing open government data using Open Web Standards. Delivered by Bernadette Hyland, co-chair W3C Government Linked Data Working Group at the European Data Forum 2013, Dublin, Ireland.
An invited talk by Paco Nathan in the speaker series at the University of Chicago's Data Science for Social Good fellowship (2013-08-12) http://dssg.io/2013/05/21/the-fellowship-and-the-fellows.html
Learnings generalized from trends in Data Science:
a 30-year retrospective on Machine Learning,
a 10-year summary of Leading Data Science Teams,
and a 2-year survey of Enterprise Use Cases.
http://www.eventbrite.com/event/7476758185
A Roadmap Towards Big Data Opportunities, Emerging Issues and Hadoop as a Sol...Rida Qayyum
Â
The concept of Big Data become extensively popular for their vast usage in emerging technologies. Despite being complex and dynamic, big data environment has been generating the colossal amount of data which is impossible to handle from traditional data processing applications. Nowadays, the Internet of things (IoT) and social media platforms like, Facebook, Instagram, Twitter, WhatsApp, LinkedIn, and YouTube generating data in various formats. Therefore, this promotes a drastic need for technology to store and process this tremendous volume of data. This research outlines the fundamental literature required to understand the concept of big data including its nature, definitions, types, and characteristics. Additionally, the primary focus of the current study is to deal with two fundamental issues; storing an enormous amount of data and fast data processing. Leading to objectives, the paper presents Hadoop as a solution to address the problem and discussed the Hadoop Distributed File System (HDFS) and MapReduce programming framework for storage and processing in Big Data efficiently. Future research directions in this field determined based on opportunities and several emerging issues in Big Data domination. These research directions facilitate the exploration of the domain and the development of optimal solutions to address Big Data storage and processing problems. Moreover, this study contributes to the existing body of knowledge by comprehensively addressing the opportunities and emerging issues of Big Data.
Linked Water Data For Water Information ManagementEdward Curry
Â
The management of water consumption is hindered by low general awareness and absence of precise historical and contextual information. Effective and efficiency management of water resources requires a holistic approach considering all the stages of water usage. A decision support tool for water management services requires access to a number of different data domains and different data providers. The design of next-generation water information management systems poses significant technical challenges in terms of information management, integration of heterogeneous data, and real-time processing of dynamic data. Linked Data is a set of web technologies that enables integration of different data sources. This work investigates the usage of Linked Data technologies in the Water Management domain, describes the fundamental concepts of the approach, details an architecture, and discusses possible water management applications.
Overview on data collection methods and a deep dive on data (primary Vs secondary, qualitative and quantitative). Bias. Data processing and structured, unstructured, semistructured data. Databases jargon.
Data Science For Social Good: Tackling the Challenge of HomelessnessAnita Luthra
Â
A talk presented at the Champions Leadership Conference Series - leveraging data provided by New York Cityâs Department of Homeless Services, software vendor Tibco partnered with SumAll.Org to help tackle the societal challenge of homelessness in New York City.
Making the invisible visible. Managing the digital footprint of development p...UNDP Eurasia
Â
Thanks to new technologies, now accessible also in remote places, development work - and development workers - have an increasing digital footprint. Quite litterally, what was invisible can now become visible, with major implications for aid effectiveness, transparency and fundraising. Being able to manage such footprint effectively and analyse it to identify emerging trends is going to be a differentiating skill in the Development 2.0 world. This presentations illustrates some key concepts, examples and tools that development organisations can use ti analyse and manager their digital footprint.
Big Data, Data Science, Machine Intelligence and Learning: Demystification, T...Prof. Dr. Diego Kuonen
Â
Keynote presentation given by Prof. Dr. Diego Kuonen, CStat PStat CSci, on March 14, 2017 at Eurostat's international conference `New Techniques and Technologies for Statistics (NTTS) 2017' in Brussels, Belgium.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
A Swiss Statistician's 'Big Tent' View on Big Data and Data Science (Version 10)Prof. Dr. Diego Kuonen
Â
Keynote talk given by Dr. Diego Kuonen, CStat PStat CSci, on October 21, 2015, at the `Austrian Statistics Days 2015' in Vienna, Austria.
ABSTRACT
There is no question that big data have hit the business, government and scientific sectors. The demand for skills in data science is unprecedented in sectors where value, competitiveness and efficiency are driven by data. However, there is plenty of misleading hype around the terms 'big data' and 'data science'. This presentation gives a professional Swiss statistician's 'big tent' view on these terms, illustrates the connection between data science and statistics, and highlights some challenges and opportunities from a statistical perspective.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
In this paper we have penetrate an era of Big Data. Through better analysis of the large volumes of data that are becoming available, there is the potential for making faster advances in many scientific disciplines and improving the profitability and success of many enterprises. However, many technical challenges described in this paper must be addressed before this potential can be realized fully. The challenges include not just the obvious issues of scale, but also heterogeneity, lack of structure, error-handling, privacy, timeliness, provenance, and visualization, at all stages of the analysis pipeline from data acquisition to result interpretation.
A Statistician's 'Big Tent' View on Big Data and Data Science (Version 9)Prof. Dr. Diego Kuonen
Â
Presentation given by Dr. Diego Kuonen, CStat PStat CSci, on October 1, 2015, at the `Joint SCITAS and Statistics Seminar' of the EPFL in Lausanne, Switzerland.
ABSTRACT
There is no question that big data have hit the business, government and scientific sectors. The demand for skills in data science is unprecedented in sectors where value, competitiveness and efficiency are driven by data. However, there is plenty of misleading hype around the terms 'big data' and 'data science'. This presentation gives a professional statistician's 'big tent' view on these terms, illustrates the connection between data science and statistics, and highlights some challenges and opportunities from a statistical perspective.
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
Big Data, Data-Driven Decision Making and Statistics Towards Data-Informed Po...Prof. Dr. Diego Kuonen
Â
Presentation given by Dr. Diego Kuonen, CStat PStat CSci, on October 20, 2015, at the Swiss Statistical Society's celebration of the `World Statistics Day 2015' in Olten, Switzerland.
Further information are available at https://worldstatisticsday.org/blog.html?c=CHE
The presentation is also available at http://www.statoo.com/BigDataDataScience/.
IT tools for statistics, visualization, open dataCarlo Vaccari
Â
Seminar "Opening Financial Data in Turkey: transparency, accessibility and citizen involvement"
"IT tools for statistics, visualization, open data"
Carlo Vaccari
Ankara, April 19 2012
Presentation slide used during the meetup on Artificial Intelligence and Its Ecosystem organized by Developer Session. In the presentation, I highlighted why open data is one of the key parts of AI ecosystem and the situation of Open Data in Nepal.
On December 9 & 10, Deloitte hosted over 20 business executives and thought leaders at the Internet of Things (IoT) Grand Challenge Workshop at the Tech Museum of Innovation in San Jose. The objective of the gathering was to work collectively to solve one of the more largely unexplored areas of IoT: revenue generating IoT use cases. The following report captures what was discussed during this extraordinary event where an open, collaborative dialogue focused on advancing the field of IoT.
Explore the key findings here or learn more at www2.deloitte.com/us/IoT-challenge.
Putting the L in front: from Open Data to Linked Open DataMartin Kaltenböck
Â
Keynote presentation of Martin Kaltenböck (LOD2 project, Semantic Web Company) at the Government Linked Data Workshop in the course of the OGD Camp 2011 in Warsaw, Poland: Putting the L in front: from Open Data to Linked Open Data
Innovation med big data â chr. hansens erfaringerMicrosoft
Â
Mange steder er Big Data stadig det nye og ukendte, der ikke har topprioritet hos IT, da âvi ikke har store datamĂŠngderâ. Men Big Data er meget mere end store datamĂŠngder. I Chr. Hansen A/S har Forskning og Udvikling (Innovation) afdelingen arbejdet med vĂŠrdien af data og som resultat etableret et tvĂŠrfagligt BioInformatik-program pĂ„ Big Data teknologier fra Microsoft.
The Brussels Data Science community is supported by the European Data Innovation Hub.
Our mission is to educate, inspire and empower scholars and professionals to apply data sciences to address humanityâs grand challenges.
What we do
mind the gap
We are the fastest growing community of data scientists in Europe.
We love doing Data4Good.
We promote the value of analytics and organise events, hands-on sessions and trainings to close the gap between academics and business.
Join us if you want to share, learn and have fun with analytical & technological innovation & positive social change.
Presentation given at the conference "open data for impact"
Erasmus+ project "Public Makers"
https://www.linkedin.com/posts/wide-luxembourg_opendata-publicmakers-activity-6818166878473596928-7ImU/
This paper describes the NHS National Innovation Centre's Linked Data initiative. A discussion on the OHIO (Open Health Innovation Ontology) is also provided.
Open Source & Open Data Session report from imaGIne 2014 ConferenceGSDI Association
Â
Session report from the imaGIne 2014 Conference held in Berlin, Germany, in October 2014. Session was chaired by Dr. Gabor Remetey-Fulopp, of HUNAGI, who were co-organisers for Session 8C1.
Open Data per il riuso della PSI: l'Europa spinge sull'economia del futuroMatteo Brunati
Â
Presentazione portata all'evento "Hack4Med and roadmap to Veneto Open Data" per dare un po' di visione sulla questione europea riguardo alla PSI, e raccontare anche qualche contesto di riuso in chiave business. Il lato da incentivare e da comunicare maggiormente.
A Guide to Data Innovation for Development - From idea to proof-of-conceptUN Global Pulse
Â
âA Guide to Data Innovation for Development - From idea to proof-of-concept,â provides step-by-step guidance for development practitioners to leverage new sources of data. It is a result of a collaboration of UNDP and UN Global Pulse with support from UN Volunteers.
The publication builds on successful case trials of six UNDP offices and on the expertise of data innovators from UNDP and UN Global Pulse who managed the design and development of those projects.
The guide is structured into three sections - (I) Explore the Problem & System, (II) Assemble the Team and (III) Create the Workplan. Each of the sections comprises of a series of tools for completing the steps needed to initiate and design a data innovation project, to engage the right partners and to make sure that adequate privacy and protection mechanisms are applied.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Â
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Â
Clients donât know what they donât know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clientsâ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Â
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as âpredictable inferenceâ.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Â
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overviewâ
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Â
Are you looking to streamline your workflows and boost your projectsâ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, youâre in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part âEssentials of Automationâ series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Hereâs what youâll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
Weâll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Donât miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Â
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Â
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
Â
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
Â
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. Whatâs changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
Â
Sharing Advisory Board newsletter #8
1. Sharing Advisory Board
Software Sharing Newsletter
Issue 8: April 2013
Editorial (Marton Vucsan, SAB Chairman)
Kodak, Sony, Philips, your local newspaper, it
happened to all of them: The main product they
built their existence was no longer relevant. Films,
walkmans, light bulbs, encyclopedias, the list is
endless .. Whole industries are wiped out to make
room for new ones. Darwin would have enjoyed
this. At the Big Data seminar that was organized by
the UN in New York you could see the first signals
that for us the bell will ring too in due course.
Choices will have to be madeâŠ
There is a tremendous opportunity in using the new
information that the planet now produces as by-
product of all its processes. It opens the door to all
kinds of new statistical products and processes to
make them. Let us for this moment focus on the
processes. Looking at the way our researchers
work with the new data we see it is fundamentally
different from what we are used to. First the
amount of data they work with is way too large to
edit, second the tools they use are alien to what we
are familiar with, third the workflow is intermittent
because of the time needed for processing in every
step. Aside from setting up rule sets and creating
workflow type actions no human labor is involved in
the actual production. The creation of these
workflows is knowledge intensive, and once they
are created very little effort is needed to produce
the result. What we see is a shift from manual
labour in production to manual labour in design
which will supply the multiplier we need for sharing
to be effective. In the traditional artisanal
production system there is no multiplier, more
statistical output means more people. Sharing and
collaborating outside the office seems not very
meaningful because of the logistical overhead and
the differences in execution and definition.
Moreover, there is no multiplier present either,
sharing work does not make other work
unnecessary in this kind of setup.
In the big data area where the problems are more
uniform, the real work is in the design of the
process. The processes are more formally defined
and these processes represent the production
knowledge for the statistics they produce. It will be
very profitable to share these processes under the
"build one get ten" rule. The reason is that there is
a multiplier present in the form of a formalized
mostly automated process.
In this Issue:
Open data â taming the tiger
The OECD Open Data Project
Developing software sharing in the
European Statistical System
Improving data collection by soft
computing
Tools for a Sprint
Understanding âPlug and Playâ
The knowledge is deployed in the design phase
of the process and sharing parts of the processes
means getting executable process parts in return
including the knowledge that created them. In
this setup sharing means indeed freeing up
resources.
The threat or opportunity of Big Data will help us
to do two things: It will help us shift the balance
between human labor and machine labor towards
machine labor and it will help us to become a
solution sharing industry that can do much more
with much less money.
In the end it might not only be the new products
that will be our opportunity but also the new
industrial processes that will underlie these new
products
Are you Linked ?
The LinkedIn group âBusiness Architecture in
Statisticsâ aims to share knowledge and ideas
and promote activities, such as those undertaken
by the Sharing Advisory Board, that support
standards-based modernisation and greater
collaboration between statistical organisations.
Join the discussions at:
http://www.linkedin.com/groups?home=&gid=
4173055
You can also find out more about SAB activities
and outputs via the MSIS wiki:
www1.unece.org/stat/platform/display/msis
2. Open data â taming the tiger
Eoin McCuirc (Central Statistics Office, Ireland)
The term âopen dataâ means different things to
different people, though the goals of making
information freely available and easily accessible
online are very clear. Iâll start by looking at Tim
Berners-Leeâs classification of five levels of open
data:
â   Make your data available on the WebÂ
under an open license
â â  Make it available as structured data (e.g.Â
Excel sheet instead of image scan of aÂ
table)
â â â   Use a nonâproprietary format (e.g. CSVÂ
file instead of an Excel sheet)Â
â â â â  Use Linked Data formats (URIs toÂ
identify things, RDF to represent data)
â â â â â  Link your data to other peopleâs data toÂ
provide context
So, there are degrees of âopennessâ â from simply
putting information up on the web to providing
linked open data. Yes, both are a form of open
data but, though similar in appearance, they are
two completely different animals, as different say
as a cat to a tiger. In this article I want to talk about
the tiger: linked open data, the semantic web and
how the CSO is beginning to meet this new
challenge.
In managing the dissemination of statistics we are
guided by international standards. Principles 14
and 15 of the European Statistics Code of Practice
are particularly relevant:
Principle 14 Coherence and Comparability: EuropeanÂ
Statistics are consistent internally, over time andÂ
comparable between regions and countries; it isÂ
possible to combine and make joint use of related dataÂ
from different sources.Â
Principle 15 Accessibility and Clarity: EuropeanÂ
Statistics are presented in a clear and understandableÂ
form, released in a suitable and convenient manner,Â
available and accessible on an impartial basis withÂ
supporting metadata and guidance.Â
Clearly, the opportunities offered by open data will
help statistical offices to deliver outputs which
match these two principles.
The data deluge, or the accumulation of data about
people, places and things, is changing the world in
which statistical offices process and publish
statistics â and is another important driver for open
data. In general, it is getting more and more
difficult to find the information you need â the
problem of the needle in the haystack. Sooner
rather than later we will need machines to trawl
through all available data in order to find the
proverbial needle. But for this to become possible,
data needs to be structured in a particular way â a
challenge to which the semantic web offers a
solution.
The semantic web provides a way of making data
machine-readable, independent of the variety of
technical platforms and software packages in use
throughout the web. A key concept is that of linked
open data. In many ways, linked open data is
similar to open data: An organisation such as a
statistical office decides what information it wants
to publish on the web and makes the necessary
technical choices about hosting, security, domain
names, content management, maintenance, etc.
These choices apply equally to publishing linked
open data.
However, the key difference is one of language. In
linked open data, semantic web objects are named
to indicate all the attributes needed to make the
data machine-readable without human
intervention. For statisticians, this is an opportunity
to use international classifications (e.g. NACE,
ISCO, ISCED etc.) as de facto standards for linked
open data. Indeed, if we donât take this opportunity,
itâs possible that other early adopters could set a
different standard.
We are starting on a journey and, unfortunately,
there is no clear road map yet and few precedents
to give guidance. So, how do we acquire the
expertise needed to publish statistics as linked
open data?
In 2012 the CSO began a pilot project with the
Digital Enterprise Research Institute (DERI) at the
National University of Ireland, Galway (NUIG), to
publish some of the Census 2011 results as linked
open data. The project has given valuable
experience to the CSO dissemination team and the
following are some of the important lessons so far:
1. For data to be linked across the semantic
web, objects need to be named. Uniform Resource
Identifiers (URIs) are the code that identifies an
object. Official statistics use many standard
classifications to define their data and, as noted
above, this is very useful when creating URIs.
2. Once the objects have been named a
framework is needed to publish this data on the
web. Using the Resource Description Framework
(RDF) which views the world in triplets â
Special Feature: Open Data
The next two articles explore the implications of
open data for official statistics. The first presents
the view from a national statistical organisation
(CSO Ireland). The second gives the perspective
of an international organisation (OECD).
3. (Resource, Attribute, Value) â the information is
published on the web. An example of an RDF
statement is (Population of Ireland 2011, Statistic,
4588252).
3. To publish data on the semantic web an
organisation needs to put it in a place and in a
format that a machine will expect. The CSO will
publish its Census 2011 open data on data.cso.ie
not on the CSO website www.cso.ie. In this
scenario machines should get RDF data and users
should get some readable representation of the
data e.g. HTML.
4. Ideally all the URIs an organisation
produces related to a single real world concept â
e.g. Population of Ireland 2011 â should be linked
together.
5. Ideally the URIs would be âcoolâ, built to last
2000 years or more.
http://www.w3.org/TR/2008/WD-cooluris-20080321/
6. For the Census 2011 pilot it is proposed to
produce a SPARQL (SPARQL Protocol and RDF
Query Language) service to facilitate access to the
data.
The following table sets out the framework which is
planned for Census 2011 results as linked open
data:
Base URI: http://data.cso.ie/
Entity URI pattern (rel. toÂ
base)Â
RDFÂ classÂ
Classification /classification/{id} skos:ConceptSch
emeÂ
Concept in aÂ
classificationÂ
/classification/{id}#{
code}Â
skos:ConceptÂ
Dataset /dataset/{id} qb:DataSetÂ
Data structureÂ
definitionÂ
/dataset/{id}#struct
ureÂ
qb:DataStructur
eDefinitionÂ
Observation /dataset/{id}#{dim1}
;{dim2};{dim3}Â
qb:ObservationÂ
PropertyÂ
(dimension,Â
attribute)Â
/property#{id}Â qb:DimensionPr
operty,Â
qb:AttributePro
pertyÂ
Later in 2013, we will publish the outputs from the
project â i.e. Census 2011 results as linked open
data and, to mark International Year of Statistics,
we will have a competition for the best âmash upâ
using those statistics. We hope this will not only be
a proof of concept, but also a proof of the value of
linked open data.
The OECD Open Data Project
The OECD is currently undertaking an Open
Data project with the aim of making its statistical
data content machine-readable, retrievable,
indexable and re-usable. The Open Data project
will implement an Application Programming
Interface (API) to provide machine-to-machine
access to the OECD statistical data warehouse
âOECD.Statâ via a number of formats along with
the challenges involved in standardising the
statistical content from the 800+ datasets.
In addition an Open Innovation community will be
created to encourage the re-use of OECD data
via external innovation.
The Open Data project is aligned to the
Knowledge Information Management (KIM)
Ontology Management and Semantic
Infrastructure project to make data accessible via
linked data.
Background to the Open Data Project
Statistics are of strategic importance to the
OECD both as an input for internal analysis and
also as a product for dissemination to a wider
audience in their own right. Following a review of
the OECD Publishing Policy in 2011 a number of
recommendations were proposed to make OECD
statistics âopen, accessible and freeâ. The OECD
Council welcomed this proposal and as a result
the DELTA programme was initiated to
implement these aims.
DELTA Project â Open Data
Openness is one of the key values that guide the
OECD vision for a stronger, cleaner and fairer
economy. Making data open is an important part
of this and to this end a number of open
benchmarks in the project have been defined as
follows:
Completeness â content should include data,
metadata, sources and methods.
Primacy â datasets should be primary and not
aggregated and include details on how the data
was collected.
Timeliness â data should be automatically
available in trusted third-party repositories upon
publication.
Ease of access â data made available via a
simple Application Programming Interface (API)
Machine Readability â data and metadata
provided in machine-readable standard plus
documentation.
Non-discrimination â No special permissions
required to access data.
4. ï· Use of common standards â Stored data can be
accessed without a special software license
ï· Licensing â Creative Commons CC-BY
(Licensees may copy, distribute, display and
perform the work and make derivative works
based on it only if they give the author or
licensor the credits in the manner specified by
these).
ï· Permanence â Information made available
remain online with archiving over time together
with notification mechanism.
ï· Usage costs â Free.
Open Data Project goals
Data today can be extracted only via downloads
from OECD.Stat. The ODWS will make them
available to other web sites directly for creating
custom data visualizations, live combinations with
other data sources etc. The goals of the Open
Data project are: to make OECD data machine-
readable, retrievable, indexable and re-usable; to
increase the dissemination and impact of OECD
data via open data services for its statistical data;
and, to encourage re-use of OECD data by
external innovation communities.
The Open Data Project has 3 main deliverables: i)
a full set of âOpen-readyâ data and metadata; ii) a
set of Open Data Web Services and iii) an
interface for managing the OECD Open Innovation
Community.
âOpen-Readyâ Data and metadata
For data to be considered âOpen-readyâ the
existing data and metadata content of the OECD
corporate data warehouse OECD.Stat will be
required to meet certain criteria of structure and
content necessary for machine-to-machine access.
To achieve this, data owners will carry out a self-
assessment of all OECD.Stat data content to
gauge the state of open-readiness for each
dataset. This will involve analysing the metadata
content according to the criteria.
Open Data Web Services (ODWS)
In parallel to the data assessment exercise, the
Open Data Web Services will be developed. This
will involve building a set of Web Services to
provide machine-to-machine access to OECD.Stat
data via a number of formats. This will involve
defining the technical standards for data to be
machine-readable that meet the needs of both
expert and non-expert audiences. Application
Programming Interfaces (API) will be developed to
make the data and metadata in OECD.Stat
available to systems outside the organisation via a
number of formats.
These Web Services will be available to other
organisations currently sharing the .Stat Data
warehouse software via the OECD Statistical
Information System Collaboration Community
(SIS-CC) .
Open Data formats
Data and metadata will be made available to
external users in as many output formats as
possible to maximise data access. The project will
start with formats including: SDMX/JSON, Restful
API, OData, XLS and CSV. Additional formats will
be added as needed over time. These formats
have been chosen for the reasons described
below.
a) Excel/CSV - Excel and CSV are already widely
used exchange standards so including them as
output formats was a fairly obvious decision.
b) SDMX/JSON - JavaScript Object Notation
(JSON) is a text-based open standard designed for
human-readable data interchange and has
become one of the most popular industry-used
open data formats on web sites today.
The Statistics Data and Metadata eXchange
standard (SDMX) provides a standard model for
statistical data and metadata exchange between
national agencies and international agencies,
within national statistical systems and within
organisations. OECD is a member of the SDMX
Sponsor Group (together with the Bank of
International Settlements, European Central Bank,
Eurostat, International Monetary Fund, United
Nations Statistics Division and World Bank). SDMX
data extracts from OECD.Stat are already provided
via a web service; this will be adapted as an API
using the SDMX compact version.
c) Open Data (OData) - OData is an open protocol
for sharing data
Future formats could include Google Data (a
REST-inspired technology), Google Dataset
Publishing Language (DPSL) or Google KML, a
Geospatial file format.
Linked Data and the OECD KIM project
The OECD Knowledge and Information
Management (KIM) has been established to
integrate information and centralise access to all
OECD content (corporate content management,
record management, authoring, etc.). KIM was
launched in parallel to the DELTA project and is
concerned with developing semantic enrichment
and centralized taxonomy linked data support.
A long-term goal of the project is to create linked
data sources with the Resource Description
Framework (RDF) using existing vocabularies to
map data to related subjects and generating a
5. Software
inventory
ï· Over 60 statistical software
tools available for sharing
ï· Find new software, or post
information about yours at:
www1.unece.org/stat/platform/di
splay/msis/Software+Inventory
collection of âtriplesâ (consisting of a subject, a
predicate and an object) known as a âtriple-storeâ.
Each component of the triple has a Unique
Resource Identifier (URI) enable data to be linked
to related sources.
Creating a triple-store from the OECD.Stat data
warehouse will be a huge task and work
investigating the possibilities has only recently
started (at time of writing the tools have not yet
been selected), but the long-term goal is to
conform to the Tim Lee-Berners â5 starâ level of
open data.
The vision of the Semantic Web is to extend
principles of the Web from documents to data.
Data should be accessed using the general Web
architecture using, e.g., URI-s; data should be
related to one another just as documents (or
portions of documents) are already. This also
means creation of a common framework that
allows data to be shared and reused across
application, enterprise, and community boundaries,
to be processed automatically by tools as well as
manually, including revealing possible new
relationships among data items.
The OECD Open Innovation Community
The Open Innovation Community will consist of an
interface for managing Open Innovation
Community (OIC) content and involves designing,
building and maintaining this interface to provide
the following:
ï· Information describing the open platform
ï· Registration services
ï· Examples of products developed using the
open platform
ï· Open Services available with associated
technical documentation
ï· OIC Blog
ï· FAQ
6. Developing software sharing in the
European Statistical System
Denis Grofils (Eurostat)
Software represents an important part of the
assets of the European Statistical System (ESS).
In statistical institutions as in many modern
businesses quality and availability of software is of
primordial importance as it affects directly the way
business processes are executed. If all members
of the ESS are possibly developing software to
some extent, all are using software without any
doubt. The development of software is usually
recognized as costly as well at development as in
maintenance stages. The simple usage of
software may be costly in different respects:
licensing fees, consultancy, training, âŠ
Software may be of different nature and extent, for
example some types of software could be:
ï· Data collection systems
ï· Procedures developed in statistical computing
languages for different purposes (sampling,
imputation, weighting, aggregation,
confidentiality protection, etc.)
ï· Tools for the management of statistical
metadata
ï· Web portals for data dissemination
As the level of standardisation grows in the
statistical community through harmonization at
international level and through initiatives that
promote industrialisation of official statistical
production (see the work of the HLG of the
UNECE or the Joint strategy and the ESS.VIP
programme at ESS level), the sharing of software
at a wider level becomes easier.
The move toward service oriented architecture
(SOA) and the development of a so called "plug
and play" architecture for statistical production
reinforce strongly the potential of sharing.
Platform-independent services allow distributed
architecture models that promote a high level of
reuse of software components. Services can be
developed independently or cooperatively and
shared among partners. Functionalities of existing
software can be offered as service at limited cost
via proper wrapping. All this make the potential of
software sharing higher than ever.
The possibility to share software among
institutions of the ESS represents several
advantages, notably:
ï· Increase efficiency and reduce costs by
avoiding multiple developments of virtually the
same products by different organisations
ï· Increase harmonization and interoperability
through the use of standard software building
blocks
ï· Improve quality of the data through the use of
widely accepted and validated software
building blocks and improve comparability
among data coming from different countries
ï· Increase the level of collaboration and resource
sharing between members of the ESS
Several important achievements relating to OSS
have been realised at the European level, notably:
ï· The European Union Public Licence (EUPL):
The first European OSS licence. It has been
created on the initiative of the European
Commission and is approved by the European
Commission in 22 official languages of the
European Union.
ï· Joinup: A collaborative platform created by the
European Commission that offers a set of
services to help e-Government professionals
share their experience with interoperability
solutions and support them to find, choose, re-
use, develop, and implement open source
software and semantic interoperability assets.
The ESS IT Directors Group (ITDG) mandated the
Statistical Information System Architecture and
Integration working group (SISAI) to launch a task
force dealing with the development of policy and
guidelines supporting ESS software sharing. The
work of this task force started during fall 2012.
The following aspects of software sharing are
tackled:
ï· Definition of software of interest: In this
context the term âsoftwareâ is to be understood
in its broadest sense as any set of computer
programs, these being defined as any set of
instructions for computers. Objective criteria for
defining the target of the recommendations are
necessary. Software of interest is defined as
âsoftware used by members of the ESS to
support directly activities of the GSBPM in
order to realise the statistical programme of the
ESSâ. It should be noted that this definition is
independent of technological characteristics of
software (web-based, command-line batch,
macros, web-services, etc.).
ï· Software catalogue: The way a catalogue of
ESS software should be maintained and which
information should be recorder is defined. A
distinction is made between unshared software
(used by only one ESS-member) for which a
minimal set of information is collected and
shared software (used by several ESS-
members) for which an extensive set of
information is collected.
7. This newsletter and previous editions are also
available on-line at:
http://www1.unece.org/stat/platform/display/
msis/SAB+Newsletter
ï· Sharing scenarios: Several scenarios are
identified and the applicability of
recommendations per scenario is defined (i.e.
all recommendations do not apply to all
scenarios).
ï· Sharing software use: The federation of
software users through the creation of user
communities is organized. This concerns
software published under any type of licence
inclusive commercial software.
ï· Sharing software development:
Recommendations are made for each step of
the development cycle. As an example it is
recommended to consider several type of
constraints when designing software:
Architectural constrains (consistency with
GSBPM & GSIM, link with PnP constraints),
clear documentation of methodological
aspects, data protection constraints specific to
the ESS, support for multilingualism and legal
roadmap (particularly intellectual property right
tracking when developing component-based
applications).
ï· Software quality evaluation: A template for
software quality assessment is provided.
Evaluations of elaborated recommendations on
real cases were performed to evaluate
propositions against reality and incorporate
feedback from these experiences. Three
illustration cases were used: Blaise, Demetra+
and SDMX-RI.
The set of draft recommendations elaborated by
the task force will be submitted in the coming
weeks to the Statistical Information System
Architecture and Integration working group (SISAI)
and then to the ESS IT Directors Group (ITDG).
Improving data collection by soft
computing
Miroslav Hudec and Jana JuriovĂĄ (Infostat)
The applicability of soft computing (fuzzy logic
and neural networks) as a modern means to
improve the collection and the quality of data for
business and trade statistics is one of topics of
the Blue-ETS project (http://www.blue-
ets.istat.it/).
The main findings which support this line of
development are:
ï· Large complex administrative and statistical
databases contain valuable information
which can be mined using powerful
methodologies;
ï· Statisticians possess knowledge on how to
deal with their tasks, but this knowledge
cannot be always expressed by precise
rules.
In order to estimate missing values, relations
between similar respondents are relevant.
Mining the Intrastat database by neural
networks (NN) reveals that it is a rational option
that could present a solution. NNs find patterns
and relations between similar respondents. In
this way we are able to estimate items if we
have enough data available from other
respondents.
Fuzzy rules expressed by linguistic terms and
quantifiers reveal levels of similarity between
imputed and surveyed values.
Similar techniques are also promising for
dissemination. People prefer to use expressions
of natural language in searching for useful data
and information. For example: select regions
where most municipalities have a small altitude
above sea level, etc. The result is entities
ranked according to the degree of match to the
query condition.
Modernization of the first and the last stage of
data collection could create a chain reaction of
improvements in data quality. Better data
dissemination (by flexible queries) could
motivate respondents to provide their own data
more timely and accurately, and reduce the
frequency of missing values implying more
efficient imputation (less missing values and
powerful neural networks).
Relevant equations, models and experimental
tools have been created in order to evaluate
pros and cons. The next step is the creation of
full functional tools and their adaptation to
particular needs.
What does Big data mean
for official statistics?
A new paper prepared by leading international
experts has recently been released by the High-
Level Group for the Modernisation of Statistical
Production and Services:
http://www1.unece.org/stat/platform/pages/v
iewpage.action?pageId=77170614
8. Tools for a Sprint
Carlo Vaccari (a sprinter)
In Ottawa from 8 to 13 April 2013 we had a Sprint for the Plug & Play Architecture Project. People from
Australia, Canada, Eurostat, Italy, Mexico, Netherlands, New Zealand, Sweden, UNECE and United
Kingdom met together to start defining a âcommon statistical production architecture for the world's official
statistical industry.â as stated by High-Level Group for the Modernisation of Statistical Production and
Services (see http://www1.unece.org/stat/platform/display/hlgbas).
The objective of the sprint session was to ensure agreement on key principles regarding the architectural
blueprint to build an interoperable statistical industry platform.
We will discuss the documents produced by this meeting over the next few weeks. Here I just want to show
you which tools were used in the Sprint.
Paper, a lot of paper
We wrote a lot on sheets, flip charts, post-its of any color and any shape.
Paper was used to explain, show, collect, store, group and debate ideas.
White-boards
We used many white-boards writing there with markers of any type â often we used cameras/mobiles to get
a picture of what was written to be able to transfer the concepts to digital files (we would love smart boards
like: www.youtube.com/watch?v=NZNTgglPbUA)
Mixed
And yes, we used paper and boards together, very useful when you
want to group concepts and keep track of what was done.
9. Wiki
We inserted documents, presentations, images,
discussion, glossary and so on in the UNECE
wiki.
Mind Maps
Often we used Mind Maps to capture
brainstorming and discussions: one of the
best ways to avoid losing ideas and
summarize what has been said in lengthy
discussions
Presentations
Presentation software was used not only to
prepare slides to present, but also to draw and
develop schemas. Using notes and colors,
presentation software has then been used as a
kind of digital dashboard.
Lego bricks
Each participant received from our wonderful facilitators three
Lego bricks of different colors. We had to raise them to
indicate respectively: âI want to speakâ âOff topicâ âToo much
detailâ. A very simple way to force participants to follow rules
for an efficient exchange of views.
Lollies
Each participant brought sweets (âlolliesâ in Australian English) from
their country, to share with partners. Biscuits, chocolates, sweets of all
kinds were the fuel to provide energy to tired brains
10. Understanding âPlug and Playâ
Marton Vucsan (Statistics Netherlands)
I have high hopes for the Common Statistical
Production Architecture (CSPA) Project, commonly
known under the more profane title of Plug and
Play. Although it sounds easy, it may prove to be
hard, very hard. From what I hear, âplug and playâ
has many interpretations. Some point in the
direction of the feared âmother of all systemsâ
projects that never work. Understanding what is
really meant by the current CSPA project is
important because CSPA is something completely
different from the big feared projects of the past.
CSPA is about reducing complexity and getting
operating system independence. It is also about
sharing and reducing our efforts while still getting
what we need. To achieve this we have to realize
that our means of production are composed of
different levels of abstraction. There are the
methods, the process descriptions and finally the
applications (I am deliberately keeping it simple
here). Normally, to arrive at a statistical output we
describe a method, create a process and build an
application. All three are normally monolithic in
nature and custom made. Unshareable, un-
reusable, expensive, complex. The stuff called
âlegacy appsâ, the stuff we should stop making.
CSPA starts with the insight, that splitting things up
reduces complexity. The GSBPM does this; it splits
up the statistical process into easy to understand
sub-processes. Thinking and building in sub-
systems reduces complexity and increases
reliability. Many programmers struggle with this,
trying to split up a given solution into meaningful
parts and often failing. In hindsight, the reason for
that failure is obvious; the reduction in complexity
has to be done at a much higher level. The
complexity is often in the methods and the way the
processes were thought up, independent from the
IT implementation. If we really want to reduce
complexity, that is our point of attack: the level
were we specify our statistical recipe.
As a statistical community, we seem to agree that
statistical outputs can be produced by processes
composed of GSBPM sub-processes. With the
right compromises we will be able to use these
sub-processes or components across a broad
range of statistics and agencies; like the engines
on a plane or the motor management system in
your car. Just like the conceptual understanding
that a car and a plane are a collection of functional
sub-systems, we need to understand that a
statistical production system is a collection of
functional sub-processes.
Once we are able to think of our processes as
assemblies of components we can reuse them or
exchange them. Of course it is not that simple, but
there are powerful forces at work to make that
happen. Look what happened in other industries.
When a component, say a motor management
system, is available, most designs gravitate to
using that component because it is much cheaper
than to âroll your ownâ.
A component can be manufactured separately
from the system it will be used in. Rolls Royce
donât need planes to manufacture engines. The
key is to do it at the right conceptual level. Others
have done it (look at your phone); we can do it
too!
Many statistical organisations are modernising
using Enterprise Architecture to underpin their
vision and change strategy. This enables them to
develop statistical services in a standard way.
Enterprise architecture creates an environment
which can change and support business goals. It
shows what the business needs are, where the
organisation wants to be, and ensures that IT
strategy aligns with this. It helps to remove silos,
improves collaboration and ensures that
technology is aligned to business needs.
In parallel, the High Level Group for the
Modernization of Statistical Production and
Services (HLG) is developing the CSPA. This will
be a generic architecture for statistical production,
and will serve as an industry architecture for
official statistics. Adopting a common architecture
will make it easier for organisations to standardise
and combine the components of statistical
production, regardless of where the statistical
services are built.
The CSPA also provides a starting point for
concerted developments of statistical
infrastructure and shared investment across
statistical organisations.
Version 0.1 of the CSPA documentation has
just been released for public comment at:
www1.unece.org/stat/platform/x/_ISwB
Your feedback is welcome!