Distributed trust architecture is becoming the new foundation for data sharing and analytics. It involves shifting trust from local and institutional models to distributed models using technologies like blockchain. This allows data and machine learning models to be distributed across organizations and jurisdictions with privacy, transparency, and integrity maintained through distributed ledgers and smart contracts. CSIRO's Data61 is doing extensive research in this area, including applying these principles to federated learning, software supply chains, consumer data rights frameworks, and other domains to enable innovative uses of data while maintaining user trust.
Distributed Trust Architecture: The New Reality of ML-based SystemsLiming Zhu
1. The document discusses trends in distributed trust architectures for machine learning systems, including more decentralized data sources, data sharing without direct access, and shifting trust from institutions to distributed models.
2. It outlines CSIRO's Data61 organization, Australia's largest data and digital innovation organization, and some of their work developing distributed trust architectures like federated learning and blockchain-based systems.
3. The document discusses technical challenges like entanglements in ML systems and the need for architectures that support distributed infrastructure, computing, and trust models.
Responsible AI & Cybersecurity: A tale of two technology risksLiming Zhu
With the broader adoption of digital technologies and AI, organisations face the emerging risks of AI, the unfamiliar, and the intensified risk of cybersecurity, the familiar. AI and cybersecurity are intertwined, but risk silos are often created when they are dealt with at the technology and governance levels. This talk will explore the interactions between responsible AI and cybersecurity risks via industry case studies. It will show how we can break down the risk silos and use emerging trust-enhancing technologies, architecture and end-to-end software engineering/DevOps practices to connect the two worlds and uplift the risk management posture for both.
Emerging Technologies in Synthetic Representation and Digital TwinLiming Zhu
This document discusses emerging technologies in synthetic representation and digital twins presented by Dr. Liming Zhu from CSIRO's Data61. It covers digital twins, synthetic representations, emerging technologies like federated learning and simulation, and examples of spatial digital twins in Australia. It emphasizes securely and privately connecting digital twins through techniques like federated analytics, sharing without access, desensitized and synthetic data. Future focus areas discussed include trusted data sharing, federated data and models, cross-domain security, and synthetic representation of supply chains.
The document discusses the Internet of Things (IoT) and some of the key challenges. It notes that IoT data is multi-modal, distributed, heterogeneous, noisy and incomplete. It raises issues around data management, actuation and feedback, service descriptions, real-time analysis, and privacy and security. The document outlines research challenges around transforming raw data to actionable information, machine learning for large datasets, making data accessible and discoverable, and energy efficient data collection and communication. It emphasizes that IoT data integration requires solutions across physical, cyber and social domains.
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...Carole Goble
Invited talk, PHIL_OS, March 30-31 2023, Exeter
https://opensciencestudies.eu/whither-open-science. Includes hidden slides.
FAIR and Open Science needs Digital Research Infrastructure, which is a federated system of systems and needs funding models that are fit for purpose
Culture change needed for paying for Open Science’s infrastructure and funding support for data driven research needs more reality and less rhetoric
This document discusses the potential for developing a knowledge network by leveraging metadata from scientific endeavors. It begins by outlining some of the limitations of traditional metadata approaches. It then proposes that metadata could be structured as a graph using semantic triples to represent relationships between people, institutions, projects, and other elements. This liberalized metadata approach could help reduce complexity while providing a more comprehensive view of scientific activities and outputs. The document advocates for establishing common standards, developing tools to extract and aggregate metadata, and creating services and repositories to enable discovery, analysis, and visualization of the knowledge network. The goal is to facilitate research by providing integrated access to information on scientific data, publications, actors and their relationships.
Emerging Technologies in Data Sharing and Analytics at Data61Liming Zhu
This document provides an overview of Data61, Australia's national science agency, and its work in emerging technologies related to data sharing and analytics. It discusses Data61's strategic goals and focus areas such as artificial intelligence, cybersecurity, digital agriculture, and quantum technologies. It also summarizes Data61's work establishing Australia's National AI Centre and its research on topics like blockchain, federated learning, and regulatory technologies.
Distributed Trust Architecture: The New Reality of ML-based SystemsLiming Zhu
1. The document discusses trends in distributed trust architectures for machine learning systems, including more decentralized data sources, data sharing without direct access, and shifting trust from institutions to distributed models.
2. It outlines CSIRO's Data61 organization, Australia's largest data and digital innovation organization, and some of their work developing distributed trust architectures like federated learning and blockchain-based systems.
3. The document discusses technical challenges like entanglements in ML systems and the need for architectures that support distributed infrastructure, computing, and trust models.
Responsible AI & Cybersecurity: A tale of two technology risksLiming Zhu
With the broader adoption of digital technologies and AI, organisations face the emerging risks of AI, the unfamiliar, and the intensified risk of cybersecurity, the familiar. AI and cybersecurity are intertwined, but risk silos are often created when they are dealt with at the technology and governance levels. This talk will explore the interactions between responsible AI and cybersecurity risks via industry case studies. It will show how we can break down the risk silos and use emerging trust-enhancing technologies, architecture and end-to-end software engineering/DevOps practices to connect the two worlds and uplift the risk management posture for both.
Emerging Technologies in Synthetic Representation and Digital TwinLiming Zhu
This document discusses emerging technologies in synthetic representation and digital twins presented by Dr. Liming Zhu from CSIRO's Data61. It covers digital twins, synthetic representations, emerging technologies like federated learning and simulation, and examples of spatial digital twins in Australia. It emphasizes securely and privately connecting digital twins through techniques like federated analytics, sharing without access, desensitized and synthetic data. Future focus areas discussed include trusted data sharing, federated data and models, cross-domain security, and synthetic representation of supply chains.
The document discusses the Internet of Things (IoT) and some of the key challenges. It notes that IoT data is multi-modal, distributed, heterogeneous, noisy and incomplete. It raises issues around data management, actuation and feedback, service descriptions, real-time analysis, and privacy and security. The document outlines research challenges around transforming raw data to actionable information, machine learning for large datasets, making data accessible and discoverable, and energy efficient data collection and communication. It emphasizes that IoT data integration requires solutions across physical, cyber and social domains.
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...Carole Goble
Invited talk, PHIL_OS, March 30-31 2023, Exeter
https://opensciencestudies.eu/whither-open-science. Includes hidden slides.
FAIR and Open Science needs Digital Research Infrastructure, which is a federated system of systems and needs funding models that are fit for purpose
Culture change needed for paying for Open Science’s infrastructure and funding support for data driven research needs more reality and less rhetoric
This document discusses the potential for developing a knowledge network by leveraging metadata from scientific endeavors. It begins by outlining some of the limitations of traditional metadata approaches. It then proposes that metadata could be structured as a graph using semantic triples to represent relationships between people, institutions, projects, and other elements. This liberalized metadata approach could help reduce complexity while providing a more comprehensive view of scientific activities and outputs. The document advocates for establishing common standards, developing tools to extract and aggregate metadata, and creating services and repositories to enable discovery, analysis, and visualization of the knowledge network. The goal is to facilitate research by providing integrated access to information on scientific data, publications, actors and their relationships.
Emerging Technologies in Data Sharing and Analytics at Data61Liming Zhu
This document provides an overview of Data61, Australia's national science agency, and its work in emerging technologies related to data sharing and analytics. It discusses Data61's strategic goals and focus areas such as artificial intelligence, cybersecurity, digital agriculture, and quantum technologies. It also summarizes Data61's work establishing Australia's National AI Centre and its research on topics like blockchain, federated learning, and regulatory technologies.
Facilitating Scientific Collaborations by Delegating Identity ManagementVon Welch
The document summarizes the research of the XSIM Team on facilitating scientific collaborations through delegating identity management. It provides context on how scientific collaborations have evolved from localized to remote and large-scale. It identifies barriers to identity management like historical inertia, risk aversion, and compliance requirements. The document then presents the XSIM VO Identity Model and examples of incremental identity delegation approaches used at facilities like NERSC and XSEDE to reduce costs while maintaining security. It concludes that virtual organizations are essential to science and strategies exist to incrementally increase trust and delegation of identity functions.
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...Denodo
Watch full webinar here: https://bit.ly/43qJKwn
Data-led transformations are becoming more prevalent in recent years, across numerous industries. More and more senior leaders are looking for data to drive their business decisions and impact their bottom line. One key challenge facing such businesses is the ability to pivot to new technologies while maintaining investments in legacy systems they have grown to rely on. In an age where automation, internet-scale search, and advanced analytics are driving many new advances, it is important to understand that this is not only a pivot in terms of technologies, it is a pivot in terms of how we think about and utilize data of different types. Traditional systems since the 1970’s have been built around database concepts where data is physically pipelined, mapped together, statically modeled, and locked away in vaults. The types of vaults have evolved over time from basic databases, to data warehouses, to data lakes, to lake houses, and so on.
The fundamental premise remains: data is placed into sealed containers, such that the critical approach is around storage, instead of being aimed at retrieval. Reversing this approach can, instead, lead to understanding data as transient, on-demand, and immediately available to end users within a certain context. This talk will discuss certain contemporary concepts that are expanding the notion of data storage devices and, instead, are moving to loosely connected data retrieval devices, or in some cases, data generation devices. We will examine this shift in approach and what it means for designing and deploying new types of technologies that can be more flexible and provide improved business value for clients in the fast-paced evolving world of Artificial Intelligence.
Dynamic Data Analytics for the Internet of Things: Challenges and OpportunitiesPayamBarnaghi
Dynamic Data Analytics for the Internet of Things: Challenges and Opportunities
IoT data analytics faces unique challenges compared to traditional big data analytics. IoT data is multi-modal, heterogeneous, noisy, incomplete, time and location dependent, and dynamic. It requires near real-time analysis while ensuring privacy and security. Analyzing IoT data requires an ecosystem approach that can integrate data from multiple sources and platforms semantically. Discovery engines are needed to locate IoT data streams and resources that are often mobile and transient. Context-aware and opportunistic techniques are required to access and route IoT data. The goal is to extract insights and actionable knowledge from physical, cyber, and social data sources.
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdfDr. Radhey Shyam
The document provides an overview of data analytics and big data concepts. It discusses the characteristics of big data, including the four V's of volume, velocity, variety and veracity. It describes different types of data like structured, semi-structured and unstructured data. The document also introduces popular big data platforms like Hadoop, Spark and Cassandra. Finally, it outlines key reasons for the need of data analytics, such as enabling better decision making and improving organizational efficiency.
These slides were used at the first Aarhus Follower Group meet-up for the EU-funded project IoTCrawler. They entail an introduction to the project aswell as a more in depth presentation of the difference between web search and Internet of Things (IoT) search an the development of Internet of Things. Furthermore some of the scenarios from the project are presented.
Hyper-Converged Infrastructure: Big Data and IoT opportunities and challenges...Andrei Khurshudov
The document discusses emerging technologies related to the Fourth Industrial Revolution including the Internet of Things (IoT), big data, artificial intelligence, and how they are fundamentally changing information technology. It notes that these technologies are creating massive amounts of data, especially unstructured data from machines. Realizing their full potential will require new approaches to data storage, processing, analytics and decision making delivered through solutions like cloud computing, hyper-converged infrastructure, and edge/fog computing. The integration of all these technologies promises to deliver improved productivity, living standards and actionable insights.
Adopting a Logical Data Architecture for Today's Data and Analytics RequirementsDenodo
Watch full webinar here: https://bit.ly/3y4yMPU
It’s almost impossible to find any organization that does not have data and analytics as one of their top priorities to further their business objectives. At the same time the data and analytics landscape is evolving faster than ever, making the data management ecosystem more complex than ever before. As data gets increasingly distributed across systems and locations, every forward looking organization should adopt a logical architecture to be future ready.
Watch On-Demand and Learn:
- Key priorities of data and analytics leaders for business transformation
- Why a monolithic and physical data architecture is not suitable for such transformation
- How a logical data architecture can help organizations in their business transformation
Many technical communities are vigorously pursuing
research topics that contribute to the Internet of Things (IoT).
Nowadays, as sensing, actuation, communication, and control become
even more sophisticated and ubiquitous, there is a significant
overlap in these communities, sometimes from slightly different
perspectives. More cooperation between communities is encouraged.
To provide a basis for discussing open research problems in
IoT, a vision for how IoT could change the world in the
distant future is first presented. Then, eight key research topics
are enumerated and research problems within these topics are
discussed.
Introduction to Data Analytics and data analytics life cycleDr. Radhey Shyam
The document provides an overview of data analytics and big data concepts. It discusses the characteristics of big data, including the four V's of volume, velocity, variety and veracity. It also describes different types of data like structured, semi-structured and unstructured data. The document then introduces big data platforms and tools like Hadoop, Spark and Cassandra. Finally, it discusses the need for data analytics in business, including enabling better decision making and improving efficiency.
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...Edward Curry
This document provides an overview of a book on enabling data ecosystems for intelligent systems. It discusses key concepts like digital twins, physical-cyber-social computing, and mass personalization. It also outlines the architecture of a real-time linked dataspace platform that supports pay-as-you-go data integration and sharing for applications and intelligent systems. The platform is designed to handle streaming data from sensors and integrate it with contextual data sources using approximate semantic matching techniques.
Enhancing The Data Mining Capabilities in large scale IT Industry: A Comprehe...IRJET Journal
This document discusses integrating artificial intelligence (AI) algorithms and Elasticsearch to enhance data mining capabilities in large-scale IT industries. It begins with an abstract that overviews leveraging AI technologies and Elasticsearch's synergistic effects for data mining, analytics, and information retrieval. The introduction provides context on data mining, AI's role in enhancing data mining, and Elasticsearch's significance for data mining activities. It then discusses strategies for integrating AI and Elasticsearch, including data, algorithm, and scalability/performance integration approaches. Example applications are described like search/recommendations, anomaly detection/fraud prevention, and predictive analytics. Benefits, challenges, and considerations of the integrated approach are also highlighted. Finally, case studies are presented on using AI and Elasticsearch
Building a Blockchain-based Reputation Infrastructure for Open Research. Ca...Carmen Holotescu
Presentation for ICCMAE 2022: The 2nd International Conference on Computational Methods and Applications in Engineering May 7-8, 2021
Authors:
Victor HOLOTESCU, PhD Student,
Andrei TERNAUCIUC, PhD
Radu VASIU, PhD
Politehnica University of Timișoara, Romania
Carmen HOLOTESCU, PhD
”Ioan Slavici” University of Timișoara, Romania
Cosmin CIORANU, PhD
UEFISCDI, Bucharest, Romania
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIBig Data Week
Charles Cai has more than two decades of experience and track records of global transformational programme deliveries – from vision, evangelism to end-to-end execution in global investment banks, and energy trading companies, where he excels at designing and building innovative, large scale, Big Data systems in high volume low latency trading, global Energy Trading & Risk Management, and advanced temporal and geospatial predictive analytics, as Chief Front Office Technical Architect and Head of Data Science. He’s also a frequent speaker at Google Campus, Big Data Innovation Summit, Cloud World Forum, Data Science London, QCon London and MoD CIO Symposium etc, to promote knowledge and best practice sharing, with audience ranging from developers, data scientists, to CXO level senior executives from both IT and business background. He has in-depth knowledge and experience Scala, Python, C# / F#, C++, Node.js, Java, R, Haskell programming languages in Mobile, Desktop, Hadoop/Spark, Cloud IoT/MCU and BlockChain etc, and TOGAF9, EMC-DS, AWS CNE4 etc. certifications.
The NIH Data Commons - BD2K All Hands Meeting 2015Vivien Bonazzi
Presentation given at the BD2K All Hands meeting in Bethesda, MD, USA in November 2015
https://datascience.nih.gov/bd2k/events/NOV2015-AllHands
Video cast of this presentation:
http://videocast.nih.gov/summary.asp?Live=17480&bhcp=1
talk starts at 2hrs 40min (its about 55mins long) - includes video!
Document describing the Commons : https://datascience.nih.gov/commons
EMBL Australian Bioinformatics Resource AHM - Data CommonsVivien Bonazzi
This document discusses the development of the NIH Data Commons, which aims to create a shared framework and infrastructure for biomedical data. It notes the increasing amounts of data being generated and the need for data sharing and interoperability. The Data Commons framework treats data, tools, and publications as digital objects that are findable, accessible, interoperable and reusable. Current pilots include deploying reference datasets in the cloud, indexing data and tools, and a credits system for cloud resources. Challenges discussed include metrics, costs, standards, incentives and sustainability. The framework's relevance for supporting open data in Australia is also addressed.
Data Science and AI in Biomedicine: The World has ChangedPhilip Bourne
This document discusses the changing landscape of data science and AI in biomedicine. Some key points:
- We are at a tipping point where data science is becoming a driver of biomedical research rather than just a tool. Biomedical researchers need to become data scientists.
- Data science is interdisciplinary and touches every field due to the rise of digital data. It requires openness, translation of findings, and consideration of responsibilities like algorithmic bias.
- Advances like AlphaFold2 show the power of large collaborative efforts combining data, computing resources, engineering, and domain expertise. This points to the need for public-private partnerships and new models of open data sharing.
- The definition of
The document discusses competency frameworks for roles in research data infrastructure, including researchers, statisticians, data scientists, librarians, data curators, and engineers. It outlines the scope of skills and knowledge required in science/research, curation/stewardship, and engineering/infrastructure. It also discusses considerations around research data infrastructure communities, open science, identity and identifiers, and interoperability. Key challenges identified include the need for multi-disciplinary skills and defining career pathways to attract talent. Solutions proposed include developing cloud and open source frameworks, education, and establishing trust to address human resource shortfalls.
Blockchain and Services – Exploring the LinksIngo Weber
In this keynote talk, given at the ASSRI Symposium 2018, I explore four different facets of the relationship between Blockchain and Services.
First, application-level service interfaces for interaction with Blockchain-based applications enable easy integration with existing infrastructure. Second, service composition can be achieved through smart contracts, and enable different approaches to orchestrations and choreographies. Third, Blockchain-aaS offerings cover infrastructure operation, but can go beyond that. And finally, microservice principles can be applied to smart contract design.
AI TransformationA Clash with Human ExpertiseLiming Zhu
Dr. Liming Zhu, Research Director at CSIRO's Data61, gave a presentation on AI transformation and its potential clash with human expertise. The presentation discussed frontier AI risks and the Australian approach to responsible AI. It highlighted how general AI capabilities could replace specific tools and enable low-cost experimentation over problem-driven planning. The presentation concluded by encouraging collaboration with CSIRO's Data61 on AI engineering best practices and governance and foundation model-based system design and evaluation.
Deciphering AI: Human Expertise in the Age of Evolving AILiming Zhu
1) The document discusses how human expertise remains important in the age of evolving AI, especially as AI systems transition from narrow, rule-based approaches to more general and autonomous capabilities like deep learning and generative AI.
2) It provides examples of how human expertise can guide different AI approaches, from feature engineering for machine learning to providing feedback to help validate or invalidate systems.
3) The document also covers challenges around the business use of advanced AI, including how to ensure systems are explainable, accountable, and developed responsibly according to principles like fairness, privacy and reliability.
More Related Content
Similar to Distributed Trust Architecture: The New Foundation of Everything
Facilitating Scientific Collaborations by Delegating Identity ManagementVon Welch
The document summarizes the research of the XSIM Team on facilitating scientific collaborations through delegating identity management. It provides context on how scientific collaborations have evolved from localized to remote and large-scale. It identifies barriers to identity management like historical inertia, risk aversion, and compliance requirements. The document then presents the XSIM VO Identity Model and examples of incremental identity delegation approaches used at facilities like NERSC and XSEDE to reduce costs while maintaining security. It concludes that virtual organizations are essential to science and strategies exist to incrementally increase trust and delegation of identity functions.
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...Denodo
Watch full webinar here: https://bit.ly/43qJKwn
Data-led transformations are becoming more prevalent in recent years, across numerous industries. More and more senior leaders are looking for data to drive their business decisions and impact their bottom line. One key challenge facing such businesses is the ability to pivot to new technologies while maintaining investments in legacy systems they have grown to rely on. In an age where automation, internet-scale search, and advanced analytics are driving many new advances, it is important to understand that this is not only a pivot in terms of technologies, it is a pivot in terms of how we think about and utilize data of different types. Traditional systems since the 1970’s have been built around database concepts where data is physically pipelined, mapped together, statically modeled, and locked away in vaults. The types of vaults have evolved over time from basic databases, to data warehouses, to data lakes, to lake houses, and so on.
The fundamental premise remains: data is placed into sealed containers, such that the critical approach is around storage, instead of being aimed at retrieval. Reversing this approach can, instead, lead to understanding data as transient, on-demand, and immediately available to end users within a certain context. This talk will discuss certain contemporary concepts that are expanding the notion of data storage devices and, instead, are moving to loosely connected data retrieval devices, or in some cases, data generation devices. We will examine this shift in approach and what it means for designing and deploying new types of technologies that can be more flexible and provide improved business value for clients in the fast-paced evolving world of Artificial Intelligence.
Dynamic Data Analytics for the Internet of Things: Challenges and OpportunitiesPayamBarnaghi
Dynamic Data Analytics for the Internet of Things: Challenges and Opportunities
IoT data analytics faces unique challenges compared to traditional big data analytics. IoT data is multi-modal, heterogeneous, noisy, incomplete, time and location dependent, and dynamic. It requires near real-time analysis while ensuring privacy and security. Analyzing IoT data requires an ecosystem approach that can integrate data from multiple sources and platforms semantically. Discovery engines are needed to locate IoT data streams and resources that are often mobile and transient. Context-aware and opportunistic techniques are required to access and route IoT data. The goal is to extract insights and actionable knowledge from physical, cyber, and social data sources.
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdfDr. Radhey Shyam
The document provides an overview of data analytics and big data concepts. It discusses the characteristics of big data, including the four V's of volume, velocity, variety and veracity. It describes different types of data like structured, semi-structured and unstructured data. The document also introduces popular big data platforms like Hadoop, Spark and Cassandra. Finally, it outlines key reasons for the need of data analytics, such as enabling better decision making and improving organizational efficiency.
These slides were used at the first Aarhus Follower Group meet-up for the EU-funded project IoTCrawler. They entail an introduction to the project aswell as a more in depth presentation of the difference between web search and Internet of Things (IoT) search an the development of Internet of Things. Furthermore some of the scenarios from the project are presented.
Hyper-Converged Infrastructure: Big Data and IoT opportunities and challenges...Andrei Khurshudov
The document discusses emerging technologies related to the Fourth Industrial Revolution including the Internet of Things (IoT), big data, artificial intelligence, and how they are fundamentally changing information technology. It notes that these technologies are creating massive amounts of data, especially unstructured data from machines. Realizing their full potential will require new approaches to data storage, processing, analytics and decision making delivered through solutions like cloud computing, hyper-converged infrastructure, and edge/fog computing. The integration of all these technologies promises to deliver improved productivity, living standards and actionable insights.
Adopting a Logical Data Architecture for Today's Data and Analytics RequirementsDenodo
Watch full webinar here: https://bit.ly/3y4yMPU
It’s almost impossible to find any organization that does not have data and analytics as one of their top priorities to further their business objectives. At the same time the data and analytics landscape is evolving faster than ever, making the data management ecosystem more complex than ever before. As data gets increasingly distributed across systems and locations, every forward looking organization should adopt a logical architecture to be future ready.
Watch On-Demand and Learn:
- Key priorities of data and analytics leaders for business transformation
- Why a monolithic and physical data architecture is not suitable for such transformation
- How a logical data architecture can help organizations in their business transformation
Many technical communities are vigorously pursuing
research topics that contribute to the Internet of Things (IoT).
Nowadays, as sensing, actuation, communication, and control become
even more sophisticated and ubiquitous, there is a significant
overlap in these communities, sometimes from slightly different
perspectives. More cooperation between communities is encouraged.
To provide a basis for discussing open research problems in
IoT, a vision for how IoT could change the world in the
distant future is first presented. Then, eight key research topics
are enumerated and research problems within these topics are
discussed.
Introduction to Data Analytics and data analytics life cycleDr. Radhey Shyam
The document provides an overview of data analytics and big data concepts. It discusses the characteristics of big data, including the four V's of volume, velocity, variety and veracity. It also describes different types of data like structured, semi-structured and unstructured data. The document then introduces big data platforms and tools like Hadoop, Spark and Cassandra. Finally, it discusses the need for data analytics in business, including enabling better decision making and improving efficiency.
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...Edward Curry
This document provides an overview of a book on enabling data ecosystems for intelligent systems. It discusses key concepts like digital twins, physical-cyber-social computing, and mass personalization. It also outlines the architecture of a real-time linked dataspace platform that supports pay-as-you-go data integration and sharing for applications and intelligent systems. The platform is designed to handle streaming data from sensors and integrate it with contextual data sources using approximate semantic matching techniques.
Enhancing The Data Mining Capabilities in large scale IT Industry: A Comprehe...IRJET Journal
This document discusses integrating artificial intelligence (AI) algorithms and Elasticsearch to enhance data mining capabilities in large-scale IT industries. It begins with an abstract that overviews leveraging AI technologies and Elasticsearch's synergistic effects for data mining, analytics, and information retrieval. The introduction provides context on data mining, AI's role in enhancing data mining, and Elasticsearch's significance for data mining activities. It then discusses strategies for integrating AI and Elasticsearch, including data, algorithm, and scalability/performance integration approaches. Example applications are described like search/recommendations, anomaly detection/fraud prevention, and predictive analytics. Benefits, challenges, and considerations of the integrated approach are also highlighted. Finally, case studies are presented on using AI and Elasticsearch
Building a Blockchain-based Reputation Infrastructure for Open Research. Ca...Carmen Holotescu
Presentation for ICCMAE 2022: The 2nd International Conference on Computational Methods and Applications in Engineering May 7-8, 2021
Authors:
Victor HOLOTESCU, PhD Student,
Andrei TERNAUCIUC, PhD
Radu VASIU, PhD
Politehnica University of Timișoara, Romania
Carmen HOLOTESCU, PhD
”Ioan Slavici” University of Timișoara, Romania
Cosmin CIORANU, PhD
UEFISCDI, Bucharest, Romania
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIBig Data Week
Charles Cai has more than two decades of experience and track records of global transformational programme deliveries – from vision, evangelism to end-to-end execution in global investment banks, and energy trading companies, where he excels at designing and building innovative, large scale, Big Data systems in high volume low latency trading, global Energy Trading & Risk Management, and advanced temporal and geospatial predictive analytics, as Chief Front Office Technical Architect and Head of Data Science. He’s also a frequent speaker at Google Campus, Big Data Innovation Summit, Cloud World Forum, Data Science London, QCon London and MoD CIO Symposium etc, to promote knowledge and best practice sharing, with audience ranging from developers, data scientists, to CXO level senior executives from both IT and business background. He has in-depth knowledge and experience Scala, Python, C# / F#, C++, Node.js, Java, R, Haskell programming languages in Mobile, Desktop, Hadoop/Spark, Cloud IoT/MCU and BlockChain etc, and TOGAF9, EMC-DS, AWS CNE4 etc. certifications.
The NIH Data Commons - BD2K All Hands Meeting 2015Vivien Bonazzi
Presentation given at the BD2K All Hands meeting in Bethesda, MD, USA in November 2015
https://datascience.nih.gov/bd2k/events/NOV2015-AllHands
Video cast of this presentation:
http://videocast.nih.gov/summary.asp?Live=17480&bhcp=1
talk starts at 2hrs 40min (its about 55mins long) - includes video!
Document describing the Commons : https://datascience.nih.gov/commons
EMBL Australian Bioinformatics Resource AHM - Data CommonsVivien Bonazzi
This document discusses the development of the NIH Data Commons, which aims to create a shared framework and infrastructure for biomedical data. It notes the increasing amounts of data being generated and the need for data sharing and interoperability. The Data Commons framework treats data, tools, and publications as digital objects that are findable, accessible, interoperable and reusable. Current pilots include deploying reference datasets in the cloud, indexing data and tools, and a credits system for cloud resources. Challenges discussed include metrics, costs, standards, incentives and sustainability. The framework's relevance for supporting open data in Australia is also addressed.
Data Science and AI in Biomedicine: The World has ChangedPhilip Bourne
This document discusses the changing landscape of data science and AI in biomedicine. Some key points:
- We are at a tipping point where data science is becoming a driver of biomedical research rather than just a tool. Biomedical researchers need to become data scientists.
- Data science is interdisciplinary and touches every field due to the rise of digital data. It requires openness, translation of findings, and consideration of responsibilities like algorithmic bias.
- Advances like AlphaFold2 show the power of large collaborative efforts combining data, computing resources, engineering, and domain expertise. This points to the need for public-private partnerships and new models of open data sharing.
- The definition of
The document discusses competency frameworks for roles in research data infrastructure, including researchers, statisticians, data scientists, librarians, data curators, and engineers. It outlines the scope of skills and knowledge required in science/research, curation/stewardship, and engineering/infrastructure. It also discusses considerations around research data infrastructure communities, open science, identity and identifiers, and interoperability. Key challenges identified include the need for multi-disciplinary skills and defining career pathways to attract talent. Solutions proposed include developing cloud and open source frameworks, education, and establishing trust to address human resource shortfalls.
Blockchain and Services – Exploring the LinksIngo Weber
In this keynote talk, given at the ASSRI Symposium 2018, I explore four different facets of the relationship between Blockchain and Services.
First, application-level service interfaces for interaction with Blockchain-based applications enable easy integration with existing infrastructure. Second, service composition can be achieved through smart contracts, and enable different approaches to orchestrations and choreographies. Third, Blockchain-aaS offerings cover infrastructure operation, but can go beyond that. And finally, microservice principles can be applied to smart contract design.
Similar to Distributed Trust Architecture: The New Foundation of Everything (20)
AI TransformationA Clash with Human ExpertiseLiming Zhu
Dr. Liming Zhu, Research Director at CSIRO's Data61, gave a presentation on AI transformation and its potential clash with human expertise. The presentation discussed frontier AI risks and the Australian approach to responsible AI. It highlighted how general AI capabilities could replace specific tools and enable low-cost experimentation over problem-driven planning. The presentation concluded by encouraging collaboration with CSIRO's Data61 on AI engineering best practices and governance and foundation model-based system design and evaluation.
Deciphering AI: Human Expertise in the Age of Evolving AILiming Zhu
1) The document discusses how human expertise remains important in the age of evolving AI, especially as AI systems transition from narrow, rule-based approaches to more general and autonomous capabilities like deep learning and generative AI.
2) It provides examples of how human expertise can guide different AI approaches, from feature engineering for machine learning to providing feedback to help validate or invalidate systems.
3) The document also covers challenges around the business use of advanced AI, including how to ensure systems are explainable, accountable, and developed responsibly according to principles like fairness, privacy and reliability.
1) Dr. Liming Zhu from CSIRO's Data61 discusses using generative AI like ChatGPT to assist with scientific discovery by acting as smart interns or tools that can provide low-cost experimentation ideas and analyses.
2) This raises opportunities for changing the role of scientist expertise and facilitating new cross-discipline collaborations, but also risks around ensuring AI systems are developed and used responsibly and trustworthily.
3) CSIRO is working on best practices for responsible generative AI, including through their Responsible AI Pattern Catalogue, to help address issues of trust, transparency, and accountability as general AI capabilities are applied to science.
AI Unveiled: From Current State to Future FrontiersLiming Zhu
The document discusses AI and generative AI. It provides an overview of CSIRO's Data61, Australia's largest data and digital innovation organization. It discusses definitions of AI, different AI approaches, and the roles of human expertise. It covers recent advances in generative AI like GPT-3 and foundation models. It also discusses opportunities and risks of generative AI, and Australia's vision for responsible AI through ethics principles and the Responsible AI Network. The document advocates for system-level practices and governance to manage risks of generative AI while understanding and explaining models rather than just building them.
Software Architecture for Foundation Model-Based SystemsLiming Zhu
With the successful implementation of Large Language Models (LLMs) in chatbots like ChatGPT, there is growing attention on foundation models, which are anticipated to serve as core components in the development of future AI systems. Yet, systematic exploration into the design of foundation model-based systems, particularly concerning risk management, trust, and trustworthiness, remains limited. In this talk, I propose the challenges and initial approaches in both architecting LLM-based systems and how LLM systems have an impact on software engineering. I point to some initial directions such as architecting as a process of understanding (rather than designing/building), setting and trade-offing guardrails (rather than quality attributes), and radical observability.
The document discusses challenges and directions for responsible AI. It outlines three gaps: 1) the need to align AI principles and standards with engineering practices; 2) the difficulty understanding inscrutable AI models; and 3) the misalignment between AI principles and system-level behaviors. It proposes closing these gaps through engineering practices, operationalizable frameworks, and connected design patterns. It also advocates understanding AI systems through testing and accountability measures. Finally, it discusses designing foundation model-based systems through capabilities rather than functions and ensuring tools are optimized for and trusted by humans and AI agents alike.
This document discusses generative AI and its potential transformations and use cases. It outlines how generative AI could enable more low-cost experimentation, blur division boundaries, and allow "talking to data" for innovation and operational excellence. The document also references responsible AI frameworks and a pattern catalogue for developing foundation model-based systems. Potential use cases discussed include automated reporting, digital twins, data integration, operation planning, communication, and innovation applications like surrogate models and cross-discipline synthesis.
Trends & Innovationin Cyber and DigitaltechLiming Zhu
The document summarizes key topics in AI and cybersecurity discussed by Liming Zhu, Research Director at CSIRO's Data61 and Conjoint Professor at UNSW. It highlights the need for trustworthy and responsible AI practices, as well as trusted collaborative intelligence and digital infrastructure. The role of AI is evolving from using data and examples to discover emerging capabilities, and oversight will need to scale with AI. Large language models will also become more sophisticated products and services, requiring new techniques for tuning and operations.
Responsible/Trustworthy AI in the Era of Foundation Models Liming Zhu
The document discusses responsible AI and challenges in developing AI systems. It summarizes CSIRO's work on closing gaps between principles and engineering practices, understanding increasingly complex AI systems, and designing systems that incorporate foundation models. Key points include the need to measure system-level impacts, develop engineering methods for explainability and oversight, and design tools that can evolve with AI capabilities rather than target specific functions.
ICSE23 Keynote: Software Engineering as the Linchpin of Responsible AILiming Zhu
Liming Zhu presented on software engineering for responsible AI. Some key points:
1. There are gaps between principles/standards and engineering practices for building ethical AI systems, and between different teams working in silos.
2. Software engineering can play a connecting role by closing these gaps and operationalizing responsible practices at the system level through approaches like guardrails, observability, and understanding AI systems rather than just building them.
3. As large language models become more common, software engineering for responsible AI will need to adapt, such as through reference architectures for designing systems using foundation models responsibly.
The talk discussed directions for software engineering to help close gaps and connect silos in responsible AI development
International Cooperation for Research on Privacy and Data Protection - Austr...Liming Zhu
The document discusses Australia's approach to international cooperation on research related to privacy, data protection, and artificial intelligence. It outlines CSIRO's Data61 organization, which conducts research in these areas. It notes trends like increased data sharing and large language models that raise privacy concerns. It also discusses balancing innovation with regulatory requirements. The document proposes connecting different risks, closing gaps between principles and algorithms, and forming an international research alliance to coordinate standards and enable cross-border data sharing with privacy protections.
RegTech for IR - Opportunities and LessonsLiming Zhu
This document discusses opportunities for using regulatory technology (RegTech) in industrial relations (IR) in Australia. It notes the large costs of regulatory compliance and difficulties in understanding and implementing new laws and policies. The CSIRO has developed several proofs of concept and spinouts applying techniques like machine learning to areas like analyzing awards and enterprise agreements against payroll data, optimizing work rosters, and filling out forms based on legislation. Lessons learned include the need for machine-readable rules and automation to handle complex regulations. Potential benefits include reduced compliance costs, faster product development, lower risk, and tools to evaluate potential regulatory impacts. Open questions remain around modeling legal language, generating explanations, ensuring ethics, and deploying capabilities.
Australia's National Science Agency, CSIRO's Data61, is working to develop responsible and trustworthy AI according to human principles and values. Data61 has over 1000 people working on responsible technology issues like privacy-preserving federated learning and using blockchain for ESG certification. Data61 is also applying Australia's 8 National AI Ethics Principles in areas like human-centered design, fairness through requirements engineering, and accountability via transparent traceability. Moving forward, Data61 will continue collaborating with industry on responsible AI projects addressing social and technical challenges.
Cyber technologies for SME growth – Barriers and SolutionsLiming Zhu
This document discusses barriers that small and medium enterprises (SMEs) face in adopting cyber technologies and potential solutions. It notes that SMEs have limited resources to pursue innovations like artificial intelligence (AI) and cybersecurity alone. It then presents several solutions developed by the Commonwealth Scientific and Industrial Research Organisation's (CSIRO) Data61 that aim to help SMEs overcome barriers through collaborative research and development models at scale, including open source ecosystems, data commons, living labs, technology platforms, shared infrastructure, and reference implementations of standards.
Challenges in Practicing High Frequency Releases in Cloud Environments Liming Zhu
Talk at RELENG 2014
Full paper: http://www.nicta.com.au/pub?doc=7925
The continuous delivery trend is dramatically shortening release cycles from months into hours. Applications with high frequency releases often rely heavily on automated deployment tools using cloud infrastructure APIs. We report some results from experiments on reliability issues of cloud infrastructure and trade-offs between using heavily-baked and lightly-baked images. Our experiments were based on Amazon Web Service (AWS) OpsWorks APIs and configuration management tool Chef. As a result of our experiments, we then propose error handling practices that can be included in tailor-made continuous deployment facilities.
More related info at our DevOps book http://www.ssrg.nicta.com.au/projects/devops_book/
Dependable Operation - Performance Management and Capacity Planning Under Con...Liming Zhu
This document discusses approaches for managing systems undergoing continuous changes, such as those from continuous deployment in cloud environments. It proposes incorporating knowledge about sporadic operations and external events into system management. For sporadic operations, it describes Process-Oriented Dependability (POD) for error detection and diagnosis during operations like rolling upgrades. It also discusses using process context for alert management and availability analysis of sporadic operations. For external events, it discusses event-aware workload prediction. The goal is to better support operations personnel through performance engineering techniques that account for changes and uncertainty in cloud systems.
The document discusses ensuring dependable operations in cloud computing. It notes that 80% of outages are caused by human/process issues rather than technical faults. The author presents a process-oriented approach to modeling operations as sets of steps executed by agents requiring resources. Faults at one step may surface later, so their framework aims to verify step post
Modelling and Analysing Operation Processes for Dependability Liming Zhu
The 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN13) talk slides. June 27th, 2013. Full text here: http://www.nicta.com.au/pub?doc=7031
Takashi Kobayashi and Hironori Washizaki, "SWEBOK Guide and Future of SE Education," First International Symposium on the Future of Software Engineering (FUSE), June 3-6, 2024, Okinawa, Japan
Zoom is a comprehensive platform designed to connect individuals and teams efficiently. With its user-friendly interface and powerful features, Zoom has become a go-to solution for virtual communication and collaboration. It offers a range of tools, including virtual meetings, team chat, VoIP phone systems, online whiteboards, and AI companions, to streamline workflows and enhance productivity.
E-commerce Application Development Company.pdfHornet Dynamics
Your business can reach new heights with our assistance as we design solutions that are specifically appropriate for your goals and vision. Our eCommerce application solutions can digitally coordinate all retail operations processes to meet the demands of the marketplace while maintaining business continuity.
What is Augmented Reality Image Trackingpavan998932
Augmented Reality (AR) Image Tracking is a technology that enables AR applications to recognize and track images in the real world, overlaying digital content onto them. This enhances the user's interaction with their environment by providing additional information and interactive elements directly tied to physical images.
Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics.
To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/
Odoo ERP software
Odoo ERP software, a leading open-source software for Enterprise Resource Planning (ERP) and business management, has recently launched its latest version, Odoo 17 Community Edition. This update introduces a range of new features and enhancements designed to streamline business operations and support growth.
The Odoo Community serves as a cost-free edition within the Odoo suite of ERP systems. Tailored to accommodate the standard needs of business operations, it provides a robust platform suitable for organisations of different sizes and business sectors. Within the Odoo Community Edition, users can access a variety of essential features and services essential for managing day-to-day tasks efficiently.
This blog presents a detailed overview of the features available within the Odoo 17 Community edition, and the differences between Odoo 17 community and enterprise editions, aiming to equip you with the necessary information to make an informed decision about its suitability for your business.
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j
Dr. Jesús Barrasa, Head of Solutions Architecture for EMEA, Neo4j
Découvrez les dernières innovations de Neo4j, et notamment les dernières intégrations cloud et les améliorations produits qui font de Neo4j un choix essentiel pour les développeurs qui créent des applications avec des données interconnectées et de l’IA générative.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeAftab Hussain
Understanding variable roles in code has been found to be helpful by students
in learning programming -- could variable roles help deep neural models in
performing coding tasks? We do an exploratory study.
- These are slides of the talk given at InteNSE'23: The 1st International Workshop on Interpretability and Robustness in Neural Software Engineering, co-located with the 45th International Conference on Software Engineering, ICSE 2023, Melbourne Australia
GraphSummit Paris - The art of the possible with Graph TechnologyNeo4j
Sudhir Hasbe, Chief Product Officer, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxrickgrimesss22
Discover the essential features to incorporate in your Winzo clone app to boost business growth, enhance user engagement, and drive revenue. Learn how to create a compelling gaming experience that stands out in the competitive market.
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Łukasz Chruściel
No one wants their application to drag like a car stuck in the slow lane! Yet it’s all too common to encounter bumpy, pothole-filled solutions that slow the speed of any application. Symfony apps are not an exception.
In this talk, I will take you for a spin around the performance racetrack. We’ll explore common pitfalls - those hidden potholes on your application that can cause unexpected slowdowns. Learn how to spot these performance bumps early, and more importantly, how to navigate around them to keep your application running at top speed.
We will focus in particular on tuning your engine at the application level, making the right adjustments to ensure that your system responds like a well-oiled, high-performance race car.
Do you want Software for your Business? Visit Deuglo
Deuglo has top Software Developers in India. They are experts in software development and help design and create custom Software solutions.
Deuglo follows seven steps methods for delivering their services to their customers. They called it the Software development life cycle process (SDLC).
Requirement — Collecting the Requirements is the first Phase in the SSLC process.
Feasibility Study — after completing the requirement process they move to the design phase.
Design — in this phase, they start designing the software.
Coding — when designing is completed, the developers start coding for the software.
Testing — in this phase when the coding of the software is done the testing team will start testing.
Installation — after completion of testing, the application opens to the live server and launches!
Maintenance — after completing the software development, customers start using the software.
E-commerce Development Services- Hornet DynamicsHornet Dynamics
For any business hoping to succeed in the digital age, having a strong online presence is crucial. We offer Ecommerce Development Services that are customized according to your business requirements and client preferences, enabling you to create a dynamic, safe, and user-friendly online store.
Hand Rolled Applicative User ValidationCode KataPhilip Schwarz
Could you use a simple piece of Scala validation code (granted, a very simplistic one too!) that you can rewrite, now and again, to refresh your basic understanding of Applicative operators <*>, <*, *>?
The goal is not to write perfect code showcasing validation, but rather, to provide a small, rough-and ready exercise to reinforce your muscle-memory.
Despite its grandiose-sounding title, this deck consists of just three slides showing the Scala 3 code to be rewritten whenever the details of the operators begin to fade away.
The code is my rough and ready translation of a Haskell user-validation program found in a book called Finding Success (and Failure) in Haskell - Fall in love with applicative functors.
Distributed Trust Architecture: The New Foundation of Everything
1. Australia’s National Science Agency
Distributed Trust Architecture:
The New Foundation of Everything
Dr Liming Zhu
Research Director, CSIRO’s Data61
Professor, University of New South Wales
Chair, Blockchain & Distributed Ledger Technology, Standards Australia
Expert on working groups:
• ISO/IEC JTC 1/WG 13 Trustworthiness
• ISO/IEC JTC 1/SC 42/WG 3 - Artificial intelligence – Trustworthiness
2. CSIRO’s Data61: Australia’s Largest Data & Digital
Innovation R&D Organisation
1000+
talented people
(including
affiliates/students)
Home of
Australia’s
National AI
Centre
Data61
Generated
18+ Spin-outs
130+ Patent
groups
200+
Gov &
Corporate
partners
Facilities
Mixed-Reality Lab
Robotics Inno. Centre
AI4Cyber HPC Enclave
300+
PhD students
30+
University collaborators
Responsible
Tech/AI
Privacy & RegTech
Engineering & Design of
AI Systems
Resilient &
Recovery Tech
Cybersecurity
Digital Twin
Spark (bushfire) toolkit
2 |
3. • Blockchain & DLT Standards
– ISO/TC 307
• Government Advisory
• Industry Projects & Technology
• Reports, Books
Blockchain Work at Data61
• Research
– Models and architecture for
blockchain-based systems
– Business process & blockchain
– Trustworthy blockchain
3 |
4. Trust Shifts to Distributed Trust
Local à Institutional à Distributed
Culture, “distributed trust” in
data, ML model, systems,
individuals and organisations
4 |
5. Trust Architecture
5 |
Systems Operating in the Context of
• Zero Trust Environment
• Trustless Machines/Protocols
• Distributed Trust/Blockchain
• Distributed Infrastructure
• Data, Compute/Code, Models
6. § More sources & types from public or partners
§ Decentralized, Distributed, Federated
§ Access and use of sensitive data from another
organization/country
§ Privacy, but also commercial and other sensitivity
§ ML/Analytics over encrypted data
§ ”sharing without access”
§ Open data/innovation (anonymized or desensitized data)
Bi-directional Distributed Trust in Data Sharing
Data sharing, Data-as-a-service & Model-as-a-Service
6 |
7. Trust Architecture via Regulation/Ethics Overlay
Platforms, Risk-based Approach, Market architecture…
Legislations
• EU’s GDPR: privacy, security and “specific” purpose of use
• Australia
• Data Breach Notification Scheme
• Consumer Data Right (CDR): Open Banking, Energy..
• AI Regulations
Ethics Principles and Guidelines
• Trust Data/AI-powered Service provided/run by others?
• OECD/GPAI, UN, Standards…
• Australia: AI Ethics Frameworks and Guidelines
7 |
8. Distributed Trust Architecture in AI Engineering/Systems
8 |
• Entanglements, Correction Cascades,
Undeclared Customers
• Data (Model, Code, Config..) Dependencies
• Anti-patterns
• Debt: Abstraction, Reproducibility, Process
Management, Culture
Circa 2014-15 2020-2021/Today
• ”federated data collection, storage, model,
and infrastructure”
• “co-design and co-versioning”…
• implication of foundation models
10. 42 Shades of Grey in (Distributed) Trust Architecture
10 |
11. 1. Y Liu, Q Lu, HY Paik, L Zhu, Defining Blockchain Governance Principles: A Comprehensive Framework. (2021) https://arxiv.org/abs/2110.13374
2. M Qi, Z Wang, F Wu, R Hanson, S Chen, Y Xiang, L Zhu: Blockchain-Enabled Federated Learning Model for Privacy Preservation: System Design ACISP 2021
3. S. Lo, Y. Liu, Q. Lu, C. Wang, X. Xu, H.Paik, L. Zhu: Blockchain-based Trustworthy Federated Learning Architecture. (2021) https://arxiv.org/abs/2108.06912
4. S.Lo, Q. Lu, L. Zhu, H. Paik, X. Xu, C. Wang: Architectural Patterns for the Design of Federated Learning Systems. (2021) https://arxiv.org/abs/2101.02373
5. W Zhang, Q. Lu, et al.: Blockchain-Based Federated Learning for Device Failure Detection in Industrial IoT. IEEE Internet Things J. 8(7): 5926-5937 (2021)
6. Sin Kit Lo, Qinghua Lu, Hye-Young Paik, Liming Zhu: FLRA: A Reference Architecture for Federated Learning Systems. ECSA 2021: 83-98
7. Q Lu, X Xu, HMN Bandara, S Chen, L Zhu: Design Patterns for Blockchain-Based Payment Applications (2021) https://arxiv.org/abs/2102.09810
8. Su Yen Chia, Xiwei Xu, Hye-Young Paik, Liming Zhu: Analysing and extending privacy patterns with architectural context. SAC 2021
9. Y. Shanmugarasa, H. Paik, S. Kanhere, Liming Zhu: Towards Automated Data Sharing in Personal Data Stores. PerCom Workshops 2021: 328-331
10. L. Zhu, X. Xu, Q. Lu, et al.: “AI and Ethics - Operationalising Responsible AI”, Humanity Driven AI (2021) https://arxiv.org/abs/2105.08867
11. M Dong, F Yuan, L Yao, X Wang, X Xu, L Zhu: Trust in recommender systems: A deep learning perspective (2020), https://arxiv.org/abs/2004.03774
12. Y. Gao, M. Kim, S. Abuadbba, et al.: End-to-End Evaluation of Federated Learning and Split Learning for Internet of Things. SRDS 2020: 91-100
13. Dongyao Wu, Sherif Sakr, Liming Zhu et al.: HDM-MC in-Action: A Framework for Big Data Analytics across Multiple Clusters. ICDCS 2018
14. Yun Zhang, Liming Zhu, Xiwei Xu, Shiping Chen, An Binh Tran: Data Service API Design for Data Analytics. SCC 2018: 87-102
15. S. Lee, Ross J., L. Zhu, "A Contingency-Based Approach to Data Governance Design for Platform Ecosystem", PACIS 2018
16. S. Lee, R. Jeffery, L. Zhu, "Data Governance Decisions for Platform Ecosystems", HICSS 2019
17. Dongyao Wu, Sherif Sakr, Liming Zhu, Huijun Wu: Towards Big Data Analytics across Multiple Clusters. CCGrid 2017: 218-227
18. L Bass, R Holz, P Rimba, AB Tran, L Zhu: Securing a deployment pipeline, 2015, RELENG 2015
Based on Selected Data61 Work
11 |
12. • Distributed Trust Architecture is all about the trade-offs.
• Blockchain solves sometimes, but mostly complements & inspires
• Limitations of Crytoeconomics and blockchain
• Persistent plutocracy, suppressing participant interests, discounting
externalities, moving towards politics, state regulation, temporal modulation,
hybridity..
Is Blockchain the Silver Bullet? Yes & No.
12 |
https://arxiv.org/abs/2110.13374
13. Trust Architecture: Untrusted Analytics to Trusted Data
From MIT Enigma to Solid PODS to Data Airlock
• Model to Data; Insights back
• Enabled analytics of sensitive data
• Vetting of insights back
• Case Studies:
• Major government agency
• Genomics
13 |
Data61’s “Data Airlock” Architecture
Blockchain complements via trusted provenance and value redistribution
14. Trust Architecture: Trusted Data to Untrusted Analytics
§ Open release of data to the public
§ Provably private/desensitized data sharing/release for analytics collaboration
§ Quantified risks and mitigation
§ Case Studies: Worked with 30+ Gov agencies
Data61’s R4: Re-identification Risks Ready-Reckoner
14 |
Blockchain complements via trusted provenance and value redistribution
15. Trust Architecture at Scale: Consumer-Driven Sharing
Enabling FinTechs including blockchain-based ones
• Consumer Data Right (CDR): Australia’s legislation
impacting consumer data and its services
• Consumers can authorise 3rd parties to access their data
• Currently designated sectors: Banking, Energy…
• Data61’s (Recent) Role
• Setting Architecture/Data API standards
• Security profiles standards
• Trust Architecture Trade-offs
• Trusted gateway vs. peer-to-peer trust
• Trust in Nodes: Processing-only vs. Processing + Use
https://consumerdatastandards.gov.au
15 |
ACCC Consumer Data Right in Energy Consultation paper:
data access models for energy data, 2019
Enabling a myriad of Fintech blockchain innovations
16. Distributed Trust at Edge: PODS & Privacy Setting Recommendation
16 |
Yashothara Shanmugarasa, Hye-Young Paik, Salil S.
Kanhere, Liming Zhu: Towards Automated Data Sharing in
Personal Data Stores. PerCom Workshops 2021: 328-331
Adding trust usability to blockchain-based PODS
17. Trust Architecture Patterns: Privacy-by-Design
Responsible AI Strategy
17 |
•
Data61 work: Su Yen Chia, Xiwei Xu, Hye-Young Paik, Liming Zhu: Analysing and
extending privacy patterns with architectural context. SAC 2021
GDPR &
Australian Privacy
Principles
18. 18 |
Dongyao Wu, Sherif Sakr, Liming Zhu et al.: HDM-MC in-Action: A Framework for Big Data Analytics across Multiple Clusters. ICDCS 2018
Dongyao Wu, Sherif Sakr, Liming Zhu, Huijun Wu: Towards Big Data Analytics across Multiple Clusters. CCGrid 2017: 218-227
- Horizontal: features are similar but vary in terms of data (different phone, patients)
- Vertical: same sample ID, but different in features (you with data in banks and telcos.. )
Architecture (trust) trade-offs: computation, communication, dependability, maintainability…
Trust Architecture across Clusters
Data systems across heterogenous clusters (vertical/horizontal partition)
19. Data Access Request Management
Trust Architecture: Federated Data/Model Sharing
Data
set
Model
set
Datasets Version Control
Data/Model Discovery & Registries
Data Registry AI Model Registry
Knowledge Based
Linked Data
Model-Data-Code-Config Co-versioning
Process Mgmt Risk Assessment
Data custodian
Data users
Privacy
Ethics
Continuous Monitoring of
Ethical Usage
Immutable Data-Model
Dependency Tracking
Secure API
Synthetic Data
Analytics Ethics
Consent Management
Data/Analytics Management in Org/Dept.
Federated Data Catalogue
Provable Privacy Release
Identity Mgt
(Macrokey)
https://data.gov.au
powered by Data61 MAGDA
Blockchain
20. Trustworthiness: Model/Data Integrity & Provenance
Responsible AI Strategy
20 |
Data61 work: X Xu, C. Wang, J. Wang, et. al. “Improving Trustworthiness of AI-
based Dynamic Digital-Physical Parity” , 2021 (submitted)
• Blockchain improves trust in data integrity
and model integrity
• Provenance is the key
21. When there are cultural or legislative restrictions
in place to data sharing, consider alternatives!
Federated Model: “Data Co-Ops”
• No centralised data repositories
• Edge AI and Analytics
Scientific Approaches
• Zero-knowledge proofs, homomorphic
encryption, secure-multi-party computation
Trust Architecture: Federated ML/Data Analytics
From limited access to full encryption during use
21 |
Other Case Studies at Data61
• Bank + Telco for fraud analytics
• Two gov departments for joint insights
Other Supported Scenarios
• Innovation in secure transactions
• Access to data by regulators
• Cross-border data flow
Data61 work: SK Lo, Q Lu, L Zhu, HY Paik, X Xu, C Wang: Architectural patterns for the
design of federated learning systems, Journal of Systems and Software (2021)
Data61 work: SK Lo, Q Lu, HY Paik, L Zhu, FLRA: A Reference Architecture for Federated
Learning Systems, European Conference on Software Architecture (2021)
22. Use Cases
- keyboard prediction
- browser history recommendation
- visual object detection
- diagnosis and treatment prediction
- drug discovery (across facilities involving IP)
- meta-analysis over distributed medical databases
- augmented reality
Data61 case studies
• name entity resolution
• fraud/anomaly detection (bank + telco)
• crop yield prediction - federated transfer learning
• IIoT fault detection
Federated Learning Architecture & Use Cases
24. Blockchain-based Trustworthy Federated Learning Architecture
S. Lo, Y. Liu, Q. Lu, C. Wang, X. Xu, H.Paik, L. Zhu: Blockchain-based Trustworthy Federated Learning Architecture. (2021)
25. Blockchain-based Federated Learning for Industrial IoT
W Zhang, Q. Lu, et al.: Blockchain-Based Federated Learning for Device Failure Detection in Industrial IoT. IEEE Internet
Things J. 8(7): 5926-5937 (2021)
27. Where to deploy?
(IV.D)
What incentive?
Need anonymity
mechanism?
Other
design
decisions
(IV.D)
Trust
Decentralization
Has trusted
authority?
Can it be
decentralised?
How to decentralise
the authority?
(IV.A)
Need a new
blockchain ?
What type?
Need multiple
blockchains?
What block size and
frequency?
What consensus
protocol ?
Blockchain
configurations
(IV.C)
What data structure?
Storage and computation:
on-chain vs. off-chain
(IV.B)
Use traditional
database
Yes No
Yes
No
Designing Trust with Blockchain (1/2)
• Design Process, including Suitability Analysis
• A taxonomy of blockchain-based systems for architecture design, X. Xu, I. Weber, M. Staples et al., ICSA2017.
• The blockchain as a software connector, X. Xu, C. Pautasso, L. Zhu et al., WICSA2016.
• Quality Analysis
• Quantifying the cost of distrust: Comparing blockchain and cloud services for business process execution. P.
Rimba, A. B. Tran, I. Weber et al., Information Systems Frontiers, accepted August 2018 (previously SCAC 2017)
• Comparing blockchain and cloud services for business process execution, P. Rimba, A. B. Tran, I. Weber et al.,
ICSA2017.
• Predicting latency of blockchain-based systems using architectural modelling and simulation, R.
Yasaweerasinghelage, M. Staples and I. Weber, ICSA2017.
• Design Patterns https://research.csiro.au/blockchainpatterns/
– A pattern collection for blockchain-based applications. X. Xu, C. Pautasso, L., Q. Lu, and I. Weber, EuroPLoP
2018
• Integration with other systems
– EthDrive: A Peer-to-Peer Data Storage with Provenance, X. L. Yu, X. Xu, B. Liu, CAISE2017.
Design process, quality analysis, design patterns and governance/risks
ICSOC18: Distributed Trust | Liming Zhu 27
28. Designing Trust with Blockchain (2/2)
• Business Process Execution
• Untrusted business process monitoring and execution using blockchain,
I. Weber, X. Xu, R. Riveret et al., BPM 2016
• Optimized Execution of Business Processes on Blockchain,
L. García-Bañuelos, A. Ponomarev, M. Dumas, Ingo Weber, BPM 2017
• Caterpillar: A blockchain-based business process management system, O. López-Pintado, L. García-
Bañuelos, M. Dumas, and I. Weber, BPM 2017 Demo
• Runtime verification for business processes utilizing the Bitcoin blockchain, C. Prybila, S. Schulte, C.
Hochreiner, and I. Weber, Future Generation Computer Systems (FGCS), accepted August 2017
• Data / Asset Modelling
• Regerator: a Registry Generator for Blockchain, A. B. Tran, X. Xu, I. Weber, CAISE 2017 Demo
• Combined Asset & Process Modelling
• Lorikeet: A Model-Driven Engineering Tool for Blockchain-Based Business Process Execution and
Asset Management A. B. Tran, Q. Lu, I. Weber, BPM 2018 Demo
Cross-org focused, Process/Data/Assets/Artifact-based model-driven engineering
ICSOC18: Distributed Trust | Liming Zhu 28
29. • Digitisation of assets (as tokens) and
distributed trading
– Unlocking finance around illiquid assets
• Interactions between primary
registries (not necessarily blockchain)
and other exchanges/markets
• Many open research issues in law,
regulation, technology, finance
Application: Tokenisation in
Digital Finance
29 |
30. 1) Dynamic Registers for Instant Exchange
creation, issuing, trading, and settlement
of digital assets in real time
2) Advanced Securitisation digitisation of assets
and use of CBDCs in transactions
3) Distributed Trading real-time exchange
4) RegTech with Algorithmic Real-Time Enforcement
governance and compliance in the ‘instant’ asset
transfer environment. New approaches to
regulation, supervision, and operational certainty
Research Programs
30 |
31. Australia’s National Science Agency
Trust Architecture in
Distributed Ecosystem:
Lessons from Responsible
AI/Platforms
32. • Responsible AI: “the development of intelligent systems according to fundamental human
principles and values.” [1]
• Being legal is a minimum requirement for responsibility; the duty you have to others.
• What are these ”Principles”? E.g. AI Ethics Principles. Make sure that “you build the right things”
• How can you be sure in a verifiable way? - “Trustworthy AI” – Make sure “you build in the right ways”
Responsible AI & AI Ethics Principles
Australia’s AI Ethics Principles
1) Human, societal and environmental wellbeing
2) Human-centred values
3) Fairness
4) Privacy protection and security
5) Reliability and safety
6) Transparency and explainability
7) Contestability
8) Accountability
32 |
[1]
33. “It never does just what I want, but only what I tell it.”
• Value alignment problem
• given an optimisation algorithm, how to make sure the
optimisation of its objective function results in outcomes that
we actually want, in all respects? [1]
• impossible (not simply hard) to accurately and completely
specify all the goals, undesirable side-effects and constraints
• sometimes latent requirements
• Autonomy & Agency
• solve problems autonomously , without explicit guidance from a
human being
Responsible AI – What’s unique?
[2] Data61 work: L. Zhu, X. Xu, Q. Lu, G. Governatori, and J. Whittle,
“AI and Ethics - Operationalising Responsible AI”, Humanity Driven AI
(2021). https://arxiv.org/abs/2105.08867
33 |
[1]
34. Create Store Process Archive Delete
Data Manager
create save update move delete
Traditional Life Cycle of Data (ISO/IEC 38505-1)
Collect Manage
Survey/Research/
Productize
Consume Terminate
Data Provider
Data Provider Data Manager Data Analyst Data Consumer
upload manage analysis generate use/share terminate
Open data
Closed data
Shared data
Access
Machine-generated
Human-sourced
Process-mediated
Data source
Raw data
Derived data
Derived insights
Process
For evolving (aggregate, combine data…)
For sharing
Retained data
Withdrawn data
Existence
If derived data,
If shared data,
If retained data,
For reusing (discover)
Proprietary data
Public data
Ownership/Rights
PII data
Non-PII data
Privacy/Sensitivity
Data in a Platform Ecosystem
Responsible Data Usage for Distributed Ecosystems
ICSOC18: Distributed Trust | Liming Zhu 34
S. Lee, Ross J., L. Zhu, "A
Contingency-Based Approach
to Data Governance Design
for Platform Ecosystem",
PACIS 2018
S. Lee, R. Jeffery, L. Zhu, "Data
Governance Decisions for
Platform Ecosystems", HICSS
2019
35. Responsible AI for Distributed Ecosystems
Shneiderman, B.: Bridging the gap between ethics and practice:
Guidelines for reliable, safe, and trustworthy human-centered ai
systems. ACM Trans. Interact. Intell. Syst. 10(4) (2020).
Data61 work: S. Lee, L. Zhu, R. Jeffery “Data Governance Decisions for
Platform Ecosystems” HICSS 2019: 1-10
Industry + Organisation + Teams
Data + Model
35 |
36. Operationalising via Architecture and Process Patterns
Data61 work: L. Zhu, X. Xu, Q. Lu, G. Governatori, and J. Whittle, “AI and
Ethics - Operationalising Responsible AI”, Humanity Driven AI (2021).
https://arxiv.org/abs/2105.08867
Data61 work: Q. Lu, L. Zhu, et.al. “Software engineering for
responsible AI: an empirical study and operationalised
mechanisms” (under review)
36 |
37. Applying the Lessons to Blockchain Governance
37 |
https://arxiv.org/abs/2110.13374
39. (Distributed) Trust in Metaverse/Digital Twin
The Metaverse is a massively scaled and interoperable network of real-time rendered 3D
virtual worlds which can be experienced synchronously and persistently by an effectively
unlimited number of users with an individual sense of presence, and with continuity of
data, such as identity, history, entitlements, objects, communications, and payments.
-Matthew Ball
41. • A digital twin is a digital representation of a physical object. It includes
the model of the physical object, data from the object, a unique one-
to-one correspondence to the object and the ability to monitor the
object.1
• A digital twin is a virtual replica of a physical asset or a process, which
is used for product design, monitoring, simulation, optimization, and
maintenance. It comprises sensors and devices that collect real-time
data from a physical asset.2
• A digital twin is a digital model or replica of a physical asset, product,
process or system that allows a digital footprint of key assets or
products from design and development through the end of the product
lifecycle.3
• ‘A digital twin is a virtual representation of real-world entities and
processes, synchronized at a specified frequency and fidelity.’4
More boring name – Digital-Twin-stan + NPC?
1 Gartner, https://www.gartner.com/smarterwithgartner/how-to-use-digital-twins-in-your-iot-strategy/, Access date 10th January 2021
2 Technavio, 2020, Global Digital Twin Market 2020-2024
3 Frost and Sullivan - TechVision Group of, May 2019, Digital Twin: Application Landscape and Opportunity Assessment
4 Digital Twin Consortium, May 2019, https://blog.digitaltwinconsortium.org/2020/12/digital-twin-consortium-defines-digital-twin.html
Accessed 6th January 2021
42. ‘Urban Scale Digital Twins’ Themes
Example figure depicting Themed Digital Twins, such as water of energy systems
7 George Percivall, OGC CTO, 12th January 2021, Overview of the Location Powers Urban Digital Twin Summit (Keynote
Presentation)
43. ‘National’ Digital Twin – an ecosystem
• ‘The National Digital Twin
will not be a single large
model but an ecosystem of
connected digital twins
which can enable system
optimisation and planning
across sectors and
organisations.’5
Example figure depicting Digital Twin at the precinct scale, and multiple
twins, an ecosystem
5 Centre for Digital Built Britain, May 2020, The approach to delivering a
National Digital twin for the United Kingdom, Summary Report
44. Platform implementations informing perspectives include:
• National Map
• NSW Spatial Digital Twin
• QLD Spatial Digital Twin
Ecosystem Enablers for Urban Scale Digital Twins:
• Collaboration, Governance
• Trusted Data Sharing
• Standards, Formats
• Visualisation, Accessibility
Data61’s Work – Platforms
45. 45 |
DESIGN
“Rules as Code”
to check designs
against
regulations
AI and simulation
for early lifecycle
BUILD
Robotic SLAM
for inspections
“Rules as Code”
to check
compliance
OPERATE
National
Digital Twin
Infrastructure
CONSTRUCTION
PROJECT
FACILITIES
MANAGEMENT
SUPPLY
CHAIN
• Blockchain for
certification
and provenance
• AI for risk-driven inspections
• Smart contracts for project
automation & security of
payments
• Smart sensors
• AI predictive
maintenance
One Multiverse Future – Enabled by Blockchain
BLOCKCHAIN
46. Summary
• Trust architecture is becoming distributed and underpins everything.
• Blockchain solves, complements and inspires.
• Solutions at different levels
• Architecture styles: model-to-data, federated learning … enabled by blockchain
• Design patterns/tactics e.g. https://research.csiro.au/blockchainpatterns/
• “Meta”-level
– Responsible Data and AI for distributed ecosystems
– Metaverse and digital twins
46
Thank you!