This document proposes a decentralized infrastructure for fair and trusted IoT data trading. It suggests using a blockchain and smart contracts to allow individual IoT data generators to directly trade their data streams, removing the need for a central trusted authority. Unilateral "message count cubes" would be reported by each participant to provide transparency and accountability. Smart contracts could then enforce settlements based on these reports in a way that is consistent, fair and removes central trust assumptions. Several challenges are noted including how to ensure fairness in the presence of malicious actors and improving scalability.
The document discusses data provenance for data science applications. It proposes automatically generating and storing metadata that describes how data flows through a machine learning pipeline. This provenance information could help address questions about model predictions, data processing decisions, and regulatory requirements for high-risk AI systems. Capturing provenance at a fine-grained level incurs overhead but enables detailed queries. The approach was evaluated on performance and scalability. Provenance may help with transparency, explainability and oversight as required by new regulations.
A Customisable Pipeline for Continuously Harvesting Socially-Minded Twitter U...Paolo Missier
talk for paper published at ICWE2019:
Primo F, Missier P, Romanovsky A, Mickael F, Cacho N. A customisable pipeline for continuously harvesting socially-minded Twitter users. In: Procs. ICWE’19. Daedjeon, Korea; 2019.
Monitoring world geopolitics through Big Data by Tomasa Rodrigo and Álvaro Or...Big Data Spain
Data from the media allows to enrich our analysis and to incorporate these insights into our models to capture nonlinear behaviour and feedback effects of human interaction, assessing their global impact on the society and enabling us to construct fragility indices and early warning systems.
https://www.bigdataspain.org/2017/talk/monitoring-world-geopolitics-through-big-data
Big Data Spain 2017
16th - 17th November Kinépolis Madrid
Multipleregression covidmobility and Covid-19 policy recommendationKan Yuenyong
Multiple Regression Analysis and Covid-19 policy is the contemporary agenda. It demonstrates how to use Python to do data wrangler, to use R to do statistical analysis, and is enable to publish in standard academic journal. The model will explain whether lockdown policy is relevant to control Covid-19 outbreak? It cinc
NG2S: A Study of Pro-Environmental Tipping Point via ABMsKan Yuenyong
A study of tipping point: much less is known about the most efficient ways to reach such transitions or how self-reinforcing systemic transformations might be instigated through policy. We employ an agent-based model to study the emergence of social tipping points through various feedback loops that have been previously identified to constitute an ecological approach to human behavior. Our model suggests that even a linear introduction of pro-environmental affordances (action opportunities) to a social system can have non-linear positive effects on the emergence of collective pro-environmental behavior patterns.
A final project presentation on the project based on THE GDELT Database.
Complete Report : https://samvat.github.io/ivmooc-gdelt-project/The GDELT Project - Final Report.pdf
Data mining involves using sophisticated tools to analyze large datasets and discover patterns. In homeland security, data mining can help identify terrorist activities through records like money transfers and communications. While useful, data mining has limitations - it only finds patterns, not their significance, and does not prove causation. It is also debated if it can reliably predict terrorism given the small number of known incidents. Several US government programs have used data mining for homeland security purposes, but have faced issues around privacy, data quality, and mission creep.
The document discusses data provenance for data science applications. It proposes automatically generating and storing metadata that describes how data flows through a machine learning pipeline. This provenance information could help address questions about model predictions, data processing decisions, and regulatory requirements for high-risk AI systems. Capturing provenance at a fine-grained level incurs overhead but enables detailed queries. The approach was evaluated on performance and scalability. Provenance may help with transparency, explainability and oversight as required by new regulations.
A Customisable Pipeline for Continuously Harvesting Socially-Minded Twitter U...Paolo Missier
talk for paper published at ICWE2019:
Primo F, Missier P, Romanovsky A, Mickael F, Cacho N. A customisable pipeline for continuously harvesting socially-minded Twitter users. In: Procs. ICWE’19. Daedjeon, Korea; 2019.
Monitoring world geopolitics through Big Data by Tomasa Rodrigo and Álvaro Or...Big Data Spain
Data from the media allows to enrich our analysis and to incorporate these insights into our models to capture nonlinear behaviour and feedback effects of human interaction, assessing their global impact on the society and enabling us to construct fragility indices and early warning systems.
https://www.bigdataspain.org/2017/talk/monitoring-world-geopolitics-through-big-data
Big Data Spain 2017
16th - 17th November Kinépolis Madrid
Multipleregression covidmobility and Covid-19 policy recommendationKan Yuenyong
Multiple Regression Analysis and Covid-19 policy is the contemporary agenda. It demonstrates how to use Python to do data wrangler, to use R to do statistical analysis, and is enable to publish in standard academic journal. The model will explain whether lockdown policy is relevant to control Covid-19 outbreak? It cinc
NG2S: A Study of Pro-Environmental Tipping Point via ABMsKan Yuenyong
A study of tipping point: much less is known about the most efficient ways to reach such transitions or how self-reinforcing systemic transformations might be instigated through policy. We employ an agent-based model to study the emergence of social tipping points through various feedback loops that have been previously identified to constitute an ecological approach to human behavior. Our model suggests that even a linear introduction of pro-environmental affordances (action opportunities) to a social system can have non-linear positive effects on the emergence of collective pro-environmental behavior patterns.
A final project presentation on the project based on THE GDELT Database.
Complete Report : https://samvat.github.io/ivmooc-gdelt-project/The GDELT Project - Final Report.pdf
Data mining involves using sophisticated tools to analyze large datasets and discover patterns. In homeland security, data mining can help identify terrorist activities through records like money transfers and communications. While useful, data mining has limitations - it only finds patterns, not their significance, and does not prove causation. It is also debated if it can reliably predict terrorism given the small number of known incidents. Several US government programs have used data mining for homeland security purposes, but have faced issues around privacy, data quality, and mission creep.
AI is the science and engineering of creating intelligent machines and software. It draws from fields like computer science, biology, psychology and linguistics. The goal is to develop systems that can perform tasks normally requiring human intelligence, like visual perception, decision making and language translation. Some key applications of AI include machine learning, expert systems, natural language processing and computer vision. As AI systems continue advancing, they are becoming better than humans at certain tasks like playing strategic games.
This document provides an introduction to big data, including:
- Big data is characterized by its volume, velocity, and variety, which makes it difficult to process using traditional databases and requires new technologies.
- Technologies like Hadoop, MongoDB, and cloud platforms from Google and Amazon can provide scalable storage and processing of big data.
- Examples of how big data is used include analyzing social media and search data to gain insights, enabling personalized experiences and targeted advertising.
- As data volumes continue growing exponentially from sources like sensors, simulations, and digital media, new tools and approaches are needed to effectively analyze and make sense of "big data".
Processing of the data generated from transactions that occur every day which resulted in nearly thousands of data per day requires software capable of enabling users to conduct a search of the necessary data. Data mining becomes a solution for the problem. To that end, many large industries began creating software that can perform data processing. Due to the high cost to obtain data mining software that comes from the big industry, then eventually some communities such as universities eventually provide convenience for users who want just to learn or to deepen the data mining to create software based on open source. Meanwhile, many commercial vendors market their products respectively. WEKA and Salford System are both of data mining software. They have the advantages and the disadvantages. This study is to compare them by using several attributes. The users can select which software is more suitable for their daily activities.
Association rule visualization techniquemustafasmart
This document describes a project submitted for a degree in computer science. It discusses studying techniques for visualizing association rules discovered from databases by developed algorithms. The project aims to identify the strengths and weaknesses of these visualization techniques to determine the most appropriate for solving a main drawback of association rules, which is the huge number of extracted rules that cannot be manually inspected. The document provides background on data mining, association rules, and functional dependencies. It then outlines chapters that will explain the knowledge discovery process, association rule mining, and visualization techniques used for association rule visualization.
This white paper discusses how utilities can leverage big data and real-time analytics to achieve situational awareness for smart grids. It argues that the true purpose of big data is to take action by making accurate, timely decisions through situational awareness. It outlines the unique big data challenges utilities face related to volume, velocity, variety, validity and veracity of their data. Traditional relational databases and data historians are insufficient for utilities' needs. Instead, flexible and scalable object-oriented databases and NoSQL technologies are needed to integrate diverse data types and conduct analysis across multiple domains in real-time to enable situational awareness. This will allow utilities to understand power flows and make critical decisions to maintain grid stability.
OPEN DATA: ECOSYSTEM, CURRENT AND FUTURE TRENDS, SUCCESS STORIES AND BARRIERSAnastasija Nikiforova
"OPEN DATA: ECOSYSTEM, CURRENT AND FUTURE TRENDS, SUCCESS STORIES AND BARRIERS" set of slides was prepared for the Guest Lecture, which I has delivered to the students of the University of South-Eastern Norway (USN), October 2021
This document summarizes a seminar presentation on using data mining techniques for telecommunications. It discusses three main types of telecom data: call summary data, network data, and customer data. It then describes using a genetic algorithm approach to mine sequential patterns from telecom databases. The genetic algorithm uses country codes to represent chromosomes and applies genetic operators and fitness functions to iteratively find sequential patterns in the telecom data. The approach provides non-optimal solutions faster than traditional algorithms.
IRJET- Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...IRJET Journal
The document proposes an improved model for big data analytics using dynamic multi-swarm optimization and unsupervised learning algorithms. It develops an algorithm called DynamicK-reference Clustering that combines dynamic multi-swarm optimization with a k-reference clustering algorithm. The k-reference clustering algorithm uses reference distance weighting, Euclidean distance, and chi-square relative frequency to cluster mixed datasets. It was tested on several datasets from a machine learning repository and was shown to more efficiently cluster large, mixed datasets than other clustering algorithms like k-means and particle swarm optimization. The dynamic multi-swarm optimization helps guide the clustering algorithm to obtain more accurate cluster formations by providing the best initial value of k clusters.
The implementation of a blockchain platform may facilitate the exchange of information among the parties involved in the process. This can be achieved by storing the cargo information on the ledger. Instead of exchanging documentation, the parties involved in the process are granted permission to access the block where the information is stored. This leads to the creation of a unique, shared piece of information which can be accessed in real-time and with lower transaction costs. The process can be further accelerated by including parties that are currently external to the process (banks, insurance companies).
The document discusses TCXC's integration of IoT AAA capabilities into its telecommunications platform. This will allow IoT devices to connect and communicate securely using the MQTT protocol. A billing structure based on messages transmitted and received will be implemented using special IoT accounts. The integration will open up new revenue opportunities from industries using IoT like smart homes, agriculture, transportation and healthcare. It is an important step for TCXC to expand its capabilities and capitalize on the growing IoT market.
This document discusses the possibilities of combining artificial intelligence and blockchain technology. Some key points made include:
- AI and encryption work well together, and blockchain can help track and explain AI decisions.
- AI can manage blockchains more efficiently than humans.
- Combining the two could improve data and model trustworthiness, effectiveness through better data sharing, and artificial trust between machines.
- AI techniques like federated learning could introduce new decentralized systems, while AI brings security benefits to blockchain.
Ocean Protocol Presentation by CEO Bruce Pon 20171129Team AI
This document discusses building a new data economy that gives power back to people by unlocking the value of data through a decentralized network and token called Ocean Protocol. Key points:
- Ocean Protocol aims to connect data and AI by building a blockchain-based data exchange that incentivizes data sharing through its OCN token.
- 45% of tokens will be allocated to "keepers" who secure the network and validate transactions, while other tokens will go to marketplaces, curators, data providers, and the Ocean Foundation.
- The network will include various technical components like BigchainDB for storage and proofs of data quality/integrity, with the goal of creating a global data marketplace to power the emerging data economy
BlockChain as a New Cyber Strategy for your Business discusses how blockchain technology could impact cybersecurity and business strategies. It provides an introduction to blockchain and what it is and isn't. It then discusses how blockchain could be used within cybersecurity by creating an immutable record of compliance activities. Examples of how blockchain could transform various industries like financial services, supply chains, energy, and governance are also presented. The document advocates that blockchain has the potential to significantly change trusted computing and cybersecurity.
Alternative Consensus & Enterprise BlockchainTobias Disse
This document discusses alternative consensus mechanisms to proof of work for blockchain networks. It describes proof of stake, which requires users to provide ownership stake in a currency to validate blocks. Proof of activity is a hybrid of proof of stake and proof of work. Proof of burn uses a lottery system where coins are burned to win the chance to mine blocks. The document also discusses Byzantine fault tolerance, Ripple, Stellar, smart contracts, decentralized applications like DAOs and DACs, and potential applications of enterprise blockchains in healthcare and government.
Blockchain - Primer for City CIOs v05 01 22.pdfssusera441c2
Blockchain primary for city government chief information officers. Originally prepared for the Cities Leadership Forum hosted by Cities Institute, Philadelphia March 2022.
Blockchain can be used across the whole manufacturing industry to address all different types of projects and stakeholders. Its value to the manufacturing vertical is promising as Industry 4.0 continues to grow.
This document discusses blockchain technology and cryptocurrency. It begins with introductions to blockchain and cryptocurrency. It then covers technical aspects of blockchain like hashing and nodes. It discusses different types of cryptocurrencies and tokens. It also discusses decentralized marketplaces and exchanges. Specific blockchain use cases are mentioned like supply chain tracking and Estonia's digital identity system. Government approaches to blockchain from different countries are also reviewed.
The document discusses potential synergies between the finance and energy sectors, including trading of financial products related to energy, regulations both industries face, use of industry standards, and data security concerns. It proposes a cross-functionality platform called SIDE (Securities Industry Data Exchange) that could integrate finance and energy/utility data and transactions. It also discusses how blockchain/distributed ledger technology could provide security benefits and reduce costs for areas like trade confirmations and reporting between counterparties in the finance sector.
Blockchain & Telecommunication Services ProviderSamuel Liu
Samuel Liu has technology experience deploying telecommunications services involving SD-WAN, data centers, cloud computing, and wireless applications. He is currently working on several blockchain proofs-of-concept for telecommunications providers to improve inter-carrier network operations and roaming settlements. The document discusses blockchain terminology and provides examples of global blockchain initiatives by telecommunications companies focusing on areas like IoT, financial services, device security, and inter-carrier settlements. It predicts more telecom companies will participate in blockchain-based consortiums to increase trust, reduce costs, and improve inter-carrier processes.
Blockchain point of view for the telco, media and entertainment industryIBM Blockchain
Blockchain point of view for the telco, media and entertainment industry by Luca Marchi, Utpal Mangla, Mathews Thomas.
IBM Blockchain Services: https://ibm.co/2Llrq3J
Nov 2 security for blockchain and analytics ulf mattsson 2020 nov 2bUlf Mattsson
Blockchain
- What is Blockchain?
- Blockchain trends
Emerging data protection techniques
- Secure multiparty computation
- Trusted execution environments
- Use cases for analytics
- Industry Standards
Tokenization
- Convert a digital value into a digital token
- Tokenization local or in a centralized model
- Tokenization and scalability
Cloud
- Analytics in Hybrid cloud
AI is the science and engineering of creating intelligent machines and software. It draws from fields like computer science, biology, psychology and linguistics. The goal is to develop systems that can perform tasks normally requiring human intelligence, like visual perception, decision making and language translation. Some key applications of AI include machine learning, expert systems, natural language processing and computer vision. As AI systems continue advancing, they are becoming better than humans at certain tasks like playing strategic games.
This document provides an introduction to big data, including:
- Big data is characterized by its volume, velocity, and variety, which makes it difficult to process using traditional databases and requires new technologies.
- Technologies like Hadoop, MongoDB, and cloud platforms from Google and Amazon can provide scalable storage and processing of big data.
- Examples of how big data is used include analyzing social media and search data to gain insights, enabling personalized experiences and targeted advertising.
- As data volumes continue growing exponentially from sources like sensors, simulations, and digital media, new tools and approaches are needed to effectively analyze and make sense of "big data".
Processing of the data generated from transactions that occur every day which resulted in nearly thousands of data per day requires software capable of enabling users to conduct a search of the necessary data. Data mining becomes a solution for the problem. To that end, many large industries began creating software that can perform data processing. Due to the high cost to obtain data mining software that comes from the big industry, then eventually some communities such as universities eventually provide convenience for users who want just to learn or to deepen the data mining to create software based on open source. Meanwhile, many commercial vendors market their products respectively. WEKA and Salford System are both of data mining software. They have the advantages and the disadvantages. This study is to compare them by using several attributes. The users can select which software is more suitable for their daily activities.
Association rule visualization techniquemustafasmart
This document describes a project submitted for a degree in computer science. It discusses studying techniques for visualizing association rules discovered from databases by developed algorithms. The project aims to identify the strengths and weaknesses of these visualization techniques to determine the most appropriate for solving a main drawback of association rules, which is the huge number of extracted rules that cannot be manually inspected. The document provides background on data mining, association rules, and functional dependencies. It then outlines chapters that will explain the knowledge discovery process, association rule mining, and visualization techniques used for association rule visualization.
This white paper discusses how utilities can leverage big data and real-time analytics to achieve situational awareness for smart grids. It argues that the true purpose of big data is to take action by making accurate, timely decisions through situational awareness. It outlines the unique big data challenges utilities face related to volume, velocity, variety, validity and veracity of their data. Traditional relational databases and data historians are insufficient for utilities' needs. Instead, flexible and scalable object-oriented databases and NoSQL technologies are needed to integrate diverse data types and conduct analysis across multiple domains in real-time to enable situational awareness. This will allow utilities to understand power flows and make critical decisions to maintain grid stability.
OPEN DATA: ECOSYSTEM, CURRENT AND FUTURE TRENDS, SUCCESS STORIES AND BARRIERSAnastasija Nikiforova
"OPEN DATA: ECOSYSTEM, CURRENT AND FUTURE TRENDS, SUCCESS STORIES AND BARRIERS" set of slides was prepared for the Guest Lecture, which I has delivered to the students of the University of South-Eastern Norway (USN), October 2021
This document summarizes a seminar presentation on using data mining techniques for telecommunications. It discusses three main types of telecom data: call summary data, network data, and customer data. It then describes using a genetic algorithm approach to mine sequential patterns from telecom databases. The genetic algorithm uses country codes to represent chromosomes and applies genetic operators and fitness functions to iteratively find sequential patterns in the telecom data. The approach provides non-optimal solutions faster than traditional algorithms.
IRJET- Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...IRJET Journal
The document proposes an improved model for big data analytics using dynamic multi-swarm optimization and unsupervised learning algorithms. It develops an algorithm called DynamicK-reference Clustering that combines dynamic multi-swarm optimization with a k-reference clustering algorithm. The k-reference clustering algorithm uses reference distance weighting, Euclidean distance, and chi-square relative frequency to cluster mixed datasets. It was tested on several datasets from a machine learning repository and was shown to more efficiently cluster large, mixed datasets than other clustering algorithms like k-means and particle swarm optimization. The dynamic multi-swarm optimization helps guide the clustering algorithm to obtain more accurate cluster formations by providing the best initial value of k clusters.
The implementation of a blockchain platform may facilitate the exchange of information among the parties involved in the process. This can be achieved by storing the cargo information on the ledger. Instead of exchanging documentation, the parties involved in the process are granted permission to access the block where the information is stored. This leads to the creation of a unique, shared piece of information which can be accessed in real-time and with lower transaction costs. The process can be further accelerated by including parties that are currently external to the process (banks, insurance companies).
The document discusses TCXC's integration of IoT AAA capabilities into its telecommunications platform. This will allow IoT devices to connect and communicate securely using the MQTT protocol. A billing structure based on messages transmitted and received will be implemented using special IoT accounts. The integration will open up new revenue opportunities from industries using IoT like smart homes, agriculture, transportation and healthcare. It is an important step for TCXC to expand its capabilities and capitalize on the growing IoT market.
This document discusses the possibilities of combining artificial intelligence and blockchain technology. Some key points made include:
- AI and encryption work well together, and blockchain can help track and explain AI decisions.
- AI can manage blockchains more efficiently than humans.
- Combining the two could improve data and model trustworthiness, effectiveness through better data sharing, and artificial trust between machines.
- AI techniques like federated learning could introduce new decentralized systems, while AI brings security benefits to blockchain.
Ocean Protocol Presentation by CEO Bruce Pon 20171129Team AI
This document discusses building a new data economy that gives power back to people by unlocking the value of data through a decentralized network and token called Ocean Protocol. Key points:
- Ocean Protocol aims to connect data and AI by building a blockchain-based data exchange that incentivizes data sharing through its OCN token.
- 45% of tokens will be allocated to "keepers" who secure the network and validate transactions, while other tokens will go to marketplaces, curators, data providers, and the Ocean Foundation.
- The network will include various technical components like BigchainDB for storage and proofs of data quality/integrity, with the goal of creating a global data marketplace to power the emerging data economy
BlockChain as a New Cyber Strategy for your Business discusses how blockchain technology could impact cybersecurity and business strategies. It provides an introduction to blockchain and what it is and isn't. It then discusses how blockchain could be used within cybersecurity by creating an immutable record of compliance activities. Examples of how blockchain could transform various industries like financial services, supply chains, energy, and governance are also presented. The document advocates that blockchain has the potential to significantly change trusted computing and cybersecurity.
Alternative Consensus & Enterprise BlockchainTobias Disse
This document discusses alternative consensus mechanisms to proof of work for blockchain networks. It describes proof of stake, which requires users to provide ownership stake in a currency to validate blocks. Proof of activity is a hybrid of proof of stake and proof of work. Proof of burn uses a lottery system where coins are burned to win the chance to mine blocks. The document also discusses Byzantine fault tolerance, Ripple, Stellar, smart contracts, decentralized applications like DAOs and DACs, and potential applications of enterprise blockchains in healthcare and government.
Blockchain - Primer for City CIOs v05 01 22.pdfssusera441c2
Blockchain primary for city government chief information officers. Originally prepared for the Cities Leadership Forum hosted by Cities Institute, Philadelphia March 2022.
Blockchain can be used across the whole manufacturing industry to address all different types of projects and stakeholders. Its value to the manufacturing vertical is promising as Industry 4.0 continues to grow.
This document discusses blockchain technology and cryptocurrency. It begins with introductions to blockchain and cryptocurrency. It then covers technical aspects of blockchain like hashing and nodes. It discusses different types of cryptocurrencies and tokens. It also discusses decentralized marketplaces and exchanges. Specific blockchain use cases are mentioned like supply chain tracking and Estonia's digital identity system. Government approaches to blockchain from different countries are also reviewed.
The document discusses potential synergies between the finance and energy sectors, including trading of financial products related to energy, regulations both industries face, use of industry standards, and data security concerns. It proposes a cross-functionality platform called SIDE (Securities Industry Data Exchange) that could integrate finance and energy/utility data and transactions. It also discusses how blockchain/distributed ledger technology could provide security benefits and reduce costs for areas like trade confirmations and reporting between counterparties in the finance sector.
Blockchain & Telecommunication Services ProviderSamuel Liu
Samuel Liu has technology experience deploying telecommunications services involving SD-WAN, data centers, cloud computing, and wireless applications. He is currently working on several blockchain proofs-of-concept for telecommunications providers to improve inter-carrier network operations and roaming settlements. The document discusses blockchain terminology and provides examples of global blockchain initiatives by telecommunications companies focusing on areas like IoT, financial services, device security, and inter-carrier settlements. It predicts more telecom companies will participate in blockchain-based consortiums to increase trust, reduce costs, and improve inter-carrier processes.
Blockchain point of view for the telco, media and entertainment industryIBM Blockchain
Blockchain point of view for the telco, media and entertainment industry by Luca Marchi, Utpal Mangla, Mathews Thomas.
IBM Blockchain Services: https://ibm.co/2Llrq3J
Nov 2 security for blockchain and analytics ulf mattsson 2020 nov 2bUlf Mattsson
Blockchain
- What is Blockchain?
- Blockchain trends
Emerging data protection techniques
- Secure multiparty computation
- Trusted execution environments
- Use cases for analytics
- Industry Standards
Tokenization
- Convert a digital value into a digital token
- Tokenization local or in a centralized model
- Tokenization and scalability
Cloud
- Analytics in Hybrid cloud
This document discusses the potential for blockchain and robotic process automation (RPA) technologies to transform the telecommunications industry. It provides examples of telcos like Sprint, Orange, and Du that are adopting blockchain to improve fraud management, identity services, and 5G implementation. RPA is discussed as a way for telcos like Telefonica O2 to automate back office processes, reducing costs and improving customer service. The document argues that blockchain and RPA, when properly implemented, can help telcos streamline operations, reduce costs, and develop new revenue streams in areas like IoT and digital transactions.
Introduction to Blockchain and Smart ContractsSaad Zaher
Blockchain & Smart Contracts! This document provides an introduction to blockchain and smart contracts. It discusses what a blockchain is, why many blockchains exist, consensus algorithms like proof of work and proof of stake, public versus private blockchains, smart contracts and how they work, examples of successful smart contracts, potential use cases, and CIT blockchain projects including Catena which provides blockchain as a service and an iRobot proof of concept.
When consumer products get switched on, brands will be able to deploy new IoT-based applications and services throughout the full product lifecycle. But what role will blockchains play in this, and is the hype about its potential justified?
This white paper will show you which use cases are best suited to blockchains and how to assess whether a blockchain-based solution is really needed.
SCALES: Supply Chain Architecture Leading to Enhanced Services Fabio Massimi
The SCALES project is designing and implementing a digital supply chain architecture using distributed ledger technologies to enable value-added services for enterprises and public administrations. The project is coordinated by AgID and involves Politecnico di Milano DIG, UNINFO, InfoCert and Consorzio DAFNE. The architecture will adopt a multi-chain approach and focus on electronic invoicing and procurement processes while ensuring privacy and compliance with GDPR. A healthcare procurement use case will demonstrate value-added services for the sector.
Insight Into Cryptocurrencies - Methods and Tools for Analyzing Blockchain-ba...Bernhard Haslhofer
This document discusses cryptocurrency analytics and summarizes a presentation on analyzing blockchain-based ecosystems. The presentation covers goals of cryptocurrency analytics like macroscopic and microscopic analysis of ecosystems. It describes approaches like using blockchain address graphs and clustering addresses. Statistics on the Bitcoin blockchain are provided. Implementation details are discussed along with a cryptocurrency analytics tool. Stakeholders like science, public authorities, and fintech are examined along with use cases for ransomware studies, law enforcement, anti-money laundering compliance, and evaluating blockchain technology. Future research directions include cybercrime, financial crime forensics, and off-chain transaction channels.
Blockchain has potential applications in peer-to-peer energy trading. It allows direct transactions between energy producers and consumers without intermediaries. Two examples are described. The first is direct trading where prosumers announce energy availability on the blockchain network and trades are verified. The second uses a credit-based system where users submit purchase bids and the distribution system operator matches buyers and sellers. Energy companies globally are piloting various uses like P2P trading and smart appliances. Blockchain facilitates energy transition by empowering users but faces challenges like immature standards, regulation uncertainty, and high data storage needs.
Similar to Mind My Value: A decentralised infrastructure for fair and trusted IoT data trading (20)
Design and Development of a Provenance Capture Platform for Data SciencePaolo Missier
A talk given at the DATAPLAT workshop, co-located with the IEEE ICDE conference (May 2024, Utrecht, NL).
Data Provenance for Data Science is our attempt to provide a foundation to add explainability to data-centric AI.
It is a prototype, with lots of work still to do.
Towards explanations for Data-Centric AI using provenance recordsPaolo Missier
In this presentation, given to graduate students at Universita' RomaTre, Italy, we suggest that concepts well-known in Data Provenance can be exploited to provide explanations in the context of data-centric AI processes. Through use cases (incremental data cleaning, training set pruning), we build up increasingly complex provenance patterns, culminating in an open question:
how to describe "why" a specific data item has been manipulated as part of data processing, when such processing may consist of a complex data transformation algorithm.
Interpretable and robust hospital readmission predictions from Electronic Hea...Paolo Missier
A talk given at the BDA4HM workshop, IEEE BigData conference, Dec. 2023
please see paper here:
https://drive.google.com/file/d/1vN08G0FWxOSH1Yeak5AX6a0sr5-EBbAt/view
Data-centric AI and the convergence of data and model engineering:opportunit...Paolo Missier
A keynote talk given to the IDEAL 2023 conference (Evora, Portugal Nov 23, 2023).
Abstract.
The past few years have seen the emergence of what the AI community calls "Data-centric AI", namely the recognition that some of the limiting factors in AI performance are in fact in the data used for training the models, as much as in the expressiveness and complexity of the models themselves. One analogy is that of a powerful engine that will only run as fast as the quality of the fuel allows. A plethora of recent literature has started the connection between data and models in depth, along with startups that offer "data engineering for AI" services. Some concepts are well-known to the data engineering community, including incremental data cleaning, multi-source integration, or data bias control; others are more specific to AI applications, for instance the realisation that some samples in the training space are "easier to learn from" than others. In this "position talk" I will suggest that, from an infrastructure perspective, there is an opportunity to efficiently support patterns of complex pipelines where data and model improvements are entangled in a series of iterations. I will focus in particular on end-to-end tracking of data and model versions, as a way to support MLDev and MLOps engineers as they navigate through a complex decision space.
Realising the potential of Health Data Science:opportunities and challenges ...Paolo Missier
This document summarizes a presentation on opportunities and challenges for applying health data science and AI in healthcare. It discusses the potential of predictive, preventative, personalized and participatory (P4) approaches using large health datasets. However, it notes major challenges including data sparsity, imbalance, inconsistency and high costs. Case studies on liver disease and COVID datasets demonstrate issues requiring data engineering. Ensuring explanations and human oversight are also key to adopting AI in clinical practice. Overall, the document outlines a complex landscape and the need for better data science methods to realize the promise of data-driven healthcare.
Provenance Week 2023 talk on DP4DS (Data Provenance for Data Science)Paolo Missier
This document describes DP4DS, a tool to collect fine-grained provenance from data processing pipelines. Specifically, it can collect provenance from dataframe-based Python scripts. It demonstrates scalable provenance generation, storage, and querying. Current work includes improving provenance compression techniques and demonstrating the tool's generality for standard relational operators. Open questions remain around how useful fine-grained provenance is for explaining findings from real data science pipelines.
A Data-centric perspective on Data-driven healthcare: a short overviewPaolo Missier
a brief intro on the data challenges associated with working with Health Care data, with a few examples, both from literature and our own, of traditional approaches (Latent Class Analysis, Topic Modelling) and a perspective on Language-based modelling for Electronic Health Records (EHR).
probably more references than actual content in here!
Capturing and querying fine-grained provenance of preprocessing pipelines in ...Paolo Missier
This document describes a method for capturing and querying fine-grained provenance from data science preprocessing pipelines. It captures provenance at the dataframe level by comparing inputs and outputs to identify transformations. Templates are used to represent common transformations like joins and appends. The approach was evaluated on benchmark datasets and pipelines, showing overhead from provenance capture is low and queries are fast even for large datasets. Scalability was demonstrated on datasets up to 1TB in size. A tool called DPDS was also developed to assist with data science provenance.
Tracking trajectories of multiple long-term conditions using dynamic patient...Paolo Missier
The document proposes tracking trajectories of multiple long-term conditions using dynamic patient-cluster associations. It uses topic modeling to identify disease clusters from patient timelines and quantifies how patients associate with clusters over time. Preliminary results on 143,000 patients from UK Biobank show varying stability of patient associations with clusters. Further work aims to better define stability and identify causes of instability.
Digital biomarkers for preventive personalised healthcarePaolo Missier
A talk given to the Alan Turing Institute, UK, Oct 2021, reporting on the preliminary results and ongoing research in our lab, on self-monitoring using accelerometers for healthcare applications
Digital biomarkers for preventive personalised healthcarePaolo Missier
A talk given to the Alan Turing Institute, UK, Oct 2021, reporting on the preliminary results and ongoing research in our lab, on self-monitoring using accelerometers for healthcare applications
Capturing and querying fine-grained provenance of preprocessing pipelines in ...Paolo Missier
a talk given at the VLDB 2021 conference, August, 2021, presenting our paper:
Capturing and Querying Fine-grained Provenance of Preprocessing Pipelines in Data Science. Chapman, A., Missier, P., Simonelli, G., & Torlone, R. PVLDB, 14(4):507–520, January, 2021.
http://doi.org/10.14778/3436905.3436911
Quo vadis, provenancer? Cui prodest? our own trajectory: provenance of data...Paolo Missier
The document discusses provenance in the context of data science and artificial intelligence. It provides bibliometric data on publications related to data/workflow provenance from 2000 to the present. Recent trends include increased focus on applications in computing and engineering fields. Blockchain is discussed as a method for capturing fine-grained provenance. The document also outlines challenges around explainability, transparency and accountability for high-risk AI systems according to new EU regulations, and argues that provenance techniques may help address these challenges by providing traceability of system functioning and operation monitoring.
Analytics of analytics pipelines:from optimising re-execution to general Dat...Paolo Missier
This document discusses using data provenance to optimize re-execution of analytics pipelines and enable transparency in data science workflows. It proposes a framework called ReComp that selectively recomputes parts of expensive analytics workflows when inputs change based on provenance data. It also discusses applying provenance techniques to collect fine-grained data on data preparation steps in machine learning pipelines to help explain model decisions and data transformations. Early results suggest provenance can be collected with reasonable overhead and enables useful queries about pipeline execution.
ReComp: optimising the re-execution of analytics pipelines in response to cha...Paolo Missier
Paolo Missier presented on optimizing the re-execution of analytics pipelines in response to changes in input data. The talk discussed using provenance to selectively re-run parts of workflows impacted by changes. ProvONE combines process structure and runtime provenance to enable granular re-execution. The ReComp framework detects and quantifies data changes, estimates impact, and selectively re-executes relevant sub-processes to optimize re-running workflows in response to evolving data.
ReComp, the complete story: an invited talk at Cardiff UniversityPaolo Missier
The document describes the ReComp framework for efficiently recomputing analytics processes when changes occur. ReComp uses provenance data from past executions to estimate the impact of changes and selectively re-execute only affected parts of processes. It identifies changes, computes data differences, and estimates impacts on past outputs to determine the minimum re-executions needed. For genomic analysis workflows, ReComp reduced re-executions from 495 to 71 by caching intermediate data and re-running only impacted fragments. The framework is customizable via difference and impact functions tailored to specific applications and data types.
Efficient Re-computation of Big Data Analytics Processes in the Presence of C...Paolo Missier
This document discusses efficient re-computation of big data analytics processes when changes occur. It presents the ReComp framework which uses process execution history and provenance to selectively re-execute only the relevant parts of a process that are impacted by changes, rather than fully re-executing the entire process from scratch. This approach estimates the impact of changes using type-specific difference functions and impact estimation functions. It then identifies the minimal subset of process fragments that need to be re-executed based on change impact analysis and provenance traces. The framework is able to efficiently re-compute complex processes like genomics analytics workflows in response to changes in reference databases or other dependencies.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Pushing the limits of ePRTC: 100ns holdover for 100 days
Mind My Value: A decentralised infrastructure for fair and trusted IoT data trading
1. 1
Mind My Value:
A decentralised infrastructure
for fair and trusted IoT data trading
Paolo Missier
Shaimaa Bajoudah
{firstname.lastname}@ncl.ac.uk
School of Computing, Newcastle
University, UK
Angelo Capossele
Andrea Gaglione
Michele Nati
{firstname.lastname}@digitalcatapult.org.uk
Digital Catapult Centre
London, UK
IoT Conference
Linz, Austria
Oct. 24th, 2017
2. 2
Motivation: monetising our own IoT data streams
Focus on IoT data streams that carry information about people
• Wearables - Health care, Personal fitness, “quantified self”
• Smart home hubs / smart vehicles
Three questions:
• Business: Do these data streams have a value as digital assets?
• Technical: How do unlock the value / enable trading of such assets?
• Legal: Who is entitled to trade them?
3. 3
Primary and secondary IoT data markets
Working assumptions:
1. VAS (Value Added Services) that aggregate granular IoT data streams exist
2. They have both technical capabilities and business incentives
3. There will exist both primary and secondary markets for IoT data
Native
Value Added Services
Data Generators
- Personal / home / vehicle devices
Secondary
Value Added Services
Edge network
IoT data flows
VAS
VAS
VAS
VAS
VAS
Secondary markets
Primary market
P1
PN
…
P2
G1
Pi
……
…
Edge / Trusted zone
Network server /
Broker
Network infrastructure
C1
…
CN
Ci
…
Trusted zone
IoT data tracking
Pi
Tk
Nijk(W)
Nik(W)
Pi
Cj
Tk
Blockchain ecosystem
Interface
Subscriber app /
application server
(VAS)
…
GN
Gi
4. 4
Re-selling Scenario – Traffic For London
Tube stations gates
TFL
Primary marketOyster card
Data
User
Data
Taxi
company
Secondary markets
Targeted
Ads
Card + user
Data
TFL is a re-seller (do tube users get a cut of their extra profits?)
5. 5
Marketplace Scenario
“I am diabetic and I monitor my own fitness regime.
Can I trade my data for a better health insurance deal?”
Primary IoT data flows
Strava
Secondary markets
Primary VAS
Garmin server
Edge network
IoT data marketplace
Health
Insurance
Follow-your-runner
portal
“I am a (really good) runner.
How do Iicense my live data feed on the marketplace?”
IoT device owners may trade
data streams directly:
6. 6
A new type of data marketplace?
Goal:
To develop a blueprint for the next generation IoT data marketplaces and
demonstrate its technical feasibility
(Ideal) Requirements:
1. Dynamic and flexible. Enable un-anticipated business relationships
• Quickly establish and fulfil new primary–VAS contracts
2. Guaranteed compliance and fairness
3. Granular: Should allow individuals to gain value from their data
4. Decentralized: Governance rules are defined, but
• No central trusted authority is appointed to enforce those rules
5. “Big Data” challenges: high Volume, high Velocity, high Variety
7. 7
Initial scenario: brokered IoT traffic
Observable IoT data streams
- Simply count messages: <from, to, topic>
An IoT pub/sub setting (eg MQTT)
8. 8
Solution: broker-centric traffic metering
message count cubes
Only contain metadata
Example Topics:
- Heart rate
- Speed
- GPS trace
- Glucose level
- Gait
- Home water consumption
- Vehicle driving data
- …
9. 9
Settlement
At the end of each W:
Total fee owed by each cj to each pi computed by aggregating counts in the cube:
Unit cost
for topic k messagespi revenue after W:
if we assume:
- Complete and correct traffic metering
- A trusted authority to compute revenues
Problem solved?
10. 10
Marketplace with central trusted components
D
D
D
IoT data flows
VAS
VAS
Centralised Traffic Metering
Contract Compliance
Data Discovery
Service
Transaction settlement
Vocabulary /
Ontology
Negotiation and
Agreement
Service
Registration /
Identity Service
Dispute arbitration
msg count cubes
Observed
data flow
Contracts
DB
Participants
DB
Setup
Exec
Analyse
Low density
High density
(MQTT)
(Fees)
11. 11
Removing trust and de-centralising control
Can we remove the trust assumption and still fulfill these functions?
Trusted broker provides
• Accountability (accurate, complete counts)
• Dispute resolution
• Revenue distribution
12. 12
Approach
1) Accountability: Each participant is responsible for reporting their own counts of
messages sent / received
2) Transparency: Reports become part of blockchain transactions
Unilateral count cubes:
- Provided by each participant
- Partial – reflect a participant’s point of view
Count of messages sent by pi about tk
Provider cube:
Subscriber cube:
Count of messages received by cj from pi about tk
13. 13
Consistency and settlement with unilateral cubes
Cubes collected at the end of W:
Consistency constraint:
For each tk, for each subscriber cj to tk:
Number of messages sent by pi = number of messages received by cj from pi
But:
• Producers have an incentive to over-report the data they send
• VAS have an incentive to under-report the data they receive
Thus for some combination of pi, cj, tk we may expect:
14. 14
Blockchain + smart contracts
1. Associate identity to marketplace participants
2. Agree on contract specification
3. Settlement of contractual disputes given unilaterally generated traffic reports
“A blockchain is a globally shared, transactional database”
Smart contracts (Ethereum blockchain)
Smart contract is a term used to describe computer
program code that is capable of facilitating, executing, and
enforcing the negotiation or performance of an agreement
(i.e. contract) using blockchain technology.
Contract logic triggered by data input the batches of (unilateral) cubes
15. 15
Removing central trust: approach
broker-controlled “message count cubes” each participant unilaterally reports data sent / received
D
D
D
IoT data flows
VAS
VAS
Contract Compliance
Data Discovery
Service
Vocabulary /
Ontology
Negotiation and
Agreement
Service
Registration /
Identity Service
Settlement /
Reputation Management
Setup
Exec
Analyse
D
D
D
IoT data flows
VAS
VAS
Contract Compliance
Data Discovery
Service
Vocabulary /
Ontology
Negotiation and
Agreement
Service
Registration /
Identity Service
Settlement /
Reputation Management
Setup
Exec
Analyse
- Identities
- Contracts
- Unilateral
traffic reports
Sm
art
C
ontract
Sm
art
C
ontract
Sm
art
C
ontract
Sm
art
C
ontract
Unilateral cubes
cubep cubes
16. 16
Evaluation
Where should cubes data live?
- Off chain: cubes remain natively located within participants’ trusted zones
- Oraclize (Ethereum-specfic mechanism)
- Adds to cost of Smart Contract execution
- guarantees the authenticity and integrity of the retrieved data.
- Oraclize requires a query fee: 0.01$ to 0.04$
- On chain: transactions embed the cubes in the blockchain
- No cost but adds to transaction size
Execution cost of cube
settlement operations
How much does it cost to run the settlement smart contract?
17. 17
Evaluation
Overhead: (cost of contract execution) x (settlement rate)
- cost of contract execution cube size) #PROD X #CONS x #TOPICS
(possibly compressed)
1
10
100
1000
10000
0 5 10 15 20 25
Datatransferrate
Cube settlement operations
0.0000010
0.0000038
0.0000153
0.0000610
0.0002441
0.0009766
0.0039063
0.0156250
0.0625000
0.2500000
Dataprice[ETH]
Impact of overhead on unit cost for varying settlement rate and data transfer rate
18. 18
Evaluation
Cost per message for varying data volume, settlement rate, gas price
0.0000002
0.0000010
0.0000038
0.0000153
0.0000610
0.0002441
0.0009766
0.0039063
200 400 600 800 1000 1200 1400 1600 1800 2000
0.00006
0.00024
0.00098
0.00391
0.01563
0.06250
0.25000
1.00000
Cubesettlementcost[ETH]
Cubesettlementcost[USD]
Data transfer rate
1 settlement at low gas price
5 settlements at low gas price
1 settlement at avg gas price
5 settlements at avg gas price
off-chain data
(retrieved using Oraclize with
additional cost overhead)
Estimated data prices
0.0000002
0.0000010
0.0000038
0.0000153
0.0000610
0.0002441
0.0009766
0.0039063
200 400 600 800 1000 1200 1400 1600 1800 2000
0.00006
0.00024
0.00098
0.00391
0.01563
0.06250
0.25000
1.00000
Cubesettlementcost[ETH]
Cubesettlementcost[USD]
Data transfer rate
1 settlement at low gas price
5 settlements at low gas price
1 settlement at avg gas price
5 settlements at avg gas price
on-chain data
19. 19
Open challenges
Fairness in the presence of malicious behaviour
reputation model:
based on history of disagreements on past transactions
What’s in a trading agreement?
• From atomic data trading (single message) to complex SLA
Think “follow-your-runner”
System challenges:
• Evolving Smart Contract technology: Ethereum vs Hyperledger
• Public vs permissioned blockchains
• Scalability
Value Added Services (VAS) shall emerge that have both the technical capability and the business motivation to continuously acquire and analyse data streams produced on a massive scale by millions of connected devices, with the aim to extract a variety of types of value-added knowledge from them.
The native eco-system consists of devices owned by TFL (the ticket gates), which generate persons data (assumed granular and anonymous) which is available to TFL.
Secondary markets for this data may include for instance a taxi company that is interested in reacting to anomalous passenger traffic flows through the tube stations, by proactively positioning its fleet outside certain stations at certain times of the day.
Also, when some demographics is known about the users, advertisers may be interested in accessing the data for accurate ad positioning.
Garmin (2) controls access to the data generated by its devices (1). Currently, data flows natively from device to its server where Garmin makes it accessible to third parties through a service interface (2->3).
An example of (3) is Strava, a social network for fitness that does not own any of the devices. It receives data from Garmin and offers VAS to Garmin users (for instance, advanced analytics). This is a secondary market for individuals’ data, which benefits (1) the individuals, who see added value from their device, (2) Garmin, who hopes to sell more devices, and (3) Strava, who may have a separate business model out of its analytics. In this scenario, Strava may be able to resell this data, in particular about individuals’ commuting habits, further down the chain to VAS (4) such as city planners.
a 2012 survey of data vendors [6], for example, includes 46 data suppliers, however the definition of data marketplace used in the paper is generic (“a platform on which anybody can upload and maintain data sets, with license-regulated access to and use of the data”) and geared towards static data, like Microsoft’s Azure Data Market.
Solidity Runs on the EVM (Ethereum Virtual Machine)
Executing Smart Contracts on the Ethereum network incurs a fee (in Ether)
Operations consume a fixed amount of Gas
Miners fees are proportional to the amount of Gas used
- Every transaction specifies the Gas price it is willing to pay
High price high incentive faster transaction execution
Oraclize 3 to 11 times more expensive
adapted the open-source Mosquitto MQTT broke to support message logging and cubes generation
into a Cassandra NoSQL database.
real producers using channels provided by the ThingSpeak platform https://thingspeak.com
Using the TrackerDB, we are able to simulate the generation of unilateral cubes
that can be either complete and correct,
or reflect malicious behaviour, for evaluation purposes.
Smart Contracts interact with the service through an Ethereum-specific mechanism
fisso un intervallo T e vario il data rate quindi il numero di messaggi totali che passano durante T. Poi decido ogni quanto fare settlement durante T. A sx ho un settlement solo (una invocazione del contract) ma un costo / message alto quando ho un transfer rate basso (cioè’ pochi messaggi) che scende quando aumento il rate cioè’ il numero di messaggi. A dx scelgo di fare settlement spesso, quindi pago per diciamo 20 invocazioni, e di nuovo il costo e’ molto alto se ho pochi message (low rate) e scende quando ne ho molti (high rate).