Intervento di Paolo Neirotti al secondo incontro del corso di formazione per dirigenti sindacali "Le parole dell'innovazione e il lavoro", nato da una progettazione congiunta tra ISMEL e le segreterie CGIL, CISL e UIL di Torino e tenutosi tra marzo e maggio 2019.
These slides introduces the second edition of ProvBench which I am leading to collect a corpus of provenance data for benchmarking for the provenance (and scientific) community
Intervento di Paolo Neirotti al secondo incontro del corso di formazione per dirigenti sindacali "Le parole dell'innovazione e il lavoro", nato da una progettazione congiunta tra ISMEL e le segreterie CGIL, CISL e UIL di Torino e tenutosi tra marzo e maggio 2019.
These slides introduces the second edition of ProvBench which I am leading to collect a corpus of provenance data for benchmarking for the provenance (and scientific) community
I gave this talk in the EDBT 2014 conference, which tool place in Athens, Greece.
I show how data examples can be used to characterize the behavior of scientific modules. I present a new methods that automatically generate the data examples, and show that such data examples are useful for the human user to understand the task of the modules, and that they can be used to assist curators in repairing broken workflows (i.e., workflows for which one or more modules are no longer supplied by their providers)
Linking the prospective and retrospective provenance of scriptsKhalid Belhajjame
Scripting languages like Python, R, andMATLAB have seen significant use across a variety of scientific domains. To assist scientists in the analysis of script executions, a number of mechanisms, e.g., noWorkflow, have been recently proposed to capture the provenance of script executions. The provenance information recorded can be used, e.g., to trace the lineage of a particular result by identifying the data inputs and the processing steps that were used to produce it. By and large, the provenance information captured for scripts is fine-grained in the sense that it captures data dependencies at the level of script statement, and do so for every variable within the script. While useful, the amount of recorded provenance information can be overwhelming for users and cumbersome to use. This suggests the need for abstraction mechanisms that focus attention on specific parts of provenance relevant for analyses. Toward this goal, we advocate that fine-grained provenance information recorded as the result of script execution can be abstracted using user-specified, workflow-like views. Specifically, we show how the provenance traces recorded by noWorkflow can be mapped to the workflow specifications generated by YesWorkflow from scripts based on user annotations. We examine the issues in constructing a successful mapping, provide an initial implementation of our solution, and present competency queries illustrating how a workflow view generated from the script can be used to explore the provenance recorded during script execution.
This is a keynote that I have given in polyweb workshop on the state of the art of data science reproducibility. I review tools that have been developed over the last few years in the first part. In the second part, I focus on proposals that I have been involved in to facilitate workflow reproducibility and preservation.
A use case designed in the context of the Dataone provenance woring group illustrating how the provenance traces generated by differet workflow engines can be quered via the D-PROV model.
I gave this talk in TAPP 2014 during the provenance week in Cologne, on inferring fine graine dependencies between data (ports) in scientific workflows. -- khalid
A talk given at the EDBT/ICDT 2010 conference. For more details, visit the project website at http://img.cs.manchester.ac.uk/dataspaces/dataspaces.html
Small Is Beautiful: Summarizing Scientific Workflows Using Semantic Annotat...Khalid Belhajjame
Scientific Workflows have become the workhorse of BigData analytics for scientists. As well as being repeatable and optimizable pipelines that bring together datasets and analysis tools, workflows make-up an important part of the provenance of data generated from their execution. By faithfully capturing all stages in the analysis, workflows play a critical part in building up the audit-trail (a.k.a. provenance) meta- data for derived datasets and contributes to the veracity of results. Provenance is essential for reporting results, reporting the method followed, and adapting to changes in the datasets or tools. These functions, however, are hampered by the complexity of workflows and consequently the complexity of data-trails generated from their instrumented execution. In this paper we propose the generation of workflow description summaries in order to tackle workflow complexity. We elaborate reduction primitives for summarizing workflows, and show how prim- itives, as building blocks, can be used in conjunction with semantic workflow annotations to encode different summariza- tion strategies. We report on the effectiveness of the method through experimental evaluation using real-world workflows from the Taverna system.
Comunicazione, politica, economia, ricerca:
cosa cambia se tutto cambia su Internet.
(slideshow realizzato per il Master in non-conventional marketing e social media di EuroGiovani)
Italia 2.0 - L'inarrestabile corsa verso un nuovo PaeseDino Amenduni
Politica, comunicazione, opinione pubblica:
l’inarrestabile corsa verso un nuovo Paese (slideshow presentato al Collegio di Milano, 30 maggio, corso di cultura e innovazione digitale)
Far crescere la manifattura con le ICT: casi di successo a Nordest - apertura...Fondazione CUOA
Presentazione a cura di Cecilia Rossignoli, Professore di Organizzazione Aziendale, Università degli Studi di Verona e Referente Scientifico Forum ICT Fondazione CUOA, al convegno del Forum ICT "Far crescere la manifattura con le ICT: casi di successo a Nordest" - 14 marzo 2013, Fondazione CUOA
I gave this talk in the EDBT 2014 conference, which tool place in Athens, Greece.
I show how data examples can be used to characterize the behavior of scientific modules. I present a new methods that automatically generate the data examples, and show that such data examples are useful for the human user to understand the task of the modules, and that they can be used to assist curators in repairing broken workflows (i.e., workflows for which one or more modules are no longer supplied by their providers)
Linking the prospective and retrospective provenance of scriptsKhalid Belhajjame
Scripting languages like Python, R, andMATLAB have seen significant use across a variety of scientific domains. To assist scientists in the analysis of script executions, a number of mechanisms, e.g., noWorkflow, have been recently proposed to capture the provenance of script executions. The provenance information recorded can be used, e.g., to trace the lineage of a particular result by identifying the data inputs and the processing steps that were used to produce it. By and large, the provenance information captured for scripts is fine-grained in the sense that it captures data dependencies at the level of script statement, and do so for every variable within the script. While useful, the amount of recorded provenance information can be overwhelming for users and cumbersome to use. This suggests the need for abstraction mechanisms that focus attention on specific parts of provenance relevant for analyses. Toward this goal, we advocate that fine-grained provenance information recorded as the result of script execution can be abstracted using user-specified, workflow-like views. Specifically, we show how the provenance traces recorded by noWorkflow can be mapped to the workflow specifications generated by YesWorkflow from scripts based on user annotations. We examine the issues in constructing a successful mapping, provide an initial implementation of our solution, and present competency queries illustrating how a workflow view generated from the script can be used to explore the provenance recorded during script execution.
This is a keynote that I have given in polyweb workshop on the state of the art of data science reproducibility. I review tools that have been developed over the last few years in the first part. In the second part, I focus on proposals that I have been involved in to facilitate workflow reproducibility and preservation.
A use case designed in the context of the Dataone provenance woring group illustrating how the provenance traces generated by differet workflow engines can be quered via the D-PROV model.
I gave this talk in TAPP 2014 during the provenance week in Cologne, on inferring fine graine dependencies between data (ports) in scientific workflows. -- khalid
A talk given at the EDBT/ICDT 2010 conference. For more details, visit the project website at http://img.cs.manchester.ac.uk/dataspaces/dataspaces.html
Small Is Beautiful: Summarizing Scientific Workflows Using Semantic Annotat...Khalid Belhajjame
Scientific Workflows have become the workhorse of BigData analytics for scientists. As well as being repeatable and optimizable pipelines that bring together datasets and analysis tools, workflows make-up an important part of the provenance of data generated from their execution. By faithfully capturing all stages in the analysis, workflows play a critical part in building up the audit-trail (a.k.a. provenance) meta- data for derived datasets and contributes to the veracity of results. Provenance is essential for reporting results, reporting the method followed, and adapting to changes in the datasets or tools. These functions, however, are hampered by the complexity of workflows and consequently the complexity of data-trails generated from their instrumented execution. In this paper we propose the generation of workflow description summaries in order to tackle workflow complexity. We elaborate reduction primitives for summarizing workflows, and show how prim- itives, as building blocks, can be used in conjunction with semantic workflow annotations to encode different summariza- tion strategies. We report on the effectiveness of the method through experimental evaluation using real-world workflows from the Taverna system.
Comunicazione, politica, economia, ricerca:
cosa cambia se tutto cambia su Internet.
(slideshow realizzato per il Master in non-conventional marketing e social media di EuroGiovani)
Italia 2.0 - L'inarrestabile corsa verso un nuovo PaeseDino Amenduni
Politica, comunicazione, opinione pubblica:
l’inarrestabile corsa verso un nuovo Paese (slideshow presentato al Collegio di Milano, 30 maggio, corso di cultura e innovazione digitale)
Far crescere la manifattura con le ICT: casi di successo a Nordest - apertura...Fondazione CUOA
Presentazione a cura di Cecilia Rossignoli, Professore di Organizzazione Aziendale, Università degli Studi di Verona e Referente Scientifico Forum ICT Fondazione CUOA, al convegno del Forum ICT "Far crescere la manifattura con le ICT: casi di successo a Nordest" - 14 marzo 2013, Fondazione CUOA
Una visione dell'evoluzione del capitalismo, con un modello dell'uomo, e con la possibilità di un'ipotetico piano di marketing strategico per una grande Corporate del comparto ICT.
Si sente oggi molto parlare del nuovo trend dell’Internet delle cose, di sensori e oggetti intelligenti, di nuove opportunità di controllo olistico di macchine produttive, abitazioni,
elettrodomestici. Qual è la dimensione attuale del fenomeno dell’Internet of Everything? Quali sono le previsioni di evoluzione?
Internet fino ad oggi è servita a collegare tra loro le persone tramite PC e device mobile: in futuro, la rete crescerà molto di più e si espanderà in domini inesplorati, creando legami sempre più stretti tra il mondo fisico e quello digitale. Appliance domestiche intelligenti, sistemi di riscaldamento e condizionamento, sensori per monitorare le condizioni ambientali, attuatori per lanciare azioni da remoto: molti di
questi oggetti saranno dotati di un proprio indirizzo IP, trasmetteranno dati, saranno raggiungibili in qualsiasi momento e da qualunque luogo, parleranno con noi. Come
mostra la figura successiva, le componenti che rendono possibile questo scenario ci sono già tutte: sensori, attuatori, reti wireless, batterie, strumenti analitici avanzati,
processi già codificati di localizzazione, controllo, automazione e alert.
www.theinnovationgroup.it
Digital Transformation: Big Data, User Targeting ed Etica - Project Work Mast...Free Your Talent
Digital Transformation: Big Data, User Targeting ed Etica - Project Work a cura degli studenti del Master ISTUD in Marketing Management Alex Caruso, Federica Ferrara e Riccardo Pavesi