A presentation on "Extraction and Visualization of Metadata Analytics for Multimedia Learning Object Repositories: The case of TERENA TF-media network OER portal" presented at the LACRO workshop of the LAK Conference, on April 9th, 2013
UNED Online Reputation Monitoring Team at RepLab 2013Damiano Spina
This paper describes the UNED's Online Reputation Monitoring Team participation at RepLab
2013. Several approaches were tested: first, an instance-based learning approach that uses Heterogeneity Based Ranking to combine seven different similarity measures was applied for all the subtasks. The filtering subtask was also tackled by automatically discovering filter keywords: those whose presence in a tweet reliably confirm (positive keywords) or discard (negative keywords) that the tweet refers to the company.
Different approaches have been submitted for the topic detection subtask: agglomerative clustering over wikified tweets, co-occurrence term clusteringand an LDA-based model that uses temporal
information.
Finally, the polarity subtask was tackled by generating domain specific semantic graphs in order to automatically expand the general purpose lexicon SentiSense. We next use the domain specific sub-lexicons to classify tweets according to their reputational polarity, following an emotional concept-based system for sentiment analysis.
We corroborated that using entity-level training data improves the filtering step. Additionally, the proposed approaches to detect topics obtained the highest scores in the official evaluation, showing that they are promising directions to address the problem. In the reputational polarity task, our results suggest that a deeper analysis should be done in order to correctly identify the main differences between the Reputational Polarity task and traditional Sentiment Analysis tasks. A final remark is that the overall performance of a monitoring system in RepLab 2013 highly depends on the performance of the initial filtering step.
Presentation given at the Text Mining for Scholarly Communications and Repositories
Joint Workshop, 28-29 Oct 2009 (http://www.nactem.ac.uk/tm-ukoln.php)
UNED Online Reputation Monitoring Team at RepLab 2013Damiano Spina
This paper describes the UNED's Online Reputation Monitoring Team participation at RepLab
2013. Several approaches were tested: first, an instance-based learning approach that uses Heterogeneity Based Ranking to combine seven different similarity measures was applied for all the subtasks. The filtering subtask was also tackled by automatically discovering filter keywords: those whose presence in a tweet reliably confirm (positive keywords) or discard (negative keywords) that the tweet refers to the company.
Different approaches have been submitted for the topic detection subtask: agglomerative clustering over wikified tweets, co-occurrence term clusteringand an LDA-based model that uses temporal
information.
Finally, the polarity subtask was tackled by generating domain specific semantic graphs in order to automatically expand the general purpose lexicon SentiSense. We next use the domain specific sub-lexicons to classify tweets according to their reputational polarity, following an emotional concept-based system for sentiment analysis.
We corroborated that using entity-level training data improves the filtering step. Additionally, the proposed approaches to detect topics obtained the highest scores in the official evaluation, showing that they are promising directions to address the problem. In the reputational polarity task, our results suggest that a deeper analysis should be done in order to correctly identify the main differences between the Reputational Polarity task and traditional Sentiment Analysis tasks. A final remark is that the overall performance of a monitoring system in RepLab 2013 highly depends on the performance of the initial filtering step.
Presentation given at the Text Mining for Scholarly Communications and Repositories
Joint Workshop, 28-29 Oct 2009 (http://www.nactem.ac.uk/tm-ukoln.php)
What's with the 1s and 0s? Making sense of binary data at scale with Tika and...gagravarr
If you have one or two files, you can take the time to manually work out what they are, what they contain, and how to get the useful bits out (probably....). However, this approach really doesn't scale, mechanical turks or no! Luckily, there are Apache projects out there which can help!
In this talk, we'll first look at how we can work out what a given blob of 1s and 0s actually is, be it textual or binary. We'll then see how to extract common metadata from it, along with text, embedded resources, images, and maybe even the kitchen sink! We'll see how to do all of this with Apache Tika, and how to dive down to the underlying libraries (including its Apache friends like POI and PDFBox) for specialist cases. Finally, we'll look a little bit about how to roll this all out on a Big Data or Large-Search case.
From the Fast Feather Track at ApacheCon NA 2010 in Atlanta
This quick talk provides an overview of Apache Tika, looks at a new features and supported file formats. It then shows how to create a new parser, and finishes with using Tika from your own application.
In this session, we will look first at the rich metadata that documents in your repository have, how to control the mapping of this on to your content model, and some of the interesting things this can deliver. We'll then move on to the content transformation and rendition services, and see how you can easily and powerfully generate a wide range of media from the content you already have.
Mapping, Merging, and Multilingual TaxonomiesHeather Hedden
SLA 2012 conference presentation sponsored by the Taxonomy Division at SLA Chicago July 16 and re-presented at the New England Chapter on October 13, 2012.
Presented by Access Innovations, Inc. president Marjorie M.K. Hlava at the 2013 Taxonomy Boot Camp, November 5, 2013. An update on taxonomy standards from the International Organization for Standardization, the American National Standards Institute, and the British Standards Institution.
This presentation is part of my work for the course 'Heterogeneous and Distributed Information Systems' at TU Berlin within the IT4BI (Information Technology for Business Intelligence) master programme.
Keynote: SemSci 2017: Enabling Open Semantic Science
1st International Workshop co-located with ISWC 2017, October 2017, Vienna, Austria,
https://semsci.github.io/semSci2017/
Abstract
We have all grown up with the research article and article collections (let’s call them libraries) as the prime means of scientific discourse. But research output is more than just the rhetorical narrative. The experimental methods, computational codes, data, algorithms, workflows, Standard Operating Procedures, samples and so on are the objects of research that enable reuse and reproduction of scientific experiments, and they too need to be examined and exchanged as research knowledge.
We can think of “Research Objects” as different types and as packages all the components of an investigation. If we stop thinking of publishing papers and start thinking of releasing Research Objects (software), then scholar exchange is a new game: ROs and their content evolve; they are multi-authored and their authorship evolves; they are a mix of virtual and embedded, and so on.
But first, some baby steps before we get carried away with a new vision of scholarly communication. Many journals (e.g. eLife, F1000, Elsevier) are just figuring out how to package together the supplementary materials of a paper. Data catalogues are figuring out how to virtually package multiple datasets scattered across many repositories to keep the integrated experimental context.
Research Objects [1] (http://researchobject.org/) is a framework by which the many, nested and contributed components of research can be packaged together in a systematic way, and their context, provenance and relationships richly described. The brave new world of containerisation provides the containers and Linked Data provides the metadata framework for the container manifest construction and profiles. It’s not just theory, but also in practice with examples in Systems Biology modelling, Bioinformatics computational workflows, and Health Informatics data exchange. I’ll talk about why and how we got here, the framework and examples, and what we need to do.
[1] Sean Bechhofer, Iain Buchan, David De Roure, Paolo Missier, John Ainsworth, Jiten Bhagat, Philip Couch, Don Cruickshank, Mark Delderfield, Ian Dunlop, Matthew Gamble, Danius Michaelides, Stuart Owen, David Newman, Shoaib Sufi, Carole Goble, Why linked data is not enough for scientists, In Future Generation Computer Systems, Volume 29, Issue 2, 2013, Pages 599-611, ISSN 0167-739X, https://doi.org/10.1016/j.future.2011.08.004
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...Angelo Salatino
Classifying research papers according to their research topics is an important task to improve their retrievability, assist the creation of smart analytics, and support a variety of approaches for analysing and making sense of the research environment. In this paper, we present the CSO Classifier, a new unsupervised approach for automatically classifying research papers according to the Computer Science Ontology (CSO), a comprehensive ontology of re-search areas in the field of Computer Science. The CSO Classifier takes as input the metadata associated with a research paper (title, abstract, keywords) and returns a selection of research concepts drawn from the ontology. The approach was evaluated on a gold standard of manually annotated articles yielding a significant improvement over alternative methods.
Gaining Advantage in e-Learning with Semantic Adaptive TechnologyOntotext
In this presentation, we will introduce you to a solution that involves adaptive semantic technology for educational institutions and e-learning providers. You will learn how to integrate 3rd party resources, legacy assets, and other content sources to create the so-called knowledge graph of all structured and unstructured data.
A new approach to helping research communities and organizations improve their potential for sharing data. Includes a discussion of new tools being developed to achieve this goal.
What's with the 1s and 0s? Making sense of binary data at scale with Tika and...gagravarr
If you have one or two files, you can take the time to manually work out what they are, what they contain, and how to get the useful bits out (probably....). However, this approach really doesn't scale, mechanical turks or no! Luckily, there are Apache projects out there which can help!
In this talk, we'll first look at how we can work out what a given blob of 1s and 0s actually is, be it textual or binary. We'll then see how to extract common metadata from it, along with text, embedded resources, images, and maybe even the kitchen sink! We'll see how to do all of this with Apache Tika, and how to dive down to the underlying libraries (including its Apache friends like POI and PDFBox) for specialist cases. Finally, we'll look a little bit about how to roll this all out on a Big Data or Large-Search case.
From the Fast Feather Track at ApacheCon NA 2010 in Atlanta
This quick talk provides an overview of Apache Tika, looks at a new features and supported file formats. It then shows how to create a new parser, and finishes with using Tika from your own application.
In this session, we will look first at the rich metadata that documents in your repository have, how to control the mapping of this on to your content model, and some of the interesting things this can deliver. We'll then move on to the content transformation and rendition services, and see how you can easily and powerfully generate a wide range of media from the content you already have.
Mapping, Merging, and Multilingual TaxonomiesHeather Hedden
SLA 2012 conference presentation sponsored by the Taxonomy Division at SLA Chicago July 16 and re-presented at the New England Chapter on October 13, 2012.
Presented by Access Innovations, Inc. president Marjorie M.K. Hlava at the 2013 Taxonomy Boot Camp, November 5, 2013. An update on taxonomy standards from the International Organization for Standardization, the American National Standards Institute, and the British Standards Institution.
This presentation is part of my work for the course 'Heterogeneous and Distributed Information Systems' at TU Berlin within the IT4BI (Information Technology for Business Intelligence) master programme.
Keynote: SemSci 2017: Enabling Open Semantic Science
1st International Workshop co-located with ISWC 2017, October 2017, Vienna, Austria,
https://semsci.github.io/semSci2017/
Abstract
We have all grown up with the research article and article collections (let’s call them libraries) as the prime means of scientific discourse. But research output is more than just the rhetorical narrative. The experimental methods, computational codes, data, algorithms, workflows, Standard Operating Procedures, samples and so on are the objects of research that enable reuse and reproduction of scientific experiments, and they too need to be examined and exchanged as research knowledge.
We can think of “Research Objects” as different types and as packages all the components of an investigation. If we stop thinking of publishing papers and start thinking of releasing Research Objects (software), then scholar exchange is a new game: ROs and their content evolve; they are multi-authored and their authorship evolves; they are a mix of virtual and embedded, and so on.
But first, some baby steps before we get carried away with a new vision of scholarly communication. Many journals (e.g. eLife, F1000, Elsevier) are just figuring out how to package together the supplementary materials of a paper. Data catalogues are figuring out how to virtually package multiple datasets scattered across many repositories to keep the integrated experimental context.
Research Objects [1] (http://researchobject.org/) is a framework by which the many, nested and contributed components of research can be packaged together in a systematic way, and their context, provenance and relationships richly described. The brave new world of containerisation provides the containers and Linked Data provides the metadata framework for the container manifest construction and profiles. It’s not just theory, but also in practice with examples in Systems Biology modelling, Bioinformatics computational workflows, and Health Informatics data exchange. I’ll talk about why and how we got here, the framework and examples, and what we need to do.
[1] Sean Bechhofer, Iain Buchan, David De Roure, Paolo Missier, John Ainsworth, Jiten Bhagat, Philip Couch, Don Cruickshank, Mark Delderfield, Ian Dunlop, Matthew Gamble, Danius Michaelides, Stuart Owen, David Newman, Shoaib Sufi, Carole Goble, Why linked data is not enough for scientists, In Future Generation Computer Systems, Volume 29, Issue 2, 2013, Pages 599-611, ISSN 0167-739X, https://doi.org/10.1016/j.future.2011.08.004
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...Angelo Salatino
Classifying research papers according to their research topics is an important task to improve their retrievability, assist the creation of smart analytics, and support a variety of approaches for analysing and making sense of the research environment. In this paper, we present the CSO Classifier, a new unsupervised approach for automatically classifying research papers according to the Computer Science Ontology (CSO), a comprehensive ontology of re-search areas in the field of Computer Science. The CSO Classifier takes as input the metadata associated with a research paper (title, abstract, keywords) and returns a selection of research concepts drawn from the ontology. The approach was evaluated on a gold standard of manually annotated articles yielding a significant improvement over alternative methods.
Gaining Advantage in e-Learning with Semantic Adaptive TechnologyOntotext
In this presentation, we will introduce you to a solution that involves adaptive semantic technology for educational institutions and e-learning providers. You will learn how to integrate 3rd party resources, legacy assets, and other content sources to create the so-called knowledge graph of all structured and unstructured data.
A new approach to helping research communities and organizations improve their potential for sharing data. Includes a discussion of new tools being developed to achieve this goal.
Royal society of chemistry activities to develop a data repository for chemis...Ken Karapetyan
The Royal Society of Chemistry publishes many thousands of articles per year, the majority of these containing rich chemistry data that, in general, in limited in its value when isolated only to the HTML or PDF form of the articles commonly consumed by readers. RSC also has an archive of over 300,000 articles containing rich chemistry data especially in the form of chemicals, reactions, property data and analytical spectra. RSC is developing a platform integrating these various forms of chemistry data. The data will be aggregated both during the manuscript deposition process as well as the result of text-mining and extraction of data from across the RSC archive. This presentation will report on the development of the platform including our success in extracting compounds, reactions and spectral data from articles. We will also discuss our developing process for handling data at manuscript deposition and the integration and support of eLab Notebooks (ELNS) in terms of facilitating data deposition and sourcing data. Each of these processes is intended to ensure long-term access to research data with the intention of facilitating improved discovery.
The Royal Society of Chemistry publishes many thousands of articles per year, the majority of these containing rich chemistry data that, in general, in limited in its value when isolated only to the HTML or PDF form of the articles commonly consumed by readers. RSC also has an archive of over 300,000 articles containing rich chemistry data especially in the form of chemicals, reactions, property data and analytical spectra. RSC is developing a platform integrating these various forms of chemistry data. The data will be aggregated both during the manuscript deposition process as well as the result of text-mining and extraction of data from across the RSC archive. This presentation will report on the development of the platform including our success in extracting compounds, reactions and spectral data from articles. We will also discuss our developing process for handling data at manuscript deposition and the integration and support of eLab Notebooks (ELNS) in terms of facilitating data deposition and sourcing data. Each of these processes is intended to ensure long-term access to research data with the intention of facilitating improved discovery.
This work presents a data architecture based on semantic web technologies that support to the inclusion of open materials in massive online courses. The framework provides transparent access to RDF data sources for Open Educational Resources stored in OpenCourseWare repositories.
Speaker(s): Nelson Piedra and Edmundo Tovar
Building an ecosystem of networked referencesHugo Manguinhas
Over the past five years, the amount of contextual entities in Europeana’s metadata has grown considerably. These entities are provided as references as part of the metadata delivered by Europeana or selected by Europeana semantic automatic enrichment. Pursuing their efforts towards the creation of a semantic network around cultural heritage objects, Europeana and its partners providers and aggregators are investigating ways to better exchange vocabulary data and manage co-references/alignments between vocabularies. In this presentation we will explore the potential of tools such as OpenSkos and Cultuurlink for supporting the building of networked references.
Presented at the 6th DBpedia Community Meeting in The Hague 2016, see http://wiki.dbpedia.org/meetings/TheHague2016
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
TERENA OER portal, metadata extraction analysis, LAK, Leuven @9apr2013
1. Extraction and Visualization of Metadata
Analytics for Multimedia Learning Object
Repositories: The case of TERENA TF-media
network
Kostas Vogias1, Ilias Hatzakis1, Nikos
Manouselis2, Peter Szegedi3
1
Greek Research and Technology Network
2
Agro-Know Technologies
3
TERENA TF-media
Workshop on Learning Object Analytics for Collections,
Repositories and Federations, 9 April, 2013
2. Metadata analysis is not something new
• Ochoa, Xavier, Klerkx, Joris, Vandeputte, Bram, and Duval, Erik.
– On the Use of Learning Object Metadata: The GLOBE Experience.
• Made the first fully quantitative study in a large number of Learning
Repositories that belongs to a large organization like Globe
• Neven, Filip and Duval, Erik.
– Reusable Learning Objects: a Survey of LOM-Based Repositories.
• Zschocke, Thomas and Beniest, Jan and Paisley, Courtney and
Najjar, Jehad and Duval, Erik.
– The LOM application profile for agricultural learning resources of the CGIAR
• studied the use of LOM for the indexing of learning resources
• Manouselis, N, Salokhe, G, Keizer, J, and Rudgard, S.
– Towards a Harmonization of Metadata Application Profiles for Agricultural
Learning Repositories.
• Made an analysis of the metadata schemas used by Repositories including
Agricultural Learning Resources.
3. Why extraction of metadata analytics is
important when we are developing a learning
portal
Which metadata schema is used by our content providers?
How the providers are using different elements of the schema?
On which metadata schema our learning portal should rely?
Can we provide services based on metadata elements such as
Portal design decisions
Portal design decisions
Subject, Type, Format, Keywords, Title and Descriptions?
Which languages can our portal support?
4. Study Objectives
• To perform a quantified study on the different metadata
schemas used by TERENA TF media network
• To propose a metadata schema on which the TERENA OER
portal will be based
• To verify if metadata analytics can constitute a tool that
can facilitate the development of a learning portal that is
based on metadata records aggregated by various content
providers
5. What Terena OER Project is
• A European level metadata aggregation portal for Open
Educational Resources (primarily audiovisual contents,
recorded lectures) collected and maintained by institutional
and national content repositories of the Research &
Education Community.
• Main objectives of the project
– Create a broker for national learning resource organizations.
– Bridge the gap between the national repositories and the
emerging global repositories (e.g., GLOBE) by establishing a
European level metadata repository (i.e. aggregation point)
for the national repositories acting in the R&E community. The
European level repository will be a metadata repository only,
the content remains in its original content repository.
6. Which Data Providers
• Successfully harvested.
Repository Name Records Harvested Metadata Schema
Switch Collection 619 oai_dc
DSpace at University Of Latvia 1009 oai_dc
OBAA Repository 56 oai_dc
RiuNet: Repositori Institucional 21902 oai_dc
de la Universitat Politècnica de 58.000 + instances
València
SCAM Repository 7351 oai_lom
Material Audiovisual ofrecido por 978 oai_dc
el Campus do Mar
wikiwijs 26054 oai_dc
Małopolskie Towarzystwo 181 oai_dc
Genealogiczne
Select the ones that can be used for the first version of TERENA OER
portal
8. HOW
• Repository Based Analysis.
• Metrics
– Element Completeness: The percentage of records in which
an element has a value.
– Relative Entropy: Diversity of values in an element.
– Vocabulary values distribution: Format, Language and
Type
– Language properties: Attribute (e.g. lang=en) value usage
frequency e.g. lang in free text metadata elements Title,
Description and Keyword
• The analysis was performed for a core set of metadata
elements that is present in the studied repositories
• Use of a standard set of mappings from DC to LOM
9. What we have used
• ARIADNE Harvester
• A metadata analysis tool
– Implemented using JAVA.
– Metadata schema agnostic.
– Metadata Analysis schemes:
1. Repository based.
2. Federation based.
– Element based analysis:
• Completeness
• Relative Entropy
• Specific element vocabulary extraction and usage frequency
– Attribute based analysis.
• Lang value attribute frequency
– Input:XMLs
– Output:CSV,TXT
23. English can be the main language
supported on the TERENA OER Portal
24. Conclusions
• Metadata Analysis helps in:
– Defining metadata aggregation element set.
– Defining the type of services that can be provided by the TERENA OER
portal e.g. browse by type of LOs, elements that can be used for full
text search
– Providing recommendations back to the providers about usage of
metadata elements
– Validating the metadata records at harvesting time
• Next steps
– Extend to more repositories of TERENA TF-media network
– Combine with results of an online survey for content providers
– Develop the web based version of the tool and provide it as an open
source tool
This presentation is about extracting metadata analytics for Multimedia Learning Object Repositories. It was conducted by memebers of GRNET in collaboration with Nikos Manouselis from Agro-Know Technologies and Peter Szegedi from Terena TF Media.
There are many previous studies that have worked on the analysis of the metadata used content providers. So our work is based on a concept and methods that have been previous defined. The most relevant work to our study is the first fully quantified study in a large number of Learning Repositories that belongs to Globe.
Let’s start by pointing why a study on metadata analytics is important when we are developing a learning portal that is based on metadata aggregation from a number of providers. When we are developing such a learning portal questions like these are born. Most of these questions are about portal design decisions and thus very critical for the development of the portal.
The main objectives of our study are:
Let’s first say some words about TERENA OER Project. The goal of this project is to set up a European level metadata aggregation portal for ….. It aims at creating a broken ….. and a bridge between ….
Here you can see the list of the data providers that were successfully harvested. The vast majority of the providers are using Dublin Core as a format for exposing the metadata. The goal is to identify these providers that can be used for the first version of the TERENA OER portal.
We have 58K+ instances and we need to select the data providers that can be used for the first version of the TERENA OER portal
How we did we conduct the study. We performed a repository based and we studied:
What we have used to conduct this study is the Ariadne Harvester in order to aggregate the metadata records and fro the metadata analysis we have developed a tool that will ….
We have this version now bu
we are working on such a version which will include GUI that will allow to researchers to easily estimate metadata analytics and visualize them.
A heat map table for the element usage. One can see (++CLICK++) that elements such as title, Identifier, Language, Type and Creator are generally used by the studied repositories. The repository with the most rich metadata is the repository of campus do Mar, SCAM, Malopolskie and RiuNet
In this chart the average element usage values for all the studied repositories is presented. It is evident again elements like Title, identifier, language, type and publisher are highly used almost by all repositories. These elements can be candidates as mandatory/core elements of the aggregator metadata schema.
Zero entropy means that the values are null or we have the same value for all elements. Format element is used only by one repository and thus the entropy is zero for the rest. High values of entropy indicates heterogeneity/diversity of info. Type of learning object has high values in many repositories and this means that we have various type of objects. Further, this means that if TERENA OER will include only audiovisual material some kind of filtering will be needed. For Campus do Mar all the entropy for all the values were found zero and this is due to the fact that all the records has the same values for these elements.
The main conclusion from language analysis of free text elements is that if all the repositories will be aggregated then English can be the main language supported on TERENA OER Portal.