There are the slides from my presentation at BNAIC2012. The talks is about why we need to look at optimization techniques to deal with Linked Data and how this can be done.
Can we use data to train Machine Learning models, perform statistical analysis, yet without putting private data on risk? There are tools and techniques such as Federated Learning, Differential Privacy or Homomorphic Encryption enabling safer work on the data.
Can we use data to train Machine Learning models, perform statistical analysis, yet without putting private data on risk? There are tools and techniques such as Federated Learning, Differential Privacy or Homomorphic Encryption enabling safer work on the data.
BDW16 London - Wael Elrifai, Pentaho - Big Data-Driven InnovatiomBig Data Week
This presentation will explore data gathering techniques, tools, and analysis processes in the business innovation process. By way of example, the presentation will outline the stages of planning, designing, and delivering behind one of today’s most popular business innovation use cases for IoT – a predictive maintenance system. It will also reveal the different areas in which businesses gain value (and cost savings) by automating the processes of data engineering and data discovery.
A presentation delivered by Robert Brooks at the Police Foundation's annual conference 'Policing and Justice for a Digital Age' (December 2016) on using big data and predictive analysis.
These are the slides of a presentation given at http://www.w3.org/2012/06/pmod . Our current mindset when thinking about "Open Data" excludes the majority of World population from using it. This presentation highlight some of the work being done to change this.
BDW16 London - Wael Elrifai, Pentaho - Big Data-Driven InnovatiomBig Data Week
This presentation will explore data gathering techniques, tools, and analysis processes in the business innovation process. By way of example, the presentation will outline the stages of planning, designing, and delivering behind one of today’s most popular business innovation use cases for IoT – a predictive maintenance system. It will also reveal the different areas in which businesses gain value (and cost savings) by automating the processes of data engineering and data discovery.
A presentation delivered by Robert Brooks at the Police Foundation's annual conference 'Policing and Justice for a Digital Age' (December 2016) on using big data and predictive analysis.
These are the slides of a presentation given at http://www.w3.org/2012/06/pmod . Our current mindset when thinking about "Open Data" excludes the majority of World population from using it. This presentation highlight some of the work being done to change this.
Embedding young learners into the information societyChristophe Guéret
A couple of years ago, One Laptop Per Child embarked on a mission to "create educational opportunities for the world's poorest children by providing each child with a rugged, low-cost, low-power, connected laptop with content and software designed for collaborative, joyful, self-empowered learning". Today, this vision is achieved through the learning environment "Sugar" and the laptop "XO". This talk will start with an overview of OLPC's mission and the XO before focusing more on Sugar. This environment centered around "activities", a model in between document and application centric interfaces, features an interesting data model and data sharing capabilities. However, most of the data produced on the XO stays on the XO and is not accessible to the other devices. I will describe how Semantic Web technologies can be employed to further share and interconnect the data and give an overview of use-cases being implemented on top of "SemanticXO", the Semantic Web toolkit for Sugar.
Presentation about http://worldwidesemanticweb.org/ given at SugarCamp#3 in Paris on April 12-13. The slides introduce the activities of the WWSW group centred around adapting Semantic Web technologies to be usable in challenging conditions.
An Evolutionary Perspective on Approximate RDF Query AnsweringChristophe Guéret
RDF is increasingly being used to represent large amounts of data on the Web. Current query evaluation strategies for RDF are inspired by databases, assuming perfect answers on finite repositories. In this paper, we present a novel query method based on evolutionary computing, which allows us to handle uncertainty, incompleteness and unsatisfiability, and deal with large datasets, all within a single conceptual framework. Our technique supports approximate answers with anytime behaviour. We present initial results and analyse next steps for improvement.
Assessing Linked Data Mappings using Network MeasuresChristophe Guéret
When generating a lot of WoD links automatically, data quality is a pressing issue. This presentation, and the related paper, introduce LinkQA: a network based node-centric framework to analyse the impact of linkage on the network topology and assess the quality of these links.
Machine Learning meets Granular Computing: the emergence of granular models in the Big Data era
** Presentation Slides from Dr Rafael Falcon, from Larus Technologies, for the February 2018 Ottawa Machine Learning & Artificial Intelligence Meetup
Abstract
Traditional Machine Learning (ML) models are unable to effectively cope with the challenges posed by the many V’s (volume, velocity, variety, etc.) characterizing the Big Data phenomenon. This has triggered the need to revisit the underlying principles and assumptions ML stands upon. Dimensionality reduction, feature/instance selection, increased computational power and parallel/distributed algorithm implementations are well-known approaches to deal with these large volumes of data.
In this talk we will introduce Granular Computing (GrC), a vibrant research discipline devoted to the design of high-level information granules and their inference frameworks. By adopting more symbolic constructs such as sets, intervals or similarity classes to describe numerical data, GrC has paved the way for a more human-centric manner of interacting with and reasoning about the real world. We will go over several granular models that address common ML tasks such as classification/clustering and will outline a methodology to appropriately design information granules for the problem at hand. Though not a mainstream concept yet, GrC is a promising direction for ML systems to harness Big Data.
Keynote given at the workshop for Artificial Intelligence meets the Web of Data on Pragmatic Semantics.
In this keynote I argue that the Web of Data is a Complex System or Marketplace of Ideas rather than a classical Database, and that the model theory on which classical semantics are based is not appropriate in all situations, and propose an alternative "Pragmatic Semantics" based on optimisation of possible interpretations. .
A presentation I gave at the 2018 Molecular Med Tri-Con in San Francisco, February 2018. This addresses the general challenge of biomedical data management, some of the things to consider when evaluation solutions in this space, and concludes with a brief summary of some of the tools and platforms in this space.
In the last few years, deep learning has achieved significant success in a wide range of domains, including computer vision, artificial intelligence, speech, NLP, and reinforcement learning. However, deep learning in recommender systems has, until recently, received relatively little attention. This talks explores recent advances in this area in both research and practice. I will explain how deep learning can be applied to recommendation settings, architectures for handling contextual data, side information, and time-based models.
Deep Learning for Recommender Systems with Nick pentreathDatabricks
In the last few years, deep learning has achieved significant success in a wide range of domains, including computer vision, artificial intelligence, speech, NLP, and reinforcement learning. However, deep learning in recommender systems has, until recently, received relatively little attention. This talks explores recent advances in this area in both research and practice. I will explain how deep learning can be applied to recommendation settings, architectures for handling contextual data, side information, and time-based models, and compare deep learning approaches to other cutting-edge contextual recommendation models, and finally explore scalability issues and model serving challenges.
FAIR data_ Superior data visibility and reuse without warehousing.pdfAlan Morrison
The advantages of semantic knowledge graphs over data warehousing when it comes to scaling quality, contextualized data for machine learning and advanced analytics purposes.
Big data is a big part of the disruption hitting this market, but not in the way most people think. It's not replacing the data warehouse, but it is changing the technology stack. It doesn't eliminate data management, but it does redefine enterprise data architecture. Big data is and isn't many things. It's important to understand which information uses are well supported and which have yet to be addressed. Otherwise you risk replacing one set of problems with another. Come to this session to hear some observations on what big data is, isn't and aspires to be.
A video is available, starts at 1:03 into this Strata online event: http://www.youtube.com/watch?v=gLsHI1ZglKw
An invited talk by Paco Nathan in the speaker series at the University of Chicago's Data Science for Social Good fellowship (2013-08-12) http://dssg.io/2013/05/21/the-fellowship-and-the-fellows.html
Learnings generalized from trends in Data Science:
a 30-year retrospective on Machine Learning,
a 10-year summary of Leading Data Science Teams,
and a 2-year survey of Enterprise Use Cases.
http://www.eventbrite.com/event/7476758185
Data Center Computing for Data Science: an evolution of machines, middleware,...Paco Nathan
Guest lecture 2013-08-27 at General Assembly in SF for the Data Science program taught by Jacob Bollinger and Thomson Nguyen https://generalassemb.ly/education/data-science/san-francisco
Many thanks to Thomson, Jacob, and the participants in the course. Excellent Q&A!
Received a bottle o' Cardhu (my fave Scotch) in payment for lecture, and since it's Burning Man Week, the city was emptied so we had enough to share with the class :)
Evidence:
https://plus.google.com/u/0/110794698656267747127/posts/GvjhhQ99CTs
LAK19 - Towards Value-Sensitive Learning Analytics DesignBodong Chen
LAK19 Full Paper. Abstract: To support ethical considerations and system integrity in learning analytics, this paper introduces two cases of applying the Value Sensitive Design methodology to learning analytics design. The first study applied two methods of Value Sensitive Design, namely stakeholder analysis and value analysis, to a conceptual investigation of an existing learning analytics tool. This investigation uncovered a number of values and value tensions, leading to design trade-offs to be considered in future tool refinements. The second study holistically applied Value Sensitive Design to the design of a recommendation system for the Wikipedia WikiProjects. To proactively consider values among stakeholders, we derived a multi-stage design process that included literature analysis, empirical investigations, prototype development, community engagement, iterative testing and refinement, and continuous evaluation. By reporting on these two cases, this paper responds to a need of practical means to support ethical considerations and human values in learning analytics systems. These two cases demonstrate that Value Sensitive Design could be a viable approach for balancing a wide range of human values, which tend to encompass and surpass ethical issues, in learning analytics design.
Full day lectures @International University, HCM City, Vietnam, May 2019. Review of AI in 2019; outlook into the future; empirical research in AI; introduction to AI research at Deakin University
Current challenges facing the implementation of NoSQL-type databases involve how to use advanced rule-based analytics on large tables and key value stores, where metadata is often sparse. Graph databases or triple stores are great for utilizing one’s metadata, but are often computationally inefficient compared to NoSQL stores. To combat this problem, Modus Operandi will showcase a Predicate Store inside of its MOVIA product that can run advanced, first-order level, logical rule sets and queries against large tables or column stores directly to provide a scalable, rapid and advanced data analytics for cloud applications. This provides graph complexity in terms of content with the performance and scalability of NoSQL data approaches. The system also allows for both statistical algorithms as well as logic-based rule sets to be run concurrently, meaning that a host of parallel analytics can be run at once, providing deep analysis over a multitude of important pattern types.
With the recent growth of the graph-based data, the large graph processing becomes more and more important. In order to explore and to extract knowledge from such data, graph mining methods, like community detection, is a necessity. The legacy graph processing tools mainly rely on single machine computational capacity, which cannot process large graphs with billions of nodes. Therefore, the main challenge of new tools and frameworks lies on the development of new paradigms that are scalable, efficient and flexible. In this paper, we review the new paradigms of large graph processing and their applications to graph mining domains using the distributed and shared nothing approach used for large data by Internet players.
Similar to Evolutionary and Swarm Computing for scaling up the Semantic Web (20)
This presentation was given at the first international conference on hybrid intelligence held in Amsterdam on June 2022. See https://www.hhai-conference.org/ for more information
This is an informal overview of Linked Data and the usage made of it for the project http://res.space (presented on August 11th 2016 during a team meeting)
Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"Christophe Guéret
Short introduction describing what worldwidesemanticweb.org is about. The presentation matches a video at http://youtu.be/pRFhK-QooBA recorded for remote participation at the workshop http://linkedup-project.eu/making-it-matter-workshop/
Small presentation given at the closing event of PiLOD (http://www.pilod.nl) to explain the technical details behind the realisation of the HuiKluis prototype (http://pilod-huiskluis.appspot.com/)
Presentation given at ODW2013 (http://www.w3.org/2013/04/odw/). Goes over the need for institutions doing digital archiving to publish their meta-data as LOD and ensure formats round-tripping for the data
Clarifier le sens de vos données publiques avec le Web de donnéesChristophe Guéret
Slides de la présentation donnée au Linked Open Data @ AIMS Webinars de la FAO. Cette présentation a pour but de mettre en avant les avantages du LOD pour la publication de données. Pour plus d'information, voir http://aims.fao.org/linked-open-data-webinars-at-aims ,
http://aims.fao.org/linked-open-data-webinars-at-aims/christophe-guedet et http://www.slideshare.net/faoaims/clarifier-le-sens-de-vos-donnes-publiques-avec-le-web-de-donnes
Slides prepared with Clement Levallois for the tutorial held at the Meertens institute. The presentation goes over the need for using Linked Data to make data machine readable. The hands-on part is focused on the annotation of a profile page with RDFa.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Essentials of Automations: The Art of Triggers and Actions in FME
Evolutionary and Swarm Computing for scaling up the Semantic Web
1. Evolutionary and Swarm Computing
for scaling up the Semantic Web
Christophe Guéret (@cgueret), Stefan Schlobach, Kathrin Dentler,
Martijn Schut, and Gusz Eiben
24th Benelux Conference on Artificial Intelligence
Maastricht University, October 25-26, 2012
1/18
2. What are we going to talk about?
Linked Data
Changing our point of view on
soundness and completeness
Consider optimisation as an
alternative to logical deduction
Short paper
Two concrete examples of re- based on this
publication
formulated problems
2/18
3. When solutions do not (quite) fit the problem ...
3/18
Copyright: sfllaw (Flickr, image 222795669)
4. Linked Data
Graph/facts based knowledge representation tool
Connect resources to properties / other resources
Web-based: resources have a URI
Try http://dbpedia.org/resource/Amsterdam
4/18
5. Interacting with Linked Data
Common goals
Completeness: all the answers
Soundness: only exact answers
5/18
6. Motivation
In the context of Web data ?
Issues with scale
Issues with lack of consistency
Issues with contextualised views over the World
Revise the goals
As many answers as possible (or needed)
Answers as accurate as possible (or needed)
6/18
7. From logic to optimisation
Optimise towards the revised goals
Need methods that cope with uncertainty, context,
noise, scale, ...
7/18
9. The problem
Match a graph pattern to the data
Most common approach
Join partial results for each edge of the query
9/18
10. Solving approaches
Logic-based
Find all the answers matching all of the query pattern
Optimisation
Find answers matching as much of the query as possible
Important implications of the optimisation
Only some of the answers will be found
Some of the answers found will be partially true
10/18
11. An optimisation approach: eRDF
Guess the answers to the query
Evolutionary algorithm
Evaluate validity of candidate solution
Optimise with a recombination + local search
11/18
12. Some results
Tested on queries with
varied complexity
Works best with more
complex queries
Find exact answers
when there are some
12/18
13. Finding implicit facts in the data
Copyright: 13/18
givingnot@rocketmail.com (Flickr, image 6990161491)
14. The problem
Deduce new facts from others
Most common approach
Centralise all the facts, batch process deductions
14/18
15. Solving approaches
Logic-based
Find all the facts that can be derived from the data
Optimisation
Find as many facts as possible while preserving
consistency
Important implications of the optimisation
Only some of the facts will be found
Unstable content
15/18
16. An optimisation approach: Swarms
Swarm of micro-reasoners
Browse the graph, applying rules when possible
Deduced facts disappear after some time
Every author of a
paper is a person
Every person is
also an agent
16/18
17. Some results
If they stay, most of
the implicit facts are
derived
Ants need to follow
each other to deal with
precedence of rules
Several ants per rule
are needed
17/18
18. Take home message
Logic problems can be turned into optimisation
problems
Trade off
Gained: scalability, speed, robustness
Lost: determinism, completeness, soundness
A lot of research still to be done!
(and done quickly, Linked Data is growing fast...)
18/18