Tabular data is difficult to analyze and search through. There is a clear need for new tools and interfaces that would allow even non-tech-savvy users to gain insights from open datasets without resorting to specialized data analysis tools or even having to fully understand the dataset structure. We explore the End-To-End Memory Networks architecture (Sukhbaatar et al., 2015) in application to answering natural language questions from tabular data. This architecture was originally designed for the question-answering tasks from short natural language texts (bAbI tasks) (Weston et al., 2015), which include testing elements of inductive and deductive reasoning, co-reference resolution and time manipulation.
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in RMarco Wirthlin
What is probabilistic programming and Bayesian statistics? What are their strengths and limitations? In his talk, Marco located Bayesian networks in the current AI landscape, gently introduced Bayesian reasoning and computation and explained how to implement generative models in R.
Keynote for Theory and Practice of Digital Libraries 2017
The theory and practice of digital libraries provides a long history of thought around how to manage knowledge ranging from collection development, to cataloging and resource description. These tools were all designed to make knowledge findable and accessible to people. Even technical progress in information retrieval and question answering are all targeted to helping answer a human’s information need.
However, increasingly demand is for data. Data that is needed not for people’s consumption but to drive machines. As an example of this demand, there has been explosive growth in job openings for Data Engineers – professionals who prepare data for machine consumption. In this talk, I overview the information needs of machine intelligence and ask the question: Are our knowledge management techniques applicable for serving this new consumer?
Gentle Introduction: Bayesian Modelling and Probabilistic Programming in RMarco Wirthlin
What is probabilistic programming and Bayesian statistics? What are their strengths and limitations? In his talk, Marco located Bayesian networks in the current AI landscape, gently introduced Bayesian reasoning and computation and explained how to implement generative models in R.
Keynote for Theory and Practice of Digital Libraries 2017
The theory and practice of digital libraries provides a long history of thought around how to manage knowledge ranging from collection development, to cataloging and resource description. These tools were all designed to make knowledge findable and accessible to people. Even technical progress in information retrieval and question answering are all targeted to helping answer a human’s information need.
However, increasingly demand is for data. Data that is needed not for people’s consumption but to drive machines. As an example of this demand, there has been explosive growth in job openings for Data Engineers – professionals who prepare data for machine consumption. In this talk, I overview the information needs of machine intelligence and ask the question: Are our knowledge management techniques applicable for serving this new consumer?
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...Amélie Gyrard
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located with The Web Conference 2018 (WWW 2018)
New release to upload the latest slides: 2 May
Previous slides have been seen 45 times.
This is a presentation on natural language processing (NLP), machine learning (ML), and Big Data. I introduce neural networks, unsupervised machine learning, and a variety of natural language processing techniques, and I cover architectural best practices, ways to implement data science for practical applications, and how to deal with organizational road-blocks.
Collaborative writing and common core standards in the classroom slideshareVicki Davis
The ISTE 12 presentation covering each of the 10 Common Core writing standards and showing how each of the standards looks in classroom examples of project based technology-enabled learning. You can use standards and have an innovative curriculum. Panel discussion slides.
Invited talk at the 5th International Workshop on Search-Oriented Conversational AI (SCAI) @EMNLP2020. Here is the recording https://slideslive.com/38940054/response-generation-and-retrieval-for-multimodal-conversational-ai
Wrangling Big Data in a Small Tech EcosystemShalin Hai-Jew
This work summarizes an early endeavor to explore some big(gish) data from an LMS data portal from flat files. There are some patterns that are identifiable: file types that are most common on the LMS instance, the types of assignments given, time-based patterns for quizzes, types of third-party tool activations (based on LTI), typical computer device configurations for those touching the LMS, features of email conversations shared through the instance, and others. This work includes some creative applications of "survival analysis" (time-to-event analysis) and computational linguistic analysis to textual descriptions. The particular LMS instance is from Kansas State University's Canvas instance; Canvas is created by Instructure. Much is learnable from LMS data portal "flat files," but much is left unexploited, without going to a larger database.
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...Amélie Gyrard
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located with The Web Conference 2018 (WWW 2018)
New release to upload the latest slides: 2 May
Previous slides have been seen 45 times.
This is a presentation on natural language processing (NLP), machine learning (ML), and Big Data. I introduce neural networks, unsupervised machine learning, and a variety of natural language processing techniques, and I cover architectural best practices, ways to implement data science for practical applications, and how to deal with organizational road-blocks.
Collaborative writing and common core standards in the classroom slideshareVicki Davis
The ISTE 12 presentation covering each of the 10 Common Core writing standards and showing how each of the standards looks in classroom examples of project based technology-enabled learning. You can use standards and have an innovative curriculum. Panel discussion slides.
Invited talk at the 5th International Workshop on Search-Oriented Conversational AI (SCAI) @EMNLP2020. Here is the recording https://slideslive.com/38940054/response-generation-and-retrieval-for-multimodal-conversational-ai
Wrangling Big Data in a Small Tech EcosystemShalin Hai-Jew
This work summarizes an early endeavor to explore some big(gish) data from an LMS data portal from flat files. There are some patterns that are identifiable: file types that are most common on the LMS instance, the types of assignments given, time-based patterns for quizzes, types of third-party tool activations (based on LTI), typical computer device configurations for those touching the LMS, features of email conversations shared through the instance, and others. This work includes some creative applications of "survival analysis" (time-to-event analysis) and computational linguistic analysis to textual descriptions. The particular LMS instance is from Kansas State University's Canvas instance; Canvas is created by Instructure. Much is learnable from LMS data portal "flat files," but much is left unexploited, without going to a larger database.
Similar to Memory Networks for Question Answering on Tabular Data (20)
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
UiPath New York Community Day in-person eventDianaGray10
UiPath Community Day is a unique gathering designed to foster collaboration, learning, and networking with automation enthusiasts. Whether you're an automation developer, business analyst, IT professional, solution architect, CoE lead, practitioner or a student/educator excited about the prospects of artificial intelligence and automation technologies in the United States, then the UiPath Community Day is definitely the place you want to be.
Join UiPath leaders, experts from the industry, and the amazing community members and let's connect over expert sessions, demos and use cases around AI in automation as we highlight our technology with a special speaker on Document Understanding.
📌Agenda
3:00 PM Registrations
3:30 PM Welcome note and Introductions | Corina Gheonea (Senior Director of Global UiPath Community)
4:00 PM Introduction to Document Understanding
How to build and deploy Document Understanding process
Where would Document Understanding be used.
Demo
Q&A
4:45 PM Customer/Partner showcase
Accelirate
Intro to Accelirate and history with UiPath
Why are we excited about the new AI features of UiPath?
Customer highlight
a. Document Understanding – BJs Case Study
b. Document Understanding + generative AI
5.30 PM Networking
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
The infamous Mallox is the digital Robin Hoods of our time, except they steal from everyone and give to themselves. Since mid-2021, they've been playing hide and seek with unsecured Microsoft SQL servers, encrypting data, and then graciously offering to give it back for a modest Bitcoin donation.
Mallox decided to go shopping for new malware toys, adding the Remcos RAT, BatCloak, and a sprinkle of Metasploit to their collection. They're now playing a game of "Catch me if you can" with antivirus software, using their FUD obfuscator packers to turn their ransomware into the digital equivalent of a ninja.
-------
This document provides a analysis of the Target Company ransomware group, also known as Smallpox, which has been rapidly evolving since its first identification in June 2021.
The analysis delves into various aspects of the group's operations, including its distinctive practice of appending targeted organizations' names to encrypted files, the evolution of its encryption algorithms, and its tactics for establishing persistence and evading defenses.
The insights gained from this analysis are crucial for informing defense strategies and enhancing preparedness against such evolving cyber threats.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
5. bAbI Benchmark
❖ 20 QA tasks: train/test 1K samples
Factoid QA
Yes/no questions
Counting
Coreference
Time manipulation
Basic deduction/induction
Positional reasoning
Reasoning about size
Path finding …
❖ 6 dialog tasks
Jason Weston, Antoine Bordes, Sumit Chopra, Alexander M. Rush, Bart van Merriënboer, Armand Joulin and Tomas
Mikolov. Towards AI Complete Question Answering: A Set of Prerequisite Toy Tasks, arXiv:1502.05698.
6. 2. Factoid QA with two supporting facts
1 Mary got the milk.
2 John moved to the bedroom.
3 Sandra went back to the kitchen.
4 Mary travelled to the hallway.
5 Where is the milk? hallway 1 4
https://github.com/facebook/bAbI-tasks
7. 15. Basic deduction
1 Wolves are afraid of mice.
2 Sheep are afraid of mice.
3 Winona is a sheep.
4 Mice are afraid of cats.
5 Cats are afraid of wolves.
6 Jessica is a mouse.
10 What is winona afraid of? mouse 3 2
12 What is jessica afraid of? cat 6 4
https://github.com/facebook/bAbI-tasks
8. 19. Path finding
1 The garden is west of the bathroom.
2 The bedroom is north of the hallway.
3 The office is south of the hallway.
4 The bathroom is north of the bedroom.
5 The kitchen is east of the bedroom.
6 How do you go from the bathroom to the hallway? s,s 4 2
https://github.com/facebook/bAbI-tasks
9. Memory Network (MemNN)
❖ Deep neural network architecture proposed by Facebook AI Research group
Memory: indexed array of objects (e.g. vectors)
❖ Components:
I: (input) convert incoming data to the internal representation.
G: (generalisation) update memories given input.
O: (output) produce output given the memories.
R: (response) convert output representation into a response.
J. Weston, S. Chopra, A. Bordes. Memory Networks. ICLR 2015
https://blog.acolyer.org/2016/03/10/memory-networks/
11. End-to-end Memory Network (MemN2N)
Sukhbaatar, Sainbayar, Jason Weston, and Rob Fergus. "End-to-end memory networks." Advances in
neural information processing systems (NIPS). 2015.
12. Variations
❖ Ankit Kumar, Ozan Irsoy, Peter Ondruska, Mohit Iyyer, James
Bradbury, Ishaan Gulrajani, Victor Zhong, Romain Paulus,
Richard Socher: Ask Me Anything: Dynamic Memory
Networks for Natural Language Processing. ICML 2016.
❖ Alexander H. Miller, Adam Fisch, Jesse Dodge, Amir-Hossein
Karimi, Antoine Bordes, Jason Weston: Key-Value Memory
Networks for Directly Reading Documents. EMNLP 2016.
❖ Julien Perez, Fei Liu: Gated End-to-End Memory Networks.
EACL 2017.
http://yerevann.com/dmn-ui
13. Application: Open Data
❖ Open Data -> Open Government
❖ Increasing transparency
❖ Empowering citizens and local communities
19. Query Disambiguation
❖ fastText model trained on Wikipedia
❖ handles OOV words
immigration recognised as immigration_total 0.96
code recognized as lau2_code 0.86
Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. Enriching word vectors
with subword information. arXiv:1607.04606. 2016.
20. Evaluation
The template-based questions are modified by
‣ omitting words: one or more words are removed from
the original user query;
‣ changing the position of words in the query;
‣ querying a different column that did not appear in the
questions from the training data set;
‣ inadequate questions, for which data required to
answer this question are not present in the input table.
21. Results
The template-based questions are modified by
✓ omitting words: one or more words are removed from
the original user query;
✓ changing the position of words in the query;
- querying a different column that did not appear in the
questions from the training data set;
- inadequate questions, for which data required to
answer this question are not present in the input table.
23. Conclusions
❖ Fancy, but tricky!
❖ Know on what you train?
data sampling & variance to ensure generalisability
❖ Know what you trained?
interaction & visualisation of learned patterns
24. Future Work
❖ Scaling up experiments to real world tables (variance &
OOV words)
❖ New dataset for QA from open data tables
❖ Answering questions across tables
❖ Semantic integration of open data tables
❖ Joint training with other bAbI tasks for text
understanding
25. Future Work
https://m.me/OpenDataAssistant
Make data your friend!
Open Data Assistant: chatbot - dialogue interface
Sebastian Neumaier, Vadim Savenkov, and Svitlana Vakulenko. "Talking Open Data." ESWC (demo). 2017
26. References I
❖ Svitlana Vakulenko and Vadim Savenkov. TableQA: Question Answering on Tabular Data.
2017. https://arxiv.org/abs/1705.06504
❖ Jason Weston, Sumit Chopra, Antoine Bordes. Memory Networks. ICLR 2015 .
❖ Sukhbaatar, Sainbayar, Jason Weston, and Rob Fergus. "End-to-end memory networks."
Advances in neural information processing systems (NIPS). 2015.
❖ Ankit Kumar, Ozan Irsoy, Peter Ondruska, Mohit Iyyer, James Bradbury, Ishaan Gulrajani,
Victor Zhong, Romain Paulus, Richard Socher: Ask Me Anything: Dynamic Memory
Networks for Natural Language Processing. ICML 2016.
❖ Alexander H. Miller, Adam Fisch, Jesse Dodge, Amir-Hossein Karimi, Antoine Bordes, Jason
Weston: Key-Value Memory Networks for Directly Reading Documents. EMNLP 2016.
❖ Julien Perez, Fei Liu: Gated End-to-End Memory Networks. EACL 2017.
❖ Antoine Bordes, Nicolas Usunier, Sumit Chopra, Jason Weston: Large-scale Simple Question
Answering with Memory Networks. CoRR abs/1506.02075. 2015.
27. References II
Datasets
❖ Jason Weston, Antoine Bordes, Sumit Chopra, Alexander M. Rush, Bart van Merriënboer, Armand Joulin
and Tomas Mikolov. Towards AI Complete Question Answering: A Set of Prerequisite Toy Tasks, arXiv:
1502.05698.
❖ SimpleQuestions
❖ WebQuestions
❖ CNN QA
Videos
❖ CS224D Guest Lecture - Jason Weston - 2015 https://www.youtube.com/watch?v=6NHeIEaSie8&t=1435s
❖ Jason Weston. Memory Networks for Language Understanding, ICML Tutorial 2016
http://www.thespermwhale.com/jaseweston/icml2016/
http://techtalks.tv/talks/memory-networks-for-language-understanding/62356/
Blogs
❖ https://blog.init.ai/icml-2016-memory-networks-for-language-understanding-f2ed4c8819c4
❖ https://blog.acolyer.org/2016/03/10/memory-networks/
❖ https://yerevann.github.io/2016/02/05/implementing-dynamic-memory-networks/