The following presentation is on my Masters Graduate Thesis Work - "Mining Interesting Trivia for Entities from Wikipedia".
This presentation is the second part and in continuation of my another presentation, which is having the same title but with 'PART-I' in end
Mining Interesting Trivia for Entities from Wikipedia PART-IAbhay Prakash
The following presentation is on my Masters Graduate Thesis Work - "Mining Interesting Trivia for Entities from Wikipedia". This presentation covers complete and exact work that has been covered in our IJCAI accepted paper.
This presentation is the first part covering around 80% of content that I had presented in my mid term. There is another presentation with same title but with 'PART-II' in end which is in continuation of this presentation.
IJCAI 2015 Presentation: Did you know?- Mining Interesting Trivia for Entitie...Abhay Prakash
This document describes a method for automatically mining interesting trivia about entities from Wikipedia. It presents the Wikipedia Trivia Miner (WTM) system, which selects candidate sentences from Wikipedia pages and ranks them based on an interestingness model trained on human ratings. WTM uses linguistic and entity-based features to determine interestingness. Evaluation shows WTM outperforms baselines in precision and recall for retrieving interesting trivia about movie entities. The authors contribute a novel approach for mining interesting facts from text and make their data and code publicly available.
The document discusses an approach called "jungloid mining" to help programmers more easily perform common tasks by automatically finding concise code snippets (called "jungloids") to transform objects of one type to another. It observes that many programming problems can be described as searching for a jungloid to transform a single input object to a single output object type. It presents an algorithm and tool called Prospector that mines codebases for valid jungloids and represents them as paths in a graph, often finding the optimal jungloid via the shortest path. Future work aims to improve Prospector's understanding of semantics, types, and other graph-theoretic considerations.
The document discusses a lecture on machine learning algorithms. It covers recapping the ID3 algorithm, machine learning biases including language bias and preference bias, and decision tree learning. It also compares the ID3 and CANDIDATE-ELIMINATION algorithms, noting that ID3 has a preference bias while CANDIDATE-ELIMINATION has a restriction bias.
This document discusses Java applets and provides examples. It defines an applet as a Java program that can be run in a web browser. Applets must extend the Applet class and override certain methods like init(), start(), stop(), and destroy() that are called at different stages of the applet lifecycle. There are two main types of applets - ones based on the AWT and ones based on Swing. The document also discusses graphics drawing in applets using the Graphics class and setting colors and fonts. It provides examples of a scrolling banner applet and using the status window.
Object Oriented Programming using C++ Part IAjit Nayak
This document provides an introduction to object-oriented programming using C++. It outlines the topics that will be covered in the course, including fundamentals, simple programs, operators, data types, namespaces, function prototypes, references, default arguments, function overloading, and inline functions. It discusses the motivation for learning OOP and C++. The document also contains examples of simple C++ programs and explanations of concepts like function prototypes, call by value/reference, and overloading. The goal of the course is to understand object-oriented thinking and become familiar with programming in C++.
Signals are interrupts sent to a process when an event occurs, allowing communication between processes. Processes can send signals using the kill() system call and receive default actions like termination or be handled by a signal handler. Message queues allow processes to exchange typed messages through first-in-first-out queues, created with message queue identifiers. Processes use msgsnd() and msgrcv() to place messages on and receive messages from queues.
This document discusses how to build intelligent and awesome web applications using machine learning techniques in Python. It covers clustering algorithms like k-means clustering to group similar news articles. It also discusses classification algorithms like Naive Bayes classifiers to analyze sentiment of tweets. Recommendation systems using collaborative filtering are also presented. The document provides code examples in Django to implement clustering of news and sentiment analysis of tweets. It highlights challenges in machine learning and lists additional techniques like SVM, canopy clustering and locality sensitive hashing.
Mining Interesting Trivia for Entities from Wikipedia PART-IAbhay Prakash
The following presentation is on my Masters Graduate Thesis Work - "Mining Interesting Trivia for Entities from Wikipedia". This presentation covers complete and exact work that has been covered in our IJCAI accepted paper.
This presentation is the first part covering around 80% of content that I had presented in my mid term. There is another presentation with same title but with 'PART-II' in end which is in continuation of this presentation.
IJCAI 2015 Presentation: Did you know?- Mining Interesting Trivia for Entitie...Abhay Prakash
This document describes a method for automatically mining interesting trivia about entities from Wikipedia. It presents the Wikipedia Trivia Miner (WTM) system, which selects candidate sentences from Wikipedia pages and ranks them based on an interestingness model trained on human ratings. WTM uses linguistic and entity-based features to determine interestingness. Evaluation shows WTM outperforms baselines in precision and recall for retrieving interesting trivia about movie entities. The authors contribute a novel approach for mining interesting facts from text and make their data and code publicly available.
The document discusses an approach called "jungloid mining" to help programmers more easily perform common tasks by automatically finding concise code snippets (called "jungloids") to transform objects of one type to another. It observes that many programming problems can be described as searching for a jungloid to transform a single input object to a single output object type. It presents an algorithm and tool called Prospector that mines codebases for valid jungloids and represents them as paths in a graph, often finding the optimal jungloid via the shortest path. Future work aims to improve Prospector's understanding of semantics, types, and other graph-theoretic considerations.
The document discusses a lecture on machine learning algorithms. It covers recapping the ID3 algorithm, machine learning biases including language bias and preference bias, and decision tree learning. It also compares the ID3 and CANDIDATE-ELIMINATION algorithms, noting that ID3 has a preference bias while CANDIDATE-ELIMINATION has a restriction bias.
This document discusses Java applets and provides examples. It defines an applet as a Java program that can be run in a web browser. Applets must extend the Applet class and override certain methods like init(), start(), stop(), and destroy() that are called at different stages of the applet lifecycle. There are two main types of applets - ones based on the AWT and ones based on Swing. The document also discusses graphics drawing in applets using the Graphics class and setting colors and fonts. It provides examples of a scrolling banner applet and using the status window.
Object Oriented Programming using C++ Part IAjit Nayak
This document provides an introduction to object-oriented programming using C++. It outlines the topics that will be covered in the course, including fundamentals, simple programs, operators, data types, namespaces, function prototypes, references, default arguments, function overloading, and inline functions. It discusses the motivation for learning OOP and C++. The document also contains examples of simple C++ programs and explanations of concepts like function prototypes, call by value/reference, and overloading. The goal of the course is to understand object-oriented thinking and become familiar with programming in C++.
Signals are interrupts sent to a process when an event occurs, allowing communication between processes. Processes can send signals using the kill() system call and receive default actions like termination or be handled by a signal handler. Message queues allow processes to exchange typed messages through first-in-first-out queues, created with message queue identifiers. Processes use msgsnd() and msgrcv() to place messages on and receive messages from queues.
This document discusses how to build intelligent and awesome web applications using machine learning techniques in Python. It covers clustering algorithms like k-means clustering to group similar news articles. It also discusses classification algorithms like Naive Bayes classifiers to analyze sentiment of tweets. Recommendation systems using collaborative filtering are also presented. The document provides code examples in Django to implement clustering of news and sentiment analysis of tweets. It highlights challenges in machine learning and lists additional techniques like SVM, canopy clustering and locality sensitive hashing.
GraphQL est un language de requête mis en open source par Facebook en 2015 qui représente une alternative à REST. Après un bref récapitulatif sur le paradigme que propose GraphQL pour exposer de la donnée, nous verrons comment implémenter un serveur GraphQL en Scala grâce à la librairie Sangria.
Kotlin is a statically typed programming language that runs on the Java Virtual Machine and is fully interoperable with Java. It was developed by JetBrains as an alternative to Java for Android development, with improvements like null safety, lambdas, and concise syntax. Kotlin aims to be a safer language than Java by eliminating NullPointerExceptions and adding features like data classes, extensions, and higher-order functions. These features allow for more readable, concise code compared to Java.
Solving performance issues in Django ORMSian Lerk Lau
This document summarizes techniques for optimizing performance when working with large datasets in Django. It discusses using select_related, prefetch_related, values, and values_list to retrieve data in a lean way without entire model objects. It also covers string aggregation, setdefault, and get for serializing to-many relationships efficiently. The goal is to retrieve and serialize data from the database with as few queries as possible for better performance with large amounts of data.
PredictionIO - Building Applications That Predict User Behavior Through Big D...predictionio
Building Applications That Predict User Behavior Through Big Data Using Open-Source Technologies
Presented by PredictionIO at Big Data TechCon (Oct 17, 2013)
C# 4.0 introduces several new features including covariance and contravariance for generics, named and optional arguments, dynamic typing, and improvements to COM interoperability. The document discusses each new feature in detail and provides examples and resources for further reading. Key features covered are type variance for generics, named and optional parameters, dynamic dispatch without static types, and omitting the ref keyword for COM calls.
IronPython and Dynamic Languages on .NET by Mahesh Prakriyacodebits
This document discusses IronPython and the Dynamic Language Runtime (DLR) framework. It provides an overview of the DLR and how it allows dynamic languages like IronPython to run on the .NET Common Language Runtime. Key points include how the DLR uses expression trees to represent code and handle dynamic operations, how languages can generate and target DLR expression trees, and how .NET types can be customized through extension methods. Visual Studio integration and example uses of IronPython are also briefly mentioned.
Automated evaluation of crowdsourced annotations in the cultural heritage domaindreamgirl314
This document summarizes research on automatically evaluating crowdsourced annotations in cultural heritage collections. The researchers explored using machine learning techniques to predict the quality of annotations based on annotation and annotator features. Their results showed the techniques could predict useful annotations with 98% accuracy but only 13% accuracy for not useful annotations. The researchers believe more in-depth features are needed to better predict lower quality annotations.
This document discusses recommending job ads to people based on their profile and interests. It describes a job recommendation framework that uses features like a user's career path, social connections, interests and interactions to estimate the relevance of job postings. A regression model is trained on past user interactions to combine these feature scores. Additional filters may then be applied to further refine recommendations. Career path graphs are mined from user profiles to infer appropriate job roles and industries based on their experience and education. The system aims to identify job postings that closely match a user's demands and skills.
Max Koretskyi "Why are Angular and React so fast?"Fwdays
The document discusses optimization techniques used in Angular and React to improve performance. It explains that both frameworks use monomorphic property access by enforcing that all view/fiber nodes share the same "hidden class" or shape. This avoids expensive property lookups and allows properties to be accessed over 10,000 times faster. The document also discusses how bit fields and bit masks are used to efficiently represent side effects and other metadata in React fiber nodes. Bloom filters are mentioned as something Angular uses in its dependency injection system.
This presentation describes the use of XText.
This presentation assumes a good knowledge of Data Modeling and Grammars as previously presented.
This presentation is developed for MDD 2010 course at ITU, Denmark.
The document summarizes the evolution and future directions of the C# programming language. It discusses new features in recent versions such as generics in C# 2.0, language integrated query in C# 3.0, and dynamic programming in C# 4.0. It also covers trends toward declarative programming, concurrency, and compiler as a service. The presentation provides examples and demos of new C# 4.0 features like dynamic typing, optional and named parameters, and covariance and contravariance.
The main body of work related to supporting dynamic languages on the JVM at Oracle today is done within the Nashorn project. While on the surface it looks like we're busy creating a JavaScript runtime, in reality JavaScript is only the beginning, and not the ultimate goal. Nashorn has served as the proving ground for new approaches for implementing a dynamic language on top of the JVM, and we're eager to – once solidified – crystallize these into a reusable dynamic language implementer's toolkit. We have faced challenges of optimally mapping JavaScript local variables to JVM types (or: "hey, there's a static type inference algorithm in your dynamic language compiler"), doing liveness analysis, cutting up methods too large to fit into a single JVM method, efficiently representing large array and object literals in compiled code, creating a system for on-demand compilation of several type-specialized variants of the same function, and more. Along the way, we have reached the limits of our initial internal representation (fun fact: you can't do liveness analysis on an AST. We learned it the hard way.) and started sketching up an intermediate representation that would be easy to emit from a dynamic language compiler, and that could be taken over by a toolchain to perform the operations described above then on it and finally output standard Java bytecode for JIT to take over. Elevator pitch: like LLVM, but for dynamic languages on the JVM.
1. The document discusses the concepts of object-oriented programming including classes, objects, and member functions.
2. A class defines the data attributes and behaviors of a type of object. An object is an instance of a class that stores its own set of data attributes and can access class member functions.
3. The example defines a Book class with private data attributes (name, pages, price) and public member functions to change attribute values and display an object's attributes.
https://www.learntek.org/blog/nltk-sentiment-analysis/
Learntek is global online training provider on Big Data Analytics, Hadoop, Machine Learning, Deep Learning, IOT, AI, Cloud Technology, DEVOPS, Digital Marketing and other IT and Management courses.
Python is an open source programming language created in 1991 by Guido van Rossum to be easy to read. It is used by many large companies like NASA, Facebook, Google, and IBM in their core products and services. Python can be used for desktop apps, mobile apps, web apps, AI, machine learning, IoT, and more. It supports object oriented programming, is interpreted and extensible with libraries, and can be run on Windows and Linux. Popular Python web frameworks include Django and Flask.
As developers, we know what good and bad JavaScript APIs "feel" like, and yet we struggle with designing the kind of APIs that we enjoy using. But principles of good JavaScript API design do exist, and it's possible to extract them from several key libraries in the the proliferating JavaScript landscape. In this session, Brandon Satrom will do exactly that, digging into the design aspects of popular libraries like jQuery, Backbone, Knockout, Modernizer, Kendo UI and others to enumerate the designed-in qualities of these libraries that make them not only popular, but a pleasure to use.
Communicating effectively and consistently with students can help them feel at ease during their learning experience and provide the instructor with a communication trail to track the course's progress. This workshop will take you through constructing an engaging course container to facilitate effective communication.
GraphQL est un language de requête mis en open source par Facebook en 2015 qui représente une alternative à REST. Après un bref récapitulatif sur le paradigme que propose GraphQL pour exposer de la donnée, nous verrons comment implémenter un serveur GraphQL en Scala grâce à la librairie Sangria.
Kotlin is a statically typed programming language that runs on the Java Virtual Machine and is fully interoperable with Java. It was developed by JetBrains as an alternative to Java for Android development, with improvements like null safety, lambdas, and concise syntax. Kotlin aims to be a safer language than Java by eliminating NullPointerExceptions and adding features like data classes, extensions, and higher-order functions. These features allow for more readable, concise code compared to Java.
Solving performance issues in Django ORMSian Lerk Lau
This document summarizes techniques for optimizing performance when working with large datasets in Django. It discusses using select_related, prefetch_related, values, and values_list to retrieve data in a lean way without entire model objects. It also covers string aggregation, setdefault, and get for serializing to-many relationships efficiently. The goal is to retrieve and serialize data from the database with as few queries as possible for better performance with large amounts of data.
PredictionIO - Building Applications That Predict User Behavior Through Big D...predictionio
Building Applications That Predict User Behavior Through Big Data Using Open-Source Technologies
Presented by PredictionIO at Big Data TechCon (Oct 17, 2013)
C# 4.0 introduces several new features including covariance and contravariance for generics, named and optional arguments, dynamic typing, and improvements to COM interoperability. The document discusses each new feature in detail and provides examples and resources for further reading. Key features covered are type variance for generics, named and optional parameters, dynamic dispatch without static types, and omitting the ref keyword for COM calls.
IronPython and Dynamic Languages on .NET by Mahesh Prakriyacodebits
This document discusses IronPython and the Dynamic Language Runtime (DLR) framework. It provides an overview of the DLR and how it allows dynamic languages like IronPython to run on the .NET Common Language Runtime. Key points include how the DLR uses expression trees to represent code and handle dynamic operations, how languages can generate and target DLR expression trees, and how .NET types can be customized through extension methods. Visual Studio integration and example uses of IronPython are also briefly mentioned.
Automated evaluation of crowdsourced annotations in the cultural heritage domaindreamgirl314
This document summarizes research on automatically evaluating crowdsourced annotations in cultural heritage collections. The researchers explored using machine learning techniques to predict the quality of annotations based on annotation and annotator features. Their results showed the techniques could predict useful annotations with 98% accuracy but only 13% accuracy for not useful annotations. The researchers believe more in-depth features are needed to better predict lower quality annotations.
This document discusses recommending job ads to people based on their profile and interests. It describes a job recommendation framework that uses features like a user's career path, social connections, interests and interactions to estimate the relevance of job postings. A regression model is trained on past user interactions to combine these feature scores. Additional filters may then be applied to further refine recommendations. Career path graphs are mined from user profiles to infer appropriate job roles and industries based on their experience and education. The system aims to identify job postings that closely match a user's demands and skills.
Max Koretskyi "Why are Angular and React so fast?"Fwdays
The document discusses optimization techniques used in Angular and React to improve performance. It explains that both frameworks use monomorphic property access by enforcing that all view/fiber nodes share the same "hidden class" or shape. This avoids expensive property lookups and allows properties to be accessed over 10,000 times faster. The document also discusses how bit fields and bit masks are used to efficiently represent side effects and other metadata in React fiber nodes. Bloom filters are mentioned as something Angular uses in its dependency injection system.
This presentation describes the use of XText.
This presentation assumes a good knowledge of Data Modeling and Grammars as previously presented.
This presentation is developed for MDD 2010 course at ITU, Denmark.
The document summarizes the evolution and future directions of the C# programming language. It discusses new features in recent versions such as generics in C# 2.0, language integrated query in C# 3.0, and dynamic programming in C# 4.0. It also covers trends toward declarative programming, concurrency, and compiler as a service. The presentation provides examples and demos of new C# 4.0 features like dynamic typing, optional and named parameters, and covariance and contravariance.
The main body of work related to supporting dynamic languages on the JVM at Oracle today is done within the Nashorn project. While on the surface it looks like we're busy creating a JavaScript runtime, in reality JavaScript is only the beginning, and not the ultimate goal. Nashorn has served as the proving ground for new approaches for implementing a dynamic language on top of the JVM, and we're eager to – once solidified – crystallize these into a reusable dynamic language implementer's toolkit. We have faced challenges of optimally mapping JavaScript local variables to JVM types (or: "hey, there's a static type inference algorithm in your dynamic language compiler"), doing liveness analysis, cutting up methods too large to fit into a single JVM method, efficiently representing large array and object literals in compiled code, creating a system for on-demand compilation of several type-specialized variants of the same function, and more. Along the way, we have reached the limits of our initial internal representation (fun fact: you can't do liveness analysis on an AST. We learned it the hard way.) and started sketching up an intermediate representation that would be easy to emit from a dynamic language compiler, and that could be taken over by a toolchain to perform the operations described above then on it and finally output standard Java bytecode for JIT to take over. Elevator pitch: like LLVM, but for dynamic languages on the JVM.
1. The document discusses the concepts of object-oriented programming including classes, objects, and member functions.
2. A class defines the data attributes and behaviors of a type of object. An object is an instance of a class that stores its own set of data attributes and can access class member functions.
3. The example defines a Book class with private data attributes (name, pages, price) and public member functions to change attribute values and display an object's attributes.
https://www.learntek.org/blog/nltk-sentiment-analysis/
Learntek is global online training provider on Big Data Analytics, Hadoop, Machine Learning, Deep Learning, IOT, AI, Cloud Technology, DEVOPS, Digital Marketing and other IT and Management courses.
Python is an open source programming language created in 1991 by Guido van Rossum to be easy to read. It is used by many large companies like NASA, Facebook, Google, and IBM in their core products and services. Python can be used for desktop apps, mobile apps, web apps, AI, machine learning, IoT, and more. It supports object oriented programming, is interpreted and extensible with libraries, and can be run on Windows and Linux. Popular Python web frameworks include Django and Flask.
As developers, we know what good and bad JavaScript APIs "feel" like, and yet we struggle with designing the kind of APIs that we enjoy using. But principles of good JavaScript API design do exist, and it's possible to extract them from several key libraries in the the proliferating JavaScript landscape. In this session, Brandon Satrom will do exactly that, digging into the design aspects of popular libraries like jQuery, Backbone, Knockout, Modernizer, Kendo UI and others to enumerate the designed-in qualities of these libraries that make them not only popular, but a pleasure to use.
Communicating effectively and consistently with students can help them feel at ease during their learning experience and provide the instructor with a communication trail to track the course's progress. This workshop will take you through constructing an engaging course container to facilitate effective communication.
বাংলাদেশের অর্থনৈতিক সমীক্ষা ২০২৪ [Bangladesh Economic Review 2024 Bangla.pdf] কম্পিউটার , ট্যাব ও স্মার্ট ফোন ভার্সন সহ সম্পূর্ণ বাংলা ই-বুক বা pdf বই " সুচিপত্র ...বুকমার্ক মেনু 🔖 ও হাইপার লিংক মেনু 📝👆 যুক্ত ..
আমাদের সবার জন্য খুব খুব গুরুত্বপূর্ণ একটি বই ..বিসিএস, ব্যাংক, ইউনিভার্সিটি ভর্তি ও যে কোন প্রতিযোগিতা মূলক পরীক্ষার জন্য এর খুব ইম্পরট্যান্ট একটি বিষয় ...তাছাড়া বাংলাদেশের সাম্প্রতিক যে কোন ডাটা বা তথ্য এই বইতে পাবেন ...
তাই একজন নাগরিক হিসাবে এই তথ্য গুলো আপনার জানা প্রয়োজন ...।
বিসিএস ও ব্যাংক এর লিখিত পরীক্ষা ...+এছাড়া মাধ্যমিক ও উচ্চমাধ্যমিকের স্টুডেন্টদের জন্য অনেক কাজে আসবে ...
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) CurriculumMJDuyan
(𝐓𝐋𝐄 𝟏𝟎𝟎) (𝐋𝐞𝐬𝐬𝐨𝐧 𝟏)-𝐏𝐫𝐞𝐥𝐢𝐦𝐬
𝐃𝐢𝐬𝐜𝐮𝐬𝐬 𝐭𝐡𝐞 𝐄𝐏𝐏 𝐂𝐮𝐫𝐫𝐢𝐜𝐮𝐥𝐮𝐦 𝐢𝐧 𝐭𝐡𝐞 𝐏𝐡𝐢𝐥𝐢𝐩𝐩𝐢𝐧𝐞𝐬:
- Understand the goals and objectives of the Edukasyong Pantahanan at Pangkabuhayan (EPP) curriculum, recognizing its importance in fostering practical life skills and values among students. Students will also be able to identify the key components and subjects covered, such as agriculture, home economics, industrial arts, and information and communication technology.
𝐄𝐱𝐩𝐥𝐚𝐢𝐧 𝐭𝐡𝐞 𝐍𝐚𝐭𝐮𝐫𝐞 𝐚𝐧𝐝 𝐒𝐜𝐨𝐩𝐞 𝐨𝐟 𝐚𝐧 𝐄𝐧𝐭𝐫𝐞𝐩𝐫𝐞𝐧𝐞𝐮𝐫:
-Define entrepreneurship, distinguishing it from general business activities by emphasizing its focus on innovation, risk-taking, and value creation. Students will describe the characteristics and traits of successful entrepreneurs, including their roles and responsibilities, and discuss the broader economic and social impacts of entrepreneurial activities on both local and global scales.
Leveraging Generative AI to Drive Nonprofit InnovationTechSoup
In this webinar, participants learned how to utilize Generative AI to streamline operations and elevate member engagement. Amazon Web Service experts provided a customer specific use cases and dived into low/no-code tools that are quick and easy to deploy through Amazon Web Service (AWS.)
Temple of Asclepius in Thrace. Excavation resultsKrassimira Luka
The temple and the sanctuary around were dedicated to Asklepios Zmidrenus. This name has been known since 1875 when an inscription dedicated to him was discovered in Rome. The inscription is dated in 227 AD and was left by soldiers originating from the city of Philippopolis (modern Plovdiv).
How to Make a Field Mandatory in Odoo 17Celine George
In Odoo, making a field required can be done through both Python code and XML views. When you set the required attribute to True in Python code, it makes the field required across all views where it's used. Conversely, when you set the required attribute in XML views, it makes the field required only in the context of that particular view.
Chapter wise All Notes of First year Basic Civil Engineering.pptxDenish Jangid
Chapter wise All Notes of First year Basic Civil Engineering
Syllabus
Chapter-1
Introduction to objective, scope and outcome the subject
Chapter 2
Introduction: Scope and Specialization of Civil Engineering, Role of civil Engineer in Society, Impact of infrastructural development on economy of country.
Chapter 3
Surveying: Object Principles & Types of Surveying; Site Plans, Plans & Maps; Scales & Unit of different Measurements.
Linear Measurements: Instruments used. Linear Measurement by Tape, Ranging out Survey Lines and overcoming Obstructions; Measurements on sloping ground; Tape corrections, conventional symbols. Angular Measurements: Instruments used; Introduction to Compass Surveying, Bearings and Longitude & Latitude of a Line, Introduction to total station.
Levelling: Instrument used Object of levelling, Methods of levelling in brief, and Contour maps.
Chapter 4
Buildings: Selection of site for Buildings, Layout of Building Plan, Types of buildings, Plinth area, carpet area, floor space index, Introduction to building byelaws, concept of sun light & ventilation. Components of Buildings & their functions, Basic concept of R.C.C., Introduction to types of foundation
Chapter 5
Transportation: Introduction to Transportation Engineering; Traffic and Road Safety: Types and Characteristics of Various Modes of Transportation; Various Road Traffic Signs, Causes of Accidents and Road Safety Measures.
Chapter 6
Environmental Engineering: Environmental Pollution, Environmental Acts and Regulations, Functional Concepts of Ecology, Basics of Species, Biodiversity, Ecosystem, Hydrological Cycle; Chemical Cycles: Carbon, Nitrogen & Phosphorus; Energy Flow in Ecosystems.
Water Pollution: Water Quality standards, Introduction to Treatment & Disposal of Waste Water. Reuse and Saving of Water, Rain Water Harvesting. Solid Waste Management: Classification of Solid Waste, Collection, Transportation and Disposal of Solid. Recycling of Solid Waste: Energy Recovery, Sanitary Landfill, On-Site Sanitation. Air & Noise Pollution: Primary and Secondary air pollutants, Harmful effects of Air Pollution, Control of Air Pollution. . Noise Pollution Harmful Effects of noise pollution, control of noise pollution, Global warming & Climate Change, Ozone depletion, Greenhouse effect
Text Books:
1. Palancharmy, Basic Civil Engineering, McGraw Hill publishers.
2. Satheesh Gopi, Basic Civil Engineering, Pearson Publishers.
3. Ketki Rangwala Dalal, Essentials of Civil Engineering, Charotar Publishing House.
4. BCP, Surveying volume 1
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Mining Interesting Trivia for Entities from Wikipedia PART-II
1. Mining Interesting Trivia for Entities
from Wikipedia
Supervised By: Presented By:
Dr. Dhaval Patel,
Assistant Professor,
IIT Roorkee
Abhay Prakash,
En. No. - 10211002,
IIT Roorkee
Dr. Manoj K. Chinnakotla,
Applied Researcher,
Microsoft India
2. Publication Accepted
[1] Abhay Prakash, Manoj K. Chinnakotla, Dhaval Patel, Puneet Garg: “Did you
know?- Mining Interesting Trivia for Entities from Wikipedia”. In 24th
International Joint Conference on Artificial Intelligence (IJCAI), 2015.
Conference Rating: A*
3. Introduction: Problem Statement
Definition: Trivia are any facts about an entity which are interesting due to any
of the following characteristics - unusualness, uniqueness, unexpectedness or
weirdness.
Generally appear in “Did you know?” articles
E.g. “To prepare for Joker’s role, Heath Ledger secluded himself in a hotel room for a month” [Batman
Begins]
Unusual for an actor/human to seclude himself for a month
Problem Statement: For a given entity, mine top-k interesting trivia from its Wikipedia
page, where a trivia is considered interesting if when it is shown to 𝑁 persons, more
than 𝑁/2 persons find it interesting.
For evaluation of unseen set, we chose 𝑁 = 5 (statistical significance discussed in mid evaluation)
4. Wikipedia Trivia Miner (WTM)
Based on ML approach to mine trivia from unstructured text
Trains a ranker using sample trivia of target domain
Experiment with Movie entities and Celebrity entities
Harness trained ranker to mine Trivia from entity’s Wikipedia page
Retrieves Top-k standalone interesting sentences from entity’s page
Why Wikipedia?
Reliable for factual correctness
Ample # of interesting trivia (56/100 in expt.)
5. System Architecture
Filtering & Grading
Filters out noisy samples
Give a grade to each sample, as reqd. by ranker
Interestingness Ranker
Extracts features from the samples/candidates
Trains ranker(SVMrank)/Ranks candidates
Candidate Selection
Identifies candidates from Wikipedia
Candidate
Selection
Human Voted Trivia Source
Train Dataset Candidates’ Source
Top-K Interesting Trivia
from Candidates
Wikipedia Trivia Miner (WTM)
Interestingness Ranker
Filtering & Grading
Feature Extraction Feature ExtractionSVMrank
Knowledge Base
6. Candidate
Selection
Candidates’ Source
Top-K Interesting Trivia
from Candidates
Feature ExtractionSVMrank
Knowledge Base
Retrieval Phase
Human Voted Trivia Source
Train Dataset
Filtering & Grading
Feature Extraction SVMrank
Train Phase
Model
Execution Phases
Train Phase
Crawls and prepares train data
Featurize the train data
Trains SVMrank to build a model
Retrieval Phase
Crawls entity’s Wikipedia text
Identify candidates for trivia
Featurize the candidates
Rank the candidates using
already built model
7. Feature Engineering
Bucket Feature Significance Sample features Example Trivia
Unigram (U)
Features
Each word’s
TF-IDF
Identify imp. words which
make the trivia interesting
“stunt”, “award”,
“improvise”
“Tom Cruise did all of his own stunt driving.”
Linguistic (L)
Features
Superlative
Words
Shows the extremeness
(uniqueness)
“best”, “longest”,
“first”
“The longest animated Disney film since
Fantasia (1940).”
Contradictory
Words
Opposing ideas could spark
intrigue and interest
“but”, “although”,
“unlike”
“The studios wanted Matthew McConaughey
for lead role, but James Cameron insisted on
Leonardo DiCaprio.”
Root Word
(Main Verb)
Captures core activity being
discussed in the sentence
root_gross “Gravity grossed $274 Mn in North America”
Subject Word
(First Noun)
Captures core thing being
discussed in the sentence
subj_actor “The actors snorted crushed B vitamins for
scenes involving cocaine”
Readability Complex and lengthy trivia
are hardly interesting
FOG Index binned
in 3 bins ---
8. Feature Engineering (Contd…)
Bucket Feature Significance Sample features Example Trivia
Entity (E)
Features
Generic NEs captures general about-
ness
MONEY,
ORGANIZATION,
PERSON, DATE, TIME
and LOCATION
“The guns in the film were supplied by Aldo
Uberti Inc., a company in Italy.”
• ORGANIZATION and LOCATION
Related
Entities
captures specific about-
ness
(Entities resolved using
DBPedia)
entity_producer,
entity_director
“According to Victoria Alonso, Rocket Raccoon
and Groot were created through a mix of
motion-capture and rotomation VFX.”
• entity_producer, entity_character
Entity Linking
before
(L) Parsing
Captures generalized
story of sentence
subj_entity_producer [The same trivia above]
• “According to entity_producer, …”
• subj_Victoria subj_entity_producer
Focus Entities Captures core entities
being talked about
underroot_entity_
producer
[The same trivia above]
• underroot_entity_producer,
underroot_entity_character
9. Feature Engineering: Example
Ex. “According to Victoria Alonso, Rocket Raccoon and Groot were created through a mix of
motion-capture and rotomation VFX.”
Features extracted: 18025 (U) + 5 (L) + 4686 (E) columns in total for all train data
Rest of the features have value 0.
entity_actor = 0, award = 0, subj_actor = 0, root_win = 0, ….
create mix motion capture rotomation VFX root_create supPOS subj_entity_producer FOG
0.25 0.75 0.96 0.4 0.85 0.75 1 0 1 3
contradictory entity_producer entity_character underroot_entiy_producer underroot_entity_character
0 1 1 1 1
10. Comparative Approaches
I. Random [Baseline I]:
- 10 sentences picked randomly from Wikipedia
II. CS + Random
- Candidates Selected (standalone context independent sentences)
- i.e., remove sentences like “it really reminds me of my childhood”
- 10 sentences picked randomly from candidates
III. CS + supPOS(Best) [Baseline II]:
- Candidates Selected
- Ranked by # of sup. words
- Deliberately taking interesting sent. for same # of sup. words
Rank # of sup.
words
Class
1 2 Interesting
2 2 Boring
3 1 Interesting
4 1 Interesting
5 1 Interesting
6 1 Boring
7 1 Boring
supPOS (Best Case)
11. Variants of WTM
I. WTM (U)
- Candidates Selected
- ML Ranking of candidates using only Unigram Features
II. WTM (U+L+E)
- Candidates Selected
- ML Ranking of candidates using all features: Unigram (U) + Linguistic (L) + Entity (E)
12. Results: P@10
Metric is Precision at 10 (P@10), which
means out of top 10 ranked candidates,
how many actually are interesting
0.25
0.3
0.34 0.34
0.45
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Random CS+Random supPOS
(Best Case)
WTM (U) WTM
(U+L+E)
P@10
Approaches
13. Results: P@10
Metric is Precision at 10 (P@10), which
means out of top 10 ranked candidates,
how many actually are interesting
CS+Random > Random
Shows significance of Candidate
Selection
0.25
0.3
0.34 0.34
0.45
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Random CS+Random supPOS
(Best Case)
WTM (U) WTM
(U+L+E)
P@10
Approaches
14. Results: P@10
Metric is Precision at 10 (P@10), which
means out of top 10 ranked candidates,
how many actually are interesting
CS+Random > Random
Shows significance of Candidate
Selection
WTM (U+L+E) >> WTM (U)
Shows significance of Engineered
Linguistic (L) and Entity (E) Features
0.25
0.3
0.34 0.34
0.45
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Random CS+Random supPOS
(Best Case)
WTM (U) WTM
(U+L+E)
P@10
Approaches
15. Results: Recall@K
supPOS limited to one kind of trivia
WTM captures varied types
62% recall till rank 25
Performance Comparison
supPOS better till rank 3
Soon after rank 3, WTM beats superPOS
0
10
20
30
40
50
60
70
0 5 10 15 20 25
%Recall
Rank
SuperPOS (Best Case) WTM Random
16. Sensitivity to Training Size
Current Results reported with 6163 Train
Trivia
WTM precision increases with train size
Desirable property as precision can be
improved by taking more train data
17. WTM’s Domain Independence
Experiment on Celebrity Domain to justify claim of domain independence.
Dataset:
Crawled Trivia for Top 1000 Movie celebrities from IMDB and did 5 fold test
Train dataset: 4459 Trivia (106 entities)
Test dataset: 500 Trivia (10 entities)
Doubtful feature for being domain dependent – Entity Features
Unigram (E) Features Linguistic (L) Features Entity (E) Features
All words subj_actor, root_reveal,
subj_scene, but, best,
FOG_index = 7.2
entity_producer,
entity_director, …
18. WTM’s Domain Independence (Contd…)
Entity Features are domain independent too
Entity Features are automatically generated using attribute:value pairs in DBpedia
For a matching of ‘value’ in sentence, the match is replaced by entity_‘attribute’
Unigram (U) and Linguistic (L) features clearly domain independent
DBpedia (attribute: value) pairs for Batman BeginsSample Trivia (Batman Begins)
19. WTM’s Domain Independence (Contd…)
Entity Features are domain independent too
Entity Features are automatically generated using attribute:value pairs in DBpedia
For a matching of ‘value’ in sentence, the match is replaced by entity_‘attribute’
Unigram (U) and Linguistic (L) features clearly domain independent
DBpedia (attribute: value) pairs for Batman BeginsSample Trivia (Batman Begins)
20. FEATURE ENTITY TRIVIA
entity_partner Johnny Depp Engaged to Amber Heard [January 17, 2014].**
entity_citizenship Nicole Kidman First Australian actress to win the Best Actress Academy Award.
** After Entity Linking sentence parsed as “Engaged to entity_partner”
Entity Feature Generation from DBpedia
Example of Entity Features in Celebrity Domain
WTM’s Domain Independence (Contd…)
Movie Domain (ex. Batman Begins (2005) ) Celebrity Domain (ex. Angelina Jolie)
DBpedia attribute:value Feature generated DBpedia attribute:value Feature generated
Director: Christopher Nolan entity_director Partner: Brad Pitt entity_partner
Producer: Larry J. Franco entity_producer birthplace: California entity_birthPlace
21. Feature Contribution (Movie v/s Celeb.)
Rank Feature Group
1 win Unigram
3 magazine Unigram
4 superPOS Linguistic
5 MONEY Entity (NER)
6 entity_alternativenames Entity
7 root_engage Linguistic
14 subj_earnings Linguistic
15 subj_entity_children Linguistic + Entity
18 entity_birthplace Entity
19 subj_unlinked_location Linguistic + Entity
Rank Feature Group
1 subj_scene Linguistic
2 subj_entity_cast Linguistic + Entity
3 entity_produced_by Entity
4 underroot_unlinked_organization Linguistic + Entity
6 root_improvise Linguistic
7 entity_character Entity
8 MONEY Entity (NER)
14 stunt Unigram
16 superPOS Linguistic
17 subj_actor Linguistic
Top Features: Our advanced features are useful and intuitive for humans too
Entity Linking leads to better generalization (instead of entity_wolverine, model gets entity_cast)
Movie Domain Celebrity Domain
22. Results: P@10 (Celebrity Domain)
0.39
0.54
0.58
0.71
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Random supPOS(Best
Case)
WTM (U) WTM
(U+L+E)
P@10
Approaches
Again WTM (U+L+E) >> WTM (U)
Significance of advanced (L) and (E)
features
Hence, Features and Approach are
Domain Independent
For entities of any domain, just replace
Train Data (Sample Trivia)
23. Dissertation Contribution
Identified, Defined and Provided a novel research problem
not just only providing solutions to existing problem
Proposed a Domain Independent system “Wikipedia Trivia Miner (WTM)”
To mine top-k interesting trivia for any given entity based on their interestingness
Engineered features that capture ‘about-ness’ of sentence
Generalizes which one are interesting
Proposed a mechanism to prepare ground truth for test-set
Cost-effective but statistically significant
24. Future Works
New Features to increase Ranking Quality
Unusualness: Probability of occurrence of the sentence in considered domain
Fact Popularity: Lesser known trivia could be more interesting to majority people
Trying Deep Learning
Could be helpful as in case of sarcasm detection
Generating Questions from mined trivia
To present Trivia in question form
Obtaining personalized Interesting Trivia
In this dissertation work, we took interesting based on majority voting. Ranking based on user
demographics