An introductory presentation about the current state of personalization in (Web) search for Bibliotekarforbundet's series of 'gå-hjem-møder'. Presented on May 17, 2016 at Aalborg University Copenhagen.
The document discusses the concept of personalized web search (PWS). It notes that generic web search engines cannot identify different user needs, so PWS was introduced to personalize search results. Various techniques for PWS are discussed, including user profiling using demographic and interest data, hyperlink analysis, and community-based or location-based approaches. Maintaining accurate user profiles that respect privacy is also addressed. Potential applications and limitations of PWS are mentioned.
Web crawlers, also known as robots or bots, are programs that systematically browse the internet and index websites for search engines. Crawlers follow links from seed URLs and download pages to extract new URLs to crawl. They use techniques like breadth-first crawling to efficiently discover as much of the web as possible. Crawlers must have policies to select pages, revisit sites, be polite to not overload websites, and coordinate distributed crawling. Their high-performance architecture is crucial for search engines to comprehensively index the large and constantly changing web.
Case-based reasoning is a problem-solving process that uses specific examples of previously experienced problems, called cases, to solve new problems. There are four main processes in case-based reasoning: retrieve, reuse, revise, and retain. Retrieve involves finding past cases similar to the new problem. Reuse means applying solutions from similar past cases to the new problem. Revise re-tests solutions to see if they work or need adjusting for the new problem. Retain stores the new experience so future problems can retrieve and reuse it. The document provides an example of using case-based reasoning to solve a new problem with Android TV software where video will not play. It describes applying each step of the process to find and apply a past
This document discusses recommendation techniques. It begins by outlining researchers' current troubles with finding and connecting relevant information in a timely manner. It then introduces recommendation techniques as having the potential to greatly influence all aspects of life by addressing these problems. The document defines recommendation techniques as systems that predict items a user may be interested in based on their preferences and activities. It categorizes techniques based on the data sources used, such as user demographics, item attributes, user ratings, and knowledge about users and items. Different recommendation approaches are described, including non-personalized, content-based, collaborative filtering, and knowledge-based techniques. The document concludes by thanking the audience and inviting them to learn more in future classes.
Probabilistic Interestingness Measures - An Introduction with Bayesian Belief...Adnan Masood
This document discusses various approaches to measuring the interestingness of patterns discovered during data mining. It describes objective interestingness measures based only on the data, like conciseness, generality, reliability, peculiarity and diversity. Subjective measures take into account user knowledge and expectations, evaluating novelty and surprisingness. Semantic measures consider pattern semantics and explanations, focusing on utility and actionability. The document also discusses limitations of typical objective measures like support and confidence, and outlines subjective approaches involving user impressions at different levels of knowledge granularity.
Boston ML - Architecting Recommender SystemsJames Kirk
This document provides an overview of key concepts in recommender systems, including:
- The components of a recommender system including users, items, interactions, features, representations, predictions, loss functions, and learning.
- Design considerations for recommender systems such as choosing appropriate interaction values, features, representation functions, prediction functions, and loss functions.
- Examples of different types of recommender systems including collaborative filtering, content-based, hybrid, and real-world systems from Netflix, YouTube, and e-commerce.
- Tools for building recommender systems in Python like Implicit, Scikit-Learn, LightFM, TensorRec, and Annoy.
The document discusses the architecture
The document discusses the concept of personalized web search (PWS). It notes that generic web search engines cannot identify different user needs, so PWS was introduced to personalize search results. Various techniques for PWS are discussed, including user profiling using demographic and interest data, hyperlink analysis, and community-based or location-based approaches. Maintaining accurate user profiles that respect privacy is also addressed. Potential applications and limitations of PWS are mentioned.
Web crawlers, also known as robots or bots, are programs that systematically browse the internet and index websites for search engines. Crawlers follow links from seed URLs and download pages to extract new URLs to crawl. They use techniques like breadth-first crawling to efficiently discover as much of the web as possible. Crawlers must have policies to select pages, revisit sites, be polite to not overload websites, and coordinate distributed crawling. Their high-performance architecture is crucial for search engines to comprehensively index the large and constantly changing web.
Case-based reasoning is a problem-solving process that uses specific examples of previously experienced problems, called cases, to solve new problems. There are four main processes in case-based reasoning: retrieve, reuse, revise, and retain. Retrieve involves finding past cases similar to the new problem. Reuse means applying solutions from similar past cases to the new problem. Revise re-tests solutions to see if they work or need adjusting for the new problem. Retain stores the new experience so future problems can retrieve and reuse it. The document provides an example of using case-based reasoning to solve a new problem with Android TV software where video will not play. It describes applying each step of the process to find and apply a past
This document discusses recommendation techniques. It begins by outlining researchers' current troubles with finding and connecting relevant information in a timely manner. It then introduces recommendation techniques as having the potential to greatly influence all aspects of life by addressing these problems. The document defines recommendation techniques as systems that predict items a user may be interested in based on their preferences and activities. It categorizes techniques based on the data sources used, such as user demographics, item attributes, user ratings, and knowledge about users and items. Different recommendation approaches are described, including non-personalized, content-based, collaborative filtering, and knowledge-based techniques. The document concludes by thanking the audience and inviting them to learn more in future classes.
Probabilistic Interestingness Measures - An Introduction with Bayesian Belief...Adnan Masood
This document discusses various approaches to measuring the interestingness of patterns discovered during data mining. It describes objective interestingness measures based only on the data, like conciseness, generality, reliability, peculiarity and diversity. Subjective measures take into account user knowledge and expectations, evaluating novelty and surprisingness. Semantic measures consider pattern semantics and explanations, focusing on utility and actionability. The document also discusses limitations of typical objective measures like support and confidence, and outlines subjective approaches involving user impressions at different levels of knowledge granularity.
Boston ML - Architecting Recommender SystemsJames Kirk
This document provides an overview of key concepts in recommender systems, including:
- The components of a recommender system including users, items, interactions, features, representations, predictions, loss functions, and learning.
- Design considerations for recommender systems such as choosing appropriate interaction values, features, representation functions, prediction functions, and loss functions.
- Examples of different types of recommender systems including collaborative filtering, content-based, hybrid, and real-world systems from Netflix, YouTube, and e-commerce.
- Tools for building recommender systems in Python like Implicit, Scikit-Learn, LightFM, TensorRec, and Annoy.
The document discusses the architecture
This document provides an overview of text mining and web mining. It defines data mining and describes the common data mining tasks of classification, clustering, association rule mining and sequential pattern mining. It then discusses text mining, defining it as the process of analyzing unstructured text data to extract meaningful information and structure. The document outlines the seven practice areas of text mining as search/information retrieval, document clustering, document classification, web mining, information extraction, natural language processing, and concept extraction. It provides brief descriptions of the problems addressed within each practice area.
This document discusses modelling and representing social network data ontologically. It covers representing social individuals and relationships ontologically, as well as aggregating and reasoning with social network data. It discusses ontology languages like RDF, OWL, and FOAF that can be used to represent social network data and individuals semantically. It also talks about state-of-the-art approaches for representing network structure and attribute data, and the need for representations that can integrate different data sources and maintain identity.
Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018Massimo Quadrana
Slides of the Tutorial on Sequence Aware Recommenders held at ACM RecSys 2018 in Vancouver.
Link to the website: https://sites.google.com/view/seq-recsys-tutorial
Link to the hands-on: https://github.com/mquad/sars_tutorial
This document describes a web service that analyzes web crawl data to provide contextual information about locations. It extracts topics like weather, healthcare, crime, and employment that are relevant to a given location from common crawl data stored on Amazon S3. The system uses Apache Pig on a Hadoop cluster to analyze the data, builds an index of locations to associated words, and makes the results searchable through Elastic Search. It aims to provide useful information to people moving to new places, policy makers, journalists, and researchers.
This document provides an introduction to data mining. It discusses the evolution of data mining technology, defines what data mining is, and outlines common data mining tasks like classification, clustering, and association rule discovery. The document also examines the KDD process, different types of data that can be mined, and major issues in data mining like scalability, handling diverse data types, and integrating discovered knowledge.
The document discusses the emergence of the social web and the relationship between Web 2.0 and the Semantic Web. It describes how blogs, wikis, and social networks enabled new forms of user-generated content and social interaction online in the early 2000s. The document also explains how Semantic Web technologies could enhance Web 2.0 by enabling the standardized exchange and combination of user data and services.
Nmap is a network scanning tool that can perform port scanning, operating system detection, and version detection among other features. It works by sending TCP and UDP packets to a target machine and examining the response, comparing it to its database to determine open ports and operating system. There are different scanning techniques that can be used like TCP SYN scanning, UDP scanning, and OS detection. Nmap also includes a scripting engine that allows users to write scripts to automate networking tasks. The presentation concludes with demonstrating Nmap's features through some examples.
This document discusses web usage mining. It begins by defining web mining and its three categories: web content mining, web structure mining, and web usage mining. The main focus is on web usage mining, which involves discovering user navigation patterns and predicting user behavior. The key processes of web usage mining are preprocessing raw data, pattern discovery using algorithms, and pattern analysis. Pattern discovery techniques discussed include statistical analysis, clustering, classification, association rules, and sequential patterns. Potential applications are personalized recommendations, system improvements, and business intelligence. The document concludes by discussing future research directions such as usage mining on the semantic web and analyzing discovered patterns.
The document discusses the need for 3D search engines and describes a system for searching and retrieving 3D models from large online repositories. The system allows users to query the repository using text, 2D sketches, 3D sketches, or a combination. It indexes 3D models based on shape and text descriptions and returns the top matching results to the user in under a quarter of a second.
Tutorial: Context-awareness In Information Retrieval and Recommender SystemsYONG ZHENG
The document provides an overview of a tutorial on context-awareness in information retrieval and recommender systems. It discusses topics such as information overload, solutions like information retrieval (e.g. search engines) and recommender systems (e.g. movie recommendations). It then covers context and context-awareness, giving examples like how recommendations may change based on location, time, user intent, etc. It also discusses incorporating context-awareness into information retrieval and recommender systems to improve recommendations.
The document provides an overview of recommender systems. It discusses the typical architecture of recommender systems and describes three main types: collaborative filtering systems, content-based systems, and knowledge-based systems. It also covers paradigms like collaborative filtering, content-based, knowledge-based, and hybrid recommender systems. The document then focuses on collaborative filtering techniques like user-based nearest neighbor collaborative filtering and item-based collaborative filtering. It also discusses latent factor models, matrix factorization approaches, and context-based recommender systems.
Personalizing Session-based Recommendations with Hierarchical Recurrent Neura...Massimo Quadrana
This document summarizes a research paper on personalizing session-based recommendations with hierarchical recurrent neural networks (HRNNs). The paper proposes using HRNNs to decouple user and session representations, with a user RNN that evolves the user's latent state across sessions and a session RNN that generates personalized recommendations for each session. Experiments on job posting and online video datasets show the HRNN approach outperforms baselines and other RNN methods, particularly for users with longer histories, by up to 28% in recall and 41% in MRR. The HRNN approach effectively transfers cross-session knowledge to improve session-based recommendations.
This document provides an overview of information retrieval models, including vector space models, TF-IDF, Doc2Vec, and latent semantic analysis. It begins with basic concepts in information retrieval like document indexing and relevance scoring. Then it discusses vector space models and how documents and queries are represented as vectors. TF-IDF weighting is explained as assigning higher weight to rare terms. Doc2Vec is introduced as an extension of word2vec to learn document embeddings. Latent semantic analysis uses singular value decomposition to project documents to a latent semantic space. Implementation details and examples are provided for several models.
Personalized Information Retrieval system using Computational Intelligence Te...veningstonk
The document presents research on developing a personalized information retrieval system using computational intelligence techniques. It discusses four proposed models: 1) a term association graph model for document re-ranking, 2) a topic model for document re-ranking, 3) a genetic intelligence model for document re-ranking, and 4) a swarm intelligence model for search query reformulation. The objectives are to improve retrieval effectiveness using term graphs and enhance personalized ranking using user topic modeling. Computational techniques like genetic algorithms and ant colony optimization will be used to re-rank documents and reformulate queries.
This document discusses various algorithms for multi-armed bandit problems including k-armed bandits, action value methods like epsilon-greedy, tracking non-stationary problems, optimistic initial values, upper confidence bound action selection, gradient bandit algorithms, contextual bandits, and Thomson sampling. The k-armed bandit problem involves choosing actions to maximize reward over time without knowing the expected reward of each action. The document outlines methods for balancing exploration of unknown actions with exploitation of best known actions.
The document discusses web crawlers, which are programs that download web pages to help search engines index websites. It explains that crawlers use strategies like breadth-first search and depth-first search to systematically crawl the web. The architecture of crawlers includes components like the URL frontier, DNS lookup, and parsing pages to extract links. Crawling policies determine which pages to download and when to revisit pages. Distributed crawling improves efficiency by using multiple coordinated crawlers.
The document proposes a privacy-preserving personalized web search framework called UPS. It aims to generalize user profiles for each query according to user-specified privacy requirements, while balancing personalization utility and privacy risk. Two algorithms, GreedyDP and GreedyIL, are developed to support runtime profile generalization. An online mechanism is also provided to decide whether personalizing a query would be beneficial without compromising privacy. Experiments show the effectiveness and efficiency of the UPS framework in achieving personalized search results while preserving user privacy.
The document proposes a privacy-preserving personalized web search framework called UPS. UPS can generalize user profiles for each query according to user-specified privacy requirements. It uses two metrics to evaluate privacy breach risk and query utility for hierarchical user profiles. The framework includes two algorithms for generalizing user profiles at runtime. It also provides an online mechanism for deciding whether to personalize queries to enhance search quality while avoiding unnecessary privacy exposures. Extensive experiments showed the proposed approach efficiently protects privacy during personalized web searches.
This document provides an overview of text mining and web mining. It defines data mining and describes the common data mining tasks of classification, clustering, association rule mining and sequential pattern mining. It then discusses text mining, defining it as the process of analyzing unstructured text data to extract meaningful information and structure. The document outlines the seven practice areas of text mining as search/information retrieval, document clustering, document classification, web mining, information extraction, natural language processing, and concept extraction. It provides brief descriptions of the problems addressed within each practice area.
This document discusses modelling and representing social network data ontologically. It covers representing social individuals and relationships ontologically, as well as aggregating and reasoning with social network data. It discusses ontology languages like RDF, OWL, and FOAF that can be used to represent social network data and individuals semantically. It also talks about state-of-the-art approaches for representing network structure and attribute data, and the need for representations that can integrate different data sources and maintain identity.
Tutorial on Sequence Aware Recommender Systems - ACM RecSys 2018Massimo Quadrana
Slides of the Tutorial on Sequence Aware Recommenders held at ACM RecSys 2018 in Vancouver.
Link to the website: https://sites.google.com/view/seq-recsys-tutorial
Link to the hands-on: https://github.com/mquad/sars_tutorial
This document describes a web service that analyzes web crawl data to provide contextual information about locations. It extracts topics like weather, healthcare, crime, and employment that are relevant to a given location from common crawl data stored on Amazon S3. The system uses Apache Pig on a Hadoop cluster to analyze the data, builds an index of locations to associated words, and makes the results searchable through Elastic Search. It aims to provide useful information to people moving to new places, policy makers, journalists, and researchers.
This document provides an introduction to data mining. It discusses the evolution of data mining technology, defines what data mining is, and outlines common data mining tasks like classification, clustering, and association rule discovery. The document also examines the KDD process, different types of data that can be mined, and major issues in data mining like scalability, handling diverse data types, and integrating discovered knowledge.
The document discusses the emergence of the social web and the relationship between Web 2.0 and the Semantic Web. It describes how blogs, wikis, and social networks enabled new forms of user-generated content and social interaction online in the early 2000s. The document also explains how Semantic Web technologies could enhance Web 2.0 by enabling the standardized exchange and combination of user data and services.
Nmap is a network scanning tool that can perform port scanning, operating system detection, and version detection among other features. It works by sending TCP and UDP packets to a target machine and examining the response, comparing it to its database to determine open ports and operating system. There are different scanning techniques that can be used like TCP SYN scanning, UDP scanning, and OS detection. Nmap also includes a scripting engine that allows users to write scripts to automate networking tasks. The presentation concludes with demonstrating Nmap's features through some examples.
This document discusses web usage mining. It begins by defining web mining and its three categories: web content mining, web structure mining, and web usage mining. The main focus is on web usage mining, which involves discovering user navigation patterns and predicting user behavior. The key processes of web usage mining are preprocessing raw data, pattern discovery using algorithms, and pattern analysis. Pattern discovery techniques discussed include statistical analysis, clustering, classification, association rules, and sequential patterns. Potential applications are personalized recommendations, system improvements, and business intelligence. The document concludes by discussing future research directions such as usage mining on the semantic web and analyzing discovered patterns.
The document discusses the need for 3D search engines and describes a system for searching and retrieving 3D models from large online repositories. The system allows users to query the repository using text, 2D sketches, 3D sketches, or a combination. It indexes 3D models based on shape and text descriptions and returns the top matching results to the user in under a quarter of a second.
Tutorial: Context-awareness In Information Retrieval and Recommender SystemsYONG ZHENG
The document provides an overview of a tutorial on context-awareness in information retrieval and recommender systems. It discusses topics such as information overload, solutions like information retrieval (e.g. search engines) and recommender systems (e.g. movie recommendations). It then covers context and context-awareness, giving examples like how recommendations may change based on location, time, user intent, etc. It also discusses incorporating context-awareness into information retrieval and recommender systems to improve recommendations.
The document provides an overview of recommender systems. It discusses the typical architecture of recommender systems and describes three main types: collaborative filtering systems, content-based systems, and knowledge-based systems. It also covers paradigms like collaborative filtering, content-based, knowledge-based, and hybrid recommender systems. The document then focuses on collaborative filtering techniques like user-based nearest neighbor collaborative filtering and item-based collaborative filtering. It also discusses latent factor models, matrix factorization approaches, and context-based recommender systems.
Personalizing Session-based Recommendations with Hierarchical Recurrent Neura...Massimo Quadrana
This document summarizes a research paper on personalizing session-based recommendations with hierarchical recurrent neural networks (HRNNs). The paper proposes using HRNNs to decouple user and session representations, with a user RNN that evolves the user's latent state across sessions and a session RNN that generates personalized recommendations for each session. Experiments on job posting and online video datasets show the HRNN approach outperforms baselines and other RNN methods, particularly for users with longer histories, by up to 28% in recall and 41% in MRR. The HRNN approach effectively transfers cross-session knowledge to improve session-based recommendations.
This document provides an overview of information retrieval models, including vector space models, TF-IDF, Doc2Vec, and latent semantic analysis. It begins with basic concepts in information retrieval like document indexing and relevance scoring. Then it discusses vector space models and how documents and queries are represented as vectors. TF-IDF weighting is explained as assigning higher weight to rare terms. Doc2Vec is introduced as an extension of word2vec to learn document embeddings. Latent semantic analysis uses singular value decomposition to project documents to a latent semantic space. Implementation details and examples are provided for several models.
Personalized Information Retrieval system using Computational Intelligence Te...veningstonk
The document presents research on developing a personalized information retrieval system using computational intelligence techniques. It discusses four proposed models: 1) a term association graph model for document re-ranking, 2) a topic model for document re-ranking, 3) a genetic intelligence model for document re-ranking, and 4) a swarm intelligence model for search query reformulation. The objectives are to improve retrieval effectiveness using term graphs and enhance personalized ranking using user topic modeling. Computational techniques like genetic algorithms and ant colony optimization will be used to re-rank documents and reformulate queries.
This document discusses various algorithms for multi-armed bandit problems including k-armed bandits, action value methods like epsilon-greedy, tracking non-stationary problems, optimistic initial values, upper confidence bound action selection, gradient bandit algorithms, contextual bandits, and Thomson sampling. The k-armed bandit problem involves choosing actions to maximize reward over time without knowing the expected reward of each action. The document outlines methods for balancing exploration of unknown actions with exploitation of best known actions.
The document discusses web crawlers, which are programs that download web pages to help search engines index websites. It explains that crawlers use strategies like breadth-first search and depth-first search to systematically crawl the web. The architecture of crawlers includes components like the URL frontier, DNS lookup, and parsing pages to extract links. Crawling policies determine which pages to download and when to revisit pages. Distributed crawling improves efficiency by using multiple coordinated crawlers.
The document proposes a privacy-preserving personalized web search framework called UPS. It aims to generalize user profiles for each query according to user-specified privacy requirements, while balancing personalization utility and privacy risk. Two algorithms, GreedyDP and GreedyIL, are developed to support runtime profile generalization. An online mechanism is also provided to decide whether personalizing a query would be beneficial without compromising privacy. Experiments show the effectiveness and efficiency of the UPS framework in achieving personalized search results while preserving user privacy.
The document proposes a privacy-preserving personalized web search framework called UPS. UPS can generalize user profiles for each query according to user-specified privacy requirements. It uses two metrics to evaluate privacy breach risk and query utility for hierarchical user profiles. The framework includes two algorithms for generalizing user profiles at runtime. It also provides an online mechanism for deciding whether to personalize queries to enhance search quality while avoiding unnecessary privacy exposures. Extensive experiments showed the proposed approach efficiently protects privacy during personalized web searches.
Research Interests : Their Dynamics, Structures and Applications in Personali...Yi Zeng
About how user interests (more specifically research interests of scientists) can be quantitatively analized and used in personalized Web search (Invited talk at Microsoft Research Asia NLC Group).
Supporting Privacy Protection in Personalized Web SearchMigrant Systems
This document proposes a framework called UPS that aims to protect user privacy in personalized web search systems while maintaining personalization utility. The framework consists of an online profiler on the client side that generalizes user profiles for queries in real-time according to user-specified privacy requirements. Two metrics are defined to evaluate personalization utility and privacy risk for generalized profiles. Algorithms are developed to generalize profiles by optimizing these conflicting metrics. Experiments demonstrate the effectiveness and efficiency of the framework in balancing privacy protection and personalization.
Supporting privacy protection in personalized web searchPapitha Velumani
This document proposes a personalized web search (PWS) framework called UPS that protects user privacy during search. UPS can generalize user profiles to different levels based on privacy requirements while balancing personalization utility and privacy risk. It presents two greedy algorithms, GreedyDP and GreedyIL, for runtime profile generalization and an online mechanism for deciding when to personalize queries. Experiments show UPS effectively protects privacy while maintaining personalization benefits and GreedyIL outperforms GreedyDP in efficiency.
How to bring innovation to your organization by streamlining the deployment process ?
IaaS, PaaS or Docker containers are all valid methods that can be tailored for your needs. They each come with advantages and drawbacks, and are opposed each day by vendors and providers along. Should we really impose a standard for every team ?
exoscale at the CloudStack User Group London - June 26th 2014Antoine COETSIER
The document provides an overview of exoscale, a cloud computing company based in Switzerland. It summarizes that exoscale offers open cloud computing, including compute instances, object storage, and platform services to deploy applications easily. It also notes that exoscale's datacenters are located in Geneva and offer a tier 3+ infrastructure with ISO certifications for quality and security. Pricing is provided on an hourly basis for compute instances and monthly for storage.
Cloud Computing Security Frameworks - our view from exoscaleAntoine COETSIER
With this short 15 min presentation done at the EPFL engineering school in Lausanne, May 2014, we brushed the concepts and recommandations when choosing and benchmarking cloud providers towards security.
The Cloud Security Alliance framework is at the moment the best matrix to establish such evaluation as it embraces the full service offered by provider and not only one aspect like Datacenter or Helpdesk.
Antoine Coetsier - CEO at Exoscale
Facebook has announced plans to use large solar-powered drones capable of staying airborne for months to beam internet access to people in Latin America, Africa, and Asia as early as 2017. The drones would fly at altitudes between 60,000 and 90,000 feet and could be the size of a Boeing 747 to deliver internet access from the sky without the need for ground-based infrastructure by using lasers, radio waves or other technologies to transmit internet signals. If successful, the drone tests could help provide internet access to more of the world's population.
This document outlines topics on error backpropagation training algorithms, Kohonen self-organizing maps, and Hopfield neural networks. It then lists several applications of artificial neural networks, including statistical pattern recognition, control of robotics and industrial processes, automatic synthesis of digital systems, adaptive telecommunications, image compression, radar classification, optimization problems, sentence understanding, and applying expertise to conceptual domains.
Neural networks are computing systems inspired by biological neural networks in the brain. They are composed of interconnected artificial neurons that process information using a connectionist approach. Neural networks can be used for applications like pattern recognition, classification, prediction, and filtering. They have the ability to learn from and recognize patterns in data, allowing them to perform complex tasks. Some examples of neural network applications discussed include face recognition, handwritten digit recognition, fingerprint recognition, medical diagnosis, and more.
Enhancing Information Retrieval by Personalization Techniquesveningstonk
This document outlines the research modules proposed for a PhD thesis focused on enhancing information retrieval through personalization techniques. The research will include four modules: 1) enhancing retrieval using term association graph representation, 2) integrating document and user topic models for personalization, 3) using genetic algorithms for document re-ranking, and 4) employing ant colony optimization for query reformulation. Module 1 will represent documents as a term graph and use the graph to re-rank documents based on term associations. The methodology for Module 1 includes preprocessing, frequent itemset mining to construct the term graph, and approaches for ranking documents based on semantic associations in the graph.
The document describes the Amazon Echo, a smart speaker controlled by voice commands. It has 7 microphones that use beamforming technology to hear commands from any direction. The Echo's artificial intelligence, Alexa, is always listening for a wake word and can provide information, music, news and more through voice interaction or a companion app. The Echo has advanced audio capabilities with 360 degree sound from its dual speakers. While convenient, it only supports English and requires internet access and electricity to function.
This document discusses the history and future of quantum computing. It explains how quantum computers work using principles of quantum mechanics like superposition and entanglement. Quantum computers can perform multiple computations simultaneously by exploiting the ability of qubits to exist in superposition. Current research involves building larger quantum registers with more qubits and performing calculations with 2 qubits. The future of quantum computing may enable solving certain problems much faster than classical computers, with desktop quantum computers potentially arriving within 10 years.
Autonomous Vehicles: Technologies, Economics, and OpportunitiesJeffrey Funk
National University of Singapore students presented on autonomous vehicles, including their evolution, enabling technologies like sensors and connectivity, infrastructure needs, and entrepreneurial opportunities. Key points discussed include autonomous vehicles producing large amounts of data, 5G enabling low latency required for applications, dedicated lanes and platooning potentially increasing road capacity, and autonomous vehicles reducing fuel costs, traffic, and accidents while creating new business models.
The document describes a smart note taker product that allows users to take notes by writing in the air. The notes are sensed and stored digitally. Key features include allowing blind users to write freely, and enabling instructors to write notes during presentations that are broadcast to students. It works using sensors to detect 3D writing motions, which are processed, stored, and can be viewed on a display or sent to other devices. An applet program and database are used to recognize words written in the air and print them. The smart note taker offers advantages over digital pens like ease of use and time savings.
Sensors and Data Management for Autonomous Vehicles report 2015 by Yole Devel...Yole Developpement
Multiple sensing technologies will ensure many market opportunities for Tier 1 players, Tier 2 players, and newcomers alike
Sensor technologies are a driving force in making fully autonomous vehicles a reality. Automakers are racing to develop safe self-driving cars, but this race is a distance run more than a sprint, where multiple automation stages will imply multiple sensors. Ultrasonic sensors, radars, and multiple cameras systems are already embedded in high-end vehicles -- and within 10 years, they could also include long-range cameras, LIDAR, micro bolometer and accurate dead reckoning. These devices will work concurrently and each technology will support another to ensure codependency and avoid concerns. Even though sensors are only part of the puzzle, their market opportunities are promising.
Speech recognition, also known as automatic speech recognition or computer speech recognition, allows computers to understand human voice. It has various applications such as dictation, system control/navigation, and commercial/industrial uses. The process involves converting analog audio of speech into digital format, then using acoustic and language models to analyze the speech and output text. There are two main types: speaker-dependent which requires training a model for each user, and speaker-independent which can recognize any voice without training. Accuracy is improving over time as technology advances.
Quantum computing - A Compilation of ConceptsGokul Alex
Excerpts of the Talk Delivered at the 'Bio-Inspired Computing' Workshop conducted by Department of Computational Biology and Bioinformatics, University of Kerala.
Search & Recommendation: Birds of a Feather?Toine Bogers
In just a little over half a century, the field of information retrieval has experienced spectacular growth and success, with IR applications such as search engines becoming a billion-dollar industry in the past decades. Recommender systems have seen an even more meteoric rise to success with wide-scale application by companies like Amazon, Facebook, and Netflix. But are search and recommendation really two different fields of research that address different problems with different sets of algorithms in papers published at distinct conferences?
In my talk, I want to argue that search and recommendation are more similar than they have been treated in the past decade. By looking more closely at the tasks and problems that search and recommendation try to solve, at the algorithms used to solve these problems and at the way their performance is evaluated, I want to show that there is no clear black and white division between the two. Instead, search and recommendation are part of a much more fluid continuum of methods and techniques for information access.
(Keynote at "Mind The Gap '14" workshop at the iConference 2014 in Berlin, Germany)
Slides from Enterprise Search & Analytics Meetup @ Cisco Systems - http://www.meetup.com/Enterprise-Search-and-Analytics-Meetup/events/220742081/
Relevancy and Search Quality Analysis - By Mark David and Avi Rappoport
The Manifold Path to Search Quality
To achieve accurate search results, we must come to an understanding of the three pillars involved.
1. Understand your data
2. Understand your customers’ intent
3. Understand your search engine
The first path passes through Data Analysis and Text Processing.
The second passes through Query Processing, Log Analysis, and Result Presentation.
Everything learned from those explorations feeds into the final path of Relevancy Ranking.
Search quality is focused on end users finding what they want -- technical relevance is sometimes irrelevant! Working with the short head (very frequent queries) has the most return on investment for improving the search experience, tuning the results, for example, to emphasize recent documents or de-emphasize archive documents, near-duplicate detection, exposing diverse results in ambiguous situations, using synonyms, and guiding search via best bets and auto-suggest. Long-tail analysis can reveal user intent by detecting patterns, discovering related terms, and identifying the most fruitful results by aggregated behavior. all this feeds back into the regression testing, which provides reliable metrics to evaluate the changes.
By merging these insights, you can improve the quality of the search overall, in a scalable and maintainable fashion.
The document summarizes the challenges in running a commercial web search engine. It discusses search engine spam, the difficulty of evaluation given the dynamic nature of the web, and provides an overview of Google's history and approach to addressing these challenges. Specifically, it notes that search engine spam is a big problem due to money that can be made from clicks, but that most search engines still provide useful results. It also explains that traditional information retrieval-based evaluation is not well-suited for web search given the massive scale and dynamic nature of the web.
This document discusses key factors to consider when evaluating a search engine, including:
1) Understanding the type of search engine (e.g. free text, directory, meta search) and its search functionality/operators.
2) Benchmarking a search engine by running sample searches and comparing results to preferred engines.
3) Analyzing how search results are ranked and algorithms are evaluated/updated.
4) Noting difficulties in evaluating search results due to ambiguity in search intents.
Personalized Search-Building a prototype to infer the user's interestTom Burgmans
In the world of Search, understanding the intend of the user is often seen as the holy grail. When a user performs multiple search and click actions while having a conversation with the search engine, then this behavior reveals a piece of her/his interest. A search engine that is aware of the user’s interest is able to add a personal layer in its responses and this could add a new dimension of accuracy and value to a search implementation. But what technology does it take to build it? What data is needed? How well does it really work? This presentation describes the journey to find a practical implementation of a recommendation engine. It answers all the questions above and more. We’ll guide you through the lessons learned while creating an engine that generates potentially interesting items for the user based on collaborative filtering and anomaly detection. We’ll demonstrate a prototype where even a minimal set of user actions could lead to a personalized search experience.
The document discusses semantic search capabilities at Yahoo. It describes how Yahoo has developed techniques to extract structured data and metadata from webpages to power enhanced search results. This includes information extraction, data fusion, and curating knowledge in a graph. Yahoo uses this knowledge to better understand search queries and present relevant entities and attributes in results. Semantic search remains an active area of research.
Information Discovery and Search Strategies for Evidence-Based ResearchDavid Nzoputa Ofili
This event was on May 2, 2017 at Wesley University, Ondo State, Nigeria. I trained the university's staff (academic and non-academic) on "Information Discovery and Search Strategies for Evidence-Based Research" in an information/digital literacy session.
Keyword research tools for Search Engine Optimisation (SEO)Duncan MacGruer
Presentation given to the University of Edinburgh web publishers community in January 2018 on the use of Keyword research tools for Search Engine Optimisation (SEO).
Introduction to Enterprise Search. A two hour class to introduce Enterprise Search. It covers:
The problems enterprise search can solve
History of (web) search
How we search and find?
Current state of Enterprise Search + stats
Technical concept
Information quality
Feedback cycle
Five dimensions of Findability
This document discusses recommender systems and approaches used at Netflix. It covers collaborative filtering using user-user and item-item methods, content-based recommendations using item attributes, and hybrid approaches. It provides examples of how Netflix uses collaborative filtering to generate personalized genre rows and social recommendations. Netflix combines many data sources and machine learning models to power its highly personalized recommendation engine.
Personalized Search at Sandia National LabsLucidworks
Clay Pryor, R&D S&E, Computer Science & Ryan Cooper, Sandia National Labs. Presentation from ACTIVATE 2019, the Search and AI Conference hosted by Lucidworks. http://www.activate-conf.com
This document discusses various algorithms for ranking webpages, including early link-based algorithms like InDegree and HITS, as well as more advanced algorithms like PageRank. It notes that early algorithms ranked pages based solely on link analysis or relevance, but modern algorithms like PageRank take a more holistic approach, treating links as endorsements and ranking pages based on both links and relevance to provide more universally relevant results. The document also covers challenges like topic drift, spamming techniques, and difficulties with non-textual content.
1) Standard tests for measuring search engine effectiveness have limitations and do not reflect actual user behavior.
2) A new integrated test framework was proposed that considers different query types, graded relevance, and user satisfaction measures from evaluating all elements on search engine results pages.
3) Preliminary work has begun on a large-scale user study utilizing query logs and crowdsourcing to address issues with prior effectiveness tests.
Lyft developed Amundsen, an internal metadata and data discovery platform, to help their data scientists and engineers find data more efficiently. Amundsen provides search-based and lineage-based discovery of Lyft's data resources. It uses a graph database and Elasticsearch to index metadata from various sources. While initially built using a pull model with crawlers, Amundsen is moving toward a push model where systems publish metadata to a message queue. The tool has increased data team productivity by over 30% and will soon be open sourced for other organizations to use.
The power of the modern Web, which is frequently called the Social Web or Web 2.0, is frequently traced to the power of users as contributors of various kinds of contents through Wikis, blogs, and resource sharing sites. However, the community power impacts not only the production of Web content, but also the access to all kinds of Web content. A number of research groups worldwide explore what we call social information access techniques that help users get to the right information using “collective wisdom” distilled from actions of those who worked with this information earlier.
Social information access can be formally defined as a stream of research that explores methods for organizing users' past interaction with an information system (known as explicit and implicit feedback), in order to provide better access to information to the future users of the system. It covers a range of rather different systems and technologies from social navigation to collaborative filtering. An important feature of all social information access systems is self-organization. Social information access systems are able to work with little or no involvement of human indexers, organizers, or other kinds of experts. They are truly powered by a community of users. Due to this feature, social information access technologies are frequently considered as an alternative to the traditional (content-oriented) technologies. The goal of this tutorial is to provide an overview of the emerging social information access research stream and to provide some practical guidelines for building social information access systems.
Search plays an important role in online social networks as it provides an essential mechanism for discovering members and content on the network. Related search recommendation is one of several mechanisms used for improving members’ search experience in finding relevant results to their queries. This paper describes the design, implementation, and deployment of Metaphor, the related search recommendation system on LinkedIn, a professional social networking site with over 175 million members worldwide. Metaphor builds on a number of signals and filters that capture several dimensions of relatedness across member search activity.
The system, which has been in live operation for over a year, has gone through multiple iterations and evaluation cycles. This paper makes three contributions. First, we provide a discussion of a large-scale related search recommendation system. Second, we describe a mechanism for effectively combining several signals in building a unified dataset for related search recommendations. Third, we introduce a query length model for capturing bias in recommendation
click behavior. We also discuss some of the practical concerns in deploying related search recommendations.
"If I like BLANK, what else will I like?": Analyzing a Human Recommendation C...Toine Bogers
While there have been several studies on how users experience algorithmic recommendations and their explanations, we know relatively little about human recommendations and which item aspects humans highlight when describing their own recommendation needs. A better understanding of human recommendation behavior could help us design better recommender systems that are more attuned to their users. In this paper, we take a step towards such understanding by analyzing a Reddit community dedicated to requesting and providing for recommendations: /r/ifyoulikeblank. After a general analysis of the community, we provide a more detailed analysis of the prevalent music requests and the example items used to ask for these recommendations. Finally, we compare these human recommendations to algorithmic recommendations to better char- acterize their differences. We conclude by discussing the implications of our work for recommender systems design.
Hands-free but not Eyes-free: A Usability Evaluation of Siri while DrivingToine Bogers
Distractions while driving are a major cause of traffic accidents and chief among these is the use of mobile phones. Driver distractions typically fall into four categories-visual, cognitive, bio-mechanical, and auditory-and different technological solutions have been proposed to address these. Intelligent Personal Assistants (IPAs), such as Siri, is a recent example of such a technological solution that offers the potential for hands-free phone interaction through a voice-controlled interface. IPAs could potentially reduce visual and bio-mechanical distractions if they are usable enough to not increase a driver's cognitive load. We present the results of a controlled experiment with the aim of understanding how the use of Siri while driving compares to manual interaction in terms of usability and distractions. We also tested these two interaction types in the lab in order to understand how the main driving task influences Siri's (perceived) usability. Our study shows that Siri is not ready for every-day use in the car: interacting with Siri while driving is likely to be unsafe for most participants, especially less experienced drivers. Participants were distracted by Siri due to its over-reliance on visual feedback as well as frequent time-outs by Siri when waiting for a response from a driver occupied with the road environment. Speech recognition quality in a noisy car as well as problematic multi-lingual speech recognition in general are other issues that resulted in low usability and more cognitive distractions. While interacting with Siri may be hands-free, it does not provide an eyes-free and distraction-free experience yet.
(Planned paper presentation @ CHIIR 2020, Vancouver, Canada)
Link to paper: https://dl.acm.org/doi/abs/10.1145/3343413.3377962
Link to YouTube video of recorded presentation: https://youtu.be/5uR_z2R_Y6Y
“Looking for an Amazing Game I Can Relax and Sink Hours into...”: A Study of ...Toine Bogers
This study analyzed video game requests posted on Reddit to identify what makes finding relevant games difficult. By coding over 500 game requests, the study identified 5 main relevance aspects (content, metadata, experience, interactivity, context) and 2 information need aspects that capture what users look for in games. Game requests were found to mention an average of 4.6 relevance aspects and can reflect multiple information needs. The study aims to help improve systems for discovering games by understanding the complex factors involved in video game search.
A Study of Usage and Usability of Intelligent Personal Assistants in DenmarkToine Bogers
Intelligent personal assistants (IPA), such as Siri, Google Assistant, Alexa, and Cortana, are rapidly becoming a popular way of interacting with our smart devices. As a result, there has been a wealth of research on all aspects of IPAs in recent years, such as studies of usage of and user satisfaction with IPAs. However, the overwhelming majority of these studies have focused on English as the interaction language. In this paper, we investigate the usage and perceived usability of IPAs in Denmark. We conduct a questionnaire with 357 Danish-speaking respondents that sheds light on how IPAs are used in Denmark. We find they are only used regularly by 19.9% of respondents and that most people do not find IPAs to be reliable. We also conduct a usability study of Siri and find that Siri suffers from several issues when used in Danish: poor voice recognition, unnatural dialogue responses, and an inability to support mixed-language speech recognition. Our findings shed light on both the current state of usage and adoption of IPAs in Denmark as well as the usability of its most popular IPA in a foreign-language setting.
(Paper presentation @ iConference 2019, College Park, MD)
“What was this movie about this chick?”: A Comparative Study of Relevance Asp...Toine Bogers
In recent decades, information retrieval research has slowly expanded its focus to address the wealth of complex search requests present in our work and leisure environments. A better understanding of such complex needs could aid in the design of more effective, domain-specific search engines. In this paper we take a first step towards such domain-specific understanding. We present an analysis of a random sample of 1000+ complex book and movie search requests posted in the LibraryThing and Internet Movie Database forums. A coding scheme was developed that captures the 29 different relevance aspects expressed in these requests. We find that while the identified relevance aspects are remarkably similar for complex book and movie requests, their relative occurrence does vary considerably from domain to domain.
(Paper presentation @ iConference 2018, Sheffield, UK)
"I just scroll through my stuff until I find it or give up": A Contextual Inq...Toine Bogers
While ownership and usage of handheld devices such as smartphones and tablets continues to grow at a rapid pace, we do not have complete picture of how people manage personal information on these devices. The few existing studies have typically used interview or survey methods to focus on personal information management (PIM) practices on smartphones. We present the results of an exploratory contextual inquiry study of PIM practices aimed at providing a structured, naturalistic overview of PIM on both smartphones and tablets. We find that people use multiple complementary strategies to acquire different types of information on their devices, and that people rely strongly on automatic chronological ordering instead of organization by subject, although this pays off most for smaller information collections. Deletion of information is strongly influenced by usefulness and personal attachment. Finally, we find that people strongly prefer browsing over search when retrieving information from their devices.
(Paper presentation @ CHIIR 2018, New Brunswick, NJ)
This lectures provides students with an introduction to natural language processing, with a specific focus on the basics of two applications: vector semantics and text classification.
(Lecture at the QUARTZ PhD Winter School (http://www.quartz-itn.eu/training/winter-school/ in Padua, Italy on February 12, 2018)
Defining and Supporting Narrative-driven RecommendationToine Bogers
The document discusses narrative-driven recommendation, where users describe their recommendation needs in a natural language narrative along with examples of past preferences. It finds that a significant percentage of recommendation requests online take this form. An analysis of book recommendation narratives found they commonly include aspects like content, engagement, familiarity and metadata. Future work is needed to better understand complex needs, extract relevant signals from narratives, and develop algorithms that can satisfy such needs.
An In-depth Analysis of Tags and Controlled Metadata for Book SearchToine Bogers
Book search for information needs that go beyond standard bibliographic data is far from a solved problem. Such complex information needs often cover a combination of different aspects, such as specific genres or plot elements, engagement or novelty. By design, subject information in controlled vocabularies is not always adequate in covering such complex needs, and social tags have been proposed as an alternative. In this paper we present a large-scale empirical comparison and in-depth analysis of the value of controlled vocabularies and tags for book retrieval using a test collection of over 2 million book records and over 330 real-world book information needs. We find that while tags and controlled vocabulary terms provide complementary performance, tags perform better overall. However, this is not due to a popularity effect; instead, tags are better at matching the language of regular users. Finally, we perform a detailed failure analysis and show, using tags and controlled vocabulary terms, that some request types are inherently more diffcult to solve than others.
(Paper presentation @ iConference 2017, Wuhan, China)
A Longitudinal Analysis of Search Engine Index SizeToine Bogers
This document summarizes a study that estimated the index sizes of Google and Bing over a 9-year period from 2006-2015. The researchers developed a novel method to extrapolate index sizes based on hit counts for specific terms compared to a training corpus. They found Google's index peaked at 49.4 billion pages in 2011 while Bing peaked at 23 billion pages in 2014. However, index size estimates varied significantly over time, which was attributed to frequent changes in the search engines' indexing and ranking infrastructures. The study demonstrates the instability of using hit counts to estimate index sizes or for one-off webometric analyses.
Tagging vs. Controlled Vocabulary: Which is More Helpful for Book Search?Toine Bogers
The popularity of social tagging has sparked a great deal of debate on whether tags could replace or improve upon professional metadata as descriptors of books and other information objects. In this paper we present a large-scale empirical comparison of the contributions of individual information elements like core bibliographic data, controlled vocabulary terms, reviews, and tags to the retrieval performance. Our comparison is done using a test collection of over 2 million book records with information elements from Amazon, the British Library, the Library of Congress, and LibraryThing. We find that tags and controlled vocabulary terms do not actually outperform each other consistently, but seem to provide complementary contributions: some information needs are best addressed using controlled vocabulary terms whereas other are best addressed using tags.
(Paper presentation @ iConference 2015, Newport Beach)
Measuring System Performance in Cultural Heritage SystemsToine Bogers
This talk presents a high-level overview of the different components of cultural heritage information systems—search, browsing, recommendation, and enrichment—and their evaluation, and the common challenges.
(Invited talk at the "Evaluating Cultural Heritage Information Systems" workshop at the iConference 2015 in Newport Beach, CA)
How 'Social' are Social News Sites? Exploring the Motivations for Using Reddi...Toine Bogers
The document summarizes a study that explored the motivations for using the social news site Reddit.com. The researchers developed a framework of 26 motivational factors organized into 4 top-level categories: personal, social, informational, and website characteristics. They conducted a survey of Reddit users to validate the framework. The results showed the top motivating factors were related to entertainment, curiosity, and passing time. Comments from users indicated they use Reddit for fun, procrastination, and the positive feelings it provides. The framework was largely validated by the empirical survey results.
Micro-Serendipity: Meaningful Coincidences in Everyday Life Shared on TwitterToine Bogers
In this paper we present work on micro-serendipity: investigating everyday contexts, conditions, and attributes of serendipity as shared on Twitter. In contrast to related work, we deliberately omit a preset definition of serendipity to allow for the inclusion of micro- occurrences of what people themselves consider as meaningful coincidences in everyday life. We find that different people have different thresholds for what they consider serendipitous, revealing a serendipity continuum. We propose a distinction between background serendipity (or ‘traditional’ serendipity) and foreground serendipity (or ‘synchronicity’, unexpectedly finding something meaningful related to foreground interests). Our study confirms the presence of three key serendipity elements of unexpectedness, insight and value, and suggests a fourth element, preoccupation (foreground problem/interest), which covers synchronicity. Finally, we find that a combination of features based on word usage, POS categories, and hashtag usage show promise in automatically identifying tweets about serendipitous occurrences.
Benchmarking Domain-specific Expert Search using Workshop Program CommitteesToine Bogers
The document summarizes the creation of three new domain-specific test collections for evaluating expert search systems in the domains of information retrieval, semantic web, and computational linguistics. The collections were created using workshop program committees and publications from relevant conferences and journals to represent experts, documents, and topics. The collections were then benchmarked using state-of-the-art expert search approaches, finding that term extraction methods outperformed language modeling on these domain-centered collections. Future work is discussed to expand the collections and incorporate additional evidence like citations.
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...Advanced-Concepts-Team
Presentation in the Science Coffee of the Advanced Concepts Team of the European Space Agency on the 07.06.2024.
Speaker: Diego Blas (IFAE/ICREA)
Title: Gravitational wave detection with orbital motion of Moon and artificial
Abstract:
In this talk I will describe some recent ideas to find gravitational waves from supermassive black holes or of primordial origin by studying their secular effect on the orbital motion of the Moon or satellites that are laser ranged.
CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)eitps1506
Description:
Dive into the fascinating realm of solid-state physics with our meticulously crafted online PowerPoint presentation. This immersive educational resource offers a comprehensive exploration of the fundamental concepts, theories, and applications within the realm of solid-state physics.
From crystalline structures to semiconductor devices, this presentation delves into the intricate principles governing the behavior of solids, providing clear explanations and illustrative examples to enhance understanding. Whether you're a student delving into the subject for the first time or a seasoned researcher seeking to deepen your knowledge, our presentation offers valuable insights and in-depth analyses to cater to various levels of expertise.
Key topics covered include:
Crystal Structures: Unravel the mysteries of crystalline arrangements and their significance in determining material properties.
Band Theory: Explore the electronic band structure of solids and understand how it influences their conductive properties.
Semiconductor Physics: Delve into the behavior of semiconductors, including doping, carrier transport, and device applications.
Magnetic Properties: Investigate the magnetic behavior of solids, including ferromagnetism, antiferromagnetism, and ferrimagnetism.
Optical Properties: Examine the interaction of light with solids, including absorption, reflection, and transmission phenomena.
With visually engaging slides, informative content, and interactive elements, our online PowerPoint presentation serves as a valuable resource for students, educators, and enthusiasts alike, facilitating a deeper understanding of the captivating world of solid-state physics. Explore the intricacies of solid-state materials and unlock the secrets behind their remarkable properties with our comprehensive presentation.
PPT on Sustainable Land Management presented at the three-day 'Training and Validation Workshop on Modules of Climate Smart Agriculture (CSA) Technologies in South Asia' workshop on April 22, 2024.
Signatures of wave erosion in Titan’s coastsSérgio Sacani
The shorelines of Titan’s hydrocarbon seas trace flooded erosional landforms such as river valleys; however, it isunclear whether coastal erosion has subsequently altered these shorelines. Spacecraft observations and theo-retical models suggest that wind may cause waves to form on Titan’s seas, potentially driving coastal erosion,but the observational evidence of waves is indirect, and the processes affecting shoreline evolution on Titanremain unknown. No widely accepted framework exists for using shoreline morphology to quantitatively dis-cern coastal erosion mechanisms, even on Earth, where the dominant mechanisms are known. We combinelandscape evolution models with measurements of shoreline shape on Earth to characterize how differentcoastal erosion mechanisms affect shoreline morphology. Applying this framework to Titan, we find that theshorelines of Titan’s seas are most consistent with flooded landscapes that subsequently have been eroded bywaves, rather than a uniform erosional process or no coastal erosion, particularly if wave growth saturates atfetch lengths of tens of kilometers.
SDSS1335+0728: The awakening of a ∼ 106M⊙ black hole⋆Sérgio Sacani
Context. The early-type galaxy SDSS J133519.91+072807.4 (hereafter SDSS1335+0728), which had exhibited no prior optical variations during the preceding two decades, began showing significant nuclear variability in the Zwicky Transient Facility (ZTF) alert stream from December 2019 (as ZTF19acnskyy). This variability behaviour, coupled with the host-galaxy properties, suggests that SDSS1335+0728 hosts a ∼ 106M⊙ black hole (BH) that is currently in the process of ‘turning on’. Aims. We present a multi-wavelength photometric analysis and spectroscopic follow-up performed with the aim of better understanding the origin of the nuclear variations detected in SDSS1335+0728. Methods. We used archival photometry (from WISE, 2MASS, SDSS, GALEX, eROSITA) and spectroscopic data (from SDSS and LAMOST) to study the state of SDSS1335+0728 prior to December 2019, and new observations from Swift, SOAR/Goodman, VLT/X-shooter, and Keck/LRIS taken after its turn-on to characterise its current state. We analysed the variability of SDSS1335+0728 in the X-ray/UV/optical/mid-infrared range, modelled its spectral energy distribution prior to and after December 2019, and studied the evolution of its UV/optical spectra. Results. From our multi-wavelength photometric analysis, we find that: (a) since 2021, the UV flux (from Swift/UVOT observations) is four times brighter than the flux reported by GALEX in 2004; (b) since June 2022, the mid-infrared flux has risen more than two times, and the W1−W2 WISE colour has become redder; and (c) since February 2024, the source has begun showing X-ray emission. From our spectroscopic follow-up, we see that (i) the narrow emission line ratios are now consistent with a more energetic ionising continuum; (ii) broad emission lines are not detected; and (iii) the [OIII] line increased its flux ∼ 3.6 years after the first ZTF alert, which implies a relatively compact narrow-line-emitting region. Conclusions. We conclude that the variations observed in SDSS1335+0728 could be either explained by a ∼ 106M⊙ AGN that is just turning on or by an exotic tidal disruption event (TDE). If the former is true, SDSS1335+0728 is one of the strongest cases of an AGNobserved in the process of activating. If the latter were found to be the case, it would correspond to the longest and faintest TDE ever observed (or another class of still unknown nuclear transient). Future observations of SDSS1335+0728 are crucial to further understand its behaviour. Key words. galaxies: active– accretion, accretion discs– galaxies: individual: SDSS J133519.91+072807.4
TOPIC OF DISCUSSION: CENTRIFUGATION SLIDESHARE.pptxshubhijain836
Centrifugation is a powerful technique used in laboratories to separate components of a heterogeneous mixture based on their density. This process utilizes centrifugal force to rapidly spin samples, causing denser particles to migrate outward more quickly than lighter ones. As a result, distinct layers form within the sample tube, allowing for easy isolation and purification of target substances.
Anti-Universe And Emergent Gravity and the Dark UniverseSérgio Sacani
Recent theoretical progress indicates that spacetime and gravity emerge together from the entanglement structure of an underlying microscopic theory. These ideas are best understood in Anti-de Sitter space, where they rely on the area law for entanglement entropy. The extension to de Sitter space requires taking into account the entropy and temperature associated with the cosmological horizon. Using insights from string theory, black hole physics and quantum information theory we argue that the positive dark energy leads to a thermal volume law contribution to the entropy that overtakes the area law precisely at the cosmological horizon. Due to the competition between area and volume law entanglement the microscopic de Sitter states do not thermalise at sub-Hubble scales: they exhibit memory effects in the form of an entropy displacement caused by matter. The emergent laws of gravity contain an additional ‘dark’ gravitational force describing the ‘elastic’ response due to the entropy displacement. We derive an estimate of the strength of this extra force in terms of the baryonic mass, Newton’s constant and the Hubble acceleration scale a0 = cH0, and provide evidence for the fact that this additional ‘dark gravity force’ explains the observed phenomena in galaxies and clusters currently attributed to dark matter.
2. Outline
• Past
- What is the basic foundation of search engines?
• Present
- How do search engines personalize the results?
• Future
- What direction are we moving in?
2
4. Search is everywhere!
• Some statistics
- 82.6% of internet users use search engines
- 93% of online experiences begin with a search engine
- Google receives ~3.3 billion searches per day
- Since 2015 half of all searches come from mobile
- Size of Google’s index exceeds 100 million GB
- 80% of users prefer personalized search
4
6. Content
• 2nd generation Web search
- Early 1990s
- Examples: Lycos, Altavista, AllTheWeb, ...
• Ranking signals
- Term frequency (TF)
‣ Term more frequent in document → more important for that document
- Inverse document frequency (IDF)
‣ Term unique for that document → more important for that document
- TF·IDF
‣ Combined term score of both TF and IDF
6
8. Content-based ranking
8
Z
...
vector
representation
0 0 1 0 0 0 0 0 0 0 1
frequency of term 1 in
the query/document
frequency of term 2 in
the query/document
Y 6 0 0 0 0 9 0 3 7 0 0
X 8 0 4 0 0 0 2 0 0 0 3
0 4 0 5 0 0 0 0 0 0 0
all unique words in the index
10. Links
• 3rd generation Web search
- Take the link structure of the Web into account
- Second half of 1990s
- Examples: Google (PageRank), Ask! (HITS)
• Ranking signals
- Website popularity
‣ More incoming links → higher popularity
‣ More incoming links from popular pages → higher popularity
10
13. Personalization
• Definition
- Providing search results tailored to the individual user
• History
- 1998: Yahoo! MyWeb
- 2004: Google introduces personalized search
- 2007: iGoogle
13
14. Personalization
• Pros & cons
+ Saves time by reducing number of results to inspect
+ Better decision making by filtering out inferior information
– Filter bubble (as much a personal decision as an algorithmic restriction)
– Users as products (using search history for advertising)
14
16. Personal
• Information about the user him/herself
• Ranking signals
- Language
‣ Language preferences can be used to filter out results
- Demographics
‣ Google+ or predicted → can be used for re-ranking results
‣ Results selected by other users from similar cohorts can be ranked higher
16
original
relevance score
Q
P
R
% times selected by
demographically
similar users
+ =
combined
score
17. Social
• Information about a user’s social network
• Ranking signals
- Social network connections
‣ Results selected by friends for similar searches could be given more weight
‣ Web pages shared by friends could be given more weight
17
shared
by friends?
+ =
original
relevance score
Q
P
R
+
combined
score
% times selected
by friends
18. Activity: Query logs
• Information about the queries submitted by the user and
other users in the past
• Ranking signals
- Query suggestion
‣ Others users entered queries A and B in the same session → B might be a good
suggestion for a user entering query A
18
19. Activity: Query suggestion
19
Session 1 john
hotels New York1.
hotels Manhattan2.
affordable hotels Manhattan3.
sightseeing New York4.
One World Trade Center5.
Session 2 mary
oed1.
oxford english dictionary2.
Session 3 jane
youtube drumpf john oliver1.
Session 4 bob
oed1.
oxford english dictionary2.
Session 5 alice
sights New York1.
sightseeing New York2.
Brooklyn Bridge3.
One World Trade Center4.
oed oxford english dictionary
sightseeing New York One World Trade Center
sightseeing New York Brooklyn Bridge
Ranking principle:
Queries are similar
if they have been
issued in the same
session.
20. Activity: Query logs
• Information about the queries submitted by the user and
other users in the past
• Applications
- Query suggestion
‣ Others users entered queries A and B in the same session → B might be a good
suggestion for a user entering query A
- Spelling correction
‣ Immediately after query X other users entered
query Y → Y might be the
correct version of query X
20
21. Activity: Browse logs
• Information about the results clicked on by the user and
other users in the past
• Ranking signals
- Similar results in the same session
- Similar results in the same user browsing history
21
Session 1
http://www.nycgo.com1.
http://www.lonelyplanet.com/new-york2.
http://www.citypass.com/new-york3.
https://oneworldobservatory.com/4.
http://www.esbnyc.com/5.
sightseeing New York Session 2
http://www.lonelyplanet.com/new-york1.
sightseeing New York
https://oneworldobservatory.com/
http://www.esbnyc.com/
22. Context
• Information about the context in which the search is performed
• Ranking signals
- Location
‣ Used to prioritize locally relevant results
‣ Essential for mobile search
- Device
‣ Has the page been optimized for the user’s current device?
- Date & time
‣ Seasonal influences, home vs. work, ...
- ...
22
23. Learning to rank
• Learning the optimal combination of all ranking signals
- Goal: to do this continuously and automatically using machine learning
‣ Predict for each query-result pair whether the result is relevant for that user’s
query at this specific time
• Machine learning is the science of teaching a computer how
to perform a task without explicitly programming it
- Detect common patterns in the data
‣ Our data → different ranking signals related to query and document
- Associate those patterns with specific outcomes
‣ Our outcomes → overall relevance score
- The more examples for the computer, the better!
23
24. Learning to rank
24
1
Example Ranking signal vector
Document
• Similarity with query vector
• Recency
• Readability score
• Language
• Spam score
0.904
Query
• Type of information need
• Entities (company, person)
• Trending topic?
Personal
• Preferred language?
• Selected by
demographically
similar users
Links
• PageRank
• Personalized PageRank
• TrustRank
25. Learning to rank
25
1
Example Ranking signal vector Relevance
✓
DocumentQuery PersonalLinks
Social
• Selected by friends
• Shared by friends
Activity
• Selected by similar users
• Selected for related
queries
Context
• Optimized for
current device?
• Related to current
location
• Related to current
date/time
26. Learning to rank
26
Example Ranking signal vector Relevance
✓1
✗2
...
3.3 billion examples per day!
3 ✗
4 ✗
5 ✓
6 ✗
27. Personalization in academic search
• What ranking signals are available in academic search?
Content
‣ Publications, teaching materials, supervised theses, homepages, grants, ...
Links
‣ Citation networks, ...
Personal
‣ LinkedIn endorsements, expertise areas, ...
Social
‣ LinkedIn, Academia.edu, ResearchGate, Mendeley, CiteULike, ...
27
28. Personalization in academic search
Activity
‣ Teaching, supervision, organization, service to the profession, ...
Context
‣ Research vs. teaching, active project, previously read, ...
28
30. Task-awareness
• Search is rarely a goal in itself → often associated with the
completion of a larger task
- Tasks are complex, involving a nontrivial sequence of steps
- Tasks are knowledge-intensive, requiring access to and manipulation of
large quantities of information
- Example: Planning a family vacation
• Awareness of the background task is essential to take
personalization to the next level
- Detecting & supporting multiple search strategies
- Supporting filtering, sorting, and aggregating of results
30