This document discusses balancing exploration and exploitation when learning ranking functions online from user interactions. It presents a method that uses dueling bandit gradient descent with k-greedy exploration to compare document lists. Experiments on simulated clicks show balancing exploration and exploitation improves online performance over all click models and datasets tested. The best results occurred with an exploration rate of 2 exploratory documents per results list. Future work includes validating simulation assumptions, evaluating on click logs, and developing new algorithms that balance exploration and exploitation for online learning to rank.
Characterising the Emergent Semantics in Twitter ListsOscar Corcho
This document summarizes research analyzing the emergent semantics of lists and list names on Twitter. The researchers investigated whether related keywords can be identified from list names according to how they are used by different user roles (curators, subscribers, members). They used a dataset of over 297,000 lists to extract keywords from list names and model their relationships based on these user roles. Their experiments analyzed the semantics of related keyword pairs using techniques like WordNet searches and found that relationships identified based on members had the highest percentage of direct semantic relations like synonyms.
DynaLearn: Problem-based learning supported by semantic techniquesOscar Corcho
This document describes a system that supports problem-based learning through semantic techniques. The system grounds learner models in semantic repositories to enable semantic-based feedback. It analyzes learner models and reference models to identify discrepancies in terminology, taxonomy, and qualitative reasoning structures. Suggestions are generated and filtered based on agreement across multiple reference models. The system aims to bridge gaps between learner and expert terminology and provide automated feedback to support the learning process.
An energy audit involves verifying, monitoring, and analyzing an organization's energy usage to identify opportunities to improve energy efficiency. There are two main types of energy audits: preliminary audits provide a quick overview of energy consumption and potential savings areas, while detailed audits provide a comprehensive evaluation of all major energy systems to accurately estimate savings. The methodology for a detailed audit involves creating an energy balance by inventorying energy usage and operating conditions, then identifying and calculating potential savings from projects.
Pump and cooling tower energy performancemaulik610
This document provides an overview of pumps and cooling towers used in industrial applications. It discusses the main components, types, and operating characteristics of pumps, including centrifugal pumps which account for 75% of installed pumps. The document also examines how to assess pump performance by calculating parameters like pump shaft power and hydraulic power. For cooling towers, it outlines the components and types, and explains how to evaluate cooling tower performance using metrics such as range, approach, effectiveness, cooling capacity, and evaporation loss. The document concludes by identifying opportunities to improve the energy efficiency of pumps and cooling towers through equipment selection and optimization.
This document contains 25 quotes from Steve Jobs on a variety of topics. Some of the key themes that emerge are Jobs' focus on excellence and innovation, his belief that quality should take priority over quantity, and his vision that technology could be used to change people's lives. He also expressed confidence in Apple's future leadership and his ongoing connection to the company even if he wasn't present at all times.
This document summarizes a presentation on sentiment classification using supervised machine learning approaches and RapidMiner. It discusses how sentiment analysis can be used for search, recommendations, market research and ad placement. A case study is described that uses RapidMiner to classify movie reviews from IMDB as positive or negative based on word vectors. Additional features like part-of-speech tags, sentiment lexicons, and document statistics are shown to improve accuracy from 85% to 86%.
RCOMM 2011 - Sentiment Classification with RapidMinerbohanairl
This document summarizes a presentation on sentiment classification using supervised machine learning approaches and RapidMiner. It discusses how sentiment analysis can be used for search, recommendations, market research and ad placement. A case study is described that uses RapidMiner to classify movie reviews from IMDB as positive or negative based on word vectors. Additional features like part-of-speech tags, sentiment lexicons, and document statistics are shown to improve accuracy from 85% to 86%.
The document discusses a lightning talk presentation on social learning analytics. It includes an agenda with topics like visualizing social ties in SocialLearn by topic and type, visualizing social learning in the SocialLearn environment, and prototyping learning power modeling in SocialLearn. There are also brief biographies of several presenters.
Characterising the Emergent Semantics in Twitter ListsOscar Corcho
This document summarizes research analyzing the emergent semantics of lists and list names on Twitter. The researchers investigated whether related keywords can be identified from list names according to how they are used by different user roles (curators, subscribers, members). They used a dataset of over 297,000 lists to extract keywords from list names and model their relationships based on these user roles. Their experiments analyzed the semantics of related keyword pairs using techniques like WordNet searches and found that relationships identified based on members had the highest percentage of direct semantic relations like synonyms.
DynaLearn: Problem-based learning supported by semantic techniquesOscar Corcho
This document describes a system that supports problem-based learning through semantic techniques. The system grounds learner models in semantic repositories to enable semantic-based feedback. It analyzes learner models and reference models to identify discrepancies in terminology, taxonomy, and qualitative reasoning structures. Suggestions are generated and filtered based on agreement across multiple reference models. The system aims to bridge gaps between learner and expert terminology and provide automated feedback to support the learning process.
An energy audit involves verifying, monitoring, and analyzing an organization's energy usage to identify opportunities to improve energy efficiency. There are two main types of energy audits: preliminary audits provide a quick overview of energy consumption and potential savings areas, while detailed audits provide a comprehensive evaluation of all major energy systems to accurately estimate savings. The methodology for a detailed audit involves creating an energy balance by inventorying energy usage and operating conditions, then identifying and calculating potential savings from projects.
Pump and cooling tower energy performancemaulik610
This document provides an overview of pumps and cooling towers used in industrial applications. It discusses the main components, types, and operating characteristics of pumps, including centrifugal pumps which account for 75% of installed pumps. The document also examines how to assess pump performance by calculating parameters like pump shaft power and hydraulic power. For cooling towers, it outlines the components and types, and explains how to evaluate cooling tower performance using metrics such as range, approach, effectiveness, cooling capacity, and evaporation loss. The document concludes by identifying opportunities to improve the energy efficiency of pumps and cooling towers through equipment selection and optimization.
This document contains 25 quotes from Steve Jobs on a variety of topics. Some of the key themes that emerge are Jobs' focus on excellence and innovation, his belief that quality should take priority over quantity, and his vision that technology could be used to change people's lives. He also expressed confidence in Apple's future leadership and his ongoing connection to the company even if he wasn't present at all times.
This document summarizes a presentation on sentiment classification using supervised machine learning approaches and RapidMiner. It discusses how sentiment analysis can be used for search, recommendations, market research and ad placement. A case study is described that uses RapidMiner to classify movie reviews from IMDB as positive or negative based on word vectors. Additional features like part-of-speech tags, sentiment lexicons, and document statistics are shown to improve accuracy from 85% to 86%.
RCOMM 2011 - Sentiment Classification with RapidMinerbohanairl
This document summarizes a presentation on sentiment classification using supervised machine learning approaches and RapidMiner. It discusses how sentiment analysis can be used for search, recommendations, market research and ad placement. A case study is described that uses RapidMiner to classify movie reviews from IMDB as positive or negative based on word vectors. Additional features like part-of-speech tags, sentiment lexicons, and document statistics are shown to improve accuracy from 85% to 86%.
The document discusses a lightning talk presentation on social learning analytics. It includes an agenda with topics like visualizing social ties in SocialLearn by topic and type, visualizing social learning in the SocialLearn environment, and prototyping learning power modeling in SocialLearn. There are also brief biographies of several presenters.
This document provides an overview of link mining and collective classification algorithms. It discusses how link mining can be used for tasks like node labeling, link prediction, entity resolution, and group detection on graph-structured data. It presents relational classifiers and collective classification as two common link mining algorithms. Relational classifiers extend traditional classifiers by incorporating relational features between linked nodes, while collective classification iteratively propagates predictions between linked nodes. The document provides examples of how these algorithms have been applied to problems like predicting ad click-through rates and friendships. It also discusses entity resolution and how relational clustering algorithms can leverage links between entities to improve resolution.
This document summarizes an approach to generate semantic user profiles from informal communication exchanges like emails, chats and meeting records. It extracts keywords, named entities and concepts from the communications to build user profiles and measure similarity between users. The profiles are used for information retrieval, recommender systems and visualizing interaction networks. An experiment on a university mailing list showed profiles based on concepts best correlated with human judgements of user similarity. Future work could involve long-term trials in organizations and linking profiles to external linked data.
SemEval - Aspect Based Sentiment AnalysisAditya Joshi
SemEval is an ongoing series of evaluations of computational semantic analysis systems that evolved from word sense evaluation. SemEval 2014 included several tasks, including aspect based sentiment analysis (Task 4) which had four subtasks: (1) aspect term extraction, (2) aspect term polarity classification, (3) aspect category detection, and (4) aspect category polarity classification. The top performing system for this task used a semi-Markov tagger for aspect term extraction and SVMs trained on lexical, syntactic, and semantic features for the other subtasks.
This document provides an overview of recommender systems. It discusses several key points:
1. Recommender systems use collaborative filtering, content-based filtering, or knowledge-based techniques to predict items users may like based on their preferences.
2. Collaborative filtering finds users with similar tastes and recommends items liked by similar users. It can be memory-based or model-based.
3. Content-based filtering recommends additional similar items to those a user has liked based on item characteristics.
4. The document also discusses challenges like data sparsity and cold start problems faced by recommender systems.
A brief description of the Opinion-Based Entity Ranking paper published in the Information Retrieval Journal, Volume 15, Number 2, 2012.
Slides By Kavita Ganesan.
Extracting Semantic User Networks from Informal Communication ExchangesSuvodeep Mazumdar
This document summarizes an approach to generate semantic user profiles from informal communication exchanges like emails, meetings, and chats. It extracts keywords, named entities, and concepts from communications to represent user profiles. Similarities between user profiles are then calculated to infer relationships. An experiment on email data found profiles based on concepts best correlated with human judgments of user similarity, outperforming profiles from keywords and entities alone. Future work involves applying the approach to organizations and connecting profiles to linked open data.
The document discusses how the Common Core State Standards (CCSS) emphasize higher-level thinking skills and preparation for new online skills needed in the 21st century. It notes that the CCSS blend the new literacies of online research and comprehension into standards at every grade level. However, no state currently measures students' ability to perform online reading comprehension skills like evaluating online information. The document suggests that failing to address these new literacies may disadvantage students without access to technology outside of school.
Intelligent Tutoring Systems: The DynaLearn ApproachWouter Beek
The document describes the DynaLearn approach to developing intelligent tutoring systems. It focuses on using conceptual modeling to help students construct knowledge about systems. Students build qualitative models and receive feedback to improve their understanding. The approach includes several interactive learning spaces to provide guidance, diagnosis of errors, and engagement through virtual characters. The goal is to develop an environment that supports open-ended conceptual modeling to address declines in science education.
Lak12 - Leeds - Deriving Group Profiles from Social Media lydia-lau
1. The presentation discusses deriving group profiles from social media data to help design simulated learning environments.
2. An experimental study combined semantics and machine learning to profile groups based on their digital traces from a job interview domain.
3. Preliminary results found the group profiles could help training professionals identify learning needs, and domain concepts could augment learner models. However, improving profile quality and demographic data accuracy requires further work.
This document summarizes a project to identify fake reviews on the Yelp dataset for New York City restaurants. It describes the dataset containing over 350,000 reviews labeled as true or fake. Preprocessing steps included merging datasets, handling missing values, text processing to remove stopwords and stemming words. Behavioral and text features were extracted, including sentiment scores, review length, number of capital words. Classification methods like logistic regression, naive Bayes, and KNN were applied and results were presented. References for related work detecting fake reviews on Yelp were also provided.
This document summarizes tag-based recommenders and social tagging systems. It discusses:
1) Social tagging systems allow users to collaboratively tag and categorize content. Popular social tagging sites include Delicious, Flickr, YouTube, etc. Tagging systems have features like tag sharing and selection.
2) Tag recommenders aim to encourage tagging and reuse of common tags. Recommender techniques discussed include most popular, collaborative filtering, tensor factorization, and graph-based methods.
3) The document presents the speaker's work on tag-based collaborative filtering which improves neighbor selection by considering tag semantic similarity between users. Their IUI 2008 paper shows their tag-based approach improves recommendation performance over traditional collaborative filtering.
The document provides 8 lessons learned from deploying a content discovery solution at Orange in France.
Lesson 1 is to have a dedicated group of real users for testing. Lesson 2 is that avoiding bad recommendations is more important than getting perfect ones. Lesson 3 is that using multiple recommendation engines helps overcome filter bubbles. Lesson 4 is that collaborative filtering is biased towards popularity while users prefer novelty. Lesson 5 is that changing factors like language can impact recommendations. Lesson 6 addresses dealing with cold starts for users, content, and systems. Lesson 7 is that laziness often wins over more complex solutions. Lesson 8 emphasizes that privacy matters in how profiles and recommendations are handled.
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...CITE
5 March 2010 (Friday) | 09:00 - 12:30 | http://citers2010.cite.hku.hk/abstract/69 | Dr. Kwok Ping CHAN, Associate Professor, Department of Computer Science, HKU
CBL - Creating an iOS App in the ClassroomDouglas Kiang
Here are the key steps I took to resolve the issue:
1. I reviewed the error message closely to understand what was causing the problem
2. I searched online to see if others had similar issues and what solutions worked for them
3. I checked that I had the latest version of the software in case it was a known bug
4. I simplified the code to isolate the issue - commenting out sections until it worked
5. Once I identified the problematic line, I debugged it step-by-step to find the error
6. I used print statements to check variable values and trace program flow
7. I got help from classmates by sharing my code and asking them to review it
8. As a last
A lot of people talk about Data Mining, Machine Learning and Big Data. It clearly must be important, right?
A lot of people are also trying to sell you snake oil - sometimes half-arsed and overpriced products or solutions promising a world of insight into your customers or users if you handover your data to them. Instead, trying to understanding your own data and what you could do with it, should be the first thing you’d be looking at.
In this talk, we’ll introduce some basic terminology about Data and Text Mining as well as Machine Learning and will have a look at what you can on your own to understand more about your data and discover patterns in your data.
Increasing Social Media ROI Using Gladwell's Tipping Point FrameworkColleen Carrington
Inspired by some of the brightest thought-leaders in social media, this deck explores how to increase social media ROI using Gladwell's tipping point framework: the right people, a sticky idea, the right context. It is designed for on-line viewing without having to be presented in person. Enjoy!
A self training framework for exploratory discourse detection finalZhongyu Wei
The document describes a self-training framework for detecting exploratory discourse in online conversations. It involves initially training a classifier on a small set of annotated data, then using the classifier to annotate additional unlabeled data and adding it to the training set. This allows the classifier to be retrained and improved without requiring manual annotation of large amounts of data. The framework is evaluated on chat data from an Open University conference, and a feature-based self-training approach is shown to improve performance over supervised classifiers and other baselines. Applications for visualizing discourse and participation are also discussed.
Metaphors as design points for collaboration 2012KM Chicago
The document discusses optimization cycles for improving collaboration and search practices, noting that multiple factors should be maintained in constant ratios to achieve predictable outcomes in key metrics, and experiments allow measuring these figures of merit to identify tradeoffs and potential improvements. It provides examples of how aspects of metadata and measuring effective precision can facilitate collaboration.
Как научить роботов тестировать веб-интерфейсы. Артем Ерошенко, Илья Кацев, Я...yaevents
Артем Ерошенко, Яндекс
Закончил математико-механический факультет Санкт-Петербургского государственного университета, учится на 3 курсе аспирантуры по специальности «Теория управления». С 2008 года занимается автоматизацией тестирования выдачи поиска и околопоисковых сервисов в компании «Яндекс». С 2011 года координирует группу разработки инструментов тестирования.
Илья Кацев, Яндекс
Окончил математико-механический факультет Санкт-Петербургского государственного университета, защитил диссертацию по теории игр на степень PhD в VU University Amsterdam (Нидерланды). В Яндексе занимается автоматизацией тестирования (имитация действий пользователя и анализ результата).
Тема доклада
Как научить роботов тестировать веб-интерфейсы.
Тезисы
Речь пойдет об инструменте, который будет сам проверять веб-интерфейсы на наличие ошибок. Главное его качество – способность самостоятельно (автоматически) обнаруживать связанные элементы на странице, строить модели, которые потом можно будет тестировать автоматически. Мы не только предложим идеи, как использовать и развивать эту систему, но и покажем её прототип.
This document provides an overview of link mining and collective classification algorithms. It discusses how link mining can be used for tasks like node labeling, link prediction, entity resolution, and group detection on graph-structured data. It presents relational classifiers and collective classification as two common link mining algorithms. Relational classifiers extend traditional classifiers by incorporating relational features between linked nodes, while collective classification iteratively propagates predictions between linked nodes. The document provides examples of how these algorithms have been applied to problems like predicting ad click-through rates and friendships. It also discusses entity resolution and how relational clustering algorithms can leverage links between entities to improve resolution.
This document summarizes an approach to generate semantic user profiles from informal communication exchanges like emails, chats and meeting records. It extracts keywords, named entities and concepts from the communications to build user profiles and measure similarity between users. The profiles are used for information retrieval, recommender systems and visualizing interaction networks. An experiment on a university mailing list showed profiles based on concepts best correlated with human judgements of user similarity. Future work could involve long-term trials in organizations and linking profiles to external linked data.
SemEval - Aspect Based Sentiment AnalysisAditya Joshi
SemEval is an ongoing series of evaluations of computational semantic analysis systems that evolved from word sense evaluation. SemEval 2014 included several tasks, including aspect based sentiment analysis (Task 4) which had four subtasks: (1) aspect term extraction, (2) aspect term polarity classification, (3) aspect category detection, and (4) aspect category polarity classification. The top performing system for this task used a semi-Markov tagger for aspect term extraction and SVMs trained on lexical, syntactic, and semantic features for the other subtasks.
This document provides an overview of recommender systems. It discusses several key points:
1. Recommender systems use collaborative filtering, content-based filtering, or knowledge-based techniques to predict items users may like based on their preferences.
2. Collaborative filtering finds users with similar tastes and recommends items liked by similar users. It can be memory-based or model-based.
3. Content-based filtering recommends additional similar items to those a user has liked based on item characteristics.
4. The document also discusses challenges like data sparsity and cold start problems faced by recommender systems.
A brief description of the Opinion-Based Entity Ranking paper published in the Information Retrieval Journal, Volume 15, Number 2, 2012.
Slides By Kavita Ganesan.
Extracting Semantic User Networks from Informal Communication ExchangesSuvodeep Mazumdar
This document summarizes an approach to generate semantic user profiles from informal communication exchanges like emails, meetings, and chats. It extracts keywords, named entities, and concepts from communications to represent user profiles. Similarities between user profiles are then calculated to infer relationships. An experiment on email data found profiles based on concepts best correlated with human judgments of user similarity, outperforming profiles from keywords and entities alone. Future work involves applying the approach to organizations and connecting profiles to linked open data.
The document discusses how the Common Core State Standards (CCSS) emphasize higher-level thinking skills and preparation for new online skills needed in the 21st century. It notes that the CCSS blend the new literacies of online research and comprehension into standards at every grade level. However, no state currently measures students' ability to perform online reading comprehension skills like evaluating online information. The document suggests that failing to address these new literacies may disadvantage students without access to technology outside of school.
Intelligent Tutoring Systems: The DynaLearn ApproachWouter Beek
The document describes the DynaLearn approach to developing intelligent tutoring systems. It focuses on using conceptual modeling to help students construct knowledge about systems. Students build qualitative models and receive feedback to improve their understanding. The approach includes several interactive learning spaces to provide guidance, diagnosis of errors, and engagement through virtual characters. The goal is to develop an environment that supports open-ended conceptual modeling to address declines in science education.
Lak12 - Leeds - Deriving Group Profiles from Social Media lydia-lau
1. The presentation discusses deriving group profiles from social media data to help design simulated learning environments.
2. An experimental study combined semantics and machine learning to profile groups based on their digital traces from a job interview domain.
3. Preliminary results found the group profiles could help training professionals identify learning needs, and domain concepts could augment learner models. However, improving profile quality and demographic data accuracy requires further work.
This document summarizes a project to identify fake reviews on the Yelp dataset for New York City restaurants. It describes the dataset containing over 350,000 reviews labeled as true or fake. Preprocessing steps included merging datasets, handling missing values, text processing to remove stopwords and stemming words. Behavioral and text features were extracted, including sentiment scores, review length, number of capital words. Classification methods like logistic regression, naive Bayes, and KNN were applied and results were presented. References for related work detecting fake reviews on Yelp were also provided.
This document summarizes tag-based recommenders and social tagging systems. It discusses:
1) Social tagging systems allow users to collaboratively tag and categorize content. Popular social tagging sites include Delicious, Flickr, YouTube, etc. Tagging systems have features like tag sharing and selection.
2) Tag recommenders aim to encourage tagging and reuse of common tags. Recommender techniques discussed include most popular, collaborative filtering, tensor factorization, and graph-based methods.
3) The document presents the speaker's work on tag-based collaborative filtering which improves neighbor selection by considering tag semantic similarity between users. Their IUI 2008 paper shows their tag-based approach improves recommendation performance over traditional collaborative filtering.
The document provides 8 lessons learned from deploying a content discovery solution at Orange in France.
Lesson 1 is to have a dedicated group of real users for testing. Lesson 2 is that avoiding bad recommendations is more important than getting perfect ones. Lesson 3 is that using multiple recommendation engines helps overcome filter bubbles. Lesson 4 is that collaborative filtering is biased towards popularity while users prefer novelty. Lesson 5 is that changing factors like language can impact recommendations. Lesson 6 addresses dealing with cold starts for users, content, and systems. Lesson 7 is that laziness often wins over more complex solutions. Lesson 8 emphasizes that privacy matters in how profiles and recommendations are handled.
Multiple Methods and Techniques in Analyzing Computer-Supported Collaborative...CITE
5 March 2010 (Friday) | 09:00 - 12:30 | http://citers2010.cite.hku.hk/abstract/69 | Dr. Kwok Ping CHAN, Associate Professor, Department of Computer Science, HKU
CBL - Creating an iOS App in the ClassroomDouglas Kiang
Here are the key steps I took to resolve the issue:
1. I reviewed the error message closely to understand what was causing the problem
2. I searched online to see if others had similar issues and what solutions worked for them
3. I checked that I had the latest version of the software in case it was a known bug
4. I simplified the code to isolate the issue - commenting out sections until it worked
5. Once I identified the problematic line, I debugged it step-by-step to find the error
6. I used print statements to check variable values and trace program flow
7. I got help from classmates by sharing my code and asking them to review it
8. As a last
A lot of people talk about Data Mining, Machine Learning and Big Data. It clearly must be important, right?
A lot of people are also trying to sell you snake oil - sometimes half-arsed and overpriced products or solutions promising a world of insight into your customers or users if you handover your data to them. Instead, trying to understanding your own data and what you could do with it, should be the first thing you’d be looking at.
In this talk, we’ll introduce some basic terminology about Data and Text Mining as well as Machine Learning and will have a look at what you can on your own to understand more about your data and discover patterns in your data.
Increasing Social Media ROI Using Gladwell's Tipping Point FrameworkColleen Carrington
Inspired by some of the brightest thought-leaders in social media, this deck explores how to increase social media ROI using Gladwell's tipping point framework: the right people, a sticky idea, the right context. It is designed for on-line viewing without having to be presented in person. Enjoy!
A self training framework for exploratory discourse detection finalZhongyu Wei
The document describes a self-training framework for detecting exploratory discourse in online conversations. It involves initially training a classifier on a small set of annotated data, then using the classifier to annotate additional unlabeled data and adding it to the training set. This allows the classifier to be retrained and improved without requiring manual annotation of large amounts of data. The framework is evaluated on chat data from an Open University conference, and a feature-based self-training approach is shown to improve performance over supervised classifiers and other baselines. Applications for visualizing discourse and participation are also discussed.
Metaphors as design points for collaboration 2012KM Chicago
The document discusses optimization cycles for improving collaboration and search practices, noting that multiple factors should be maintained in constant ratios to achieve predictable outcomes in key metrics, and experiments allow measuring these figures of merit to identify tradeoffs and potential improvements. It provides examples of how aspects of metadata and measuring effective precision can facilitate collaboration.
Similar to Adapting Rankers Online, Maarten de Rijke (20)
Как научить роботов тестировать веб-интерфейсы. Артем Ерошенко, Илья Кацев, Я...yaevents
Артем Ерошенко, Яндекс
Закончил математико-механический факультет Санкт-Петербургского государственного университета, учится на 3 курсе аспирантуры по специальности «Теория управления». С 2008 года занимается автоматизацией тестирования выдачи поиска и околопоисковых сервисов в компании «Яндекс». С 2011 года координирует группу разработки инструментов тестирования.
Илья Кацев, Яндекс
Окончил математико-механический факультет Санкт-Петербургского государственного университета, защитил диссертацию по теории игр на степень PhD в VU University Amsterdam (Нидерланды). В Яндексе занимается автоматизацией тестирования (имитация действий пользователя и анализ результата).
Тема доклада
Как научить роботов тестировать веб-интерфейсы.
Тезисы
Речь пойдет об инструменте, который будет сам проверять веб-интерфейсы на наличие ошибок. Главное его качество – способность самостоятельно (автоматически) обнаруживать связанные элементы на странице, строить модели, которые потом можно будет тестировать автоматически. Мы не только предложим идеи, как использовать и развивать эту систему, но и покажем её прототип.
Построение сложносоставных блоков в шаблонизаторе bemhtml. Сергей Бережной, Я...yaevents
Сергей Бережной, Яндекс
С 2005 года работает веб-разработчиком в Яндексе. За это время успел поучаствовать в разработке целого ряда сервисов, например, Поиска по блогам, Я.ру, Яндекс.Почты, Поиска, Картинок, Видео. Помимо внешних проектов активно занимается развитием различных внутренних инструментов для полного цикла создания сайтов. Больше всего на свете любит жену и программирование.
Тема доклада
Построение сложносоставных блоков в шаблонизаторе bemhtml.
Тезисы
Предметно-ориентированный шаблонизатор bemhtml позволяет создавать шаблоны блоков согласно методологии БЭМ. После компиляции получаются быстрые plain JavaScript-шаблоны, которые можно исполнить как на сервере, так и на клиенте. Эта технология используется в библиотеке блоков bem-bl, а также на некоторых сервисах Яндекса. Мастер-класс демонстрирует одно из преимуществ шаблонизатора bemhtml — возможность построения сложносоставных блоков. Во время мастер-класса вы узнаете об идее и синтаксисе шаблонизатора, получите готовые рецепты для решения типовых задач и анализ возможностей bemhtml.
i-bem.js: JavaScript в БЭМ-терминах. Елена Глухова, Варвара Степанова, Яндексyaevents
Елена Глухова, Яндекс
Верстальщик, разработчик веб-интерфейсов. Работает в Яндексе с 2008 года.
Варвара Степанова, Яндекс
Закончила Петрозаводский государственный университет. Работает в Яндексе с 2008 года разработчиком интерфейсов. Разрабатывала проекты Яндекс.Ответы и Яндекс.Фотки. Последние полтора года Елена Глухова и Варвара Степанова совместно работают вместе над внутренним интерфейсным фреймворком, помогающим единообразно делать сервисы Яндекса. В последнее время также заняты разработкой подобного интерфейсного фреймворка в open source.
Тема доклада
i-bem.js: JavaScript в БЭМ-терминах.
Тезисы
Разрабатывая сайты по методологии БЭМ, мы используем единую предметную область во всех технологиях: CSS, шаблоны, JavaScript. Для того чтобы это было возможно, в библиотеке блоков bem-bl реализовано ядро клиентского JS-фреймворка, которое позволяет работать со страницей в терминах БЭМ, на следующем уровне абстракции над DOM-представлением. В этом мастер-классе показаны ключевые моменты использования такого подхода для написания клиентского JS. Мы создаём составной блок, использующий JS-функциональность входящих в него маленьких блоков. В результате всё работает, и никакого копипаста.
Дом из готовых кирпичей. Библиотека блоков, тюнинг, инструменты. Елена Глухов...yaevents
Елена Глухова, Яндекс
Верстальщик, разработчик веб-интерфейсов. Работает в Яндексе с 2008 года.
Варвара Степанова, Яндекс
Закончила Петрозаводский государственный университет. Работает в Яндексе с 2008 года разработчиком интерфейсов. Разрабатывала проекты Яндекс.Ответы и Яндекс.Фотки. Последние полтора года Елена Глухова и Варвара Степанова совместно работают вместе над внутренним интерфейсным фреймворком, помогающим единообразно делать сервисы Яндекса. В последнее время также заняты разработкой подобного интерфейсного фреймворка в open source.
Тема доклада
Дом из готовых кирпичей. Библиотека блоков, тюнинг, инструменты.
Тезисы
Все сайты немного похожи друг на друга. Если заниматься веб-разработкой долгие годы, накапливаются практики и типовые решения распространённых задач. Результатом наших накоплений становится open source библиотека блоков bem-bl , которую мы разрабатываем на GitHub. Библиотека реализована согласно методологии БЭМ и позволяет использовать блоки, уже имеющие шаблонную, CSS и JS-реализации, для построения web-страницы. Мастер-класс продемонстрирует, как можно использовать готовые блоки из этой библиотеки и как модифицировать их под нужды своего сайта. Для работы с файлами библиотеки используются консольные инструменты bem-tools.
Модели в профессиональной инженерии и тестировании программ. Александр Петрен...yaevents
Александр Петренко, ИСП РАН
Профессор, доктор физико-математических наук, заведующий отделом технологий программирования Института системного программирования (ИСП РАН), профессор ВМК МГУ. Основные работы в областях: формализация требований, генерация тестов на основе формализованных требований и формальных моделей (model based testing – MBT). Приложения: тестирование операционных систем и распределенных систем, тестирование компиляторов, верификация дизайна микропроцессоров, формализация стандартов на API операционных систем и телекоммуникационных протоколов. Сопредседатель оргкомитетов International MBT workshop (http://www.mbrworkshop.org/), Spring Young Researcher Colloquium on Software Engineering – SYRCoSE (http://syrocose.ispras.ru), городского семинара по технологиям разработки и анализа программ ТРАП/SDAT (http://sdat.ispras.ru/).
Тема доклада
Модели в профессиональной инженерии и тестировании программ.
Тезисы
Model Based Software Engineering (MBSE) является расширением подхода к разработке программ на основе моделей. В MBSE в отличие, например, от MDA (Model Driver Architecture) существенное внимание уделяется не только задачам собственно проектирования и разработки кода, но и задачам других фаз жизненного цикла – анализу требований, верификации и валидации, управлению требованиями на всех фазах жизненного цикла. Model Based Testing (MBT) хронологически возник гораздо раньше, чем MBSE и MDA, однако его место в разработке программ в полной мере раскрылось вместе с развитием MBSE. По этой причине MBT и MBSE следует рассматривать в тесной связке. В докладе будут рассмотрены концепции MBSE-MDA-MBT, основные источники и виды моделей, которые используются в этих подходах, методы генерации тестов на основе моделей, известные инструменты для
Администрирование небольших сервисов или один за всех и 100 на одного. Роман ...yaevents
Роман Андриади, Яндекс
Работает в департаменте эксплуатации Яндекса с 2005 года. С 2010 года – руководитель группы администрирования коммуникационных, контент- и внутренних сервисов.
Тема доклада
Администрирование небольших сервисов, или Один за всех и 100 на одного.
Тезисы
Администрирование коммуникационных сервисов начиналось в 2004 году с обслуживания десятка серверов и десятка сервисов, на них располагающихся. Со временем сервисов становилось все больше, увеличивалось число задач по ним, а десяток серверов вырос в парк из сотен машин, разделенных на множество разношерстных кластеров. В докладе будет рассказано, как с ростом объемов кластера эволюционировали приемы администрирования, какие инструменты при этом использовались, как мы написали свой инструмент управления, как и чем он научился помогать нам за эти годы.
Истории про разработку сайтов. Сергей Бережной, Яндексyaevents
Сергей Бережной, Яндекс
С 2005 года работает веб-разработчиком в Яндексе. За это время успел поучаствовать в разработке целого ряда сервисов, например, Поиска по блогам, Я.ру, Яндекс.Почты, Поиска, Картинок, Видео. Помимо внешних проектов активно занимается развитием различных внутренних инструментов для полного цикла создания сайтов. Больше всего на свете любит жену и программирование.
Тема доклада
Истории про разработку сайтов.
Тезисы
Мы расскажем о том, какие задачи, связанные с разработкой сайтов, появлялись в Яндексе в разное время и как мы их решали. Выступление задумывается как диалог с разработчиками, которые тоже сталкиваются с похожими задачами. В итоге у нас получится некий сборник технологических историй для размышления.
Разработка приложений для Android на С++. Юрий Береза, Shturmannyaevents
Юрий Береза, Shturmann
Окончил факультет приборостроения Московской государственной академии приборостроения и информатики. В 2004 году пришел на работу в отдел мобильных разработок компании «Макцентр». Занимался разработкой под огромное число мобильных платформ: Windows Mobile, Symbian, Android, Embedded linux и iOS. В данный момент работает руководителем группы в компании «Контент Мастер», где занимается разработкой автомобильной навигации Shturmann.
Тема доклада
Разработка приложений для Android на С++.
Тезисы
Платформа Android становится популярнее с каждым годом. Несмотря на то, что основным языком разработки приложений для Android является Java, часто для написания кросс-платформенных приложений или при использовании сторонних библиотек программистам приходится использовать С или С++. К сожалению, разработка на С++ для платформы Android описана довольно скупо, и зачастую приходится тратить много времени на поиск нужной информации. В докладе будут представлены ответы на основные вопросы по всему циклу разработки: как писать С++ код, который будет работать на Android, как его отлаживать и находить ошибки во время падения приложений, есть ли возможность профилировать код и где искать дополнительную информацию по этим вопросам.
Кросс-платформенная разработка под мобильные устройства. Дмитрий Жестилевский...yaevents
Дмитрий Жестилевский, Яндекс
Закончил факультет экспериментальной и теоретической физики Московского инженерно-физического института в 2011 году. С 2006 года занимается разработкой приложений (игры, бизнес-приложения) под мобильные устройства на платформах J2ME, BREW, Windows Mobile, Android, iOS. В Яндексе с 2010 года, занимается разработкой архитектуры мобильных картографических сервисов. Область интересов: кросс-платформенная разработка под мобильные устройства, визуализация 3D.
Тема доклада
Кросс-платформенная разработка под мобильные устройства.
Тезисы
Разработка приложений под embedded-устройства сильно фрагментирована из-за обилия OS (Android, iOS, WM, WP7, Symbian, Bada). Независимая разработка под каждую платформу в отдельности приводит к пропорциональному росту количества участников процесса разработки и объема поддерживаемого CodeBase. Внедрение общего кода, который будет работать на всех платформах за счет использования Platform Abstraction Layer с унифицированным интерфейсом, способно сократить эти издержки. В то же время остается возможность использовать платформенно-зависимые сущности, например UI, для придания приложению native look and feel. В докладе рассматривается процесс внедрения общих компонентов в мобильные приложения Яндекса на примере Панорам улиц, а также трудности, с которыми мы столкнулись во время разработки, и пути их решения.
Сложнейшие техники, применяемые буткитами и полиморфными вирусами. Вячеслав З...yaevents
Вячеслав Закоржевский, Kaspersky Lab
Пришёл в «Лабораторию Касперского» в середине 2007 года на должность вирусного аналитика. В конце 2008 года занял позицию старшего вирусного аналитика в группе эвристического детектирования. В область интересов входит исследование полиморфных вирусов и сильно изменяющихся зловредов. Также следит за современными тенденциями в методах обфускации, антиэмуляции и прочих, применяемых вредоносным программным обеспечением.
Тема доклада
Сложнейшие техники, применяемые буткитами и полиморфными вирусами.
Тезисы
Бытует мнение, что современные зловреды достаточно просты и пишутся неподготовленными людьми. Данное выступление призвано развеять этот миф. В презентации будут описаны три зловреда, которые используют нетривиальные и сложные методы в процессе своего функционирования. В частности, будет рассмотрена схема работы современных буткитов, которые всё больше и больше набирают обороты. На двух других примерах мы проиллюстрируем изобретательность вирусописателей, которые пытаются максимально усложнить жизнь исследователям и антивирусным компаниям. В одном случае они использовали собственную виртуальную машину совместно с EPO техникой заражения. А в другом - «подключение» нулевых виртуальных адресов для размещения в них своих данных.
Сканирование уязвимостей со вкусом Яндекса. Тарас Иващенко, Яндексyaevents
Тарас Иващенко, Яндекс
Администратор информационной безопасности в Яндексе. Специалист по информационной безопасности, проповедник свободного программного обеспечения, автор Termite, xCobra и участник проекта W3AF.
Тема доклада
Сканирование уязвимостей со вкусом Яндекса.
Тезисы
В докладе будет рассказано о внедрении в Яндексе сканирования сервисов на уязвимости как одного из контроля безопасности в рамках SDLC (Secure Development Life Cycle). Речь пойдет о сканировании уязвимостей на этапе тестирования сервисов, а также о сканировании сервисов, находящихся в промышленной эксплуатации. Мы рассмотрим проблемы, с которыми столкнулись, и объясним, почему в качестве основного механизма решили выбрать открытое программное обеспечение (сканер уязвимостей w3af), доработанное под наши нужды.
Масштабируемость Hadoop в Facebook. Дмитрий Мольков, Facebookyaevents
Дмитрий Мольков, Facebook
Бакалавр прикладной математики Киевского национального университета им. Тараса Шевченко (2007). Магистр компьютерных наук Stony Brook University (2009). Hadoop HDFS Commiter с 2011 года. Член команды Hadoop в Facebook с 2009 года.
Тема доклада
Масштабируемость Hadoop в Facebook.
Тезисы
Hadoop и Hive являются прекрасным инструментарием для хранения и анализа петабайтов информации в Facebook. Работая с такими объемами информации, команда разработчиков Hadoop в Facebook ежедневно сталкивается с проблемами масштабируемости и эффективности Hadoop. В докладе пойдет речь о некоторых деталях оптимизаций в разных частях Hadoop инфраструктуры в Facebook, которые позволяют предоставлять высококачественный сервис. Это может быть, например, оптимизация стоимости хранения в многопетабайтных HDFS кластерах, увеличение пропускной способности системы, сокращение времени отказа системы с помощью High Availability разработок для HDFS.
Контроль зверей: инструменты для управления и мониторинга распределенных сист...yaevents
Александр Козлов, Cloudera Inc.
Александр Козлов, старший архитектор в Cloudera Inc., работает с большими компаниями, многие из которых находятся в рейтинге Fortune 500, над проектами по созданию систем анализа большого количества данных. Закончил аспирантуру физического факультета Московского государственного университета, после чего также получил степень Ph.D. в Стэнфорде. До Cloudera и после окончания учебы работал над статистическим анализом данных и соответствующими компьютерными технологиями в SGI, Hewlett-Packard, а также стартапе Turn.
Тема доклада
Контроль зверей: инструменты для управления и мониторинга распределенных систем от Cloudera.
Тезисы
Поддержание распределенных систем, состоящих из тысяч компьютеров, является сложной задачей. Компания Cloudera, которая специализируется на создании распределенных технологий, разработала набор средств для централизованного управления распределенных Hadoop/HBase кластеров. Hadoop и HBase являются проектами Apache Software Foundation, и их применение для анализа частично структурированных данных ускоряется во всем мире. В этом докладе будет рассказано о SCM, системе для конфигурации, настройки, и управления Hadoop/HBase и Activity Monitor, системе для мониторинга ряда ОС и Hadoop/HBase метрик, а также об особенностях подхода Cloudera в отличие от существующих решений для мониторинга (Tivoli, xCat, Ganglia, Nagios и т.д.).
Юнит-тестирование и Google Mock. Влад Лосев, Googleyaevents
Владимир Лосев, Google
Закончил математико-механический факультет Санкт-Петербургского государственного университета в 1995 году. Работал в компаниях Motоrola, Fair Isaac и Yahoo. С 2008 года работает в Google, в группе, занимающейся вопросами повышения производительности инженеров.
Тема доклада
Юнит-тестирование и Google Mock.
Тезисы
В модульных (юнит) тестах каждый элемент программы тестируется по отдельности, в изоляции от других. Такие тесты исполняются очень быстро, поэтому их можно запускать когда угодно, что позволяет отлавливать дефекты на самых ранних стадиях разработки. Однако для тестирования объекта в изоляции от других необходимо имитировать поведение связанных с ним объектов, что на C++ довольно утомительное занятие. Разработанная в Googlе библиотека для создания и использования mock-объектов — Google Mock — позволяет существенно упростить этот процесс и ускорить написание тестов. В докладе пойдет речь о принципах и возможностях библиотеки, примерах её использования и её внутреннем устройстве.
C++11 (formerly known as C++0x) is the new C++ language standard. Dave Abraha...yaevents
Dave Abrahams, BoostPro Computing
He is a founding member of Boost.org and an active participant in the ISO C++ standards committee. His broad range of experience in the computer industry includes shrink-wrap software development, embedded systems design and natural language processing. He has authored eight Boost libraries and has made contributions to numerous others. Dave made his mark on C++ standardization by developing a conceptual framework for understanding exception-safety and applying it to the C++ standard library. He created the first exception-safe standard library implementation and, with Greg Colvin, drafted the proposals that eventually became the standard library’s exception safety guarantees.
Presentation topic:
C++11 (formerly known as C++0x) is the new C++ language standard. Dave Abrahams, BoostPro Computing.
Key points:
The ISO C++ standardization committee has just unanimously approved its final draft international standard, and it's chock full of new features. Though a few of the features have been available for years, some are brand new, and nobody really knows what it's like to program in this new C++ language. As with C++03, Boost.org is expected to take a leading role in exploiting C++11. In this talk, I'll give an overview of the most important new developments.
Зачем обычному программисту знать языки, на которых почти никто не пишет. Але...yaevents
Алексей Воинов, Яндекс
Закончил МГТУ им. Н.Э.Баумана в 1998 году. Посвятил часть своей жизни свободному программному обеспечению. Замечен в любви к языкам, как к алгоритмическим, так и к человеческим, как к естественным, так и к искусственным. Работает в Яндексе с 2009 года, занимается разработкой Яндекс.Почты.
Тема доклада
Зачем обычному программисту знать языки, на которых почти никто не пишет.
Тезисы
Есть категория алгоритмических языков, которые большинство программистов считает в лучшем случае странными. Это такие языки как Haskell, *ML, Lisp, Q. «Странные» языки не приживаются в промышленной разработке софта, потому что они не дают возможности писать стандартный «промышленный» код. Однако они бывают очень хороши для придумывания приёмов, которые помогают улучшить промышленный код. Впоследствии многие из них становятся стандартными промышленными. Знание «странных» языков очень полезно, когда в силу внешних обстоятельств сделать промышленный код радикально лучше невозможно, но его можно улучшать небольшими шагами.
В поисках математики. Михаил Денисенко, Нигмаyaevents
Михаил Денисенко, Нигма
Закончил факультет вычислительной математики и кибернетики МГУ. Завершает работу над диссертацией, посвященной математическим аспектам информационной безопасности. Занимался исследованиями в области обработки видеопоследовательностей и компьютерной безопасности в компании Intel. С 2009 года является старшим разработчиком математического сервиса в компании Nigma.ru. С 2011 года — системный архитектор поисковой системы ITim.vn.
Тема доклада
В поисках математики.
Тезисы
Nigma-Математика – это сервис, с помощью которого пользователи могут решать различные математические задачи (упрощать выражения, решать уравнения, системы уравнений и т. д.), вводя их прямо в строку поиска в виде обычного текста. Система распознает более тысячи физических, математических констант и единиц измерения, что позволяет пользователям производить операции с различными величинами (в том числе решать уравнения) и получать ответ в указанных единицах измерения. Помимо уравнений система решает все задачи, характерные для калькуляторов поисковых систем и конвертеров валют. В докладе будет описана общая схема функционирования сервиса, базовые и новые алгоритмы системы символьных вычислений (алгоритмы решения уравнений и неравенств, алгоритм учета области допустимых значений, алгоритм исследования функций и т.п.). Также будет рассказано об ускорении работы сервиса, распределении нагрузки на систему, распознавании математичности запроса, преобразовании валют и метрических величинах.
Using classifiers to compute similarities between face images. Prof. Lior Wol...yaevents
Prof. Lior Wolf, Tel-Aviv University
He is a faculty member at the School of Computer Science at Tel-Aviv University. Previously, he was a post-doctoral associate in Prof. Poggio's lab at MIT. He graduated from the Hebrew University, Jerusalem, where he worked under the supervision of Prof. Shashua. He was awarded the 2008 Sackler Career Development Chair, the Colton Excellence Fellowship for new faculty (2006-2008), the Max Shlumiuk award for 2004, and the Rothchild fellowship for 2004. His joint work with Prof. Shashua in ECCV 2000 received the best paper award, and their work in ICCV 2001 received the Marr prize honorable mention. He was also awarded the best paper award at the post ICCV workshop on eHeritage 2009. In addition, Lior has held several development, consulting and advisory positions in computer vision companies including face.com and superfish, and is a co-founder of FDNA.
Presentation topic:
Using classifiers to compute similarities between images of faces.
Key points:
The One-Shot-Similarity (OSS) is a framework for classifier-based similarity functions. It is based on the use of background samples and was shown to excel in tasks ranging from face recognition to document analysis. In this talk we will present the framework as well as the following results: (1) when using a version of LDA as the underlying classifier, this score is a Conditionally Positive Definite kernel and may be used within kernel-methods (e.g., SVM), (2) OSS can be efficiently computed, and (3) a metric learning technique that is geared toward improved OSS performance.
What is an RPA CoE? Session 1 – CoE VisionDianaGray10
In the first session, we will review the organization's vision and how this has an impact on the COE Structure.
Topics covered:
• The role of a steering committee
• How do the organization’s priorities determine CoE Structure?
Speaker:
Chris Bolin, Senior Intelligent Automation Architect Anika Systems
Must Know Postgres Extension for DBA and Developer during MigrationMydbops
Mydbops Opensource Database Meetup 16
Topic: Must-Know PostgreSQL Extensions for Developers and DBAs During Migration
Speaker: Deepak Mahto, Founder of DataCloudGaze Consulting
Date & Time: 8th June | 10 AM - 1 PM IST
Venue: Bangalore International Centre, Bangalore
Abstract: Discover how PostgreSQL extensions can be your secret weapon! This talk explores how key extensions enhance database capabilities and streamline the migration process for users moving from other relational databases like Oracle.
Key Takeaways:
* Learn about crucial extensions like oracle_fdw, pgtt, and pg_audit that ease migration complexities.
* Gain valuable strategies for implementing these extensions in PostgreSQL to achieve license freedom.
* Discover how these key extensions can empower both developers and DBAs during the migration process.
* Don't miss this chance to gain practical knowledge from an industry expert and stay updated on the latest open-source database trends.
Mydbops Managed Services specializes in taking the pain out of database management while optimizing performance. Since 2015, we have been providing top-notch support and assistance for the top three open-source databases: MySQL, MongoDB, and PostgreSQL.
Our team offers a wide range of services, including assistance, support, consulting, 24/7 operations, and expertise in all relevant technologies. We help organizations improve their database's performance, scalability, efficiency, and availability.
Contact us: info@mydbops.com
Visit: https://www.mydbops.com/
Follow us on LinkedIn: https://in.linkedin.com/company/mydbops
For more details and updates, please follow up the below links.
Meetup Page : https://www.meetup.com/mydbops-databa...
Twitter: https://twitter.com/mydbopsofficial
Blogs: https://www.mydbops.com/blog/
Facebook(Meta): https://www.facebook.com/mydbops/
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...GlobalLogic Ukraine
Під час доповіді відповімо на питання, навіщо потрібно підвищувати продуктивність аплікації і які є найефективніші способи для цього. А також поговоримо про те, що таке кеш, які його види бувають та, основне — як знайти performance bottleneck?
Відео та деталі заходу: https://bit.ly/45tILxj
What is an RPA CoE? Session 2 – CoE RolesDianaGray10
In this session, we will review the players involved in the CoE and how each role impacts opportunities.
Topics covered:
• What roles are essential?
• What place in the automation journey does each role play?
Speaker:
Chris Bolin, Senior Intelligent Automation Architect Anika Systems
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving
Manufacturing custom quality metal nameplates and badges involves several standard operations. Processes include sheet prep, lithography, screening, coating, punch press and inspection. All decoration is completed in the flat sheet with adhesive and tooling operations following. The possibilities for creating unique durable nameplates are endless. How will you create your brand identity? We can help!
"Scaling RAG Applications to serve millions of users", Kevin GoedeckeFwdays
How we managed to grow and scale a RAG application from zero to thousands of users in 7 months. Lessons from technical challenges around managing high load for LLMs, RAGs and Vector databases.
In our second session, we shall learn all about the main features and fundamentals of UiPath Studio that enable us to use the building blocks for any automation project.
📕 Detailed agenda:
Variables and Datatypes
Workflow Layouts
Arguments
Control Flows and Loops
Conditional Statements
💻 Extra training through UiPath Academy:
Variables, Constants, and Arguments in Studio
Control Flow in Studio
"What does it really mean for your system to be available, or how to define w...Fwdays
We will talk about system monitoring from a few different angles. We will start by covering the basics, then discuss SLOs, how to define them, and why understanding the business well is crucial for success in this exercise.
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...AlexanderRichford
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation Functions to Prevent Interaction with Malicious QR Codes.
Aim of the Study: The goal of this research was to develop a robust hybrid approach for identifying malicious and insecure URLs derived from QR codes, ensuring safe interactions.
This is achieved through:
Machine Learning Model: Predicts the likelihood of a URL being malicious.
Security Validation Functions: Ensures the derived URL has a valid certificate and proper URL format.
This innovative blend of technology aims to enhance cybersecurity measures and protect users from potential threats hidden within QR codes 🖥 🔒
This study was my first introduction to using ML which has shown me the immense potential of ML in creating more secure digital environments!
Introducing BoxLang : A new JVM language for productivity and modularity!Ortus Solutions, Corp
Just like life, our code must adapt to the ever changing world we live in. From one day coding for the web, to the next for our tablets or APIs or for running serverless applications. Multi-runtime development is the future of coding, the future is to be dynamic. Let us introduce you to BoxLang.
Dynamic. Modular. Productive.
BoxLang redefines development with its dynamic nature, empowering developers to craft expressive and functional code effortlessly. Its modular architecture prioritizes flexibility, allowing for seamless integration into existing ecosystems.
Interoperability at its Core
With 100% interoperability with Java, BoxLang seamlessly bridges the gap between traditional and modern development paradigms, unlocking new possibilities for innovation and collaboration.
Multi-Runtime
From the tiny 2m operating system binary to running on our pure Java web server, CommandBox, Jakarta EE, AWS Lambda, Microsoft Functions, Web Assembly, Android and more. BoxLang has been designed to enhance and adapt according to it's runnable runtime.
The Fusion of Modernity and Tradition
Experience the fusion of modern features inspired by CFML, Node, Ruby, Kotlin, Java, and Clojure, combined with the familiarity of Java bytecode compilation, making BoxLang a language of choice for forward-thinking developers.
Empowering Transition with Transpiler Support
Transitioning from CFML to BoxLang is seamless with our JIT transpiler, facilitating smooth migration and preserving existing code investments.
Unlocking Creativity with IDE Tools
Unleash your creativity with powerful IDE tools tailored for BoxLang, providing an intuitive development experience and streamlining your workflow. Join us as we embark on a journey to redefine JVM development. Welcome to the era of BoxLang.
This talk will cover ScyllaDB Architecture from the cluster-level view and zoom in on data distribution and internal node architecture. In the process, we will learn the secret sauce used to get ScyllaDB's high availability and superior performance. We will also touch on the upcoming changes to ScyllaDB architecture, moving to strongly consistent metadata and tablets.
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillLizaNolte
HERE IS YOUR WEBINAR CONTENT! 'Mastering Customer Journey Management with Dr. Graham Hill'. We hope you find the webinar recording both insightful and enjoyable.
In this webinar, we explored essential aspects of Customer Journey Management and personalization. Here’s a summary of the key insights and topics discussed:
Key Takeaways:
Understanding the Customer Journey: Dr. Hill emphasized the importance of mapping and understanding the complete customer journey to identify touchpoints and opportunities for improvement.
Personalization Strategies: We discussed how to leverage data and insights to create personalized experiences that resonate with customers.
Technology Integration: Insights were shared on how inQuba’s advanced technology can streamline customer interactions and drive operational efficiency.
Getting the Most Out of ScyllaDB Monitoring: ShareChat's TipsScyllaDB
ScyllaDB monitoring provides a lot of useful information. But sometimes it’s not easy to find the root of the problem if something is wrong or even estimate the remaining capacity by the load on the cluster. This talk shares our team's practical tips on: 1) How to find the root of the problem by metrics if ScyllaDB is slow 2) How to interpret the load and plan capacity for the future 3) Compaction strategies and how to choose the right one 4) Important metrics which aren’t available in the default monitoring setup.
QA or the Highway - Component Testing: Bridging the gap between frontend appl...zjhamm304
These are the slides for the presentation, "Component Testing: Bridging the gap between frontend applications" that was presented at QA or the Highway 2024 in Columbus, OH by Zachary Hamm.
Dandelion Hashtable: beyond billion requests per second on a commodity serverAntonios Katsarakis
This slide deck presents DLHT, a concurrent in-memory hashtable. Despite efforts to optimize hashtables, that go as far as sacrificing core functionality, state-of-the-art designs still incur multiple memory accesses per request and block request processing in three cases. First, most hashtables block while waiting for data to be retrieved from memory. Second, open-addressing designs, which represent the current state-of-the-art, either cannot free index slots on deletes or must block all requests to do so. Third, index resizes block every request until all objects are copied to the new index. Defying folklore wisdom, DLHT forgoes open-addressing and adopts a fully-featured and memory-aware closed-addressing design based on bounded cache-line-chaining. This design offers lock-free index operations and deletes that free slots instantly, (2) completes most requests with a single memory access, (3) utilizes software prefetching to hide memory latencies, and (4) employs a novel non-blocking and parallel resizing. In a commodity server and a memory-resident workload, DLHT surpasses 1.6B requests per second and provides 3.5x (12x) the throughput of the state-of-the-art closed-addressing (open-addressing) resizable hashtable on Gets (Deletes).
2. Joint work with Katja Hofmann
and Shimon Whiteson
Adapting Rankers Online 2
3. Growing complexity of search engines
Current methods for optimizing mostly work offline
Adapting Rankers Online 3
4. Online learning to rank
No distinction between training and operating
Search engine observes users’ natural interactions
with the search interface, infers information from
them, and improves its ranking function
automatically
Expensive data collection not required; the collected
data matches target users and target setting
Adapting Rankers Online 4
5. Users’ natural interactions with the search
interface Refe r
s
s m a l to
p o s s i le st
b
Minimum scope of i te le s c op e
m
a cte d b ei n g
up o n
Segment Object Class
Behavior category
View, Listen, Scroll,
Examine Find, Query
Select Browse
Bookmark, Save,
Retain Print Delete, Purchase,
Email
Subscribe
Copy-and-paste, Forward, Reply,
Reference Quote Link, Cite
to
R efe rs of
o se
p u rp ve d Annotate Mark up Rate, Publish Organize
o bser io r
v
beh a
Create Type, Edit Author
Oard and Kim, 2001
Adapting Rankers Online 5
Kelly and Teevan, 2004
6. Users’ interactions
Relevance feedback
History goes back close to forty years
Typically used for query expansion, user profiling
Explicit feedback
Users explicitly give feedback
Keywords, selecting or marking documents,
answering questions
Natural explicit feedback can be difficult to obtain
“Unnatural” explicit feedback through TREC
assessors and crowd sourcing
Adapting Rankers Online 6
7. Users’ interactions (2)
Implicit feedback for learning, query expansion and
user profiling
Observe users’ natural interactions with system
Reading time, saving, printing, bookmarking,
selecting, clicking, …
Thought to be less accurate than explicit
measures
Available in very large quantities at no cost
Adapting Rankers Online 7
8. Learning to rank online
Using online learning to rank approaches, retrieval
systems can learn directly from implicit feedback,
while they are running
Algorithms need to explore new solutions to obtain
feedback for effective learning and exploit what has
been learned to produce results acceptable to users
Interleaved comparison methods can use implicit
feedback to detect small differences between
rankers and can be used to learn ranking functions
online
Adapting Rankers Online 8
9. Agenda
Balancing exploration and exploitation
Inferring preferences from clicks
Adapting Rankers Online 9
10. Rec
en
wor t
k
Balancing Exploitation
and Exploration
K. Hofmann et al. (2011), Balancing exploration and exploitation. In:
ECIR ’11.
Adapting Rankers Online 10
11. Challenges
Generalize over queries and documents
Learn from implicit feedback that is …
noisy
relative
rank-biased
Keep users happy while learning
Adapting Rankers Online 11
12. Learning document pair-wise preferences
Vienna
Insight: infer preferences
from clicks
Joachims, T. (2002). Optimizing search engines using
clickthrough data. In KDD '02, pages 133-142. Adapting Rankers Online 12
13. Learning document pair-wise preferences
Input: feature vectors constructed from document
( (q, di ), (q, dj )) ∈ Rn × Rn
pairs x x
Output: y ∈ {−1, +1} correct / incorrect order
Learning method: supervised learning, e.g., SVM
Joachims, T. (2002). Optimizing search engines using clickthrough data. In KDD '02,
pages 133-142. Adapting Rankers Online 13
14. Challenges
Generalize over queries and documents
Learn from implicit feedback that is …
noisy
relative
rank-biased
Keep users happy while learning
Adapting Rankers Online 14
15. Dueling bandit gradient descent
Learns a ranking function consisting of a weight vector
for a linear weighted combination of feature vectors
from feedback about relative quality of rankings
Outcome: weights for ranking S = w (q, d)
x
Approach
Maintain a current “best” ranking function
candidate w
On each incoming query: x2
current best w
Generate a new candidate ranking function
Compare to current “best” x1
If candidate is better, update “best” ranking function
Yue, Y. and Joachims, T. (2009). Interactively optimizing information
retrieval systems as a dueling bandits problem. In ICML '09.
Adapting Rankers Online 15
16. Challenges
Generalize over queries and documents
Learn from implicit feedback that is …
noisy
relative
rank-biased
Keep users happy while learning
Adapting Rankers Online 16
17. Exploration and exploitation
Need to learn effectively Need to present high-
from rank-biased quality results while
feedback learning
Exploration Exploitation
Previous approaches are either purely exploratory or
purely exploitative
Adapting Rankers Online 17
18. Questions
Can we improve online performance by balancing
exploration and exploitation?
How much exploration is needed for effective
learning?
Adapting Rankers Online 18
19. Problem formulation
Reinforcement learning
No explicit labels
Learn from feedback from the environment in
response to actions (document lists)
Contextual bandit problem
try something documents
Retrieval Environment Retrieval Environment
system (user) system (user)
get feedback clicks
Adapting Rankers Online 19
20. Our method
Learning based on Dueling Bandit Gradient Descent
Relative evaluations of quality of two document
lists
Infers such comparisons from implicit feedback
Balance exploration and exploitation with k-greedy
comparison of document lists
Adapting Rankers Online 20
21. k-greedy exploration
To compare document
lists, interleave
An exploration rate k
influences the relative
number of documents
from each list Blue wi n
c o mp a r s
is o n
n
Exp l o ratio
rate k = 0.5
Adapting Rankers Online 21
22. k-greedy exploration
atio n atio n
Exp l o r 0.5 Exp l o r 0.2
rate k = rate k =
Adapting Rankers Online 22
23. Evaluation
Simulated interactions
We need to
observe clicks on arbitrary result lists
measure online performance
Simulate clicks and measure online performance
Probabilistic click model: assume dependent click
model and define click and stop probabilities based
on standard learning to rank data sets
Measure cumulative reward of the rankings
displayed to the user
Adapting Rankers Online 23
24. Experiments
Vary exploration rate k
Three click models
“perfect”
“navigational”
“informational”
Evaluate on nine data sets (LETOR 3.0 and 4.0)
Adapting Rankers Online 24
25. “Perfect” click model
0.8
0.6
Click model
0.4
P(c|R) P(c|NR) P(s|R) P(s|NR)
0.2
1.0 0.0 0.0 0.0
0.0
0 200 400 600 800 1000
Final performance over time for data set
NP2003 and perfect click model
Provides an upperbound
Adapting Rankers Online 25
26. “Perfect” online performance
k = 0.5 k = 0.4 k = 0.3 k = 0.2 k = 0.1
HP2003 119.91 125.71 129.99 130.55 128.50
HP2004 109.21 111.57 118.54 119.86 116.46
117.44 fo r m a n
ce
NP2003 108.74 113.61
Bes t per 120.46
o
119.06
124.47 n l y t w
NP2004 112.33 119.34
with o 126.20
y
123.70
TD2003 82.00 84.24 88.20 r ato r 89.36
exp lo 86.20
or
e nts f91.71
TD2004 85.67 90.23 do c u m
91.00 88.98
OHSUMED 128.12 130.40 top- 01
131.16 re s u lts
133.37 131.93
MQ2007 96.02 97.48 98.54 100.28 98.32
MQ2008 90.97 92.99 94.03 95.59 95.14
Darker shades indicate higher performance
125.71 Dark borders indicate significant improvements over the k = 0.5 baseline
Adapting Rankers Online 26
27. “Navigational” click model
0.8
0.6
Click model
0.4
P(c|R) P(c|NR) P(s|R) P(s|NR)
0.2
0.95 0.05 0.9 0.2
0.0
0 200 400 600 800 1000
Final performance over time for data set
Simulate realistic but NP2003 and navigational click model
reliable interaction
Adapting Rankers Online 27
28. “Navigational” online performance
k = 0.5 k = 0.4 k = 0.3 k = 0.2 k = 0.1
HP2003 102.58 109.78 118.84 116.38 117.52
HP2004 89.61 97.08 99.03 103.36 105.69
NP2003 90.32 100.94 Be st p e r fo r m a n c e
105.03 108.15 110.12
NP2004 99.14 104.34
t le
110.16 h l i t 112.05
wit 116.00
TD2003 70.93 75.20 ex plo
77.64ratio n77.54dan 75.70
TD2004 78.83 80.17 82.40 ot s o f 83.54
l 80.98
OHSUMED 125.35 126.92 127.37 l o i t at i o n
exp 127.94 127.21
MQ2007 95.50 94.99 95.70 96.02 94.94
MQ2008 89.39 90.55 91.24 92.36 92.25
Darker shades indicate higher performance
125.71 Dark borders indicate significant improvements over the k = 0.5 baseline
Adapting Rankers Online 28
29. “Informational” click model
0.8
k = 0.5 k = 0.2 k = 0.1
0.6
Click model
0.4
P(c|R) P(c|NR) P(s|R) P(s|NR)
0.2
0.9 0.4 0.5 0.1
0.0
0 200 400 600 800 1000
Simulate very noisy Final performance over time for data set
NP2003 and informational click model
interaction
Adapting Rankers Online 29
30. “Informational” online performance
k = 0.5 k = 0.4 k = 0.3 k = 0.2 k = 0.1
HP2003 59.53 63.91 61.43 70.11 71.19
HP2004 41.12 52.88
st 55.88
H i g h e 58.40
48.54 55.16
e nts63.23t h
wi
NP2003 53.63 53.64 57.60 69.90
63.38 p ro ve m
im
o n55.76 te s:
NP2004 60.59 64.17 69.96
51.58 at i
r ra
TD2003 52.78
l o w exp l o
52.95 57.30
59.75 n b et we e n
TD2004 58.49
i nte ra
61.43
ctio 62.88 63.37
126.76 et
as
OHSUMED
MQ2007
121.39
91.57
123.26
92.00
124.01
an
n o ise91.66 d dat90.79 125.40
90.19
MQ2008 86.06 87.26 85.83 87.62 86.29
Darker shades indicate higher performance
125.71 Dark borders indicate significant improvements over the k = 0.5 baseline
Adapting Rankers Online 30
31. Summary
What?
Developed first method for balancing exploration and
exploitation in online learning to rank
Devised experimental framework for simulating user
interactions and measuring online performance
And so?
Balancing exploration and exploitation improves online
performance for all click models and all data sets
Best results are achieved with 2 exploratory
documents per results list
Adapting Rankers Online 31
32. What’s next here?
Validate simulation assumptions
Evaluate using on click logs
Develop new algorithms for online learning to rank
for IR that can balance exploration and exploitation
Adapting Rankers Online 32
33. Ongo
ing
Inferring Preferences
work
from Clicks
Adapting Rankers Online 33
34. Interleaved ranker comparison methods
Use implicit feedback (“clicks”), not to infer absolute
judgments, but to compare two rankers by observing
clicks on an interleaved result list
Interleave two ranked lists (“outputs of two rankers”)
Use click data to detect even very small differences
between rankers
Examine three existing methods for interleaving,
identify issues with them and propose a new one
Adapting Rankers Online 34
35. Three methods (1)
Balanced interleave method
Interleaved list is generated for each query based
on the two rankers
User’s clicks on interleaved list are attributed to
each ranker based on how they ranked the clicked
docs
Ranker that obtains more clicks is deemed
superior
Joachims, Evaluating retrieval performance Adapting Rankers Online 35
using clickthrough data, In: Text Mining, 2003
36. 1) Interleaving 2) Comparison
List l1 List l2
d1 d2 d1 d2
d2 d3 x d2 x d1
observed
clicks c
d3 d4 d3 d3
d4 d1 x d4 x d4
k = min(4,3) = 3 k = min(4,4) = 4
Two possible interleaved lists l: click count:
click count:
c1 = 1 c1 = 2
d1 d2 c2 = 2
c2 = 2
d2 d1
d3 d3 l2 wins the first comparison, and the lists tie for
d4 d4 the second. In expectation l2 wins.
Adapting Rankers Online 36
37. Three methods (2)
Team draft method
Create an interleaved list following the model of
“team captains” selecting their team from a set of
players
For each pair of documents to be placed in the
interleaved list, a coin flip determines which list
gets to select a document first
Record which document contributed which
document
Radlinski et al., How does click-through data reflect Adapting Rankers Online 37
retrieval quality? 2008
38. 1) Interleaving 2) Comparison
assignments a
List l1 List l2
d1 d2 a) c)
d2 d3 d1 1 d2 2
d3 d4 d2 2 d1 1
d4 d1 x d3 1 x d3 2
d4 2 d4 1
Four possible interleaved lists l,
with different assignments a: b) d)
d2 2 d1 1
For the interleaved lists a) and b) l1 d1 1 d2 2
wins the comparison. l2 wins in the x d3 1 x d3 2
other two cases. d4 2 d4 1
Adapting Rankers Online 38
39. Three methods (3)
Document-constraint method
Result lists are interleaved and clicks observed as
for the balanced interleaved method
Infer constraints on pairs of individual documents
based on clicks and ranks
For each pair of a clicked document and a higher-ranked non-
clicked document, a constraint is inferred that requires the
former to be ranked higher than the latter
The original list that violates fewer constraints is deemed
superior
He et al., Evaluation of methods for relative comparison of retrieval Adapting Rankers Online 39
systems based on clickthroughs, 2009
40. 1) Interleaving 2) Comparison
List l1 List l2
d1 d2 d1 d2
d2 d3 x d2 x d1
d3 d4 x d3 x d3
d4 d1 d4 d4
inferred constraints inferred constraints
Two possible interleaved lists l:
violated by: l1 l2 violated by: l1 l2
d1 d2 d2 ≻ d1 x - d1 ≻ d2 - x
d2 d1 d3 ≻ d1 x - d3 ≻ d2 x x
d3 d3 l2 wins the first comparison, and loses the
d4 d4 second. In expectation l2 wins.
Adapting Rankers Online 40
41. Assessing comparison methods
Bias
Don’t prefer either ranker when clicks are random
Sensitivity
The ability of a comparison method to detect
differences in the quality of rankings
Balanced interleave and document constraint are
biased
Team draft may suffer from insensitivity
Adapting Rankers Online 41
42. A new proposal
Briefly
Based on team draft
Instead of interleaving deterministically, model the
interleaving process as random sampling from
softmax functions that define probability
distributions over documents
Derive an estimator that is unbiased and sensitive
to small ranking changes
Marginalize over all possible assignments to make
estimates more reliable
Adapting Rankers Online 42
43. 1) Probabilistic Interleave 2) Probabilistic marginalize over all possible assignments:
Comparison
l1 ! softmax s1 l2 ! softmax s2
a o(ci,a) P(a|li,qi)
d1 d2 P(dr=1)= 0.85 1 1 1 1 2 0 0.053
Observe data, e.g. 1 1 1 2 2 0 0.053
d2 d3 P(dr=2)= 0.10
d1 1 1 1 2 1 1 1 0.058
d3 d4 P(dr=3)= 0.03 x d2 2 1 1 2 2 1 1 0.058
d4 d1 P(dr=4)= 0.02
x d3 1 1 2 1 1 1 1 0.065
For each rank of the interleaved list l draw one of {s1, s2} and d4 2 1 2 1 2 1 1 0.065 P(c1 c2) = 0.108
sample d: 1 2 2 1 0 2 0.071 P(c1 c2) = 0.144
s1 d4
1 2 2 2 0 2 0.071
s1 d3
d2 s2 d4 2 1 1 1 2 0 0.001
s1 s2 d4 ... 2 1 1 2 2 0 0.001 s2 (based on l2) wins
d1 d3 ... 2 1 2 1 1 1 0.001 the comparison. s1 and
s2 2 1 2 2 1 1 0.001 s2 tie in expectation.
s1 d2 ... d4 ...
2 2 1 1 1 1 0.001
2 2 1 2 1 1 0.001
s2 d3 ... 2 2 2 1 0 2 0.001
All permutations of documents
d4 ... in D are possible. 2 2 2 2 0 2 0.001
For an incoming query ...
System generates All possible assignments are
generated;
interleaved list Probability of each is computed
Observe clicks
Expensive; only need to do this
Compute probability of until the lowest observed click
each possible outcome
Adapting Rankers Online 43
44. Question
Do analytical differences between the methods
translate into performance differences?
Adapting Rankers Online 44
45. Evaluation
Set-up
Simulation based on dependent click model
Perfect and realistic instantiations
Not binary, but with relevance levels
MSLR-WEB30k Microsoft learning to rank data set
136 doc features (i.e., rankers)
Three experiments
Exhaustive comparison of all distinct ranker pairs
9,180 distinct pairs
Selection of small subsets for detailed analysis
Add noise
Adapting Rankers Online 45
46. Results (1)
Experiment 1
Accuracy
Percentage of pairs of rankers for which a comparison
method identified the better ranker after 1000 queries
Method Accuracy
balanced interleave 0.881
team draft 0.898
document constraint 0.857
new 0.914
Adapting Rankers Online 46
47. Results (2): overview
“Problematic” pairs
Pairs of rankers for which
all methods correctly
identified the better one
Three achieved perfect
accuracy within 1000
queries
For each method,
incorrectly judged pair with
highest difference in
NDCG
Adapting Rankers Online 47
50. Summary
What?
Methods for evaluating rankers using implicit
feedback
Analysis of interleaved comparison methods in
terms of bias and sensitivity
And so?
Introduced a new probabilistic interleaved
comparison method, unbiased and sensitive
Experimental analysis: more accurate, with
substantially fewer observed queries, more robust
Adapting Rankers Online 50
51. What’s next here?
Evaluate in a real-life setting in the future
With more reliable and faster convergence, our
approach can pave the way for online learning to
rank methods that require many comparisons
Adapting Rankers Online 51
53. Online learning to rank
Emphasis on implicit feedback collected during
normal operation of the search engine
Balancing exploration and exploitation
Probabilistic method for inferring preferences from
clicks
Adapting Rankers Online 53
54. Information retrieval observatory
Academic experiments on online learning and
implicit feedback used simulators
Need to validate the simulators
What’s really needed
Move away from artificial explicit feedback to
natural implicit feedback
Shared experimental environment for observing
users in the wild as they interact with systems
Adapting Rankers Online 54
57. Bias
1) Interleaving 2) Comparison
List l1 List l2
d1 d2 d1 d2
d2 d3 x d2 x d1
observed
clicks c
d3 d4 d3 d3
d4 d1 x d4 x d4
k = min(4,3) = 3 k = min(4,4) = 4
Two possible interleaved lists l: click count:
click count:
c1 = 1 c1 = 2
d1 d2 c2 = 2
c2 = 2
d2 d1
d3 d3 l2 wins the first comparison, and the lists tie for
d4 d4 the second. In expectation l2 wins.
Adapting Rankers Online 57
58. Sensitivity
1) Interleaving 2) Comparison
assignments a
List l1 List l2
d1 d2 a) c)
d2 d3 d1 1 d2 2
d3 d4 d2 2 d1 1
d4 d1 x d3 1 x d3 2
d4 2 d4 1
Four possible interleaved lists l,
with different assignments a: b) d)
d2 2 d1 1
For the interleaved lists a) and b) l1 d1 1 d2 2
wins the comparison. l2 wins in the x d3 1 x d3 2
other two cases. d4 2 d4 1
Adapting Rankers Online 58