This document discusses tensor decompositions for learning latent variable models. It presents five applications of latent variable modeling: topic modeling, understanding human communities, recommender systems, feature learning, and computational biology. It describes the statistical framework of discovering hidden structure through latent variable models and graphical modeling. A key challenge is efficiently learning latent variable models, as maximum likelihood is NP-hard. The document proposes using tensor decompositions to efficiently and consistently learn latent variable models from data in a guaranteed manner. It outlines experimental results on social networks that demonstrate the tensor approach outperforms variational methods in accuracy and speed.
Paper presentation for the final course Advanced Concept in Machine Learning.
The paper is @Topic Modeling using Topics from Many Domains, Lifelong Learning and Big Data"
http://jmlr.org/proceedings/papers/v32/chenf14.pdf
Deep neural methods have recently demonstrated significant performance improvements in several IR tasks. In this lecture, we will present a brief overview of deep models for ranking and retrieval.
This is a follow-up lecture to "Neural Learning to Rank" (https://www.slideshare.net/BhaskarMitra3/neural-learning-to-rank-231759858)
A Simple Introduction to Neural Information RetrievalBhaskar Mitra
Neural Information Retrieval (or neural IR) is the application of shallow or deep neural networks to IR tasks. In this lecture, we will cover some of the fundamentals of neural representation learning for text retrieval. We will also discuss some of the recent advances in the applications of deep neural architectures to retrieval tasks.
(These slides were presented at a lecture as part of the Information Retrieval and Data Mining course taught at UCL.)
Paper presentation for the final course Advanced Concept in Machine Learning.
The paper is @Topic Modeling using Topics from Many Domains, Lifelong Learning and Big Data"
http://jmlr.org/proceedings/papers/v32/chenf14.pdf
Deep neural methods have recently demonstrated significant performance improvements in several IR tasks. In this lecture, we will present a brief overview of deep models for ranking and retrieval.
This is a follow-up lecture to "Neural Learning to Rank" (https://www.slideshare.net/BhaskarMitra3/neural-learning-to-rank-231759858)
A Simple Introduction to Neural Information RetrievalBhaskar Mitra
Neural Information Retrieval (or neural IR) is the application of shallow or deep neural networks to IR tasks. In this lecture, we will cover some of the fundamentals of neural representation learning for text retrieval. We will also discuss some of the recent advances in the applications of deep neural architectures to retrieval tasks.
(These slides were presented at a lecture as part of the Information Retrieval and Data Mining course taught at UCL.)
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Rules for inducing hierarchies from social tagging dataHang Dong
Automatic generation of hierarchies from social tags is a challenging task. We identified three rules, set inclusion, graph centrality and information-theoretic condition from the literature and proposed two new rules, fuzzy set inclusion and probabilistic association to induce hierarchical relations. We proposed an hierarchy generation algorithm, which can incorporate each rule with different data representations, i.e., resource and Probabilistic Topic Model based representations. The learned hierarchies were compared to some of the widely used reference concept hierarchies. We found that probabilistic association and set inclusion based rules helped produce better quality hierarchies according to the evaluation metrics.
Handling missing data with expectation maximization algorithmLoc Nguyen
Expectation maximization (EM) algorithm is a powerful mathematical tool for estimating parameter of statistical models in case of incomplete data or hidden data. EM assumes that there is a relationship between hidden data and observed data, which can be a joint distribution or a mapping function. Therefore, this implies another implicit relationship between parameter estimation and data imputation. If missing data which contains missing values is considered as hidden data, it is very natural to handle missing data by EM algorithm. Handling missing data is not a new research but this report focuses on the theoretical base with detailed mathematical proofs for fulfilling missing values with EM. Besides, multinormal distribution and multinomial distribution are the two sample statistical models which are concerned to hold missing values.
Mapping Subsets of Scholarly InformationPaul Houle
We illustrate the use of machine learning techniques to analyze, structure, maintain,
and evolve a large online corpus of academic literature. An emerging field of research
can be identified as part of an existing corpus, permitting the implementation of a
more coherent community structure for its practitioners.
Neural Models for Information RetrievalBhaskar Mitra
In the last few years, neural representation learning approaches have achieved very good performance on many natural language processing (NLP) tasks, such as language modelling and machine translation. This suggests that neural models will also yield significant performance improvements on information retrieval (IR) tasks, such as relevance ranking, addressing the query-document vocabulary mismatch problem by using semantic rather than lexical matching. IR tasks, however, are fundamentally different from NLP tasks leading to new challenges and opportunities for existing neural representation learning approaches for text.
We begin this talk with a discussion on text embedding spaces for modelling different types of relationships between items which makes them suitable for different IR tasks. Next, we present how topic-specific representations can be more effective than learning global embeddings. Finally, we conclude with an emphasis on dealing with rare terms and concepts for IR, and how embedding based approaches can be augmented with neural models for lexical matching for better retrieval performance. While our discussions are grounded in IR tasks, the findings and the insights covered during this talk should be generally applicable to other NLP and machine learning tasks.
Выступление Сергея Кольцова (НИУ ВШЭ) на International Conference on Big Data and its Applications (ICBDA).
ICBDA — конференция для предпринимателей и разработчиков о том, как эффективно решать бизнес-задачи с помощью анализа больших данных.
http://icbda2015.org/
(δ,l)-diversity: Privacy Preservation for Publication Numerical Sensitive Data cscpconf
(ε,m)-anonymity considers ε as the interval to define similarity between two values, and m as
the level of privacy protection. For example {40,60} satisfies (ε,m)-anonymity but {40,50,60}
doesn't, for ε=15 and m=2. We show that protection in {40,50,60} sensitive values of an equivalence class is not less (if don't say more) than {40,60}. Therefore, although (ε,m)-anonymity has well studied publication of numerical sensitive values, it fails to address proximity in the right way. Accordingly, we introduce a revised principle which solve this problem by introducing (δ,l)-diversity principle. Surprisingly, in contrast with (ε,m)-anonymity, the proposed principle respects monotonicity property which makes it adoptable to be exploited in other anonymity principles
Centralized Class Specific Dictionary Learning for wearable sensors based phy...Sherin Mathews
With recent progress in pervasive healthcare,
physical activity recognition with wearable body sensors has
become an important and challenging area in both research and
industrial communities. Here, we address a novel technique for
a sensor platform that performs physical activity recognition by
leveraging a class specific regularizer term into the dictionary
pair learning objective function. The proposed algorithm jointly
learns a synthesis dictionary and an analysis dictionary in
order to simultaneously perform signal representation and
classification once the time-domain features have been extracted.
Specifically, the class specific regularizer term ensures that the
sparse codes belonging to the same class will be concentrated
thereby proving beneficial for the classification stage. In order
to develop a more practical approach, we employ a combination
of an alternating direction method of multipliers and a l1 − ls
minimization method to approximately minimize the objective
function. We validate the effectiveness of our proposed model
by employing it on two activity recognition problem and an
intensity estimation problem, both of which include a large
number of physical activities. Experimental results demonstrate
that classifiers built in this dictionary learning based framework
outperforms state of art algorithms by using simple features,
thereby achieving competitive results when compared with
classical systems built upon features with prior knowledge
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...MLconf
Anima Anandkumar is a faculty at the EECS Dept. at U.C.Irvine since August 2010. Her research interests are in the area of large-scale machine learning and high-dimensional statistics. She received her B.Tech in Electrical Engineering from IIT Madras in 2004 and her PhD from Cornell University in 2009. She has been a visiting faculty at Microsoft Research New England in 2012 and a postdoctoral researcher at the Stochastic Systems Group at MIT between 2009-2010. She is the recipient of the Microsoft Faculty Fellowship, ARO Young Investigator Award, NSF CAREER Award, and IBM Fran Allen PhD fellowship.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Rules for inducing hierarchies from social tagging dataHang Dong
Automatic generation of hierarchies from social tags is a challenging task. We identified three rules, set inclusion, graph centrality and information-theoretic condition from the literature and proposed two new rules, fuzzy set inclusion and probabilistic association to induce hierarchical relations. We proposed an hierarchy generation algorithm, which can incorporate each rule with different data representations, i.e., resource and Probabilistic Topic Model based representations. The learned hierarchies were compared to some of the widely used reference concept hierarchies. We found that probabilistic association and set inclusion based rules helped produce better quality hierarchies according to the evaluation metrics.
Handling missing data with expectation maximization algorithmLoc Nguyen
Expectation maximization (EM) algorithm is a powerful mathematical tool for estimating parameter of statistical models in case of incomplete data or hidden data. EM assumes that there is a relationship between hidden data and observed data, which can be a joint distribution or a mapping function. Therefore, this implies another implicit relationship between parameter estimation and data imputation. If missing data which contains missing values is considered as hidden data, it is very natural to handle missing data by EM algorithm. Handling missing data is not a new research but this report focuses on the theoretical base with detailed mathematical proofs for fulfilling missing values with EM. Besides, multinormal distribution and multinomial distribution are the two sample statistical models which are concerned to hold missing values.
Mapping Subsets of Scholarly InformationPaul Houle
We illustrate the use of machine learning techniques to analyze, structure, maintain,
and evolve a large online corpus of academic literature. An emerging field of research
can be identified as part of an existing corpus, permitting the implementation of a
more coherent community structure for its practitioners.
Neural Models for Information RetrievalBhaskar Mitra
In the last few years, neural representation learning approaches have achieved very good performance on many natural language processing (NLP) tasks, such as language modelling and machine translation. This suggests that neural models will also yield significant performance improvements on information retrieval (IR) tasks, such as relevance ranking, addressing the query-document vocabulary mismatch problem by using semantic rather than lexical matching. IR tasks, however, are fundamentally different from NLP tasks leading to new challenges and opportunities for existing neural representation learning approaches for text.
We begin this talk with a discussion on text embedding spaces for modelling different types of relationships between items which makes them suitable for different IR tasks. Next, we present how topic-specific representations can be more effective than learning global embeddings. Finally, we conclude with an emphasis on dealing with rare terms and concepts for IR, and how embedding based approaches can be augmented with neural models for lexical matching for better retrieval performance. While our discussions are grounded in IR tasks, the findings and the insights covered during this talk should be generally applicable to other NLP and machine learning tasks.
Выступление Сергея Кольцова (НИУ ВШЭ) на International Conference on Big Data and its Applications (ICBDA).
ICBDA — конференция для предпринимателей и разработчиков о том, как эффективно решать бизнес-задачи с помощью анализа больших данных.
http://icbda2015.org/
(δ,l)-diversity: Privacy Preservation for Publication Numerical Sensitive Data cscpconf
(ε,m)-anonymity considers ε as the interval to define similarity between two values, and m as
the level of privacy protection. For example {40,60} satisfies (ε,m)-anonymity but {40,50,60}
doesn't, for ε=15 and m=2. We show that protection in {40,50,60} sensitive values of an equivalence class is not less (if don't say more) than {40,60}. Therefore, although (ε,m)-anonymity has well studied publication of numerical sensitive values, it fails to address proximity in the right way. Accordingly, we introduce a revised principle which solve this problem by introducing (δ,l)-diversity principle. Surprisingly, in contrast with (ε,m)-anonymity, the proposed principle respects monotonicity property which makes it adoptable to be exploited in other anonymity principles
Centralized Class Specific Dictionary Learning for wearable sensors based phy...Sherin Mathews
With recent progress in pervasive healthcare,
physical activity recognition with wearable body sensors has
become an important and challenging area in both research and
industrial communities. Here, we address a novel technique for
a sensor platform that performs physical activity recognition by
leveraging a class specific regularizer term into the dictionary
pair learning objective function. The proposed algorithm jointly
learns a synthesis dictionary and an analysis dictionary in
order to simultaneously perform signal representation and
classification once the time-domain features have been extracted.
Specifically, the class specific regularizer term ensures that the
sparse codes belonging to the same class will be concentrated
thereby proving beneficial for the classification stage. In order
to develop a more practical approach, we employ a combination
of an alternating direction method of multipliers and a l1 − ls
minimization method to approximately minimize the objective
function. We validate the effectiveness of our proposed model
by employing it on two activity recognition problem and an
intensity estimation problem, both of which include a large
number of physical activities. Experimental results demonstrate
that classifiers built in this dictionary learning based framework
outperforms state of art algorithms by using simple features,
thereby achieving competitive results when compared with
classical systems built upon features with prior knowledge
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...MLconf
Anima Anandkumar is a faculty at the EECS Dept. at U.C.Irvine since August 2010. Her research interests are in the area of large-scale machine learning and high-dimensional statistics. She received her B.Tech in Electrical Engineering from IIT Madras in 2004 and her PhD from Cornell University in 2009. She has been a visiting faculty at Microsoft Research New England in 2012 and a postdoctoral researcher at the Stochastic Systems Group at MIT between 2009-2010. She is the recipient of the Microsoft Faculty Fellowship, ARO Young Investigator Award, NSF CAREER Award, and IBM Fran Allen PhD fellowship.
Hierarchical topics in texts generated by a streamkevig
We observe a stream of text messages, generated by Twitter or by a text file
and present a tool which constructs a dynamic list of topics. Each tweet generates edges of
a graph where the nodes are the tags and edges link the author of the tweet with the tags
present in the tweet. We consider the large clusters of the graph and approximate the stream
of edges with a Reservoir sampling. We study the giant components of the Reservoir and each
large component represents a topic. The nodes of high degree and their edges provide the
first layer of a topic, and the iteration over the nodes provide a hierarchical decomposition.
For a standard text, we use a Weighted Reservoir sampling where the weight is the similarity
between words given by Word2vec. We consider dynamic overlapping windows and provide
the topicalization on each window. We compare this approach with the Word2content and LDA techniques in the case of a standard text, viewed as a stream.
Hierarchical topics in texts generated by a streamkevig
We observe a stream of text messages, generated by Twitter or by a text file
and present a tool which constructs a dynamic list of topics. Each tweet generates edges of
a graph where the nodes are the tags and edges link the author of the tweet with the tags
present in the tweet. We consider the large clusters of the graph and approximate the stream
of edges with a Reservoir sampling. We study the giant components of the Reservoir and each
large component represents a topic. The nodes of high degree and their edges provide the
first layer of a topic, and the iteration over the nodes provide a hierarchical decomposition.
For a standard text, we use a Weighted Reservoir sampling where the weight is the similarity
between words given by Word2vec. We consider dynamic overlapping windows and provide
the topicalization on each window. We compare this approach with the Word2content and
LDA techniques in the case of a standard text, viewed as a stream.
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf
Understanding Human Impact: Social and Equity Assessments for AI Technologies
Social and Equity Impact Assessments have broad applications but can be a useful tool to explore and mitigate for Machine Learning fairness issues and can be applied to product specific questions as a way to generate insights and learnings about users, as well as impacts on society broadly as a result of the deployment of new and emerging technologies.
In this presentation, my goal is to advocate for and highlight the need to consult community and external stakeholder engagement to develop a new knowledge base and understanding of the human and social consequences of algorithmic decision making and to introduce principles, methods and process for these types of impact assessments.
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
The Brain’s Guide to Dealing with Context in Language Understanding
Like the visual cortex, the regions of the brain involved in understanding language represent information hierarchically. But whereas the visual cortex organizes things into a spatial hierarchy, the language regions encode information into a hierarchy of timescale. This organization is key to our uniquely human ability to integrate semantic information across narratives. More and more, deep learning-based approaches to natural language understanding embrace models that incorporate contextual information at varying timescales. This has not only led to state-of-the art performance on many difficult natural language tasks, but also to breakthroughs in our understanding of brain activity.
In this talk, we will discuss the important connection between language understanding and context at different timescales. We will explore how different deep learning architectures capture timescales in language and how closely their encodings mimic the brain. Along the way, we will uncover some surprising discoveries about what depth does and doesn’t buy you in deep recurrent neural networks. And we’ll describe a new, more flexible way to think about these architectures and ease design space exploration. Finally, we’ll discuss some of the exciting applications made possible by these breakthroughs.
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf
Applying Computer Vision to Reduce Contamination in the Recycling Stream
With China’s recent refusal of most foreign recyclables, North American waste haulers are scrambling to figure out how to make on-shore recycling cost-effective in order to continue providing recycling services. Recyclables that were once being shipped to China for manual sorting are now primarily being redirected to landfills or incinerators. Without a solution, a nearly $5 billion annual recycling market could come to a halt.
Purity in the recycling stream is key to this effort as contaminants in the stream can increase the cost of operations, damage equipment and reduce the ability to create pure commodities suitable for creating recycled goods. This market disruption as a result of China’s new regulations, however, provides us the chance to re-examine and improve our current disposal & collection habits with modern monitoring & artificial intelligence technology.
Using images from our in-dumpster cameras, Compology has developed an ML-based process that helps identify, measure and alert for contaminants in recycling containers before they are picked-up, helping keep the recycling stream clean.
Our convolutional neural network flags potential instances of contamination inside a dumpster, enabling garbage haulers to know which containers have the wrong type of material inside. This allows them to provide targeted, timely education, and when appropriate, assess fines, to improve recycling compliance at the businesses and residences they serve, helping keep recycling services financially viable.
In this presentation, we will walk through our ML-based contamination measurement and scoring process by showing how Waste Management, a national waste hauler, has experienced 57% contamination reduction in nearly 2,000 containers over six months, This progress shows significant strides towards financially viable recycling services.
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf
Quantum Computing: a Treasure Hunt, not a Gold Rush
Quantum computers promise a significant step up in computational power over conventional computers, but also suffer a number of counterintuitive limitations --- both in their computational model and in leading lab implementations. In this talk, we review how quantum computers compete with conventional computers and how conventional computers try to hold their ground. Then we outline what stands in the way of successful quantum ML applications.
Josh Wills - Data Labeling as Religious ExperienceMLconf
Data Labeling as Religious Experience
One of the most common places to deploy a production machine learning systems is as a replacement for a legacy rules-based system that is having a hard time keeping up with new edge cases and requirements. I'll be walking through the process and tooling we used to help us design, train, and deploy a model to replace a set of static rules we had for handling invite spam at Slack, talk about what we learned, and discuss some problems to solve in order to make these migrations easier for everyone.
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf
Project GaitNet: Ushering in the ImageNet moment for human Gait kinematics
The emergence of the upright human bipedal gait can be traced back 4 to 2.8 million years ago, to the now extinct hominin Australopithecus afarensis. Fine grained analysis of gait using the modern MEMS sensors found on all smartphones not just reveals a lot about the person’s orthopedic and neuromuscular health status, but also has enough idiosyncratic clues that it can be harnessed as a passive biometric. While there were many siloed attempts made by the machine learning community to model Bipedal Gait sensor data, these were done with small datasets oft collected in restricted academic environs. In this talk, we will introduce the ImageNet moment for human gait analysis by presenting 'Project GaitNet', the largest ever planet-sized motion sensor based human bipedal gait dataset ever curated. We’ll also present the associated state-of-the-art results in classifying humans harnessing novel deep neural architectures and the related success stories we have enjoyed in transfer-learning into disparate domains of human kinematics analysis.
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...MLconf
Machine Learning Methods in Detecting Alzheimer’s Disease from Speech and Language
Alzheimer's disease affects millions of people worldwide, and it is important to predict the disease as early and as accurate as possible. In this talk, I will discuss development of novel ML models that help classifying healthy people from those who develop Alzheimer's, using short samples of human speech. As an input to the model, features of different modalities are extracted from speech audio samples and transcriptions: (1) syntactic measures, such as e.g. production rules extracted from syntactic parse trees, (2) lexical measures, such as e.g. features of lexical richness and complexity and lexical norms, and (3) acoustic measures, such as e.g. standard Mel-frequency cepstral coefficients. I will present the ML model that detects cognitive impairment by reaching agreement among modalities. The resulting model is able to achieve state of the art performance in both supervised and semi-supervised manner, using manual transcripts of human speech. Additionally, I will discuss potential limitations of any fully-automated speech-based Alzheimer's disease detection model, focusing mostly on the analysis of the impact of a not-so-accurate automatic speech recognition (ASR) on the classification performance. To illustrate this, I will present the experiments with controlled amounts of artificially generated ASR errors and explain how the deletion errors affect Alzheimer's detection performance the most, due to their impact on the features of syntactic and lexical complexity.
Meghana Ravikumar - Optimized Image Classification on the CheapMLconf
Optimized Image Classification on the Cheap
In this talk, we anchor on building an image classifier trained on the Stanford Cars dataset to evaluate two approaches to transfer learning -fine tuning and feature extraction- and the impact of hyperparameter optimization on these techniques. Once we define the most performant transfer learning technique for Stanford Cars, we will double the size of the dataset through image augmentation to boost the classifier’s performance. We will use Bayesian optimization to learn the hyperparameters associated with image transformations using the downstream image classifier’s performance as the guide. In conjunction with model performance, we will also focus on the features of these augmented images and the downstream implications for our image classifier.
To both maximize model performance on a budget and explore the impact of optimization on these methods, we apply a particularly efficient implementation of Bayesian optimization to each of these architectures in this comparison. Our goal is to draw on a rigorous set of experimental results that can help us answer the question: how can resource-constrained teams make trade-offs between efficiency and effectiveness using pre-trained models?
Noam Finkelstein - The Importance of Modeling Data CollectionMLconf
The Importance of Modeling Data Collection
Data sets used in machine learning are often collected in a systematically biased way - certain data points are more likely to be collected than others. We call this "observation bias". For example, in health care, we are more likely to see lab tests when the patient is feeling unwell than otherwise. Failing to account for observation bias can, of course, result in poor predictions on new data. By contrast, properly accounting for this bias allows us to make better use of the data we do have.
In this presentation, we discuss practical and theoretical approaches to dealing with observation bias. When the nature of the bias is known, there are simple adjustments we can make to nonparametric function estimation techniques, such as Gaussian Process models. We also discuss the scenario where the data collection model is unknown. In this case, there are steps we can take to estimate it from observed data. Finally, we demonstrate that having a small subset of data points that are known to be collected at random - that is, in an unbiased way - can vastly improve our ability to account for observation bias in the rest of the data set.
My hope is that attendees of this presentation will be aware of the perils of observation bias in their own work, and be equipped with tools to address it.
The Uncanny Valley of ML
Every so often, the conundrum of the Uncanny Valley re-emerges as advanced technologies evolve from clearly experimental products to refined accepted technologies. We have seen its effects in robotics, computer graphics, and page load times. The debate of how to handle the new technology detracts from its benefits. When machine learning is added to human decision systems a similar effect can be measured in increased response time and decreased accuracy. These systems include radiology, judicial assignments, bus schedules, housing prices, power grids and a growing variety of applications. Unfortunately, the Uncanny Valley of ML can be hard to detect in these systems and can lead to degraded system performance when ML is introduced, at great expense. Here, we'll introduce key design principles for introducing ML into human decision systems to navigate around the Uncanny Valley and avoid its pitfalls.
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
Deep Learning Architectures for Semantic Relation Detection Tasks
Recognizing and distinguishing specific semantic relations from other types of semantic relations is an essential part of language understanding systems. Identifying expressions with similar and contrasting meanings is valuable for NLP systems which go beyond recognizing semantic relatedness and require to identify specific semantic relations. In this talk, I will first present novel techniques for creating labelled datasets required for training deep learning models for classifying semantic relations between phrases. I will further present various neural network architectures that integrate morphological features into integrated path-based and distributional relation detection algorithms and demonstrate that this model outperforms state-of-the-art models in distinguishing semantic relations and is capable of efficiently handling multi-word expressions.
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf
Building an Incrementally Trained, Local Taste Aware, Global Deep Learned Recommender System Model
At Netflix, our main goal is to maximize our members’ enjoyment of the selected show by minimizing the amount of time it takes for them to find it. We try to achieve this goal by personalizing almost all the aspects of our product -- from what shows to recommend, to how to present these shows and construct their home-pages to what images to select per show, among many other things. Everything is recommendations for us and as an applied Machine Learning group, we spend our time building models for personalization that will eventually increase the joy and satisfaction of our members. In this talk we will primarily focus our attention on a) making a global deep learned recommender model that is regional tastes and popularity aware and b) adapting this model to changing taste preferences as well as dynamic catalog availability.
We will first go through some standard recommender system models that use Matrix Factorization and Topic Models and then compare and contrast them with more powerful and higher capacity deep learning based models such as sequence models that use recurrent neural networks. We will show what it entails to build a global model that is aware of regional taste preferences and catalog availability. We will show how models that are built on simple Maximum Likelihood principle fail to do that. We will then describe one solution that we have employed in order to enable the global deep learned models to focus their attention on capturing regional taste preferences and changing catalog.In the latter half of the talk, we will discuss how we do incremental learning of deep learned recommender system models. Why do we need to do that ? Everything changes with time. Users’ tastes change with time. What’s available on Netflix and what’s popular also change over time. Therefore, updating or improving recommendation systems over time is necessary to bring more joy to users. In addition to how we apply incremental learning, we will discuss some of the challenges we face involving large-scale data preparation, infrastructure setup for incremental model training as well as pipeline scheduling. The incremental training enables us to serve fresher models trained on fresher and larger amounts of data. This helps our recommender system to nicely and quickly adapt to catalog and users’ taste changes, and improve overall performance.
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf
Vito Ostuni - The Voice: New Challenges in a Zero UI World
The adoption of voice-enabled devices has seen an explosive growth in the last few years and music consumption is among the most popular use cases. Music personalization and recommendation plays a major role at Pandora in providing a daily delightful listening experience for millions of users. In turn, providing the same perfectly tailored listening experience through these novel voice interfaces brings new interesting challenges and exciting opportunities. In this talk we will describe how we apply personalization and recommendation techniques in three common voice scenarios which can be defined in terms of request types: known-item, thematic, and broad open-ended. We will describe how we use deep learning slot filling techniques and query classification to interpret the user intent and identify the main concepts in the query.
We will also present the differences and challenges regarding evaluation of voice powered recommendation systems. Since pure voice interfaces do not contain visual UI elements, relevance labels need to be inferred through implicit actions such as play time, query reformulations or other types of session level information. Another difference is that while the typical recommendation task corresponds to recommending a ranked list of items, a voice play request translates into a single item play action. Thus, some considerations about closed feedback loops need to be made. In summary, improving the quality of voice interactions in music services is a relatively new challenge and many exciting opportunities for breakthroughs still remain. There are many new aspects of recommendation system interfaces to address to bring a delightful and effortless experience for voice users. We will share a few open challenges to solve for the future.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
2. Application 1: Topic Modeling
Document modeling
Observed: words in document corpus.
Hidden: topics.
Goal: carry out document summarization.
3. Application 2: Understanding Human Communities
Social Networks
Observed: network of social ties, e.g. friendships, co-authorships
Hidden: groups/communities of actors.
4. Application 3: Recommender Systems
Recommender System
Observed: Ratings of users for various products, e.g. yelp reviews.
Goal: Predict new recommendations.
Modeling: Find groups/communities of users and products.
5. Application 4: Feature Learning
Feature Engineering
Learn good features/representations for classification tasks, e.g.
image and speech recognition.
Sparse representations, low dimensional hidden structures.
6. Application 5: Computational Biology
Observed: gene expression levels
Goal: discover gene groups
Hidden variables: regulators controlling gene groups
“Unsupervised Learning of Transcriptional Regulatory Networks via Latent Tree Graphical
Model” by A. Gitter, F. Huang, R. Valluvan, E. Fraenkel and A. Anandkumar Submitted to
BMC Bioinformatics, Jan. 2014.
7. Statistical Framework
In all applications: discover hidden structure in data: unsupervised
learning.
Latent Variable Models
Concise statistical description through
graphical modeling
Conditional independence relationships
or hierarchy of variables. x
h
8. Statistical Framework
In all applications: discover hidden structure in data: unsupervised
learning.
Latent Variable Models
Concise statistical description through
graphical modeling
Conditional independence relationships
or hierarchy of variables. x1 x2 x3 x4 x5
h
9. Statistical Framework
In all applications: discover hidden structure in data: unsupervised
learning.
Latent Variable Models
Concise statistical description through
graphical modeling
Conditional independence relationships
or hierarchy of variables. x1 x2 x3 x4 x5
h1
h2 h3
10. Computational Framework
Challenge: Efficient Learning of Latent Variable Models
Maximum likelihood is NP-hard.
Practice: EM, Variational Bayes have no consistency guarantees.
Efficient computational and sample complexities?
11. Computational Framework
Challenge: Efficient Learning of Latent Variable Models
Maximum likelihood is NP-hard.
Practice: EM, Variational Bayes have no consistency guarantees.
Efficient computational and sample complexities?
Fast methods such as matrix factorization are not statistical. We
cannot learn the latent variable model through such methods.
12. Computational Framework
Challenge: Efficient Learning of Latent Variable Models
Maximum likelihood is NP-hard.
Practice: EM, Variational Bayes have no consistency guarantees.
Efficient computational and sample complexities?
Fast methods such as matrix factorization are not statistical. We
cannot learn the latent variable model through such methods.
Tensor-based Estimation
Estimate moment tensors from data: higher order relationships.
Compute decomposition of moment tensor.
Iterative updates, e.g. tensor power iterations, alternating
minimization.
Non-convex: convergence to a local optima. No guarantees.
13. Computational Framework
Challenge: Efficient Learning of Latent Variable Models
Maximum likelihood is NP-hard.
Practice: EM, Variational Bayes have no consistency guarantees.
Efficient computational and sample complexities?
Fast methods such as matrix factorization are not statistical. We
cannot learn the latent variable model through such methods.
Tensor-based Estimation
Estimate moment tensors from data: higher order relationships.
Compute decomposition of moment tensor.
Iterative updates, e.g. tensor power iterations, alternating
minimization.
Non-convex: convergence to a local optima. No guarantees.
Innovation: Guaranteed convergence to correct model.
14. Computational Framework
Challenge: Efficient Learning of Latent Variable Models
Maximum likelihood is NP-hard.
Practice: EM, Variational Bayes have no consistency guarantees.
Efficient computational and sample complexities?
Fast methods such as matrix factorization are not statistical. We
cannot learn the latent variable model through such methods.
Tensor-based Estimation
Estimate moment tensors from data: higher order relationships.
Compute decomposition of moment tensor.
Iterative updates, e.g. tensor power iterations, alternating
minimization.
Non-convex: convergence to a local optima. No guarantees.
Innovation: Guaranteed convergence to correct model.
In this talk: tensor decompositions and applications
17. Probabilistic Topic Models
Bag of words: order of words does not matter
Graphical model representation
l words in a document x1, . . . , xl.
h: proportions of topics in a document.
Word xi generated from topic yi.
A(i, j) := P[xm = i|ym = j] :
topic-word matrix.
Words
Topics
Topic
Mixture
x1 x2 x3 x4 x5
y1 y2 y3 y4 y5
AAAAA
h
18. Geometric Picture for Topic Models
Topic proportions vector (h)
Document
Linear Model:
E[xi|h] = Ah .
Multiview model: h is
fixed and multiple words
(xi) are generated.
19. Geometric Picture for Topic Models
Single topic (h)
Linear Model:
E[xi|h] = Ah .
Multiview model: h is
fixed and multiple words
(xi) are generated.
20. Geometric Picture for Topic Models
Topic proportions vector (h)
Linear Model:
E[xi|h] = Ah .
Multiview model: h is
fixed and multiple words
(xi) are generated.
21. Geometric Picture for Topic Models
Topic proportions vector (h)
AAA
x1
x2
x3
Word generation (x1, x2, . . .)
Linear Model:
E[xi|h] = Ah .
Multiview model: h is
fixed and multiple words
(xi) are generated.
22. Moment Tensors
Consider single topic model.
E[xi|h] = Ah. λ := [E[h]]i.
Learn topic-word matrix A, vector λ = P[h]
M2: Co-occurrence of two words in a document
M2 := E[x1x⊤
2 ] = E[E[x1x⊤
2 |h]] = AE[hh⊤
]A⊤
=
k
r=1
λrara⊤
r
23. Moment Tensors
Consider single topic model.
E[xi|h] = Ah. λ := [E[h]]i.
Learn topic-word matrix A, vector λ = P[h]
M2: Co-occurrence of two words in a document
M2 := E[x1x⊤
2 ] = E[E[x1x⊤
2 |h]] = AE[hh⊤
]A⊤
=
k
r=1
λrara⊤
r
Tensor M3: Co-occurrence of three words
M3 := E(x1 ⊗ x2 ⊗ x3) =
r
λrar ⊗ ar ⊗ ar
24. Moment Tensors
Consider single topic model.
E[xi|h] = Ah. λ := [E[h]]i.
Learn topic-word matrix A, vector λ = P[h]
M2: Co-occurrence of two words in a document
M2 := E[x1x⊤
2 ] = E[E[x1x⊤
2 |h]] = AE[hh⊤
]A⊤
=
k
r=1
λrara⊤
r
Tensor M3: Co-occurrence of three words
M3 := E(x1 ⊗ x2 ⊗ x3) =
r
λrar ⊗ ar ⊗ ar
Matrix and Tensor Forms: ar := rth
column of A.
M2 =
k
r=1
λrar ⊗ ar. M3 =
k
r=1
λrar ⊗ ar ⊗ ar
25. Tensor Decomposition Problem
M2 =
k
r=1
λrar ⊗ ar. M3 =
k
r=1
λrar ⊗ ar ⊗ ar
= + ....
Tensor M3 λ1a1 ⊗ a1 ⊗ a1 λ2a2 ⊗ a2 ⊗ a2
u ⊗ v ⊗ w is a rank-1 tensor whose i, j, kth
entry is uivjwk.
k topics, d words in vocabulary.
M3: O(d × d × d) tensor, Rank k.
Learning Topic Models through Tensor Decomposition
28. Detecting Communities in Networks
Stochastic Block Model
Non-overlapping
Mixed Membership Model
Overlapping
29. Detecting Communities in Networks
Stochastic Block Model
Non-overlapping
Mixed Membership Model
Overlapping
30. Detecting Communities in Networks
Stochastic Block Model
Non-overlapping
Mixed Membership Model
Overlapping
Unifying Assumption
Edges conditionally independent given community memberships
32. Tensor Forms in Other Models
Independent Component Analysis
Independent sources, unknown mixing.
Blind source separation of speech, image, video..
h1 h2 hk
x1 x2 xd
A
Gaussian Mixtures Hidden Markov
Models/Latent Trees
x1 x2 x3 x4 x5
h1
h2 h3
Reduction to similar moment forms
34. Tensor Decomposition Problem
M3 =
k
r=1
λrar ⊗ ar ⊗ ar
= + ....
Tensor M3 λ1a1 ⊗ a1 ⊗ a1 λ2a2 ⊗ a2 ⊗ a2
u ⊗ v ⊗ w is a rank-1 tensor whose i, j, kth
entry is uivjwk.
k topics, d words in vocabulary.
M3: O(d × d × d) tensor, Rank k.
d: vocabulary size for topic models or n: size of network for
community models.
35. Dimensionality Reduction for Tensor Decomposition
M3 =
k
r=1
λrar ⊗ ar ⊗ ar
Dimensionality Reduction
(Whitening)
Convert M3 of size O(d × d × d)
to tensor T of size k × k × k
Carry out decomposition of T Tensor M3 Tensor T
Dimensionality reduction through multi-linear transforms
Computed from data, e.g. pairwise moments.
T = i ρir⊗3
i is symmetric orthogonal tensor: {ri} are orthonormal
38. Orthogonal/Eigen Decomposition
Orthogonal symmetric tensor: T =
j∈[k]
ρjr⊗3
j
T(I, r1, r1) =
j∈[k]
ρj r1, rj
2rj = ρ1r1
Obtaining eigenvectors through power iterations
u →
T(I, u, u)
T(I, u, u)
Basic Algorithm
Random initialization, run power iterations and deflate
39. Practical Considerations
k communities, n nodes, k ≪ n.
Steps
k-SVD of n × n matrix: randomized techniques
Online k × k × k tensor decomposition: No tensor explicitly formed.
Parallelization: Inherently parallelizable, GPU deployment.
Sparse implementation: real-world networks are sparse
Validation Metric: p-value test based “soft-pairing”
Parallel time complexity: O
nsk
c
+ k3 ,
s is max. degree in graph and c is number of cores.
Huang, Niranjan, Hakeem and Anandkumar, “Fast Detection of Overlapping Communities via
Online Tensor Methods,” Preprint, Sept. 2013.
40. Scaling Of The Stochastic Iterations
vt+1
i ← vt
i − 3θβt
k
j=1
vt
j, vt
i
2
vt
j + βt
vt
i, yt
A vt
i , yt
B yt
C + . . .
Parallelize across
eigenvectors.
STGD is iterative:
device code reuse
buffers for updates.
vt
i
yt
A,yt
B,yt
C
CPU
GPU
Standard Interface
vt
i
yt
A,yt
B,yt
C
CPU
GPU
Device Interface
vt
i
41. Scaling Of The Stochastic Iterations
10
2
10
3
10
−1
10
0
10
1
10
2
10
3
10
4
Number of communities k
Runningtime(secs)
MATLAB Tensor Toolbox
CULA Standard Interface
CULA Device Interface
Eigen Sparse
43. Experimental Results
Friend
Users
Facebook
n ∼ 20, 000
Business
User
Reviews
Yelp
n ∼ 40, 000
Author
Coauthor
DBLP
n ∼ 1 million
Error (E) and Recovery ratio (R)
Dataset ˆk Method Running Time E R
Facebook(k=360) 500 ours 468 0.0175 100%
Facebook(k=360) 500 variational 86,808 0.0308 100%
.
Yelp(k=159) 100 ours 287 0.046 86%
Yelp(k=159) 100 variational N.A.
.
DBLP(k=6000) 100 ours 5407 0.105 95%
44. Experimental Results on Yelp
Lowest error business categories & largest weight businesses
Rank Category Business Stars Review Counts
1 Latin American Salvadoreno Restaurant 4.0 36
2 Gluten Free P.F. Chang’s China Bistro 3.5 55
3 Hobby Shops Make Meaning 4.5 14
4 Mass Media KJZZ 91.5FM 4.0 13
5 Yoga Sutra Midtown 4.5 31
45. Experimental Results on Yelp
Lowest error business categories & largest weight businesses
Rank Category Business Stars Review Counts
1 Latin American Salvadoreno Restaurant 4.0 36
2 Gluten Free P.F. Chang’s China Bistro 3.5 55
3 Hobby Shops Make Meaning 4.5 14
4 Mass Media KJZZ 91.5FM 4.0 13
5 Yoga Sutra Midtown 4.5 31
Bridgeness: Distance from vector [1/ˆk, . . . , 1/ˆk]⊤
Top-5 bridging nodes (businesses)
Business Categories
Four Peaks Brewing Restaurants, Bars, American, Nightlife, Food, Pubs, Tempe
Pizzeria Bianco Restaurants, Pizza, Phoenix
FEZ Restaurants, Bars, American, Nightlife, Mediterranean, Lounges, Phoenix
Matt’s Big Breakfast Restaurants, Phoenix, Breakfast& Brunch
Cornish Pasty Co Restaurants, Bars, Nightlife, Pubs, Tempe
47. Conclusion
Guaranteed Learning of Latent Variable Models
Guaranteed to recover correct model
Efficient sample and computational complexities
Better performance compared to EM, Variational
Bayes etc.
Mixed membership communities, topic models,
ICA, Gaussian mixtures...
Current and Future Goals
Guaranteed online learning in high dimensions
Large-scale cloud-based implementation of tensor approaches
Code available on website and Github