Tensor Decomposition: A Mathematical Tool for Data Analysis:
Tensors are multiway arrays, and tensor decompositions are powerful tools for data analysis. In this talk, we demonstrate the wide-ranging utility of the canonical polyadic (CP) tensor decomposition with examples in neuroscience and chemical detection. The CP model is extremely useful for interpretation, as we show with an example in neuroscience. However, it can be difficult to fit to real data for a variety of reasons. We present a novel randomized method for fitting the CP decomposition to dense data that is more scalable and robust than the standard techniques. We further consider the modeling assumptions for fitting tensor decompositions to data and explain alternative strategies for different statistical scenarios, resulting in a _generalized_ CP tensor decomposition.
Bio: Tamara G. Kolda is a member of the Data Science and Cyber Analytics Department at Sandia National Laboratories in Livermore, CA. Her research is generally in the area of computational science and data analysis, with specialties in multilinear algebra and tensor decompositions, graph models and algorithms, data mining, optimization, nonlinear solvers, parallel computing and the design of scientific software. She has received a Presidential Early Career Award for Scientists and Engineers (PECASE), been named a Distinguished Scientist of the Association for Computing Machinery (ACM) and a Fellow of the Society for Industrial and Applied Mathematics (SIAM). She was the winner of an R&D100 award and three best paper prizes at international conferences. She is currently a member of the SIAM Board of Trustees and serves as associate editor for both the SIAM J. Scientific Computing and the SIAM J. Matrix Analysis and Applications.
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...MLconf
Large-scale Machine Learning: Deep, Distributed and Multi-Dimensional:
Modern machine learning involves deep neural network architectures which yields state-of-art performance on multiple domains such as computer vision, natural language processing and speech recognition. As the data and models scale, it becomes necessary to have multiple processing units for both training and inference. Apache MXNet is an open-source framework developed for distributed deep learning. I will describe the underlying lightweight hierarchical parameter server architecture that results in high efficiency in distributed settings.
Pushing the current boundaries of deep learning requires using multiple dimensions and modalities. These can be encoded into tensors, which are natural extensions of matrices. We present new deep learning architectures that preserve the multi-dimensional information in data end-to-end. We show that tensor contractions and regression layers are an effective replacement for fully connected layers in deep learning architectures. They result in significant space savings with negligible performance degradation. These functionalities are available in the Tensorly package with MXNet backend interface for large-scale efficient learning.
Bio: Anima Anandkumar is a principal scientist at Amazon Web Services and a Bren professor at Caltech CMS department. Her research interests are in the areas of large-scale machine learning, non-convex optimization and high-dimensional statistics. In particular, she has been spearheading the development and analysis of tensor algorithms. She is the recipient of several awards such as the Alfred. P. Sloan Fellowship, Microsoft Faculty Fellowship, Google research award, ARO and AFOSR Young Investigator Awards, NSF Career Award, Early Career Excellence in Research Award at UCI, Best Thesis Award from the ACM Sigmetrics society, IBM Fran Allen PhD fellowship, and several best paper awards. She has been featured in a number of forums such as the yourstory, Quora ML session, O’Reilly media, and so on. She received her B.Tech in Electrical Engineering from IIT Madras in 2004 and her PhD from Cornell University in 2009. She was a postdoctoral researcher at MIT from 2009 to 2010, an assistant professor at U.C. Irvine between 2010 and 2016, and a visiting researcher at Microsoft Research New England in 2012 and 2014.
Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017MLconf
Natural Language Understanding @ Facebook Scale:
At Facebook, text understanding is the key to surfacing content that’s relevant and personalized, plus enabling new experiences like social recommendations and Marketplace suggestions. In this talk, I will introduce you to DeepText, Facebook’s platform for text understanding, and discuss the various models it supports.
Bio: Rushin Shah is an engineering leader at Facebook, currently working on natural language understanding and dialog. Previously, he was at Siri at Apple for 5 years, where he built and headed the natural language understanding group, which included teams dedicated to modeling, engineering and data science. He also worked at the query understanding group at Yahoo. He has worked on a broad range of problems in the NLP area including parsing, information extraction, dialog and question answering. He holds degrees in language technologies and computer science from Carnegie Mellon and IIT Kharagpur.
Anima Anadkumar, Principal Scientist, Amazon Web Services, Endowed Professor,...MLconf
Large-scale Machine Learning: Deep, Distributed and Multi-Dimensional:
Modern machine learning involves deep neural network architectures which yields state-of-art performance on multiple domains such as computer vision, natural language processing and speech recognition. As the data and models scale, it becomes necessary to have multiple processing units for both training and inference. Apache MXNet is an open-source framework developed for distributed deep learning. I will describe the underlying lightweight hierarchical parameter server architecture that results in high efficiency in distributed settings.
Pushing the current boundaries of deep learning requires using multiple dimensions and modalities. These can be encoded into tensors, which are natural extensions of matrices. We present new deep learning architectures that preserve the multi-dimensional information in data end-to-end. We show that tensor contractions and regression layers are an effective replacement for fully connected layers in deep learning architectures. They result in significant space savings with negligible performance degradation. These functionalities are available in the Tensorly package with MXNet backend interface for large-scale efficient learning.
Bio: Anima Anandkumar is a principal scientist at Amazon Web Services and a Bren professor at Caltech CMS department. Her research interests are in the areas of large-scale machine learning, non-convex optimization and high-dimensional statistics. In particular, she has been spearheading the development and analysis of tensor algorithms. She is the recipient of several awards such as the Alfred. P. Sloan Fellowship, Microsoft Faculty Fellowship, Google research award, ARO and AFOSR Young Investigator Awards, NSF Career Award, Early Career Excellence in Research Award at UCI, Best Thesis Award from the ACM Sigmetrics society, IBM Fran Allen PhD fellowship, and several best paper awards. She has been featured in a number of forums such as the yourstory, Quora ML session, O’Reilly media, and so on. She received her B.Tech in Electrical Engineering from IIT Madras in 2004 and her PhD from Cornell University in 2009. She was a postdoctoral researcher at MIT from 2009 to 2010, an assistant professor at U.C. Irvine between 2010 and 2016, and a visiting researcher at Microsoft Research New England in 2012 and 2014.
Rushin Shah, Engineering Manager, Facebook at MLconf SF 2017MLconf
Natural Language Understanding @ Facebook Scale:
At Facebook, text understanding is the key to surfacing content that’s relevant and personalized, plus enabling new experiences like social recommendations and Marketplace suggestions. In this talk, I will introduce you to DeepText, Facebook’s platform for text understanding, and discuss the various models it supports.
Bio: Rushin Shah is an engineering leader at Facebook, currently working on natural language understanding and dialog. Previously, he was at Siri at Apple for 5 years, where he built and headed the natural language understanding group, which included teams dedicated to modeling, engineering and data science. He also worked at the query understanding group at Yahoo. He has worked on a broad range of problems in the NLP area including parsing, information extraction, dialog and question answering. He holds degrees in language technologies and computer science from Carnegie Mellon and IIT Kharagpur.
Daniel Shank, Data Scientist, Talla at MLconf SF 2017MLconf
Getting Value Out of Chat Data:
Chat-based interfaces are increasingly common, whether as customers interacting with companies or as employees communicating with each other within an organization. Given the large number of chat logs being captured, along with recent advances in natural language processing, there is a desire to leverage this data for both insight generation and machine learning applications. Unfortunately, chat data is user-generated data, meaning it is often noisy and difficult to normalize. It is also mostly short texts and heavily context-dependent, which cause difficulty in applying methods such as topic modeling and information extraction.
Despite these challenges, it is still possible to extract useful information from these data sources. In this talk, I will be providing an overview of techniques and practices for working with chat-based user interaction data with a focus on machine-augmented data annotation and unsupervised learning methods.
Bio: Daniel Shank is a Senior Data Scientist at Talla, a company developing a platform for intelligent information discovery and delivery. His focus is on developing machine learning techniques to handle various business automation tasks, such as scheduling, polls, expert identification, as well as doing work on NLP. Before joining Talla as the company’s first employee in 2015, Daniel worked with TechStars Boston and did consulting work for ThriveHive, a small business focused marketing company in Boston. He studied economics at the University of Chicago.
Michael Alcorn, Sr. Software Engineer, Red Hat Inc. at MLconf SF 2017MLconf
Representation Learning @ Red Hat:
For many companies, the vast majority of their data is unstructured and unlabeled; however, the data often contains information that could be useful in a variety of scenarios. Representation learning is the process of extracting meaningful features from unlabeled data so that it can be used in other tasks. In this talk, you’ll hear about how Red Hat is using deep learning to discover meaningful entity representations in a number of different settings, including: (1) identifying duplicate documents on the Customer Portal, (2) finding contextually similar URLs with word2vec, and (3) clustering behaviorally similar customers with doc2vec. To close, we will walk through an example demonstrating how representation learning can be applied to Major League Baseball players.
Bio: Michael first developed his data crunching chops as an undergraduate at Auburn University (War Eagle!) where he used a number of different statistical techniques to investigate various aspects of salamander biology (work that led to several publications). He then went on to earn a M.S. in evolutionary biology from The University of Chicago (where he wrote a thesis on frog ecomorphology) before changing directions and earning a second M.S. in computer science (with a focus on intelligent systems) from The University of Texas at Dallas. As a Machine Learning Engineer – Information Retrieval at Red Hat, Michael is constantly looking for ways to use the latest and greatest machine learning technology to improve search.
Dr. June Andrews, Principal Data Scientist, Wise.io, From GE Digital at MLcon...MLconf
Counter Intuitive Machine Learning for the Industrial Internet of Things:
The Industrial Internet of Things (IIoT) is the infrastructure and data flow built around the world’s most valuable things like airplane engines, medical scanners, nuclear power plants, and oil pipelines. These machines and systems require far greater uptime, security, governance, and regulation than the IoT landscape based around consumer activity. In the IIoT the cost of being wrong can be the catastrophic loss of life on a massive scale. Nevertheless, given the growing scale through the digitalization of industrial assets, there is clearly a growing role for machine learning to help augment and automate human decision making. It is against this backdrop that traditional machine learning techniques must be adapted and need based innovations created. We see industrial machine learning as distinct from consumer machine learning and in this talk we will cover the counterintuitive changes of featurization, metrics for model performance, and human-in-the-loop design changes for using machine learning in an industrial environment.
Bio: June Andrews is a Principal Data Scientist at Wise.io, From GE Digital working on a machine learning and data science platform for the Industrial Internet of Things, which includes aviation, trains, and power plants. Previously, she worked at Pinterest spearheading the Data Trustworthiness and Signals Program to create a healthy data ecosystem for machine learning. She has also lead efforts at LinkedIn on growth, engagement, and social network analysis to increase economic opportunity for professionals. June holds degrees in applied mathematics, computer science, and electrical engineering from UC Berkeley and Cornell.
Talha Obaid, Email Security, Symantec at MLconf ATL 2017MLconf
A Machine Learning approach for detecting a Malware:
The project is to improve the way we detect script based malware using Machine Learning. Malware has become one of the most active channel to deliver threats like Banking Trojans and Ransomware. The talk is aimed at finding a new and effective way to detect the malware. We started with acquiring both malicious and clean samples. Later we performed feature identification, while building on top of existing knowledge base of malware. Then we performed automated feature extraction. After certain feature set is obtained, we teased-out feature which are categorical, interdependent or composite. We applied varying machine learning models, producing both binary and categorical outcomes. We cross validated our results and re-tuned our feature set and our model, until we obtained satisfying results, with least false-positives. We concluded that not all the extracted features are significant, in fact some features are detrimental on the model performance. Once such features are factored-out, it results not only in better match, but also provides a significant gain in performance.
Matineh Shaker, Artificial Intelligence Scientist, Bonsai at MLconf SF 2017MLconf
Deep Reinforcement Learning with Shallow Trees:
In this talk, I present Concept Network Reinforcement Learning (CNRL), developed at Bonsai. It is an industrially applicable approach to solving complex tasks using reinforcement learning, which facilitates problem decomposition, allows component reuse, and simplifies reward functions. Inspired by Sutton’s options framework, we introduce the notion of “Concept Networks” which are tree-like structures in which leaves are “sub-concepts” (sub-tasks), representing policies on a subset of state space. The parent (non-leaf) nodes are “Selectors”, containing policies on which sub-concept to choose from the child nodes, at each time during an episode. There will be a high-level overview on the reinforcement learning fundamentals at the beginning of the talk.
Bio: Matineh Shaker is an Artificial Intelligence Scientist at Bonsai in Berkeley, CA, where she builds machine learning, reinforcement learning, and deep learning tools and algorithms for general purpose intelligent systems. She was previously a Machine Learning Researcher at Geometric Intelligence, Data Science Fellow at Insight Data Science, Predoctoral Fellow at Harvard Medical School. She received her PhD from Northeastern University with a dissertation in geometry-inspired manifold learning.
Jonas Schneider, Head of Engineering for Robotics, OpenAIMLconf
Machine Learning Systems at Scale:
OpenAI is a non-profit research company, discovering and enacting the path to safe artificial general intelligence. As part of our work, we regularly push the limits of scalability in cutting-edge ML algorithms. We’ve found that in many cases, designing the systems we build around the core algorithms is as important as designing the algorithms themselves. This means that many systems engineering areas, such as distributed computing, networking, and orchestration, are crucial for machine learning to succeed on large problems requiring thousands of computers. As a result, at OpenAI engineers and researchers work closely together to build these large systems as opposed to a strict researcher/engineer split. In this talk, we will go over some of the lessons we’ve learned, and how they come together in the design and internals of our system for learning-based robotics research.
Bio: Jonas leads technology development for OpenAI’s robotics group, developing methods to apply machine learning and AI to robots. He also helped build the infrastructure to scale OpenAI’s distributed ML systems to thousands of machines.
Xavier Amatriain, Cofounder & CTO, Curai at MLconf SF 2017MLconf
ML to Cure the World:
The practice of medicine involves diagnosis, treatment, and prevention of diseases. Recent technological breakthroughs have made little dent to the centuries-old system of practicing medicine: complex diagnostic decisions are still mostly dependent on “educated” work-ups of the doctors, and rely on somewhat outdated tools and incomplete data. All of this often leads to imperfect, biased, and, at times, incorrect diagnosis and treatment.
With a growing research community as well as tech companies working on AI advances to medicine, the hope for healthcare renaissance is definitely not lost. The emphasis of this talk will be on ML-driven medicine. We will discuss recent AI advancements for aiding medical decision including language understanding, medical knowledge base construction and diagnosis systems. We will discuss the importance of personalized medicine that takes into account not only the user, but also the context, and other metadata. We will also highlight challenges in designing ML-based medical systems that are accurate, but at the same time engaging and trustworthy for the user.
Bio: Xavier Amatriain is currently co-founder and CTO of Curai, a stealth startup trying to radically improve healthcare for patients by using AI. Previous to this, he was VP of Engineering at Quora, and Research/engineering Director at Netflix, where he led the team building the famous Netflix recommendation algorithms. Before going into leadership positions in industry, Xavier was a research scientist at Telefonica Research and a research director at UCSB. With over 50 publications (and 3k+ citations) in different fields, Xavier is best known for his work on machine learning in general and recommender systems in particular. He has lectured at different universities both in the US and Spain and is frequently invited as a speaker at conferences and companies.
LN Renganarayana, Architect, ML Platform and Services and Madhura Dudhgaonkar...MLconf
Lessons Learnt from building ML Products for enterprise SaaS:
Having spent the last 4+ years productizing ML powered enterprise products, we have learnt a lot! Join us to hear the stories of our stumbles (ahem learnings) in applying machine learning to solve business problems for Fortune 500 companies. Our hands-on experience has shaped our product strategy, ML platform design and organization’s operational principles. And the investments we made based on our learnings have helped us drastically improve our time to market for ML products. Come on by to hear the technical and organizational challenges (and some solutions) in building ML products for enterprise SaaS. Hopefully our learnings will be useful in your journey.
Bio: LN leads the architecture and design of Workday’s ML Platform and Services. He is all about building large scale distributed systems and data platforms. Currently his days (and some nights) are spent on solving the challenges in building ML products for Enterprise SaaS. LN’s career spans across HP, IBM Research, Symantec and now Workday. At Symantec, he was the architect and lead of a streaming platform that ingested and processed 2+ billions of events per day. As a Research Staff Member at IBM T.J. Watson Research Center, LN built optimizations for automatic parallelization, techniques for approximate computing, deployment automation for OpenStack, and analytics for large scale cloud services.
LN holds a Ph.D. in Computer Science from Colorado State University and has published more than 40 technical publications / patents. His work has received awards from ACM, IBM, and HP.
Bio:Madhura Dudhgaonkar is responsible for leading Workday’s search, data science and machine learning teams based in San Francisco. Her teams have spent ~4 years building machine learning products used by Fortune 500 companies. Her experience ranges from being a hands-on engineer to leading large engineering organizations. Madhura’s career spans across SUN Microsystems, Adobe and now Workday. During her career, she has been involved with building a variety of products – from developing Java Language to building a version 1.0 consumer product to building enterprise SaaS products.
She holds a bachelor’s degree in electronics and telecommunications and a master’s degree in math and computer science.
Madhura is originally from a small town in India and came to the United States to pursue her passion in technology. She currently calls San Francisco home, and despite nine years here, can’t get enough of its hilly charm, the diversity of people, culture, and experiences.
Classifying Multi-Variate Time Series at Scale:
Characterizing and understanding the runtime behavior of large scale Big Data production systems is extremely important. Typical systems consist of hundreds to thousands of machines in a cluster with hundreds of terabytes of storage costing millions of dollars, solving problems that are business critical. By instrumenting each running process, and measuring their resource utilization including CPU, Memory, I/O, network etc., as time series it is possible to understand and characterize the workload on these massive clusters. Each time series is a series consisting of tens to tens of thousands of data points that must be ingested and then classified. At Pepperdata, our instrumentation of the clusters collects over three hundred metrics from each task every five seconds resulting in millions of data points per hour. At this scale the data are equivalent to the biggest IOT data sets in the world. Our objective is to classify the collection of time series into a set of classes that represent different work load types. Or phrased differently, our problem is essentially the problem of classifying multivariate time series.
In this talk, we propose a unique, off-the-shelf approach to classifying time series that achieves near best-in-class accuracy for univariate series and generalizes to multivariate time series. Our technique maps each time series to a Grammian Angular Difference Field (GADF), interprets that as an image, uses Google’s pre-trained CNN (trained on Inception v3) to map the GADF images into a 2048-dimensional vector space and then uses a small MLP with two hidden layers, with fifty nodes in each layer, and a softmax output to achieve the final classification. Our work is not domain specific – a fact proven by our achieving competitive accuracies with published results on the univariate UCR data set as well as the multivariate UCI data set.
Bio: Before joining Pepperdata, Ash was executive chairman for Marianas Labs, a deep learning startup sold in December 2015. Prior to that he was CEO for Graphite Systems, a big data storage startup that was sold to EMC DSSD in August 2015. Munshi also served as CTO of Yahoo, as a CEO of both public and private companies, and is on the board of several technology startups.
Doug Eck, Research Scientist, Google Magenta, at MLconf SF 2017MLconf
The Role of AI and Machine Learning in Creativity:
I’ll discuss Magenta, a Google Brain project investigating music and art generation using deep learning and reinforcement learning. I’ll describe the goals of Magenta and how it fits into the general trend of AI moving into our daily lives. One crucial question is: Where does AI and Machine Learning fit in the creative process? I’ll argue that it’s about augmenting and extending the artist rather than just creating artifacts (songs, paintings, etc.) with machines. I’ll talk about two recent projects. In the first, we explore the use of recurrent neural networks to extend musical phrases in different ways. In the second we look at teaching a neural network to draw with strokes. This will be a high-level overview talk with no need for knowledge of AI or Machine Learning.
Bio:Doug leads Magenta, a Google Brain project working to generate music, video, image and text using deep learning and reinforcement learning. A main goal of Magenta is to better understanding how AI can enable artists and musicians to express themselves in innovative new ways. Before Magenta, Doug led the Google Play Music search and recommendation team. From 2003 to 2010 Doug was faculty at the University of Montreal’s MILA Machine Learning lab, where he worked on expressive music performance and automatic tagging of music audio.
Dr. Steve Liu, Chief Scientist, Tinder at MLconf SF 2017MLconf
Personalized User Recommendations at Tinder: The TinVec Approach:
With 26 million matches per day and more than 20 billion matches made to date, Tinder is the world’s most popular app for meeting new people. Our users swipe for a variety of purposes, like dating to find love, expanding social networks and meeting locals when traveling.
Recommendation is an important service behind-the-scenes at Tinder, and a good recommendation system needs to be personalized to meet an individual user’s preferences. In this talk, we will discuss a new personalized recommendation approach being developed at Tinder, called TinVec. TinVec embeds users’ preferences into vectors leveraging on the large amount of swipes by Tinder users. We will discuss the design, implementation, and evaluation of TinVec as well as its application to
personalized recommendations.
Bio: Dr. Steve Liu is chief scientist at Tinder. In his role, he leads research innovation and applies novel technologies to new product developments.
He is currently a professor and William Dawson Scholar at McGill University School of Computer Science. He has also served as a visiting research scientist at HP Labs. Dr. Liu has published more than 280 research papers in peer-reviewed international journals and conference proceedings. He has also authored and co-authored several books. Over the course of his career, his research has focused on big data, machine learning/AI, computing systems and networking, Internet of Things, and more. His research has been referenced in articles publishing across The New York Times, IDG/Computer World, The Register, Business Insider, Huffington Post, CBC, NewScientist, MIT Technology Review, McGill Daily and others. He is a recipient of the Outstanding Young Canadian Computer Science Researcher Prizes from the Canadian Association of Computer Science and is a recipient of the Tomlinson Scientist Award from McGill University.
He is serving or has served on the editorial boards of ACM Transactions on Cyber-Physical Systems (TCPS), IEEE/ACM Transactions on Networking (ToN), IEEE Transactions on Parallel and Distributed Systems (TPDS), IEEE Transactions on Vehicular Technology (TVT), and IEEE Communications Surveys and Tutorials (COMST). He has also served on the organizing committees of more than 38 major international conferences and workshops.
Dr. Liu received his Ph.D. in Computer Science with multiple honors from the University of Illinois at Urbana-Champaign. He received his Master’s degree in Automation and BSc degree in Mathematics from Tsinghua University.
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017MLconf
Codifying Data Science Intuition: Using Decision Theory to Automate Time Series Model Selection:
While models generated from cross-sectional data can utilize cross-validation for model selection, most time series models cannot be cross-validated due to the temporal structure of the data used to create them. It is possible to employ a rolling cross-validation technique, however this process is computationally expensive and provides no indication of the long-term forecast accuracies of the models.
The purpose of this talk is to elaborate how decision theory can be used to automate time series model selection in order to streamline the manual process of validation and testing. By creating consecutive, temporally independent holdout sets, performance metrics for each model’s prediction on each holdout set are fed into a decision function to select an unbiased model. The decision function helps minimize the poorest performance of each model across all holdout sets in order to counteract the possibility of choosing a model that overfits or underfits the holdout sets. Not only does this process improve forecast accuracy, but it also reduces computation time by only requiring the creation of a fixed number of proposed forecasting models.
Ashrith Barthur, Security Scientist, H2o.ai, at MLconf 2017MLconf
Machine Learning Based Attack Vector Modeling for CyberSecurity:
Connections have behavioural patterns that are unique to protocols, loads, window sizes, bandwidth, and mainly the type of traffic. A CDN enterprise behaves completely differently than how a Cloud service company would behave and they both would be different from a corporation. This also means that attack vectors and attack landscapes are different in all these places. In this talk we speak about modeling different kinds of attacks and build a model that is able to identify these different kinds of attacks using ML.
The method we use is to identify different profiles based on many variables that specifically but robustly identify attacks of different kinds. The variables are specific to business, network profile, traffic. The variables are also high-level i.e. aggregate, and packet-level. This way the models are specifically picking up on constant variations in traffic, and create machine learning models to identify these attacks. Using the power of H2O these analyses are not just limited to a research and analysis of the traffic and concluding with a “OH, this was what it was.” moment but to actually deploy code, besides existing IDS and IPS, or deploying highly optimized, independent programs that can handle high thruputs at the rate of 1.2 Million decisions per second making it one of the fastest implementations of ML to identify, defend and protect critical infrastructure that are potentially under threat.
Jessica Rudd, PhD Student, Analytics and Data Science, Kennesaw State Univers...MLconf
Application of Support Vector Machine Modeling and Graph Theory Metrics for Disease Classification:
Disease classification is a crucial element of biomedical research. Recent studies have demonstrated that machine learning techniques, such as Support Vector Machine (SVM) modeling, produce similar or improved predictive capabilities in comparison to the traditional method of Logistic Regression. In addition, it has been found that social network metrics can provide useful predictive information for disease modeling. In this study, we combine simulated social network metrics with SVM to predict diabetes in a sample of data from the Behavioral Risk Factor Surveillance System. In this dataset, Logistic Regression outperformed SVM with ROC index of 81.8 and 81.7 for models with and without graph metrics, respectively. SVM with a polynomial kernel had ROC index of 72.9 and 75.6 for models with and without graph metrics, respectively. Although this did not perform as well as Logistic Regression, the results are consistent with previous studies utilizing SVM to classify diabetes.
Venkatesh Ramanathan, Data Scientist, PayPal at MLconf ATL 2017MLconf
Large Scale Graph Processing & Machine Learning Algorithms for Payment Fraud Prevention:
PayPal is at the forefront of applying large scale graph processing and machine learning algorithms to keep fraudsters at bay. In this talk, I’ll present how advanced graph processing and machine learning algorithms such as Deep Learning and Gradient Boosting are applied at PayPal for fraud prevention. I’ll elaborate on specific challenges in applying large scale graph processing & machine technique to payment fraud prevention. I’ll explain how we employ sophisticated machine learning tools – open source and in-house developed.
I will also present results from experiments conducted on a very large graph data set containing millions of edges and vertices.
Alexandra Johnson, Software Engineer, SigOpt at MLconf ATL 2017MLconf
Best Practices for Hyperparameter Optimization:
All machine learning and artificial intelligence pipelines – from reinforcement agents to deep neural nets – have tunable hyperparameters. Optimizing these hyperparameters provides tremendous performance gains, but only if the optimization is done correctly. This presentation will discuss topics including selecting performance criteria, why you should always use cross validation, and choosing between state of the art optimization methods.
https://github.com/telecombcn-dl/dlmm-2017-dcu
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or text captioning.
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures Intel® Software
This session discuss the implementation and performance of the K-nearest neighbor (KNN) computation on a distributed architecture using the Intel® Xeon Phi™ processor.
ArrayUDF: User-Defined Scientific Data Analysis on ArraysGoon83
User-Defined Functions (UDF) allow application programmers to specify analysis operations on data, while leaving the data management and other non-trivial tasks to the system. This general approach is at the heart of the modern Big Data systems, such MapReduce/Spark and SciDB. However, a wide variety of common scientific data operations -- such as computing the moving average of a time series, the vorticity of a fluid flow, etc., -- are hard to express and slow to execute with these Big Data systems. In this talk, we will introduce a brand new Big Data system namely ArrayUDF (https://bitbucket.org/arrayudf/arrayudf) for scientific data sets, especially for multi-dimensional arrays. The ArrayUDF allows flexible expressiveness of UDF for scientific data analysis on the strength of their common character--structural locality. ArrayUDF executes the UDF directly on arrays stored in files, such as HDF5, without any data load overload. ArrayUDF's desi
gn and implementation considerations for parallel data processing on large-scale HPC will also be introduced. The performance tests on Edison at NERSC show that ArrayUDF is around 2000X faster than Spark on processing large scientific datasets.
Java Thread and Process Performance for Parallel Machine Learning on Multicor...Saliya Ekanayake
The growing use of Big Data frameworks on large machines highlights the importance of performance issues and the value of High Performance Computing (HPC) technology. This paper looks carefully at three major frameworks Spark, Flink and Message Passing Interface (MPI) both in scaling across nodes and internally over the many cores inside modern nodes.We focus on the special challenges of the Java Virtual Machine (JVM) using an Intel Haswell HPC cluster with 24 cores per node. Two parallel machine learning algorithms, K-Means clustering and Multidimensional Scaling (MDS) are used in our performance studies. We identify three major issues – thread models, affinity patterns, and communication mechanisms – as factors affecting performance by large factors and show how to optimize them so that Java can match the performance of traditional HPC languages like C. Further we suggest approaches that preserve the user interface and elegant dataflow approach of Flink and Spark but modify the runtime so that these Big Data frameworks can achieve excellent performance and realize the goals of HPCBig Data convergence.
Diefficiency Metrics: Measuring the Continuous Efficiency of Query Processing...Maribel Acosta Deibe
During empirical evaluations of query processing techniques, metrics like execution time, time for the first answer, and throughput are usually reported. Albeit informative, these metrics are unable to quantify and evaluate the efficiency of a query engine over a certain time period – or diefficiency –, thus hampering the distinction of cutting- edge engines able to exhibit high-performance gradually. We tackle this issue and devise two experimental metrics named dief@t and dief@k, which allow for measuring the diefficiency during an elapsed time period t or while k answers are produced, respectively. The dief@t and dief@k measurement methods rely on the computation of the area under the curve of answer traces, and thus capturing the answer concentration over a time interval. We report experimental results of evaluating the behavior of a generic SPARQL query engine using both metrics. Observed results suggest that dief@t and dief@k are able to measure the performance of SPARQL query engines based on both the amount of answers produced by an engine and the time required to generate these answers.
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Opti...Fabian Pedregosa
Short presentation of the paper "Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization"
https://arxiv.org/abs/1707.06468
https://telecombcn-dl.github.io/2017-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
Similar to Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Laboratories at MLconf SF 2017 (20)
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf
Understanding Human Impact: Social and Equity Assessments for AI Technologies
Social and Equity Impact Assessments have broad applications but can be a useful tool to explore and mitigate for Machine Learning fairness issues and can be applied to product specific questions as a way to generate insights and learnings about users, as well as impacts on society broadly as a result of the deployment of new and emerging technologies.
In this presentation, my goal is to advocate for and highlight the need to consult community and external stakeholder engagement to develop a new knowledge base and understanding of the human and social consequences of algorithmic decision making and to introduce principles, methods and process for these types of impact assessments.
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
The Brain’s Guide to Dealing with Context in Language Understanding
Like the visual cortex, the regions of the brain involved in understanding language represent information hierarchically. But whereas the visual cortex organizes things into a spatial hierarchy, the language regions encode information into a hierarchy of timescale. This organization is key to our uniquely human ability to integrate semantic information across narratives. More and more, deep learning-based approaches to natural language understanding embrace models that incorporate contextual information at varying timescales. This has not only led to state-of-the art performance on many difficult natural language tasks, but also to breakthroughs in our understanding of brain activity.
In this talk, we will discuss the important connection between language understanding and context at different timescales. We will explore how different deep learning architectures capture timescales in language and how closely their encodings mimic the brain. Along the way, we will uncover some surprising discoveries about what depth does and doesn’t buy you in deep recurrent neural networks. And we’ll describe a new, more flexible way to think about these architectures and ease design space exploration. Finally, we’ll discuss some of the exciting applications made possible by these breakthroughs.
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf
Applying Computer Vision to Reduce Contamination in the Recycling Stream
With China’s recent refusal of most foreign recyclables, North American waste haulers are scrambling to figure out how to make on-shore recycling cost-effective in order to continue providing recycling services. Recyclables that were once being shipped to China for manual sorting are now primarily being redirected to landfills or incinerators. Without a solution, a nearly $5 billion annual recycling market could come to a halt.
Purity in the recycling stream is key to this effort as contaminants in the stream can increase the cost of operations, damage equipment and reduce the ability to create pure commodities suitable for creating recycled goods. This market disruption as a result of China’s new regulations, however, provides us the chance to re-examine and improve our current disposal & collection habits with modern monitoring & artificial intelligence technology.
Using images from our in-dumpster cameras, Compology has developed an ML-based process that helps identify, measure and alert for contaminants in recycling containers before they are picked-up, helping keep the recycling stream clean.
Our convolutional neural network flags potential instances of contamination inside a dumpster, enabling garbage haulers to know which containers have the wrong type of material inside. This allows them to provide targeted, timely education, and when appropriate, assess fines, to improve recycling compliance at the businesses and residences they serve, helping keep recycling services financially viable.
In this presentation, we will walk through our ML-based contamination measurement and scoring process by showing how Waste Management, a national waste hauler, has experienced 57% contamination reduction in nearly 2,000 containers over six months, This progress shows significant strides towards financially viable recycling services.
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf
Quantum Computing: a Treasure Hunt, not a Gold Rush
Quantum computers promise a significant step up in computational power over conventional computers, but also suffer a number of counterintuitive limitations --- both in their computational model and in leading lab implementations. In this talk, we review how quantum computers compete with conventional computers and how conventional computers try to hold their ground. Then we outline what stands in the way of successful quantum ML applications.
Josh Wills - Data Labeling as Religious ExperienceMLconf
Data Labeling as Religious Experience
One of the most common places to deploy a production machine learning systems is as a replacement for a legacy rules-based system that is having a hard time keeping up with new edge cases and requirements. I'll be walking through the process and tooling we used to help us design, train, and deploy a model to replace a set of static rules we had for handling invite spam at Slack, talk about what we learned, and discuss some problems to solve in order to make these migrations easier for everyone.
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf
Project GaitNet: Ushering in the ImageNet moment for human Gait kinematics
The emergence of the upright human bipedal gait can be traced back 4 to 2.8 million years ago, to the now extinct hominin Australopithecus afarensis. Fine grained analysis of gait using the modern MEMS sensors found on all smartphones not just reveals a lot about the person’s orthopedic and neuromuscular health status, but also has enough idiosyncratic clues that it can be harnessed as a passive biometric. While there were many siloed attempts made by the machine learning community to model Bipedal Gait sensor data, these were done with small datasets oft collected in restricted academic environs. In this talk, we will introduce the ImageNet moment for human gait analysis by presenting 'Project GaitNet', the largest ever planet-sized motion sensor based human bipedal gait dataset ever curated. We’ll also present the associated state-of-the-art results in classifying humans harnessing novel deep neural architectures and the related success stories we have enjoyed in transfer-learning into disparate domains of human kinematics analysis.
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...MLconf
Machine Learning Methods in Detecting Alzheimer’s Disease from Speech and Language
Alzheimer's disease affects millions of people worldwide, and it is important to predict the disease as early and as accurate as possible. In this talk, I will discuss development of novel ML models that help classifying healthy people from those who develop Alzheimer's, using short samples of human speech. As an input to the model, features of different modalities are extracted from speech audio samples and transcriptions: (1) syntactic measures, such as e.g. production rules extracted from syntactic parse trees, (2) lexical measures, such as e.g. features of lexical richness and complexity and lexical norms, and (3) acoustic measures, such as e.g. standard Mel-frequency cepstral coefficients. I will present the ML model that detects cognitive impairment by reaching agreement among modalities. The resulting model is able to achieve state of the art performance in both supervised and semi-supervised manner, using manual transcripts of human speech. Additionally, I will discuss potential limitations of any fully-automated speech-based Alzheimer's disease detection model, focusing mostly on the analysis of the impact of a not-so-accurate automatic speech recognition (ASR) on the classification performance. To illustrate this, I will present the experiments with controlled amounts of artificially generated ASR errors and explain how the deletion errors affect Alzheimer's detection performance the most, due to their impact on the features of syntactic and lexical complexity.
Meghana Ravikumar - Optimized Image Classification on the CheapMLconf
Optimized Image Classification on the Cheap
In this talk, we anchor on building an image classifier trained on the Stanford Cars dataset to evaluate two approaches to transfer learning -fine tuning and feature extraction- and the impact of hyperparameter optimization on these techniques. Once we define the most performant transfer learning technique for Stanford Cars, we will double the size of the dataset through image augmentation to boost the classifier’s performance. We will use Bayesian optimization to learn the hyperparameters associated with image transformations using the downstream image classifier’s performance as the guide. In conjunction with model performance, we will also focus on the features of these augmented images and the downstream implications for our image classifier.
To both maximize model performance on a budget and explore the impact of optimization on these methods, we apply a particularly efficient implementation of Bayesian optimization to each of these architectures in this comparison. Our goal is to draw on a rigorous set of experimental results that can help us answer the question: how can resource-constrained teams make trade-offs between efficiency and effectiveness using pre-trained models?
Noam Finkelstein - The Importance of Modeling Data CollectionMLconf
The Importance of Modeling Data Collection
Data sets used in machine learning are often collected in a systematically biased way - certain data points are more likely to be collected than others. We call this "observation bias". For example, in health care, we are more likely to see lab tests when the patient is feeling unwell than otherwise. Failing to account for observation bias can, of course, result in poor predictions on new data. By contrast, properly accounting for this bias allows us to make better use of the data we do have.
In this presentation, we discuss practical and theoretical approaches to dealing with observation bias. When the nature of the bias is known, there are simple adjustments we can make to nonparametric function estimation techniques, such as Gaussian Process models. We also discuss the scenario where the data collection model is unknown. In this case, there are steps we can take to estimate it from observed data. Finally, we demonstrate that having a small subset of data points that are known to be collected at random - that is, in an unbiased way - can vastly improve our ability to account for observation bias in the rest of the data set.
My hope is that attendees of this presentation will be aware of the perils of observation bias in their own work, and be equipped with tools to address it.
The Uncanny Valley of ML
Every so often, the conundrum of the Uncanny Valley re-emerges as advanced technologies evolve from clearly experimental products to refined accepted technologies. We have seen its effects in robotics, computer graphics, and page load times. The debate of how to handle the new technology detracts from its benefits. When machine learning is added to human decision systems a similar effect can be measured in increased response time and decreased accuracy. These systems include radiology, judicial assignments, bus schedules, housing prices, power grids and a growing variety of applications. Unfortunately, the Uncanny Valley of ML can be hard to detect in these systems and can lead to degraded system performance when ML is introduced, at great expense. Here, we'll introduce key design principles for introducing ML into human decision systems to navigate around the Uncanny Valley and avoid its pitfalls.
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
Deep Learning Architectures for Semantic Relation Detection Tasks
Recognizing and distinguishing specific semantic relations from other types of semantic relations is an essential part of language understanding systems. Identifying expressions with similar and contrasting meanings is valuable for NLP systems which go beyond recognizing semantic relatedness and require to identify specific semantic relations. In this talk, I will first present novel techniques for creating labelled datasets required for training deep learning models for classifying semantic relations between phrases. I will further present various neural network architectures that integrate morphological features into integrated path-based and distributional relation detection algorithms and demonstrate that this model outperforms state-of-the-art models in distinguishing semantic relations and is capable of efficiently handling multi-word expressions.
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf
Building an Incrementally Trained, Local Taste Aware, Global Deep Learned Recommender System Model
At Netflix, our main goal is to maximize our members’ enjoyment of the selected show by minimizing the amount of time it takes for them to find it. We try to achieve this goal by personalizing almost all the aspects of our product -- from what shows to recommend, to how to present these shows and construct their home-pages to what images to select per show, among many other things. Everything is recommendations for us and as an applied Machine Learning group, we spend our time building models for personalization that will eventually increase the joy and satisfaction of our members. In this talk we will primarily focus our attention on a) making a global deep learned recommender model that is regional tastes and popularity aware and b) adapting this model to changing taste preferences as well as dynamic catalog availability.
We will first go through some standard recommender system models that use Matrix Factorization and Topic Models and then compare and contrast them with more powerful and higher capacity deep learning based models such as sequence models that use recurrent neural networks. We will show what it entails to build a global model that is aware of regional taste preferences and catalog availability. We will show how models that are built on simple Maximum Likelihood principle fail to do that. We will then describe one solution that we have employed in order to enable the global deep learned models to focus their attention on capturing regional taste preferences and changing catalog.In the latter half of the talk, we will discuss how we do incremental learning of deep learned recommender system models. Why do we need to do that ? Everything changes with time. Users’ tastes change with time. What’s available on Netflix and what’s popular also change over time. Therefore, updating or improving recommendation systems over time is necessary to bring more joy to users. In addition to how we apply incremental learning, we will discuss some of the challenges we face involving large-scale data preparation, infrastructure setup for incremental model training as well as pipeline scheduling. The incremental training enables us to serve fresher models trained on fresher and larger amounts of data. This helps our recommender system to nicely and quickly adapt to catalog and users’ taste changes, and improve overall performance.
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf
Vito Ostuni - The Voice: New Challenges in a Zero UI World
The adoption of voice-enabled devices has seen an explosive growth in the last few years and music consumption is among the most popular use cases. Music personalization and recommendation plays a major role at Pandora in providing a daily delightful listening experience for millions of users. In turn, providing the same perfectly tailored listening experience through these novel voice interfaces brings new interesting challenges and exciting opportunities. In this talk we will describe how we apply personalization and recommendation techniques in three common voice scenarios which can be defined in terms of request types: known-item, thematic, and broad open-ended. We will describe how we use deep learning slot filling techniques and query classification to interpret the user intent and identify the main concepts in the query.
We will also present the differences and challenges regarding evaluation of voice powered recommendation systems. Since pure voice interfaces do not contain visual UI elements, relevance labels need to be inferred through implicit actions such as play time, query reformulations or other types of session level information. Another difference is that while the typical recommendation task corresponds to recommending a ranked list of items, a voice play request translates into a single item play action. Thus, some considerations about closed feedback loops need to be made. In summary, improving the quality of voice interactions in music services is a relatively new challenge and many exciting opportunities for breakthroughs still remain. There are many new aspects of recommendation system interfaces to address to bring a delightful and effortless experience for voice users. We will share a few open challenges to solve for the future.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Tamara G. Kolda, Distinguished Member of Technical Staff, Sandia National Laboratories at MLconf SF 2017
1. IllustrationbyChrisBrigman
Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC, a wholly
owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525.
MLconf: The Machine Learning Conference
San Francisco, CA, Nov 10, 2017
2. A Tensor is an d-Way Array
11/10/2017 Kolda @ MLconf 2
Vector
d = 1
Matrix
d = 2
3rd-Order Tensor
d = 3
4th-Order Tensor
d = 4
5th-Order Tensor
d = 5
3. From Matrices to Tensors: Two Points of View
11/10/2017 Kolda @ MLconf 3
Singular value decomposition (SVD),
eigendecomposition (EVD), nonnegative matrix
factorization (NMF), sparse SVD, CUR, etc.
Viewpoint 1: Sum of outer products,
useful for interpretation
Tucker Model: Project onto high-variance
subspaces to reduce dimensionality
CP Model: Sum of d-way outer products,
useful for interpretation
CANDECOMP, PARAFAC, Canonical Polyadic, CP
HO-SVD, Best Rank-(𝑹1,𝑹2,…,𝑹d) decomposition
Other models for compression include
hierarchical Tucker and tensor train.
Viewpoint 2: High-variance subspaces,
useful for compression
11. Motivating Example: Neuron Activity in Learning
11/10/2017 Kolda @ MLconf 11
Thanks to Schnitzer Group @ Stanford
Mark Schnitzer, Fori Wang, Tony Kim
mouse
in “maze” neural activity
× 120 time bins
× 600 trials (over 5 days)
Microscope by
Inscopix
One Column
of Neuron x
Time Matrix
300 neurons
One Trial
Williams et al., bioRxiv, 2017, DOI:10.1101/211128
12. Trials Vary Start Position and Strategies
11/10/2017 Kolda @ MLconf 12
• 600 Trials over 5 Days
• Start West or East
• Conditions Swap Twice
❖ Always Turn South
❖ Always Turn Right
❖ Always Turn South
wall
W
E
N
S
note different patterns on curtains
Williams et al., bioRxiv, 2017, DOI:10.1101/211128
13. CP for Simultaneous Analysis of Neurons, Time,
and Trial
11/10/2017 Kolda @ MLconf 13
Prior tensor work in neuroscience for fMRI and EEG: Andersen and Rayens (2004),
Mørup et al. (2004), Acar et al. (2007), De Vos et al. (2007), and more
Williams et al., bioRxiv, 2017, DOI:10.1101/211128
18. CP-ALS: Fitting CP Model via Alternating Least
Squares
11/10/2017 Kolda @ MLconf 18
Harshman, 1970; Carroll & Chang, 1970
▪ Rank (R) NP-Hard: Even best
low-rank solution may not exist
(Håstad 1990, Silva & Lim 2006,
Hillar & Lim 2009)
▪ Not nested: Best rank-(R-1)
factorization may not be part
of best rank-R factorization
(Kolda 2001)
▪ Nonconvex: But convex linear
least squares problems
▪ Not orthogonal: Factor
matrices are not orthogonal
and may even have linearly
dependent columns
▪ Essentially Unique: Under
modest conditions, CP is
unique up to permutation and
scaling unique (Kruskal 1977)
Repeat until convergence:
Step 1:
Step 2:
Step 3:
24. Randomizing the Convergence Check
11/10/2017 Kolda @ MLconf 24
Estimate convergence of
function values using small
random subset of elements
in function evaluation
(use Chernoff-Hoeffding to
bound accuracy)
16000 samples < 1% of full data
Battaglino, Ballard, & Kolda 2017
25. Speed Advantage: Analysis of Hazardous Gas
Data
11/10/2017 Kolda @ MLconf 25
Data from Vergara et al. 2013; see also Vervliet and De Lathauwer (2016)
This mode scaled by component size Color-coded by gas type
900 experiments (with three different gas types) x 72 sensors x 25,900 time steps (13 GB)
Battaglino, Ballard, & Kolda 2017
28. Generalizing the Goodness-of-Fit Criteria
11/10/2017 Kolda @ MLconf 28
Anderson-Bergman, Duersch, Hong, Kolda 2017
Similar ideas have been proposed in matrix world,
e.g., Collins, Dasgupta, Schapire 2002
29. “Standard” CP
11/10/2017 Kolda @ MLconf 29
Typically: Consider data to be
low-rank plus “white noise”
Equivalently, Gaussian with mean 𝑚𝑖𝑗𝑘
Gaussian Probability
Density Function (PDF)
Minimize negative log likelihood:
Results in the “standard” objective:
Link:
Anderson-Bergman, Duersch, Hong, Kolda 2017
30. “Boolean CP”: Odds Link
11/10/2017 Kolda @ MLconf 30
Consider data to be Bernoulli distributed
with probability 𝑝𝑖jk
Equivalent to minimizing negative log likelihood:
Probability Mass Function (PMF):
𝑝𝑖𝑗𝑘 =
𝑚𝑖𝑗𝑘
1 + 𝑚𝑖𝑗𝑘
⇔ 𝑚𝑖𝑗𝑘 =
𝑝𝑖𝑗𝑘
1 − 𝑝𝑖𝑗𝑘
Convert from probability to odds:
𝑝 𝑥 1 − 𝑝 1−𝑥
Anderson-Bergman, Duersch, Hong, Kolda 2017
32. A Sparse Dataset
▪ UC Irvine Chat Network
▪ 4-way binary tensor
▪ Sender (205)
▪ Receiver (210)
▪ Hour of Day (24)
▪ Day (194)
▪ 14,953 nonzeros (very sparse)
▪ Goodness-of-fit (odds):
𝑓 𝑥, 𝑚 = log 𝑚 + 1 − 𝑥 log 𝑚
▪ Use GCP to compute rank-12
decomposition
11/10/2017 Kolda @ MLconf 32
Opsahl, T., Panzarasa, P., 2009. Clustering in weighted networks. Social Networks 31 (2), 155-163, doi: 10.1016/j.socnet.2009.02.002
33. Binary Chat Data using Boolean CP
11/10/2017 Kolda @ MLconf 33
Anderson-Bergman, Duersch, Hong, Kolda 2017
34. Tensors & Data Analysis
▪ CP tensor decomposition is effective for unsupervised data analysis
▪ Latent factor analysis
▪ Dimension reduction
▪ CP can be generalized to alternative fit functions
▪ Boolean data, count data, etc.
▪ Randomized techniques are open new doorways to larger datasets
and more robust solutions
▪ Matrix sketching
▪ Stochastic gradient descent
▪ Other on-going & future work
▪ Parallel CP and GCP implementations
(https://gitlab.com/tensors/genten)
▪ Parallel Tucker for compression
(https://gitlab.com/tensors/TuckerMPI)
▪ Randomized ST-HOSVD (Tucker)
▪ Functional tensor factorization as surrogate for expensive functions
▪ Extensions to many more applications (binary data, signals, etc.)
11/10/2017 Kolda @ MLconf 34
Acknowledgements
▪ Cliff Anderson-
Bergman (Sandia)
▪ Grey Ballard
(Wake Forrest)
▪ Casey Battaglino
(Georgia Tech)
▪ Jed Duersch
(Sandia)
▪ David Hong
(U. Michigan)
▪ Alex Williams
(Stanford)
Kolda and Bader,
Tensor Decompositions
and Applications, SIAM
Review ‘09
Tensor Toolbox for MATLAB:
www.tensortoolbox.org
Bader, Kolda, Acar, Dunlavy,
and othersContact: Tammy Kolda, www.kolda.net, tgkolda@sandia.gov