Representation Learning @ Red Hat:
For many companies, the vast majority of their data is unstructured and unlabeled; however, the data often contains information that could be useful in a variety of scenarios. Representation learning is the process of extracting meaningful features from unlabeled data so that it can be used in other tasks. In this talk, you’ll hear about how Red Hat is using deep learning to discover meaningful entity representations in a number of different settings, including: (1) identifying duplicate documents on the Customer Portal, (2) finding contextually similar URLs with word2vec, and (3) clustering behaviorally similar customers with doc2vec. To close, we will walk through an example demonstrating how representation learning can be applied to Major League Baseball players.
Bio: Michael first developed his data crunching chops as an undergraduate at Auburn University (War Eagle!) where he used a number of different statistical techniques to investigate various aspects of salamander biology (work that led to several publications). He then went on to earn a M.S. in evolutionary biology from The University of Chicago (where he wrote a thesis on frog ecomorphology) before changing directions and earning a second M.S. in computer science (with a focus on intelligent systems) from The University of Texas at Dallas. As a Machine Learning Engineer – Information Retrieval at Red Hat, Michael is constantly looking for ways to use the latest and greatest machine learning technology to improve search.