Having previously worked at both Millennial Media and AOL, Michael Conway brought his expertise to Bidtellect tasked with transforming the business to a self-service SaaS-based content distribution platform, enabling the company to grow 10-fold.
Next DSS MIA Event - https://datascience.salon/miami/
During the 30-minute presentation, Michael will provide background information about Bidtellect and how data is an integral component of the business managing their premium native inventory across their supply ecosystem with over 5 billion native auctions per day. As Bidtellect embraces big data, Michael will share the challenges and successes he and his team have experienced along the way. In addition, Steve Sarsfield, Vertica Senior Product Marketing Manager, will be available to discuss how specific technologies (SQL, Python, R and embedded algorithms) can be combined in an MPP environment to achieve big data analytics success.
Data Science Salon: Building smart AI: How Deep Learning Can Get You Into Dee...Formulatedby
Presented by Michael Housman Chief Data Scientist at RapportBoost.AI
Next DSS NYC Event 👉 https://datascience.salon/newyork/
Next DSS LA Event 👉 https://datascience.salon/la/
Recent advances in deep learning have fueled tremendous excitement about the potential for artificial intelligence to solve countless problems. But there are some perils and pitfalls endemic to these new techniques, particularly because they ignore two essential components of the scientific method: (1) understanding the how; and (2) explaining the why. Dr. Michael Housman offers up a two specific examples from his own career as a data scientist to show how a naive application of deep learning algorithms can lead data scientists to the wrong conclusion and offers up some guidance for avoiding these mistakes.
Data Science Salon: Adopting Machine Learning to Drive Revenue and Market ShareFormulatedby
The race is on to gain strategic and proprietary insights into changes in customer preferences before your competitors. This workshop will cover how and why machine learning is the tool for marketers to drive revenue and increase market share. The adoption of machine learning does not happen overnight. We will discuss the Five Es of machine learning maturity – Educating, Exploring, Engaging, Executing and Expanding. Hear real-world examples of using machine learning to accelerate revenue, identify new customers and introduce new products based on machine learning capabilities.
Next DSS MIA Event - https://datascience.salon/miami/
Data Science Salon: Digital Transformation: The Data Science CatalystFormulatedby
Carnival is the world's largest travel leisure company, with a combined fleet of over 100 vessels across 10 cruise line brands and growing. We analyze social channels (Facebook, Twitter, Instagram), web analytics and booking data to predict customer behavior and develop marketing strategies. This session will discuss the challenges of mining all of this data and some of the Machine Learning techniques we use to segment our customers (e.g. Clustering) and predicting the value of a customer (e.g. Regression).
Next DSS MIA Event - https://datascience.salon/miami/
Presented by MANCHON (KEVIN) U Senior Director, Head of Marketing Analytics & Data Science at Carnival Cruise Line and MARC FRIDSON, former Principal Data Scientist at Carnival Cruise Line.
Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...Formulatedby
Presented by Yashas Vaidya, Sr Data Scientist at DataIku
Next DSS MIA Event - https://datascience.salon/miami/
The steps to taking a machine learning model to production. Modern architectures and technologies for building production machine learning. An overview of the talent and processes for creating and maintaining production machine learning.
Ironside's VP of Strategy & Innovation, Greg Bonnette, delivered a presentation on "How to Build a Winning Strategy for Data & Analytics" to provide a framework for data-driven decision making.
Data Science Salon: Building smart AI: How Deep Learning Can Get You Into Dee...Formulatedby
Presented by Michael Housman Chief Data Scientist at RapportBoost.AI
Next DSS NYC Event 👉 https://datascience.salon/newyork/
Next DSS LA Event 👉 https://datascience.salon/la/
Recent advances in deep learning have fueled tremendous excitement about the potential for artificial intelligence to solve countless problems. But there are some perils and pitfalls endemic to these new techniques, particularly because they ignore two essential components of the scientific method: (1) understanding the how; and (2) explaining the why. Dr. Michael Housman offers up a two specific examples from his own career as a data scientist to show how a naive application of deep learning algorithms can lead data scientists to the wrong conclusion and offers up some guidance for avoiding these mistakes.
Data Science Salon: Adopting Machine Learning to Drive Revenue and Market ShareFormulatedby
The race is on to gain strategic and proprietary insights into changes in customer preferences before your competitors. This workshop will cover how and why machine learning is the tool for marketers to drive revenue and increase market share. The adoption of machine learning does not happen overnight. We will discuss the Five Es of machine learning maturity – Educating, Exploring, Engaging, Executing and Expanding. Hear real-world examples of using machine learning to accelerate revenue, identify new customers and introduce new products based on machine learning capabilities.
Next DSS MIA Event - https://datascience.salon/miami/
Data Science Salon: Digital Transformation: The Data Science CatalystFormulatedby
Carnival is the world's largest travel leisure company, with a combined fleet of over 100 vessels across 10 cruise line brands and growing. We analyze social channels (Facebook, Twitter, Instagram), web analytics and booking data to predict customer behavior and develop marketing strategies. This session will discuss the challenges of mining all of this data and some of the Machine Learning techniques we use to segment our customers (e.g. Clustering) and predicting the value of a customer (e.g. Regression).
Next DSS MIA Event - https://datascience.salon/miami/
Presented by MANCHON (KEVIN) U Senior Director, Head of Marketing Analytics & Data Science at Carnival Cruise Line and MARC FRIDSON, former Principal Data Scientist at Carnival Cruise Line.
Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...Formulatedby
Presented by Yashas Vaidya, Sr Data Scientist at DataIku
Next DSS MIA Event - https://datascience.salon/miami/
The steps to taking a machine learning model to production. Modern architectures and technologies for building production machine learning. An overview of the talent and processes for creating and maintaining production machine learning.
Ironside's VP of Strategy & Innovation, Greg Bonnette, delivered a presentation on "How to Build a Winning Strategy for Data & Analytics" to provide a framework for data-driven decision making.
The Data Driven Enterprise - Roadmap to Big Data & Analytics SuccessBigInsights
The Data Driven Enterprise - Roadmap to Big Data & Analytics Success
Presentation used at the series of Breakfast seminar around Australia hosted by Lenovo/Intel/SAP/EY
Data Science Salon: Building a Data Science CultureFormulatedby
Catalina is a Data Scientist with a specialty in building out scalable data solutions for startups.
Next DSS MIA Event - https://datascience.salon/miami/
There's a huge hype around the power of data science across industries. However, not all companies have been able to successfully build out their data science capabilities, and some are just starting to think about doing so. Just as each business is unique, each data science endeavor is unique. In this talk, we explore both the non-negotiables in building a data science culture and how to tailor your data science initiatives to match your business needs at different stages of your journey towards reaping the benefits of a data science culture.
Your smarter data analytics strategy - Social Media Strategies Summit (SMSS) ...Clark Boyd
The volume and velocity of available data brings with it a huge amount of new opportunities for marketers. However, without the analytics know-how to avail of this data, these are opportunities that are often missed. Moreover, the variety of different data sources and analytics platforms only add to this complexity.
This presentation covers:
- How to define and communicate an analytics framework
- How to set up analytics dashboards for a range of stakeholders
- The people and skills you need for an optimal analytics team
- Practical tips for improving your campaign measurement
"Planning Your Analytics Implementation" by Bachtiar Rifai (Kofera Technology)Tech in Asia ID
Bachtiar is a tech startup & science enthusiast with more than 7 years experience in digital marketing, ecommerce, analytics and product development. Bachtiar has spend his career life as marketing leader at top ecommerce such as Lazada & Blanja.com. Currently Bachtiar develop a startup called Kofera, a technology company who provides Software as a Service (SaaS) marketing automation platform powered by Artificial Intelligence (AI) and machine learning. Established in 2016, Kofera helps companies build & optimize PPC campaign using machine learning algorithm to maximize business ROI. Kofera has helped many clients from various industries. Recently, Kofera received pra-series A funding lead by MDI Ventures and followed by Indosterling, DNC & Gunung Sewu.
***
This slide was shared at Tech in Asia Product Development Conference 2017 (PDC'17) on 9-10 August 2017.
Get more insightful updates from TIA by subscribing techin.asia/updateselalu
Expert data analytics prove to be highly transformative when applied in context to corporate business strategies.
This webinar covers various approaches and strategies that will give you a detailed insight into planning and executing your Data Analytics projects.
Introduction to Machine Learning with Azure & DatabricksCCG
Join CCG and Microsoft for a hands-on demonstration of Azure’s machine learning capabilities. During the workshop, we will:
- Hold a Machine Learning 101 session to explain what machine learning is and how it fits in the analytics landscape
- Demonstrate Azure Databricks’ capabilities for building custom machine learning models
- Take a tour of the Azure Machine Learning’s capabilities for MLOps, Automated Machine Learning, and code-free Machine Learning
By the end of the workshop, you’ll have the tools you need to begin your own journey to AI.
Tips for Improving Google Shopping Campaign results Elizabeth Clark
Presenter slides from the first Shopping Guru's group meeting at Deloitte in Manchester. Attended by retailers , digital agencies and ad tech companies. Presentations covered the Google Myths to be wary of, rapid growth retailer case study, the agency role in managing feeds, and Manchester's opportunity to play a key role in setting shopping best practice.
An interview with Jack Levis, Director of Process Management at UPS.
"Our challenge hasn’t been around identifying analytics talent as much as it has been in determining the best way to train the hundreds of business people who are using these tools."
New Relic Telemetry for Digital MarketingHeiko Specht
Use existing Data to build a strong real time digital marketing analytics engine. Incl. alerting, custom dashboard, custom data enrichment. Here is how it works
The Data Driven Enterprise - Roadmap to Big Data & Analytics SuccessBigInsights
The Data Driven Enterprise - Roadmap to Big Data & Analytics Success
Presentation used at the series of Breakfast seminar around Australia hosted by Lenovo/Intel/SAP/EY
Data Science Salon: Building a Data Science CultureFormulatedby
Catalina is a Data Scientist with a specialty in building out scalable data solutions for startups.
Next DSS MIA Event - https://datascience.salon/miami/
There's a huge hype around the power of data science across industries. However, not all companies have been able to successfully build out their data science capabilities, and some are just starting to think about doing so. Just as each business is unique, each data science endeavor is unique. In this talk, we explore both the non-negotiables in building a data science culture and how to tailor your data science initiatives to match your business needs at different stages of your journey towards reaping the benefits of a data science culture.
Your smarter data analytics strategy - Social Media Strategies Summit (SMSS) ...Clark Boyd
The volume and velocity of available data brings with it a huge amount of new opportunities for marketers. However, without the analytics know-how to avail of this data, these are opportunities that are often missed. Moreover, the variety of different data sources and analytics platforms only add to this complexity.
This presentation covers:
- How to define and communicate an analytics framework
- How to set up analytics dashboards for a range of stakeholders
- The people and skills you need for an optimal analytics team
- Practical tips for improving your campaign measurement
"Planning Your Analytics Implementation" by Bachtiar Rifai (Kofera Technology)Tech in Asia ID
Bachtiar is a tech startup & science enthusiast with more than 7 years experience in digital marketing, ecommerce, analytics and product development. Bachtiar has spend his career life as marketing leader at top ecommerce such as Lazada & Blanja.com. Currently Bachtiar develop a startup called Kofera, a technology company who provides Software as a Service (SaaS) marketing automation platform powered by Artificial Intelligence (AI) and machine learning. Established in 2016, Kofera helps companies build & optimize PPC campaign using machine learning algorithm to maximize business ROI. Kofera has helped many clients from various industries. Recently, Kofera received pra-series A funding lead by MDI Ventures and followed by Indosterling, DNC & Gunung Sewu.
***
This slide was shared at Tech in Asia Product Development Conference 2017 (PDC'17) on 9-10 August 2017.
Get more insightful updates from TIA by subscribing techin.asia/updateselalu
Expert data analytics prove to be highly transformative when applied in context to corporate business strategies.
This webinar covers various approaches and strategies that will give you a detailed insight into planning and executing your Data Analytics projects.
Introduction to Machine Learning with Azure & DatabricksCCG
Join CCG and Microsoft for a hands-on demonstration of Azure’s machine learning capabilities. During the workshop, we will:
- Hold a Machine Learning 101 session to explain what machine learning is and how it fits in the analytics landscape
- Demonstrate Azure Databricks’ capabilities for building custom machine learning models
- Take a tour of the Azure Machine Learning’s capabilities for MLOps, Automated Machine Learning, and code-free Machine Learning
By the end of the workshop, you’ll have the tools you need to begin your own journey to AI.
Tips for Improving Google Shopping Campaign results Elizabeth Clark
Presenter slides from the first Shopping Guru's group meeting at Deloitte in Manchester. Attended by retailers , digital agencies and ad tech companies. Presentations covered the Google Myths to be wary of, rapid growth retailer case study, the agency role in managing feeds, and Manchester's opportunity to play a key role in setting shopping best practice.
An interview with Jack Levis, Director of Process Management at UPS.
"Our challenge hasn’t been around identifying analytics talent as much as it has been in determining the best way to train the hundreds of business people who are using these tools."
New Relic Telemetry for Digital MarketingHeiko Specht
Use existing Data to build a strong real time digital marketing analytics engine. Incl. alerting, custom dashboard, custom data enrichment. Here is how it works
Data Integration and Marketing Attribution ROIVENUE™
Microsoft and ROIVENUE™ have teamed up to provide a glimpse into the benefits of integrating all your marketing data. All about the latest advancements in data management powered by Azure and how ROIVENUE™ helps marketers identify where to best allocate their digital spend with our Marketing Attribution models and Budget Optimizer™.
Shape.io offers a cross-channel PPC budget management software suite. Monitor your PPC spend and performance across 8 ad platforms, automate time-consuming, low-value tasks, and eliminate budget pacing issues and overspends for good. Shape also now offers an Advertising Data Infrastructure product. Shape's ADI combines a two-way API that normalizes data across all major PPC ad networks, a warehouse to house all your data, and public connectors to your favorite software tools.
Understanding the Ins and Outs of Programmatic for PPCHanapin Marketing
In this presentation, Founder and CEO of MightyHive Pete Kim will join Hanapin Account Manager Bryan Gaynor to tell you all about the importance of implementing programmatic advertising into your PPC strategy. From day-to-day operations to explaining its importance to your CMO, we’ll give you expert insights on the world of programmatic.
Retargeting is a tactic of online advertising that can help get your brand in front of bounced traffic once they leave your website. If you’re investing in any amount of SEM, SEO, social media or other online initiatives that direct users to your website, then you should be ensuring you’re converting as many prospects as possible. Retargeting is a tactic of online advertising that can help get your brand in front of bounced traffic once they leave your website. If you’re investing in any amount of SEM, SEO, social media or other online initiatives that direct users to your website, then you should be ensuring you’re converting as many prospects as possible.
Changing customer behaviors have dramatically changed the way brands need to measure success. Starting with an analytics frameworks is essential for measuring business ROI.
Google Publisher Tags (GPT): What You Can Do with Them, Why You Should Use Th...Infinitive
This guide to Google Publisher Tags explains how to get the most out GPT's capabilities for your ad business. With GPT, you can optimize your product catalog, sell your audience across platforms and support viewability and vCPM. Learn how to strengthen ad sales and operations to enhance revenue and performance.
Ready to implement Google Publisher Tags? Learn how Infinitive can help you get it right at http://www.infinitive.com/gpt
Helping brands to foster deeper customer relationships mParticle
A brief introduction to the mParticle Customer Data Platform. In 5 mins learn how mParticle's API-powered consumer data platform is used by customer-centric organizations to fuel amazing Customer Experiences and improve Customer Lifetime Value.
Alex Kesaris
akesaris@mparticle.com
+447400999957
Similar to Data Science Salon: Enabling self-service predictive analytics at Bidtellect (20)
Data Science Salon: An Experiment on Data Science Algorithms Enabled by a Pil...Formulatedby
Pilosa, as a technology, changes the dialog around large data sets, both static and in motion. Historically data lakes like Hadoop have been used to store massive amounts of data. However, it is estimated that only 20% of that data is practically analyzable because complex analytical operations on an ad-hoc basis become computationally painful and slow.
Next DSS MIA Event - https://datascience.salon/miami/
Next DSS AUS Event - https://datascience.salon/austin/
Enter a distributed binary index: Pilosa. While this can be used to unlock and join massive datasets and streams, it can also be thought of as an accelerator for training Machine Learning models and most importantly running your algorithms in large scale production environments. In this workshop Hypergiant will discuss how Pilosa interacts with several ML ideas including the Winnow algorithm, association schemes, and recommendation engines.
Data Science Salon: Are you sure you're an ethical technologist?: Build your ...Formulatedby
This 60-minute workshop reveals our hidden attribution biases and equips attendees with an ethical imagination to build better technologies, companies, and communities.
As an Ethics for Data Science instructor at NYU, I was struck by the disjunction between students’ excellent ability to find ethical gaps in other peoples’ projects and the blind spots they exhibited when critiquing their own work.
Next DSS MIA Event - https://datascience.salon/miami/
Next DSS AUS Event - https://datascience.salon/austin/
In this interactive workshop, we will identify how to reduce our own good intention biases.
Data Science Salon: In your own words: computing customer similarity from tex...Formulatedby
In order to attract customers, companies aim to provide an experience that is personalized to their audience. Typically this involves assigning customers to certain “segments” based on their characteristics and creating a personalized experience for each segment. Often the mostly widely available data sources for customers tend to be unstructured text (e.g. social media profiles, websites). Since reading each document is not a scalable approach, practitioners use a set of techniques for extracting information from text loosely referred to as Natural Language Processing (NLP).
Next DSS MIA Event - https://datascience.salon/miami/
Next DSS AUS Event - https://datascience.salon/austin/
In this workshop, we will walk through an NLP use-case computing similarity from business website text in Python. We will explore the use of the spaCy and scikit-learn libraries to preprocess and tokenize page text. We will then walk through methods for turning tokenized text into page-level numeric vectors (i.e. TF-IDF weighting and word vectors). Using these vectors, we will make use of several matrix factorization techniques to distill the vectors into a smaller set of features. These steps will familiarize participants with some common steps in NLP pipelines.
After each of the above steps, we will explore how the information extracted can be used for customer segmentation based on a similarity measure. Specifically, we will focusing on how each stage affects the similarity between pages. Through this exploration, participants will gain an understanding of how different processing may be appropriate for different NLP use-cases. For example, fitting topic models to document vectors is most useful when there is likely to be a distinct set of topics among the document set.
The workshop will conclude with a discussion about how these techniques are currently used in production at ThriveHive. This will provide participants with an example of how they might be able to make use of what we explored in their own work. Any additional time will be devoted to discussing advanced techniques in NLP such as text autoencoders for computing context-sensitive similarity between documents.
Data Science Salon: nterpretable Predictive Models in the Healthcare DomainFormulatedby
Predictive models are often used to identify individuals that will likely have escalating health severity in the future and accordingly deliver appropriate interventions. However, for the clinicians and care managers, these predictive models often act as a black-box at an individual level. The reason for this being, typically predictive models use combinations of complicated algorithms that makes it hard to explain the reason behind a predictive model score at an individual level. This talk will focus on model and feature agnostic methodologies and techniques that help uncover the drivers behind a prediction at a personal level in a healthcare setting.
Next DSS MIA Event - https://datascience.salon/miami/
Next DSS AUS Event - https://datascience.salon/austin/
Data Science Salon: Applications of Embeddings and Deep Learning at GrouponFormulatedby
Bojan Babick, Senior Software Engineer at Groupon talks about how the Groupon technical team went on a journey to switch from rule-based systems to classical machine learning models with hand-designed features to representation learning and deep learning
See the related post here: https://roundtable.datascience.salon/applications-of-embeddings-and-deep-learning-at-groupon
Sign up for DSSInsider to see the full video: https://insider.datascience.salon/
Next DSS SEA Event - https://datascience.salon/seattle/
Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...Formulatedby
Presented by Hila Lamm, Chief Strategy Officer at Firefly.ai
Next DSS MIA Event - https://datascience.salon/miami/
Next DSS AUS Event - https://datascience.salon/austin/
With all the hype around auto machine learning for computer vision, businesses with structured data are left wondering: Is AutoML relevant for enterprise data? Can it alleviate the bottleneck that data science teams are experiencing?
Our team was experimenting with different types of enterprise challenges -- from optimizing pricing to credit card fraud detection to retail banking customer behavior -- and was able to automatically build models that produced top-ranking Kaggle results within a few hours. In this session, through customer use cases and under the hood insights, you will learn about the capabilities of AutoML as applied on Firefly. Oh, and we’ll also talk about how we attained a Kaggle 1st place score in just half an hour.
Presented by SK Reddy, Chief Product Officer AI at Hexagon
Next DSS MIA Event - https://datascience.salon/miami/
Next DSS AUS Event - https://datascience.salon/austin/
Detecting indoor human activity is used for security, patient care, baby monitoring, etc. purposes. Other than having another human being providing the service (i.e. a security guard, a nurse, baby’s mother, etc.), many solutions have been suggested using image processing neural networks that detect patient’s fall, baby walking, door open, etc. Many of these models have achieved higher prediction accuracy rates. But neural networks that use video cameras bring up privacy concerns.
Custom made sensors, though solve the problem, are expensive. Researchers have proposed deep learning (DL) models use wifi signals to detect human activity. This is relatively recent research.
I would like to discuss on how to design a DL to detect human activity to use Wifi signals that are available from off-the-shelf wifi routers. I will also discuss the architecture of such models, share the implementation problems and evaluate solutions that may address these problems.
Data Science Salon: Building a Data Driven Product MindsetFormulatedby
Presented by Viswanath Puttagunta, Chief Technology Officer of Divergence.AI
Next DSS MIA Event - https://datascience.salon/miami/
Next DSS AUS Event - https://datascience.salon/austin/
OK, your Analysts and ETL developers pushed the limits of Tableau and traditional data warehouses. May be recently, you've even leveraged some of the Cloud services. You have a good idea of key metrics that are critical to your organization. But you definitely know there's a lot more data lurking within your organization that could be monetized. You look around and are overwhelmed with choices. You want a standard set of tools, but the tools are evolving at a dizzying pace, giving you a classic case of analysis paralysis. Even when you pick a tool, getting licenses and on-boarding seems to be taking forever!
Come and learn how to build a ""Data Driven Product Mindset"" within your organization. We will discuss how to build a cross-functional team that has the right basics in AI/ML/Software/DevOps/Admin and can evolve as fast as the evolving tool-set, while providing actionable insights every step of the way. We will delve into the benefits of using Managed & Serverless services in Cloud to make your team nimbler than ever. That team you build will be the most formidable tool your organization will have.
Data Science Salon: Data visualization and Analysis in the Florida Panthers H...Formulatedby
We will give an overview of how data visualization and data analysis are used within the Florida Panthers organization, around the National Hockey League, and in the sports industry in general, in a variety of different contexts. We discuss how analytics can be used to assist an NHL team’s front office, coaching staff, and scouting department. We also discuss the kinds of data we encounter on the business side of the organization in departments like sales and marketing, as well as the kinds of questions the league offices try to answer with the help of data.
Next DSS MIA Event - https://datascience.salon/miami/
Data Science Salon: Machine Learning for Personalized Cancer VaccinesFormulatedby
Presented by Alex Rubinsteyn
Next DSS MIA Event - https://datascience.salon/miami/
A short introduction to cancer immunotherapy followed by several machine learning problems which arise from designing personalized cancer vaccines.
Data Science Salon: MCL Clustering of Sparse GraphsFormulatedby
The increasing need for clustering in several scientific domains has inevitably driven the creation of innovative algorithms, each designed to perform more efficiently in certain applications. More specifically, in many applications the data entities involved can be portrayed effectively by a graph as a collection of nodes and edges. One of the most established algorithms for graph clustering problems is the Markov Cluster Algorithm (MCL).
Next DSS MIA Event - https://datascience.salon/miami/
When dealing with large and complex datasets, the underlying graphs can easily reach proportions that independent computing systems are inadequate to deal with. Additionally, the graphs encountered are typically sparse: the number of edges is far smaller than might be possible in a fully-connected graph. Consequently, there is a concrete need for algorithms that are designed to handle sparse graph clustering utilizing distributed computing resources.
Our motivation was the development of a distributed architecture, able to accommodate large and sparse graphs, to actualize the MCL and R-MCL algorithm. The Apache Spark framework was chosen due to its ability to utilize distributed resources and its proven track record. Although Spark is a framework capable of handling massive datasets, it currently does not provide rich support for computation with sparse matrices and sparse graphs. Hence, methods have been implemented to enable the exploitation of sparse adjacency matrices in distributed sparse matrix multiplication, a critical component of MCL. The proposed solution can handle arbitrarily large inputs, provide almost linear speed-up with the addition of computational resources and output results directly comparable to the non-distributed reference MCL implementation.
Data Science Salon: Applying Machine Learning to Modernize Business ProcessesFormulatedby
Next DSS MIA Event - https://datascience.salon/miami/
For most data scientist building models is hard work, but deploying them into production and impacting business processes can be even harder. In fact, research shows that only about 10% of data science models get deployed into production, and those that do can take between 6 to 9 months to be deployed. This session will highlight the challenges that data scientist and organizations alike face when trying to deploy machine learning models and how to overcome these challenges. It will examine several use cases where models built in R and Python have been able to deliver impactful results across several industries.
Data Science Salon: Deep Learning as a Product @ ScribdFormulatedby
Presented by Kevin Perko, Head of Data Science at Scribd
Next DSS NYC Event 👉 https://datascience.salon/newyork/
Next DSS LA Event 👉 https://datascience.salon/la/
Kevin will cover his experience using deep learning, going from scratch to deploying models in production to improve the product experience. He goes in-depth in terms of how we started deep learning from scratch, including navigating the maze of frameworks and hyper-parameters to optimize. Kevin will discuss pitfalls of using other people's algorithms and make a call for more rigor in publishing data science blog posts. Kevin closes with how his failure turned into an open source contribution and the work in moving from dev to production.
Data Science Salon: A Journey of Deploying a Data Science Engine to ProductionFormulatedby
Presented by Mostafa Madjipour., Senior Data Scientist at Time Inc.
Next DSS NYC Event 👉 https://datascience.salon/newyork/
Next DSS LA Event 👉 https://datascience.salon/la/
Reducing the gap between R&D and production is still a challenge for data science/ machine learning engineering groups in many companies. Typically, data scientists develop the data-driven models in a research-oriented programming environment (such as R and python). Next, the data/machine learning engineers rewrite the code (typically in another programming language) in a way that is easy to integrate with production services.
This process has some disadvantages: 1) It is time consuming; 2) slows the impact of data science team on business; 3) code rewriting is prone to errors.
A possible solution to overcome the aforementioned disadvantages would be to implement a deployment strategy that easily embeds/transforms the model created by data scientists. Packages such as jPMML, MLeap, PFA, and PMML among others are developed for this purpose.
In this talk we review some of the mentioned packages, motivated by a project at Time Inc. The project involves development of a near real-time recommender system, which includes a predictor engine, paired with a set of business rules.
Data Science Salon: Culture, Data Engineering and Hamburger Stands: Thoughts ...Formulatedby
Presented by Becky Tucker, Data Scientist at Netflix
Next DSS NYC Event 👉 https://datascience.salon/newyork/
Next DSS LA Event 👉 https://datascience.salon/la/
Becky Tucker is speaking about how Netflix culture uniquely interacts with data science, the importance of data engineering to our data science teams, how their teams are structured to do data science "at scale," and what "data science at scale" looks like for her.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Data Science Salon: Enabling self-service predictive analytics at Bidtellect
1.
2. Founded in 2011, a Delray Beach, Florida company that helps other businesses develop and tell
their optimal story to share it with as many people possible and maximize sales through enabling
technologies in Content Distribution and Native Advertising.
Founded By Industry Veterans from AdTech and Media
Built from the ground up for Content Distribution and Native Advertising
Think of us as the Facebook of the Open Web !!!
Bidtellect Background
3. Tasked with transforming Bidtellect from a Managed service
business with limited focus on data to a data-focused self-service
SaaS-based content distribution platform - enabling the company
to grow 10-fold
The “Mike” Goal
4.
5. Advertisers need to quantify, in real time, which ads are converting and which are not—and they must
constantly revise ads, delivery, and placement among surrounding content on the fly to improve results.
Big Data analytics and insights to empower people and machines to instantaneously make optimal
advertising decisions, helping brands achieve tangible results in their businesses.
Continual behavioral analysis of user behavior with learnings supporting the creation of behavioral
audience segmentation and retargeting pools for scale.
Self-service forecasting to support optimal campaign trafficking and near-real time adjustments to
campaign configuration optimizing performance and KPI objectives.
Understanding the
Need
6. Challenge(s)
Technology
● Complexity
● Fragmentation
● Scale
● Fraud
● Brand Safety
● Transparency
● Viewability
Creative
● Personalization
● Creative Optimization
● Content Creation
● Resources
● Catering Content
Distribution
● Targeting
● Campaign Measurement
● Content Measurement
● Insights
● Optimization
Content May Be King, But Context is Queen &
She Holds the Purse Strings
7. ● Increased number of supply partners
● Tremendous growth in daily data throughput
● Poor Data Strategy
● Poor Data Infrastructure
Challenge(s)
5BN 1MM 1BN
Transactions per Day Decisions per Second Possible Permutations
8BN 1.5MM 1.5BN
Transactions per Day Decisions per Second Possible Permutations
8.
9. ● Put Data at the Center
○ Flipped roadmap on its head to focus on data, reach and optimization
● Fix the infrastructure
○ Replaced existing analytics and reporting infrastructure to mitigate current issues and support
projected growth
● Optimization
○ Change optimization approach to allow for multiple KPI
○ SaaS enablement
● Audience Targeting
○ Creation of internal Data Management Platform
■ Audience creation, management
■ Cross-device delivery
● Engagement Score 2.0
● Forecasting
○ SaaS enablement
10.
11.
12. Infrastructure
Situation
● nDSP Performance Issues within production, even for simple non-analytics tasks
○ Saving campaign changes
○ Logging in
○ Pulling reports
● Reporting data in multiple data stores
● Perception of instability
● Operational productivity hindered
13. Infrastructure
Root Cause
● Reporting data was in same database as nDSP production data (SQL & Greenplum)
● Generation, querying, and updating of these large data sets:
○ Blocking non-reporting/analytics queries
○ Ripple effect impacting the ability for platform to bid at optimal levels
■ Scheduled processes required by the platform that need to access timed out
● Performance and scalability of Greenplum was not sufficient to support our growth
● Performance would only degrade as we add users to the system.
14. Infrastructure
Solution
Replaced existing analytics and reporting infrastructure
● Conducted PoC and runoff against multiple vendors
● Selected Vertica
○ Estimated 2 months upon hardware installation to migrate all current analytics jobs and reporting to
Vertica.
■ Took less than a month
○ Met our query SLA times
■ Significantly reduced query times compared to SQL or Greenplum < 4ms on average vs 30s+
■ Provided ability to tune to achieve query result times under multiple query loads/multi
tenancy
○ Reduced the complexity of the data architecture
○ Enabled to deliver hourly granularity vs daily granularity
15. Vertica
Vertica essentials supporting our mission
Focus Real-Time Requirements
Storage and Data Management ● Column store enables sorting, compression and organization of data more
efficiently
● Ability to apply different compression and encoding algorithm varying across
data set
Queries ● No indexes and self-indexes data by sorting and encoding
Speed & Performance ● 50-1,000 times faster than legacy Greenplum instance
● Ability to process queries in parallel over multiple processors
● Linear scalability and high availability
Built in analytics ● Supports functions including pattern matching, approximate count distinct and
approximate count distinct synopsis
● Efficient design requires less coding
16.
17. Bidtellect Optimization
● Our optimization technology, Intellibid™ leverages big data, predictive models and machine
learning to make the smartest buying decisions in real-time for Native Advertising and Content
Distribution campaigns.
● Our campaigns are getting smarter even as we speak!
● Ability to optimize against a single or multiple campaign KPIs
18. Optimization Breadth
Leverages Big Data and machine learning to make smarter buying decision in real-time for all Native
Advertising campaigns.
Algorithms make over 1MM decisions per second across the following parameters:
● Creative
● Creative Rendering
● Device
● Placement
● Recency
● Dynamic Bid Price
● Goal Type Target
● Ad Format Type
● User Frequency
● User Characteristics
19. Optimization
Objectives
Allows Advertisers to optimize their Native Advertising and Content Distribution campaigns against a
variety of objectives, including:
● CTR
● Conversions
● View Rate
● Bounce Rate
● Play Rate (video)
● Page Views per Visit
● Average Time on Site
● Engagement Score ** (Proprietary measurement and algorithm tool for post-click engagement
metrics.
20. SaaS Enablement
Developed and enabled agency self-service SaaS solution through Bidtellect nDSP allowing users to
fully control adjustments to optimization goals supporting their KPIs
Objectives
Measurement Goal
21. 30-days from impressions,
15-days from matched-conversions,
1-day from rollups/cache/optimization/…/672
Simple Approach -
(15 Day Easy CVR)
Fitted Functions in R
22. Simple Approach -
(15 Day Easy CVR)
Fitted Functions in R
15 Day attribution window for $30
click-eCPA cutoff
Click-eCPA = revenue/conversions
23. Audience Targeting
Distributes content to specific target audience through sophisticated audience targeting algorithm and
behavioral measurement.
Target across devices and products against multiple parameters, Contextual, Behavioral, Demographic
and Location
01 | PRODUCT TYPE
02 | DEVICE TYPE
03 | LOCATION/ ZIP CODE
04 | OPERATING SYSTEM
05 | AGE & GENDER
06 | DEMOGRAPHICS
07 | LANGUAGE & CURRENCY
08 | DAY PARTING
09 | FREQUENCY CAPPING
10 | BEHAVIORAL / RETARGETING
11 | FIRST / THIRD PARTY
12 | CATEGORY AND
KEYWORD CONTEXTUAL
13 | KPI SLIDER
14 | BROWSER
15 | SUPPLY SOURCE
16 | SUPPLY TIER
17 | IAS VERIFIED
FRAUD PROTECTION
18 | CONTEXTUALLY
BRAND SAFETY
19 | CARRIER TARGETING
24.
25. Engagement Score is the ultimate measure of content marketing success, designed to capture and
analyze post-click consumer activity.
● Sessions
● Pageviews (content consumed)
● Bounce Rate
● Time on Site
Measured from the behavior of users once they land on the advertiser’s landing page.
The advanced formula is a linear combination of three logistic functions.
Behavioral Analytics
(Descriptive/Predictive)
26. Page Views Per Visit
We wish to construct a function so that when page
views per visit is 1, the score is low and when it reaches
2 the score is much higher. This behavior is
encapsulated in Equation
The contribution of page views per visit to
engagement score. It reaches a maximum as page
views per visit reaches > 3.
Behavioral Analytics
(Descriptive/Predictive)
27. The engagement score is a number ranging from 0 to 10 where 0 corresponds to the least customer
engagement and 10 represents the highest level of engagement. It is derived from four variables
measured from the behavior of users once they land on the advertiser’s site.
Advertisers can optimize their campaigns toward Engagement Score as a whole or the individual
post-click metrics.
Behavioral Analytics
(Descriptive/Predictive)
28.
29. We are in the process of implementing a Self-service forecasting to support optimal campaign trafficking
and near-real time adjustments to campaign configuration optimizing performance and KPI objectives.
A forecasting tool where a user can enter all or most of the targeting, allowability, and capability options
and the forecasting tool produces a bid price vs (impressions, clicks, spend, viewable impressions,
plays, completes) graph.
It is critical to understand and plan a campaign for success before it starts. We often have campaigns
that begin and then we realize we don't have the right price/inventory to support the success of the
campaign.
The Forecasting tool allows our users to understand our landscape and how making changes to a
campaign affect the scale and will provide them will multiple alternative solutions for driving their
campaigns.
Forecasting Analytics
(Prescriptive)
30. Create graphs using fit functions to cut down on real data noise
● Derive the appropriate other metric from the combination of
win rate and what that metric needs applied to against
total available auctions
a. impressions (just win rate)
b. clicks (CTR and win rate)
c. viewable imps (viewability rate and win rate)
d. Spend (bid price * (impressions, clicks, plays,
completes–depending on selection of bid) and win
rate)
e. plays (play rate and win rate)
f. completes (completion rate and win rate)
● Filter feasible targeting, allowability and others
● Campaign product type
Forecasting Analytics
(Prescriptive) Bid Price Variable
CPM ● Impressions
● viewable impressions
● Clicks
● Spend $
● Plays (video)
● Completes (video)
CPC ● Clicks
● Spend $
vCPM ● Viewable Imps
● Spend $
CPP ● Plays (video)
● Spend $
CPCV ● Completes
● Spend $
31.
32. Nothing new to anyone in this room
Data is an integral component of the my Bidtellect’s business, managing our premium native inventory across
their supply ecosystem with over 8 billion native auctions per day entering back into our systems
The ability to analyze and act on data is increasingly important to all businesses.
● Pace of change requires quick reaction to changing demands
● Increasingly more complex decisions required, but with faster action
● Greater and greater amounts of data
Data is The Key
33. ● Start with the ensuring that you have the infrastructure in place to support your objectives for scale and
performance
● Start analyzing the data
○ Condense large amounts of data into smaller pieces of information.
■ Use descriptive analytics ("the simplest class of analytics") to summarize what happened
○ Use a variety of statistical, modeling, data mining, and machine learning techniques to study recent and
historical data
■ Allowing staff to make predictions about the future
■ Forecast what “might” happen (probabilistic)
■ Should provide a sentiment score (positive, negative between +1 or -1)
● The emerging technology of prescriptive analytics goes beyond descriptive and predictive models
by recommending one or more courses of action -- and showing the likely outcome of each
decision.
How to Get There?
34. ● Provide greater insight and offer alternatives based on data learnings
○ Like predictive analytics prescriptive analytics provides the ability to prescribe a possible
action
○ Prescriptive model is able to predict the possible consequences based on different choice of action
○ Should also recommend the best course of action for any pre-specified outcome
Where to Start?