Metail allows users to discover clothes on their body shape online with minimum measurements from the user. With your avatar you can create outfits and coupled with our size advice this gives you a confidence in the size and fit.
I'm part of the team within Metail that has built a pipeline to collection, enriched and serve data to the company and our clients, and which has been used to validate Metail's product. This talk was given at the AWS Loft in London 21st April 2016 where I gave an overview of the end-to-end pipeline and then went into detail how we're using AWS' EMR to perform a batch processing of the collected data which is then served internally with Redshift.
Metail uses Snowplow to collect customer data and Cascalog to process that data into normalized batch views for analysis. Cascalog transforms the raw Snowplow event stream into structured tables for things like customer body shape, orders, items ordered, returns, and browsers. This makes the data more manageable and complex analysis and aggregation easier. For example, Cascalog is used to calculate key performance indicators by grouping customer data and summing metrics from the batch views. The output is then analyzed further in R. Looker will also allow business analysts to access and explore the batch views and raw Snowplow event data.
Metail provides an online fitting room and body scanning technology that allows customers to visualize how clothes will fit their body. This helps increase sales metrics for retailers like conversion rates, average order value, and reducing returns. Metail's portfolio includes products like MeModel, an online fitting room, and data services that provide insights into customer body shapes and sizes. The technology and personalized shopping experience benefit both customers and retailers.
Use of Analytics to recover from COVID19 hit economyAmit Parija
The document discusses several topics related to business analytics and optimization. It recommends (1) looking at analytics strategies to re-evaluate business strategies and gain insights, (2) reducing CAPEX and increasing OPEX to improve cash flow, and (3) adopting ready-to-use frameworks for use cases like predictive maintenance and customer analytics.
5 Essential Practices of the Data Driven OrganizationVivastream
The document discusses five essential practices of data-driven organizations: 1) defining key performance indicators, 2) deploying analytics tools expertly across channels, 3) analyzing results and making recommendations, 4) creating changes based on data, and 5) measuring results continuously. It emphasizes the importance of standardization, governance, accuracy, and having a repeatable process for using data to optimize digital properties and drive business goals.
A presentation covers how data science is connected to build effective machine learning solutions. How to build end to end solutions in Azure ML. How to build, model, and evaluate algorithms in Azure ML.
The document discusses using Apigee Insights to enable personalized experiences through predictive analytics. Insights uses a GRASP technology to analyze sequential customer behavior patterns at scale from big data sources. This allows building predictive models to anticipate customer needs and adapt interactions in real-time across channels. The platform provides segmentation, predictions, and an interaction layer to deliver the right offer to the right customer at the right time.
Designing Outcomes For Usability Nycupa Hurst FinalWIKOLO
MarkoHurst.com :: My topic of discussion at the Feb 17 2009 NYC UPA.
Even as the pace of society, business, and the Internet continue to increase, many budgets and time lines continue to decrease. To compound this issue, there is a serious disconnect between business goals, user goals, and what visitors actually do on your site. UX practitioners need a simple and efficient way to reconcile these diverse needs while taking action on their data. Join us to learn about a new method for incorporating quantitative data such as web analytics and business intelligence into your qualitative user experience deliverables: personas, wireframes, and more. This presentation will include discussions of online business models, feedback loops for ensuring cross-discipline collaboration, and ongoing revisions.
Metail allows users to discover clothes on their body shape online with minimum measurements from the user. With your avatar you can create outfits and coupled with our size advice this gives you a confidence in the size and fit.
I'm part of the team within Metail that has built a pipeline to collection, enriched and serve data to the company and our clients, and which has been used to validate Metail's product. This talk was given at the AWS Loft in London 21st April 2016 where I gave an overview of the end-to-end pipeline and then went into detail how we're using AWS' EMR to perform a batch processing of the collected data which is then served internally with Redshift.
Metail uses Snowplow to collect customer data and Cascalog to process that data into normalized batch views for analysis. Cascalog transforms the raw Snowplow event stream into structured tables for things like customer body shape, orders, items ordered, returns, and browsers. This makes the data more manageable and complex analysis and aggregation easier. For example, Cascalog is used to calculate key performance indicators by grouping customer data and summing metrics from the batch views. The output is then analyzed further in R. Looker will also allow business analysts to access and explore the batch views and raw Snowplow event data.
Metail provides an online fitting room and body scanning technology that allows customers to visualize how clothes will fit their body. This helps increase sales metrics for retailers like conversion rates, average order value, and reducing returns. Metail's portfolio includes products like MeModel, an online fitting room, and data services that provide insights into customer body shapes and sizes. The technology and personalized shopping experience benefit both customers and retailers.
Use of Analytics to recover from COVID19 hit economyAmit Parija
The document discusses several topics related to business analytics and optimization. It recommends (1) looking at analytics strategies to re-evaluate business strategies and gain insights, (2) reducing CAPEX and increasing OPEX to improve cash flow, and (3) adopting ready-to-use frameworks for use cases like predictive maintenance and customer analytics.
5 Essential Practices of the Data Driven OrganizationVivastream
The document discusses five essential practices of data-driven organizations: 1) defining key performance indicators, 2) deploying analytics tools expertly across channels, 3) analyzing results and making recommendations, 4) creating changes based on data, and 5) measuring results continuously. It emphasizes the importance of standardization, governance, accuracy, and having a repeatable process for using data to optimize digital properties and drive business goals.
A presentation covers how data science is connected to build effective machine learning solutions. How to build end to end solutions in Azure ML. How to build, model, and evaluate algorithms in Azure ML.
The document discusses using Apigee Insights to enable personalized experiences through predictive analytics. Insights uses a GRASP technology to analyze sequential customer behavior patterns at scale from big data sources. This allows building predictive models to anticipate customer needs and adapt interactions in real-time across channels. The platform provides segmentation, predictions, and an interaction layer to deliver the right offer to the right customer at the right time.
Designing Outcomes For Usability Nycupa Hurst FinalWIKOLO
MarkoHurst.com :: My topic of discussion at the Feb 17 2009 NYC UPA.
Even as the pace of society, business, and the Internet continue to increase, many budgets and time lines continue to decrease. To compound this issue, there is a serious disconnect between business goals, user goals, and what visitors actually do on your site. UX practitioners need a simple and efficient way to reconcile these diverse needs while taking action on their data. Join us to learn about a new method for incorporating quantitative data such as web analytics and business intelligence into your qualitative user experience deliverables: personas, wireframes, and more. This presentation will include discussions of online business models, feedback loops for ensuring cross-discipline collaboration, and ongoing revisions.
ModCloth uses Tableau to enable stakeholders across the company to access and analyze data independently. By training stakeholders in Tableau, the data team is able to focus on more complex analyses while stakeholders can answer questions with same-day data. Some challenges in training include different skill levels and goals amongst stakeholders. ModCloth addresses this through tailored trainings and office hours. Since implementing stakeholder training, the data team spends less time on routine tasks and more on modeling and products while stakeholders complete over 200 additional requests per quarter in Tableau.
Great Data Delivery: A model-based approachZach Taylor
Great data strategies focus on delivery. The presentation will discuss:
- The importance of how data is delivered to driving user adoption of data-driven behavior
- Strategies for creating data driven organizations
- A model-based approach to supporting self-service analytics and ending "data breadlines"
- User experience design for data teams creating a data product for their organizations
This document provides an introduction and overview of a summer school course on business analytics and data science. It begins by introducing the instructor and their qualifications. It then outlines the course schedule and topics to be covered, including introductions to data science, analytics, modeling, Google Analytics, and more. Expectations and support resources are also mentioned. Key concepts from various topics are then defined at a high level, such as the data-information-knowledge hierarchy, data mining, CRISP-DM, machine learning techniques like decision trees and association analysis, and types of models like regression and clustering.
Roger S. Barga discusses his experience in data science and predictive analytics projects across multiple industries. He provides examples of predictive models built for customer segmentation, predictive maintenance, customer targeting, and network intrusion prevention. Barga also outlines a sample predictive analytics project for a real estate client to predict whether they can charge above or below market rates. The presentation emphasizes best practices for building predictive models such as starting small, leveraging third-party tools, and focusing on proxy metrics that drive business outcomes.
This document presents a framework for analyzing consumer clothing preferences using large online shopping datasets. It proposes mining attractive and profitable clothing features from product descriptions and user transactions. The methodology includes pruning noisy images, learning clothing features, and identifying popular and unprofitable features. Experimental results on Taobao and fashion show data reveal classic/attractive styles, popular outfits over time, and unique clothing trends. The framework provides insights into consumer preferences in a fine-grained way.
This document provides an overview of AlgoAnalytics, an analytics consultancy company that uses advanced machine learning techniques. The summary is as follows:
(1) AlgoAnalytics provides predictive analytics solutions for retail, healthcare, financial services, and other industries using techniques like deep learning, natural language processing, and computer vision on structured, text, image and sound data.
(2) The CEO and founder, Aniruddha Pant, has over 20 years of experience applying mathematical techniques to business problems. Some of AlgoAnalytics' work includes recommender systems, demand prediction, image analysis, and customer churn prevention for online retail.
(3) Examples of AlgoAnalytics' predictive models shown include an
Deep.bi - Real-time, Deep Data Analytics Platform For EcommerceDeep.BI
This document provides an overview of the deep.bi analytics platform for ecommerce companies. It describes how deep.bi collects detailed ("deep") data on products, customers, and customer behaviors. This deep data is analyzed to provide real-time insights. Deep.bi helps ecommerce teams improve performance in areas like merchandising, marketing, customer service, and site experience. It does this by tracking custom metrics and providing customizable dashboards. Deep.bi can be used as a standalone tool or integrated with other systems through its API.
Even if you have terabytes of business data, it might not be easy to apply AI-based analytics to it. The bottleneck is often Machine Learning (ML) expertise and scalable infrastructure.
We'll first look at how you can access vast amounts of data from the data warehouse directly in a Google Sheet. Then, you'll see how it's possible to train custom ML models with that data, without ever leaving the spreadsheet.
Speaker:
Karl Weinmeister
Google
Cloud AI Advocacy Manager
Creating a Single View: Overview and AnalysisMongoDB
Brian Goodman discusses creating a single view of customers using MongoDB. He explains that a single view allows for fast, rich queries of all customer data in one place. Goodman outlines how to model customer data, including profiles, actions, products purchased, and sentiment analysis. He emphasizes starting simply and iterating as new questions and data sources emerge. A single customer view in MongoDB provides flexibility to ask questions and gain insights that can improve customer experience and business outcomes.
Introduction to Machine Learning - An overview and first step for candidate d...Lucas Jellema
Our technology has gotten smart and fast enough to make predictions and come up with recommendations in near real time. Machine Learning is the art of deriving models from our Big Data collections – harvesting historic patterns and trends – and applying those models to new data in order to rapidly and adequately respond to that data. This presentation will explain and demonstrate in simple, straightforward terms and using easy to understand practical examples what Machine Learning really is and how it can be useful in our world of applications, integrations and databases. Hadoop and Spark, real time and streaming analytics, Watson and Cloud Datalab, Jupyter Notebooks and Citizen Data Scientists will all make their appearance, as will SQL.
This document discusses key considerations for managing AI products. It begins with an overview of intelligent systems and the OODA loop model of decision making. It then covers the different areas of AI including machine learning, deep learning, and supervised vs unsupervised learning. The rest of the document provides guidance on strategic areas for AI product management such as corporate and data strategy, analyzing use cases, building minimal viable products, and influencing other teams to deliver AI capabilities. It emphasizes the importance of data acquisition, network effects, and focusing on practical applications that create business value.
This document provides an overview of predictive analytics, including its evolution, definition, process, tools and techniques. It discusses how predictive analytics is being used across various industries to optimize outcomes, increase revenue and reduce costs. Specific use cases are outlined, such as using IoT sensor data and predictive models to improve risk calculations for auto insurance, optimize energy usage in buildings, enhance customer recommendations, and optimize policy interventions. Business cases focus on how companies in various sectors leverage customer data and predictive analytics to increase digital marketing effectiveness, revenues, and customer loyalty. Overall, the document examines current and emerging applications of predictive analytics across different domains.
The document discusses how customer experience, big data analytics, and surveys can be used together to gain insights about customers. It provides an overview of big data and data science, how to integrate different customer data sources, and how to use analytics and surveys to measure customer loyalty, experience, and benchmark performance versus competitors. The document also provides examples of key questions to ask in a customer relationship diagnostic survey to understand retention, advocacy, purchasing behavior, experience, and relative performance.
A Practical-ish Introduction to Data ScienceMark West
In this talk I will share insights and knowledge that I have gained from building up a Data Science department from scratch. This talk will be split into three sections:
1. I'll begin by defining what Data Science is, how it is related to Machine Learning and share some tips for introducing Data Science to your organisation.
2. Next up well run through some commonly used Machine Learning algorithms used by Data Scientists, along with examples for use cases where these algorithms can be applied.
3. The final third of the talk will be a demonstration of how you can quickly get started with Data Science and Machine Learning using Python and the Open Source scikit-learn Library.
The document summarizes a data science project on bank marketing data using various tools in IBM Watson Studio. The project followed a standard methodology of data exploration, feature engineering, model selection, training and evaluation. Random forest, XGBoost, LightGBM and deep learning models were tested. LightGBM performed best with a 95.1% ROC AUC score from AutoAI hyperparameter tuning. The best model was deployed to IBM Watson Machine Learning for production use. Overall, the project demonstrated the effectiveness of the Watson Studio platform and tools in developing performant models from structured data.
The document provides an overview of growth marketing. It begins with the basics of growth marketing, which is data-driven marketing based on rapid experimentation focused on the AAARRR funnel, with a blending of marketing, product, and engineering. It then outlines the G.R.O.W.S process for growth marketing, which stands for gather, rank, outline, work, and study. The next sections discuss how growth marketing incorporates artificial intelligence and focuses on mobile growth. It concludes with sections on social media and content marketing for growth.
This document provides an overview of data mining, including what it is, the data mining/KDD process, why it is used, and examples of applications. Data mining involves analyzing large datasets to discover hidden patterns and relationships. It is used in business to better understand customers, predict trends, and make decisions. Examples where data mining is applied include fraud detection, credit scoring, customer profiling, and optimizing marketing campaigns. The document also outlines common data mining techniques and how to implement the process to extract useful knowledge from data.
This document provides an overview of becoming a data scientist. It defines a data scientist and lists common job titles. It discusses the functions of a data scientist like devising business strategies, descriptive/predictive analytics, and data mining. Examples are provided of customer churn analysis and market basket analysis. The skills, aptitudes, and educational paths to become a data scientist are also outlined.
Neil Perlin is an internationally recognized content consultant who helps clients create effective content across various mediums. The document discusses several predictions for the future of technical communication, including increased use of mobile-friendly responsive design, topic-based authoring, structured authoring using standardized styles, and analytics to track content usage. It also covers trends toward open web standards, cloud-based tools, and smaller chunks of reusable content.
Data Mining and Business Analytics by Seyed Ziae Mousavi Mojabzmojab
Data mining is a process of discovering patterns in large data sets involving artificial intelligence, machine learning, statistics, and database systems. It can be used to extract valuable knowledge from data sets and predict unknown data by adjusting models. In business, data mining techniques like customer segmentation, behavior prediction, and direct marketing response prediction can be used to increase profits by better understanding customers and targeting the most profitable ones. A typical data mining process includes business understanding, data understanding, data preparation, modeling, evaluation, and deployment.
ModCloth uses Tableau to enable stakeholders across the company to access and analyze data independently. By training stakeholders in Tableau, the data team is able to focus on more complex analyses while stakeholders can answer questions with same-day data. Some challenges in training include different skill levels and goals amongst stakeholders. ModCloth addresses this through tailored trainings and office hours. Since implementing stakeholder training, the data team spends less time on routine tasks and more on modeling and products while stakeholders complete over 200 additional requests per quarter in Tableau.
Great Data Delivery: A model-based approachZach Taylor
Great data strategies focus on delivery. The presentation will discuss:
- The importance of how data is delivered to driving user adoption of data-driven behavior
- Strategies for creating data driven organizations
- A model-based approach to supporting self-service analytics and ending "data breadlines"
- User experience design for data teams creating a data product for their organizations
This document provides an introduction and overview of a summer school course on business analytics and data science. It begins by introducing the instructor and their qualifications. It then outlines the course schedule and topics to be covered, including introductions to data science, analytics, modeling, Google Analytics, and more. Expectations and support resources are also mentioned. Key concepts from various topics are then defined at a high level, such as the data-information-knowledge hierarchy, data mining, CRISP-DM, machine learning techniques like decision trees and association analysis, and types of models like regression and clustering.
Roger S. Barga discusses his experience in data science and predictive analytics projects across multiple industries. He provides examples of predictive models built for customer segmentation, predictive maintenance, customer targeting, and network intrusion prevention. Barga also outlines a sample predictive analytics project for a real estate client to predict whether they can charge above or below market rates. The presentation emphasizes best practices for building predictive models such as starting small, leveraging third-party tools, and focusing on proxy metrics that drive business outcomes.
This document presents a framework for analyzing consumer clothing preferences using large online shopping datasets. It proposes mining attractive and profitable clothing features from product descriptions and user transactions. The methodology includes pruning noisy images, learning clothing features, and identifying popular and unprofitable features. Experimental results on Taobao and fashion show data reveal classic/attractive styles, popular outfits over time, and unique clothing trends. The framework provides insights into consumer preferences in a fine-grained way.
This document provides an overview of AlgoAnalytics, an analytics consultancy company that uses advanced machine learning techniques. The summary is as follows:
(1) AlgoAnalytics provides predictive analytics solutions for retail, healthcare, financial services, and other industries using techniques like deep learning, natural language processing, and computer vision on structured, text, image and sound data.
(2) The CEO and founder, Aniruddha Pant, has over 20 years of experience applying mathematical techniques to business problems. Some of AlgoAnalytics' work includes recommender systems, demand prediction, image analysis, and customer churn prevention for online retail.
(3) Examples of AlgoAnalytics' predictive models shown include an
Deep.bi - Real-time, Deep Data Analytics Platform For EcommerceDeep.BI
This document provides an overview of the deep.bi analytics platform for ecommerce companies. It describes how deep.bi collects detailed ("deep") data on products, customers, and customer behaviors. This deep data is analyzed to provide real-time insights. Deep.bi helps ecommerce teams improve performance in areas like merchandising, marketing, customer service, and site experience. It does this by tracking custom metrics and providing customizable dashboards. Deep.bi can be used as a standalone tool or integrated with other systems through its API.
Even if you have terabytes of business data, it might not be easy to apply AI-based analytics to it. The bottleneck is often Machine Learning (ML) expertise and scalable infrastructure.
We'll first look at how you can access vast amounts of data from the data warehouse directly in a Google Sheet. Then, you'll see how it's possible to train custom ML models with that data, without ever leaving the spreadsheet.
Speaker:
Karl Weinmeister
Google
Cloud AI Advocacy Manager
Creating a Single View: Overview and AnalysisMongoDB
Brian Goodman discusses creating a single view of customers using MongoDB. He explains that a single view allows for fast, rich queries of all customer data in one place. Goodman outlines how to model customer data, including profiles, actions, products purchased, and sentiment analysis. He emphasizes starting simply and iterating as new questions and data sources emerge. A single customer view in MongoDB provides flexibility to ask questions and gain insights that can improve customer experience and business outcomes.
Introduction to Machine Learning - An overview and first step for candidate d...Lucas Jellema
Our technology has gotten smart and fast enough to make predictions and come up with recommendations in near real time. Machine Learning is the art of deriving models from our Big Data collections – harvesting historic patterns and trends – and applying those models to new data in order to rapidly and adequately respond to that data. This presentation will explain and demonstrate in simple, straightforward terms and using easy to understand practical examples what Machine Learning really is and how it can be useful in our world of applications, integrations and databases. Hadoop and Spark, real time and streaming analytics, Watson and Cloud Datalab, Jupyter Notebooks and Citizen Data Scientists will all make their appearance, as will SQL.
This document discusses key considerations for managing AI products. It begins with an overview of intelligent systems and the OODA loop model of decision making. It then covers the different areas of AI including machine learning, deep learning, and supervised vs unsupervised learning. The rest of the document provides guidance on strategic areas for AI product management such as corporate and data strategy, analyzing use cases, building minimal viable products, and influencing other teams to deliver AI capabilities. It emphasizes the importance of data acquisition, network effects, and focusing on practical applications that create business value.
This document provides an overview of predictive analytics, including its evolution, definition, process, tools and techniques. It discusses how predictive analytics is being used across various industries to optimize outcomes, increase revenue and reduce costs. Specific use cases are outlined, such as using IoT sensor data and predictive models to improve risk calculations for auto insurance, optimize energy usage in buildings, enhance customer recommendations, and optimize policy interventions. Business cases focus on how companies in various sectors leverage customer data and predictive analytics to increase digital marketing effectiveness, revenues, and customer loyalty. Overall, the document examines current and emerging applications of predictive analytics across different domains.
The document discusses how customer experience, big data analytics, and surveys can be used together to gain insights about customers. It provides an overview of big data and data science, how to integrate different customer data sources, and how to use analytics and surveys to measure customer loyalty, experience, and benchmark performance versus competitors. The document also provides examples of key questions to ask in a customer relationship diagnostic survey to understand retention, advocacy, purchasing behavior, experience, and relative performance.
A Practical-ish Introduction to Data ScienceMark West
In this talk I will share insights and knowledge that I have gained from building up a Data Science department from scratch. This talk will be split into three sections:
1. I'll begin by defining what Data Science is, how it is related to Machine Learning and share some tips for introducing Data Science to your organisation.
2. Next up well run through some commonly used Machine Learning algorithms used by Data Scientists, along with examples for use cases where these algorithms can be applied.
3. The final third of the talk will be a demonstration of how you can quickly get started with Data Science and Machine Learning using Python and the Open Source scikit-learn Library.
The document summarizes a data science project on bank marketing data using various tools in IBM Watson Studio. The project followed a standard methodology of data exploration, feature engineering, model selection, training and evaluation. Random forest, XGBoost, LightGBM and deep learning models were tested. LightGBM performed best with a 95.1% ROC AUC score from AutoAI hyperparameter tuning. The best model was deployed to IBM Watson Machine Learning for production use. Overall, the project demonstrated the effectiveness of the Watson Studio platform and tools in developing performant models from structured data.
The document provides an overview of growth marketing. It begins with the basics of growth marketing, which is data-driven marketing based on rapid experimentation focused on the AAARRR funnel, with a blending of marketing, product, and engineering. It then outlines the G.R.O.W.S process for growth marketing, which stands for gather, rank, outline, work, and study. The next sections discuss how growth marketing incorporates artificial intelligence and focuses on mobile growth. It concludes with sections on social media and content marketing for growth.
This document provides an overview of data mining, including what it is, the data mining/KDD process, why it is used, and examples of applications. Data mining involves analyzing large datasets to discover hidden patterns and relationships. It is used in business to better understand customers, predict trends, and make decisions. Examples where data mining is applied include fraud detection, credit scoring, customer profiling, and optimizing marketing campaigns. The document also outlines common data mining techniques and how to implement the process to extract useful knowledge from data.
This document provides an overview of becoming a data scientist. It defines a data scientist and lists common job titles. It discusses the functions of a data scientist like devising business strategies, descriptive/predictive analytics, and data mining. Examples are provided of customer churn analysis and market basket analysis. The skills, aptitudes, and educational paths to become a data scientist are also outlined.
Neil Perlin is an internationally recognized content consultant who helps clients create effective content across various mediums. The document discusses several predictions for the future of technical communication, including increased use of mobile-friendly responsive design, topic-based authoring, structured authoring using standardized styles, and analytics to track content usage. It also covers trends toward open web standards, cloud-based tools, and smaller chunks of reusable content.
Data Mining and Business Analytics by Seyed Ziae Mousavi Mojabzmojab
Data mining is a process of discovering patterns in large data sets involving artificial intelligence, machine learning, statistics, and database systems. It can be used to extract valuable knowledge from data sets and predict unknown data by adjusting models. In business, data mining techniques like customer segmentation, behavior prediction, and direct marketing response prediction can be used to increase profits by better understanding customers and targeting the most profitable ones. A typical data mining process includes business understanding, data understanding, data preparation, modeling, evaluation, and deployment.
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataKiwi Creative
Harness the power of AI-backed reports, benchmarking and data analysis to predict trends and detect anomalies in your marketing efforts.
Peter Caputa, CEO at Databox, reveals how you can discover the strategies and tools to increase your growth rate (and margins!).
From metrics to track to data habits to pick up, enhance your reporting for powerful insights to improve your B2B tech company's marketing.
- - -
This is the webinar recording from the June 2024 HubSpot User Group (HUG) for B2B Technology USA.
Watch the video recording at https://youtu.be/5vjwGfPN9lw
Sign up for future HUG events at https://events.hubspot.com/b2b-technology-usa/
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...Social Samosa
The Modern Marketing Reckoner (MMR) is a comprehensive resource packed with POVs from 60+ industry leaders on how AI is transforming the 4 key pillars of marketing – product, place, price and promotions.
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
University of New South Wales degree offer diploma Transcript
A Year of Data Science at Metail
1. A Year Of Data Science at Metail
Matt McDonnell - Data Scientist
2. Business Context
Startup: “A group of people operating in an environment of uncertainty
striving for a repeatable and scalable business model“
3. A scalable startup needs a Customer Factory
Figure adapted from ‘Scaling Lean’ by Ash Maurya https://leanstack.com/scaling-lean-book/
4. A look behind the curtain – what’s the data?
See Metail in action:
http://metail.myshopify.com?utm_source=DataInsightsNov2016
(Scary UTM code is there so I don’t have to spend the next week
digging into ‘Who are these mysterious visitors?’)
Live Demo Starts Here!
Sheepish explanation of why it’s not working starts here
5. The road to Data Science
• Understand the data
• Learn the tools
• Build the analytics for business intelligence
• More sophisticated data analysis for deeper understanding
• Apply machine learning techniques
• Develop models for prediction and decision making
6. My experience prior to Metail
Careers
• Physics Postdoc
Oxford, Griffith
• Technical Consultant
MathWorks
• Quant Developer
Fidelity Worldwide Investment
• Quant Analyst
Fidelity Worldwide Investment
Tools used:
(plus some Java, C#, Excel and VBA when I had to)
Understanding the data and tools
7. My experience since joining Metail
Lots of event stream data
Many AWS components
Outputs:
- Business Intelligence
- Bespoke Analysis
- Productionised Science
8. Tools to learn
Tools we used a year ago
• R for analysis and science
• dplyr, tidyr, ggplot
• Looker for some of the analysis
Tools we use now
• Python
• pandas, SQLAlchemy, boto3,
seaborn
• Still some R
• dplyr, tidyr, ggplot
• Looker for most of day to day
analysis
• Swagger
• AWS stack
9. Data Analytics
Business intelligence
• How well is the customer factory working? (KPIs)
• What about if we do this? (A/B Tests)
• How’s our retention? (Cohort analysis)
• How efficiently are we digitising garments? (Process monitoring)
• How are we growing?
To answer this we need …
LOTS AND LOTS OF SQL! (yay.)
Most of it embedded in Looker LookML (basically YAML) (yay - again.)
13. Spread of digitised garments
• Look at positions of all digitised garments for a given category.
• page is in units of #scrolls (based on browser height on the user’s device)
• Digitised garments on /women-dress and /women-tops-tees are more spread
out than garments on /women-jeans
14. Views by garment position
• Aggregate visitors who see garment ‘X’ in a given
category on a given date.
• Scale these visitor counts by the maximum #visitors for a
garment on that date in that category.
• In the /women-dress category:
• Digitised garments are spread between 0 and 120 page scrolls
with median ~40
• Long “tail” of digitised garments which get much fewer visits.
• The average digitised garment typically gets 20% of the visitors as
the most popular garment in that category (on a given day).
Date url_path sku Users Page scaled_count
2016-01-01 /women-
dress
101742 699 5.0 0.743617
2016-01-01 /women-
dress
101743 700 4.0 0.744681
15. Views by category
• Look at positions of all digitized garments for a given category.
• ‘page’ is in units of #scrolls (based on browser height on the user’s device)
• Digitised garments on /women-dress and /women-tops-tees are more spread out than digitised garments
on /women-jeans. Could also be that there are more digitised garments in /women-tops-tees.
• There are some “hotspots” of digitised garment positions e.g. ~page 100 for /women-tops-tees.
Unfortunately, they are quite far down the category page and visitor counts are typically around 10-20% of
the values for the most popular garments (closest to the top of the category page)
/women-tops-tees /women-jeans /women-dress
16. Views as time series
• Digitised garments on /women-dress over time
• The “hotspot” moves further down the page: most discernibly in the last 2 weeks.
28. Future plans: more MODELLING!
Some possibilities:
• Use engagement clustering to create labels for supervised learning
• Engagement prediction using trained machine learning
• Apply Probabilistic Graphical Modelling techniques
• (I quite like Daphne Koller’s Coursera course and book
https://www.coursera.org/learn/probabilistic-graphical-models/home/welcome )
• More Bayesian reasoning
• … (any suggestions?)
Time permitting, SAMIAM (http://reasoning.cs.ucla.edu/samiam/) demo goes here
29. Bayesian inference – what are the variables?
(Disclaimer: this is me playing around with SAMIAM for 15 minutes and not an actual model)
30. Bayesian inference – how are things related?
(Disclaimer: this is me playing around with SAMIAM for 15 minutes and not an actual model)
31. Bayesian inference – what can we infer?
(Disclaimer: this is me playing around with SAMIAM for 15 minutes and not an actual model)