This document summarizes a presentation about data science at OLX. It discusses OLX's moderation and recommender systems. For moderation, it describes OLX's machine learning models that automatically moderate listings for issues like duplicates, spam, and illegal/NSFW content. Moderators review flagged content. For recommendations, it discusses collaborative filtering and item embeddings to suggest relevant listings to users. It also outlines OLX's team structure, goal setting process, and expectations for data scientists, which include a focus on modeling, evaluation and some production work.
2016 XUG Conference Big Data: Big Deal for Personalized Communications or Meh?Jeffrey Stewart
What is the big deal with big data? Why is everyone talking about it? What, if anything, is anyone doing with it?
This session will discuss big data, starting with a definition of the 4 Vs and diving into the current and potential uses in personalized communication.
What is different from traditional data management and business intelligence is the sheer size of the datasets and the quality of sources of relevant data.
Each source has different structures, and the frequency of updates is faster than ever before. How can all of data from all facets of human activity be related? How can they be combined and analyzed to help us understand individuals and how they want to be communicated to individually?
ShopekLobek is a website and mobile application to:
Quickly share needs and abilities in a
tweet-like fashion.
Get recommended abilities from your friends
and people nearby, which are most relevant to
your need.
Get similar needs from other users to know
how did they satisfy it.
Get needs of friends and people nearby which
you can satisfy, so that you can offer help
SEO Best Practices: Top 10 SEO Tools for 2016Steve Weber
On August 16th 2016 we held the fourth episode of our SEO best practice webinar series. This webinar is a quick overview of our favorite SEO tools that we like to use as we optimize our clients websites. Check out the recording from the webinar here: https://youtu.be/h9csyndCdK4
12 Things to do Before Your Company Dies : Conversion Conference London - Oct...Craig Sullivan
A roundup of all the things to help you maintain a competitive edge in experience design and conversion optimisation. With examples of companies putting this stuff together, the tools they are using and their project management approaches, this presentation delves deeper into the cultural aspects of CRO.
SEO & Large websites - Search University 2012Sven De Meyere
A presentation I gave @ Search University 2012. Rock solid tips & takeaways on how to improve your overall SEO strategy for large websites.
Topics covered, such as:
- SIte architecture
- Internal linking
- Microformats
- XML sitemaps
- Site speed
Join 20-year SEO veteran Ryan Huser as he explores the transformative intersection of Generative AI and SEO in his talk, ""Generative AI: The new Wild West of SEO"". The presentation will discuss how tech giants Bing and Google have enhanced the search experience using Generative AI. It will also unpack the wealth of options available to marketers, and how they can use these innovations for content creation and search engine optimization. Ryan's talk promises to offer invaluable insights into the emerging landscape of AI-driven SEO, emphasizing its profound implications for digital marketing strategies.
HacktoberFestPune - DSC MESCOE x DSC PVGCOETTanyaRaina3
HacktoberFestPune is a beginner-friendly, all-inclusive event that is absolutely free of cost. Certificates will be issued by DSC MESCOE and DSC PVGCOET for everyone who can complete 4 successful Pull Requests by 13th October 10 AM! An evening filled with speaker sessions, interactions with fellow developers, and mini-games, we think you'll have a great time with everyone!
The right path to making search relevant - Taxonomy Bootcamp London 2019OpenSource Connections
Three aspects of search quality; focusing on relevance; why this is not just a technology problem; measuring search maturity & relevance; open source tools and techniques; Solr and Elasticsearch
As machine learning has is permeating more and more industries and businesses, the need for audit professionals to provide assurance over machine learning is growing. Andrew's presentation will provide an audit-centric overview of machine learning and present a framework for how to begin auditing machine learning in your organization.
Free Basic SEO Course/Workshop - AnadigmeJoaquin Poggi
Slides from the SEO Course from Anadigme Head of Search, short description about the course:
* Real project training, step by step for building your SEO plan.
* All main and most important areas of SEO.
* How to stop wasting time in things that don´t work.
* The tools you need to improve sales.
* Content Marketing Tools and tips for your business.
* Technical SEO Free Tools for finding those errors that hurt your Google rankings.
How to Enhance Your SEO When Redesigning an Ecommerce Website - Tarun Gehani,...Tarun Gehani
With over 40% of traffic to e-commerce websites coming from organic search, it is imperative to take the necessary steps to preserve your SEO when embarking on a redesign. Learn current, reliable SEO best practices to deploy all along the web design process, to maintain (and increase) your SEO.
It’s generally accepted that traffic loss is inherent to any website relaunch, but with proper planning and preparation, you can mitigate nearly any hiccup to your keyword rankings and traffic. Conducting an SEO audit prior to embarking on the redesign process will enable you to deeply understand which elements of your website are driving traffic in the first place—before you make any design, architecture, or content decisions. This is especially important for e-commerce websites which traditionally are much larger, more complex, and more prone to simple oversights which could have dire consequences. Learn how to preserve (and enhance) your e-commerce SEO by understanding how search engines and users interact with your unique website and set yourself up for a successful website relaunch!
2016 XUG Conference Big Data: Big Deal for Personalized Communications or Meh?Jeffrey Stewart
What is the big deal with big data? Why is everyone talking about it? What, if anything, is anyone doing with it?
This session will discuss big data, starting with a definition of the 4 Vs and diving into the current and potential uses in personalized communication.
What is different from traditional data management and business intelligence is the sheer size of the datasets and the quality of sources of relevant data.
Each source has different structures, and the frequency of updates is faster than ever before. How can all of data from all facets of human activity be related? How can they be combined and analyzed to help us understand individuals and how they want to be communicated to individually?
ShopekLobek is a website and mobile application to:
Quickly share needs and abilities in a
tweet-like fashion.
Get recommended abilities from your friends
and people nearby, which are most relevant to
your need.
Get similar needs from other users to know
how did they satisfy it.
Get needs of friends and people nearby which
you can satisfy, so that you can offer help
SEO Best Practices: Top 10 SEO Tools for 2016Steve Weber
On August 16th 2016 we held the fourth episode of our SEO best practice webinar series. This webinar is a quick overview of our favorite SEO tools that we like to use as we optimize our clients websites. Check out the recording from the webinar here: https://youtu.be/h9csyndCdK4
12 Things to do Before Your Company Dies : Conversion Conference London - Oct...Craig Sullivan
A roundup of all the things to help you maintain a competitive edge in experience design and conversion optimisation. With examples of companies putting this stuff together, the tools they are using and their project management approaches, this presentation delves deeper into the cultural aspects of CRO.
SEO & Large websites - Search University 2012Sven De Meyere
A presentation I gave @ Search University 2012. Rock solid tips & takeaways on how to improve your overall SEO strategy for large websites.
Topics covered, such as:
- SIte architecture
- Internal linking
- Microformats
- XML sitemaps
- Site speed
Join 20-year SEO veteran Ryan Huser as he explores the transformative intersection of Generative AI and SEO in his talk, ""Generative AI: The new Wild West of SEO"". The presentation will discuss how tech giants Bing and Google have enhanced the search experience using Generative AI. It will also unpack the wealth of options available to marketers, and how they can use these innovations for content creation and search engine optimization. Ryan's talk promises to offer invaluable insights into the emerging landscape of AI-driven SEO, emphasizing its profound implications for digital marketing strategies.
HacktoberFestPune - DSC MESCOE x DSC PVGCOETTanyaRaina3
HacktoberFestPune is a beginner-friendly, all-inclusive event that is absolutely free of cost. Certificates will be issued by DSC MESCOE and DSC PVGCOET for everyone who can complete 4 successful Pull Requests by 13th October 10 AM! An evening filled with speaker sessions, interactions with fellow developers, and mini-games, we think you'll have a great time with everyone!
The right path to making search relevant - Taxonomy Bootcamp London 2019OpenSource Connections
Three aspects of search quality; focusing on relevance; why this is not just a technology problem; measuring search maturity & relevance; open source tools and techniques; Solr and Elasticsearch
As machine learning has is permeating more and more industries and businesses, the need for audit professionals to provide assurance over machine learning is growing. Andrew's presentation will provide an audit-centric overview of machine learning and present a framework for how to begin auditing machine learning in your organization.
Free Basic SEO Course/Workshop - AnadigmeJoaquin Poggi
Slides from the SEO Course from Anadigme Head of Search, short description about the course:
* Real project training, step by step for building your SEO plan.
* All main and most important areas of SEO.
* How to stop wasting time in things that don´t work.
* The tools you need to improve sales.
* Content Marketing Tools and tips for your business.
* Technical SEO Free Tools for finding those errors that hurt your Google rankings.
How to Enhance Your SEO When Redesigning an Ecommerce Website - Tarun Gehani,...Tarun Gehani
With over 40% of traffic to e-commerce websites coming from organic search, it is imperative to take the necessary steps to preserve your SEO when embarking on a redesign. Learn current, reliable SEO best practices to deploy all along the web design process, to maintain (and increase) your SEO.
It’s generally accepted that traffic loss is inherent to any website relaunch, but with proper planning and preparation, you can mitigate nearly any hiccup to your keyword rankings and traffic. Conducting an SEO audit prior to embarking on the redesign process will enable you to deeply understand which elements of your website are driving traffic in the first place—before you make any design, architecture, or content decisions. This is especially important for e-commerce websites which traditionally are much larger, more complex, and more prone to simple oversights which could have dire consequences. Learn how to preserve (and enhance) your e-commerce SEO by understanding how search engines and users interact with your unique website and set yourself up for a successful website relaunch!
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
ACEP Magazine edition 4th launched on 05.06.2024Rahul
This document provides information about the third edition of the magazine "Sthapatya" published by the Association of Civil Engineers (Practicing) Aurangabad. It includes messages from current and past presidents of ACEP, memories and photos from past ACEP events, information on life time achievement awards given by ACEP, and a technical article on concrete maintenance, repairs and strengthening. The document highlights activities of ACEP and provides a technical educational article for members.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsVictor Morales
K8sGPT is a tool that analyzes and diagnoses Kubernetes clusters. This presentation was used to share the requirements and dependencies to deploy K8sGPT in a local environment.
HEAP SORT ILLUSTRATED WITH HEAPIFY, BUILD HEAP FOR DYNAMIC ARRAYS.
Heap sort is a comparison-based sorting technique based on Binary Heap data structure. It is similar to the selection sort where we first find the minimum element and place the minimum element at the beginning. Repeat the same process for the remaining elements.
40. ML
Such description
So much text
Accept
Reject
Moderation queue
Automatic
moderation system
Duplicate
detection
Forbidden
items
Other ML
models
41. ML
Such description
So much text
MP
Automatic
moderation system
Moderation panel
Accept
Reject
Moderators
s3
ES
Duplicate
detection
system
Hashes
Accept
Reject
Moderation queue
42. ML
Such description
So much text
MP
Automatic
moderation system
Moderation panel
Accept
Reject
Moderators
s3
ES
Duplicate
detection
system
Hashes
Accept
Reject
Moderation queue
Index listings & images
43. ML
Such description
So much text
MP
Automatic
moderation system
Moderation panel
Accept
Reject
Moderators
s3
ES
Duplicate
detection
system
Hashes
Accept
Reject
Moderation queue
Detect duplicates
44. ML
Such description
So much text
MP
Automatic
moderation system
Moderation panel
Accept
Reject
Moderators
s3
ES
Duplicate
detection
system
Hashes
Accept
Reject
Moderation queue
Moderate duplicates
45. ML
Such description
So much text
MP
Automatic
moderation system
Moderation panel
Accept
Reject
Moderators
s3
ES
Duplicate
detection
system
Hashes
Accept
Reject
Moderation queue
Collect feedback
57. Plan
● What is OLX
● Data Science at OLX
● Moderation system
● Recommender system
● Way of working
● Expectations from data scientists
58. A project like this is very complex
We need a team (or multiple teams) to make it work: it’s a joined effort of many
people working together
59. Roles in teams
● Product Manager (PM)
● Engineering Manager (EM)
● Software Engineers
○ Backend Engineers (BE)
○ Data Engineers (DE)
○ ML Engineer (MLE)
○ Site Reliability Engineers (SRE)
○ Frontend Engineers (FE)
○ Mobile Engineers
● Product Analysts (PA)
● Data Scientists (DS)
60. Team A
Team B
Team C
Product
PM
PM
PM
Head of
Product
PA
PA
Head of
Analytics
DS
DS
DS
Manager
Data Tech
EM
EM
EM
Head of
Engineering
BE
DE
BE
FE
BE SRE
FE SRE
FE
Matrix structure
61. Feature teams
● A cross-functional team with experts in different areas
● All work together on one feature/product
● All have the same goal!
● Anyone can work on anything, as long as it helps achieve the goal
PA DS DE BE SRE
EM
PM
62. Goal setting
● OKRs, set quarterly
● Great alignment tool: other teams know what you’re doing
● Whatever team is doing, should be in line with their OKRs
Example:
● O
○ Catch more fraudsters
● KRs
○ Precision of model A improves from 30% to 60% while staying at the same recall level
○ Model B is tested in 5 key markets
63. Plan
● What is OLX
● Data Science at OLX
● Moderation system
● Recommender system
● Way of working
● Expectations from data scientists
65. DATA SCIENTIST
DATA SCIENTIST
DATA SCIENTIST
Product
Definition
Data
Processing
Modeling Evaluation Production Customers
Data
Collection
Stages
Focus on modelling and evaluation, a bit
on production
66. DATA SCIENTIST
DATA SCIENTIST
DATA SCIENTIST
Product
Definition
Data
Processing
Modeling Evaluation Production Customers
Data
Collection
Stages