Meta experimentation at Etsy - Emily Sommer

•

1 like•98 views

Experimentation abounds, but how do we test our tests? I’ll share some ways we at Etsy proved our experimentation methods broken, and the approach we took to fixing them. I’ll discuss multiple ways of running A/A tests (as opposed to A/B tests), and a statistical method called bootstrapping, which we used to remedy our experiment analysis.

Data & Analytics

Introduction to the
(Bag of Little) Bootstrap(s)

Agenda
1 2 3
4 5
Identification of the
Problem
Classic Bootstrap Bag of Little Bootstraps
Evaluation of Success
(or Failure)
Next Steps
2

Identification of the Problem
SECTION 1
3

CONTROL VARIANT
No pugs. No change. Etsy with PUG!
5
Standard A/B Experiment

CONTROLS VARIANTS
6
Browser # ABJ463MZ2
OFF
to pug or not to pug?

10
standard error:
accounting for variance
and sample sizes

CONTROL VARIANT
No pugs. No change. No pugs. No change.
13
Standard A/A Experiment

Statistical Assumptions
for Parametric Testing
15

i.i.d.
independent & identically distributed
16
A
A
B
C
D
E

20
...
sample boots:
WITH replacement
(do this many, many times)

take the 95%
confidence interval
21
2.5th 97.5th
distribution of t-statistics
boot 1
boot 3
boot 2
boot 4
boot 1 results
boot 2 results
boot 3 results
boot 4 results

22
BOOTSTRAP BY USER
instead of by "visit" (one user can have many "visits")
user id visit id $
A 45 1
A 23 1
A 85 0
B 37 0
C 12 1
C 72 0
A
B
C
}
}
}
closer to i.i.d. (independent & identically distributed)

25
A B C D E Fusers in our
original experiment
fetch a bag:
resample withOUT replacementbag 1
A
E
F
B
25

26
A B C D E Fusers in our
original experiment
bag 1
F
A
E
B
E
B
F
A
F
B
monte carlo subsamples:
resample WITH replacement (from our bag)
*TO SIZE OF ORIGINAL DATA SET*
boot 1
26

27
t-statistic =
difference in populations
standard error
[per monte carlo'd subsample]

bootstrapped
confidence interval
28
2.5th 97.5th
t-statistics
E
B
F
A
F
B
boot 1
E
B
F
A
B
boot 2
A
E
F
B
boot 3
E
A
A
E
F
A
F
B
boot 4
B

A B C D E Fusers in our
original experiment
confidence interval
from bag 1
2.5th 97.5th
t-statistics
E
B
F
A
B
boot 2
E
B
F
A
F
B
boot 1
E
F
B
boot 3
E
F
A
F
B
boot 4
A
E
A
A
B
bag 1
F
A
E
B
29

average the
confidence intervals
30
avg. avg.
2.5th 97.5th
averaged t-statistics
bag 1
bag 3
bag 2
bag 4
bag 1 results
bag 2 results
bag 3 results
bag 4 results

Fixes i.i.d. & distribution
Suitable for distributed systems
Faster, less memory
WINS
31

Evaluation of Success
(or Failure)
SECTION 4
32

Hyperparameter optimization,
Poisson Bootstrap
37

Power calculations
& complicated metrics
39

Resources
Kleiner, A., Talwalkar, A., Sarkar, P., and Jordan, M. I. A scalable bootstrap for massive
data. arXiv preprint arXiv:1112.5016v2, 2012. URL http://arxiv.org/abs/1112.5016v2
Bakshy, E., Eckles, D. Uncertainty in Online Experiments with Dependent Data: An
Evaluation of Bootstrap Methods. arXiv preprint arXiv:1304.7406v3, 2013. URL
https://arxiv.org/pdf/1304.7406v3.pdf
Idea for Bootstrapping @ Etsy: @hpster (Hilary Parker)
40

QUESTIONS, COMMENTS, REFUTATIONS?
Contact me: Emily Sommer (esommer@etsy.com)
Thanks!
41

More from Evention

The way you operate your Big Data environment is not going to be the same anymore. This session is based on our experience managing on-premise environments and taking the lesson from innovative data-driven companies that successfully migrated their multi PB Hadoop clusters. Where to start and what decisions you have to make to gradually becoming cloud ready. The examples would refer to Google Cloud Platform yet the challenges are common.

Elephants in the cloud or how to become cloud ready - Krzysztof Adamski, GetI...

Evention

Deriving Actionable Insights from High Volume Media Streams - Jörn Kottmann, ...

Evention

During this session we’ll discuss the pros and cons of a new structured streaming data processing model in Spark and a nifty way of enhancing Spark with SnappyData, an open-source framework providing great features for both persistent and in-motion data analysis. Based on a real-life use case, where we designed and implemented a streaming application filtering, consuming and aggregating tons of events, we will talk the role of the persistent back-end and stream processing integration in the real-time applications in terms of performance, robustness and scalability of the solution.

Enhancing Spark - increase streaming capabilities of your applications - Kami...

Evention

The next time you find yourself thinking there isn’t enough time in a week, consider what Drinker Biddle did for their client in 7 days. When a senior executive for a publicly traded company was fired for underperformance, he made a serious allegation on his way out the door. He claimed he was laid off because of his repeated attempts to inform officials that the company was falsifying quarterly financial reports to the public. Instead of waiting for the typical pace of discovery that could potentially cost their client at least a quarter of a million dollars, Drinker Biddle used powerful analytics technology to conduct an intelligent investigation, fast. In this session, you will learn about machine learning that makes digging through large multi-sources data sets possible. You will have a chance to see the backstage of how engineers empower legal teams to organize data, discover the truth and act on it.

7 Days of Playing Minesweeper, or How to Shut Down Whistleblower Defense with...

Evention

We will present the journey of Orange Polska evolving from a proprietary ecosystem towards significantly open-source ecosystem based on Hadoop and friends – a journey particularly challenging at a large corporation. We’ll present key drivers for starting Big Data, evolution of BI, emergence of Data Scientists and advanced analytics along with operational reporting and stream processing to detect issues. This presentation will cover both technical aspects and business environment, as both are inherently linked in process of big data enterprise adoption.

Big Data Journey at a Big Corp - Tomasz Burzyński, Maciej Czyżowicz, Orange P...

Evention

Apache Flink is an open source platform for distributed stream and batch data processing. At its core, Flink is a streaming dataflow engine which provides data distribution, communication, and fault tolerance for distributed computations over data streams. On top of this core, APIs make it easy to develop distributed data analysis programs. Libraries for graph processing or machine learning provide convenient abstractions for solving large-scale problems. Apache Flink integrates with a multitude of other open source systems like Hadoop, databases, or message queues. Its streaming capabilities make it a perfect fit for traditional batch processing as well as state of the art stream processing.

Stream processing with Apache Flink - Maximilian Michels Data Artisans

Evention

Scaling Cassandra in all directions - Jimmy Mardell Spotify

Evention

Źródłami dla Big Data są zwykle ustrukturalizowane dane, pochodzące z innych systemów i z mechanizmów śledzących kanały interakcji z klientami (lub urządzeniami w przypadku M2M). A co z olbrzymim potencjałem drzemiącym w przepastnych zasobach informacji nieustrukturalizowanej? Jak wydobyć biznesową wartość i zamienić koszt (składowania) takich danych na rzeczywiste aktywa firmy? Poza tradycyjnymi narzędziami analizy Big Data (HPE IDOL czy Vertica) firma Hewlett Packard Enterprise oferuje technologie dla informacji niestrukturalnych. Klasyfikacja i analityka plików oferowana przez HPE ControlPoint pozwala na łatwą ocenę jakości informacji niestrukturalnych oraz na szybkie odsianie zbędnych danych (redundant, obsolete, trivial and dark data). HPE Investigative Analytics łączy źródła danych i analizy nie tylko za pomocą modeli behavioralnych, ale uzupełnia ten obraz o Analizę Nastroju (Sentiment Analysis) oraz Intencje (Intent)

Big Data for unstructured data Dariusz Śliwa

Evention

Elastic development. Implementing Big Data search Grzegorz Kołpuć

Evention

Deep Water is H2O’s integration with multiple open source deep learning libraries such as TensorFlow, MXNet and Caffe. On top of the performance gains from GPU backends, Deep Water naturally inherits all H2O properties in scalability. ease of use and deployment. In this talk, I will go through the motivation and benefits of Deep Water. After that, I will demonstrate how to build and deploy deep learning models with or without programming experience using H2O’s R/Python/Flow (Web) interfaces.

H2 o deep water making deep learning accessible to everyone -jo-fai chow

Evention

SentiOne is one of the leading solutions in Europe for social media listening and analysis. We monitor over 26 European markets including CEE, Scandinavia, DACH, and the Balkans. The amount of data that is processed every day and is ready to be queried by our users is enormous. Over the years we have tested many technologies and approaches in big data from which many have failed. The presentation includes our experiences and lessons learned on setting up big data company from scratch. I will give details on configuring robust ElasticSearch cluster with over 26TB of data and describe key challenges in efficient web crawling and data extraction

That won’t fit into RAM - Michał Brzezicki

Evention

SQL is undoubtedly the most widely used language for data analytics for many good reasons. It is declarative, many database systems and query processors feature advanced query optimizers and highly efficient execution engines, and last but not least it is the standard that everybody knows and uses. With stream processing technology becoming mainstream a question arises: “Why isn’t SQL widely supported by open source stream processors?”. One answer is that SQL’s semantics and syntax have not been designed with the characteristics of streaming data in mind. Consequently, systems that want to provide support for SQL on data streams have to overcome a conceptual gap. One approach is to support standard SQL which is known by users and tools but comes at the cost of cumbersome workarounds for many common streaming computations. Other approaches are to design custom SQL-inspired stream analytics languages or to extend SQL with streaming-specific keywords. While such solutions tend to result in more intuitive syntax, they suffer from not being established standards and thereby exclude many users and tools. Apache Flink is a distributed stream processing system with very good support for streaming analytics. Flink features two relational APIs, the Table API and SQL. The Table API is a language-integrated relational API with stream-specific features. Flink’s SQL interface implements the plain SQL standard. Both APIs are semantically compatible and share the same optimization and execution path based on Apache Calcite. In this talk we present the future of Apache Flink’s relational APIs for stream analytics, discuss their conceptual model, and showcase their usage. The central concept of these APIs are dynamic tables. We explain how streams are converted into dynamic tables and vice versa without losing information due to the stream-table duality. Relational queries on dynamic tables behave similar to materialized view definitions and produce new dynamic tables. We show how dynamic tables are converted back into changelog streams or are written as materialized views to external systems, such as Apache Kafka or Apache Cassandra, and are updated in place with low latency. We conclude our talk demonstrating the power and expressiveness of Flink’s relational APIs by presenting how common stream analytics use cases can be realized.

Stream Analytics with SQL on Apache Flink - Fabian Hueske

Evention

Since June 2016, Kafak, Spark and Flink-as-a-service have been available to researchers and companies in Sweden from the Swedish ICT SICS Data Center at www.hops.site using the HopsWorks platform (www.hops.io). Flink and Spark applications are run within a project on a YARN cluster with the novel property that applications are metered and charged to projects. Projects are also securely isolated from each other and include support for project-specific Kafka topics. That is, Kafka topics are protected from access by users that are not members of the project. In this talk we will discuss the challenges in building multi-tenant streaming applications on YARN that are metered and easy-to-debug. We show how we use the ELK stack (Elasticsearch, Logstash, and Kibana) for logging and debugging running streaming applications, how we use Graphana and Graphite for monitoring streaming applications, and how users can debug and optimize terminated Spark Streaming jobs using Dr Elephant. We will also discuss the experiences of our users (over 120 users as of Oct 2016): how they manage their Kafka topics and quotas, patterns for how users share topics between projects, and our novel solutions for helping researchers debug and optimize Spark applications. Hopsworks is entirely UI-driven with an Apache v2 open source license.

Hopsworks Secure Streaming as-a-service with Kafka Flinkspark - Theofilos Kak...

Evention

Security is at the core of every bank activity. ING set an ambitious goal to have an insight into the overall network data activity. The purpose is to quickly recognize and neutralize unwelcomed guests such as malware, viruses and to prevent data leakage or track down misconfigured software components. Since the inception of the CoreIntel project we knew we were going to face the challenges of capturing, storing and processing vast amount of data of a various type from all over the world. In our session we would like to share our experience in building scalable, distributed system architecture based on Kafka, Spark Streaming, Hadoop and Elasticsearch to help us achieving these goals. Why choosing good data format matters? How to manage kafka offsets? Why dealing with Elasticsearch is a love-hate relationship for us or how we just managed to put it all these pieces together.

ING CoreIntel - collect and process network logs across data centers in near ...

Evention

Our platform, which purchases and runs advertisements in the Real-Time Bidding model, processes 250K bid requests and generates 20K events per every second which gives 3TB data every day. Because of machine learning, system monitoring and financial settlements we need to filter, store, aggregate and join these events together. As a result processed events and aggregated statistics are available in Hadoop, Google BigQuery and Postgres. The most demanding are business requirements such as: events that should be joined together can appear 30 days after each other, we are not allowed to create any duplicates, we have to minimalize possible data losses as well as there could not be any differences between generated data outputs. We have designed and implemented the solution which has reduced delay of availability of this data from 1 day to 15 seconds. We will preent: Our first approach to the problem (end-of-day batch jobs) and final solution (real-time stream processing) 2. detailed description of the current architecture 3. how we had tested new data flow before it was deployed and in which way it is being monitored now 4. our one-click deployment process 5. decisions which we made with its advantages and disadvantages and our future plans to improve our current solution. We would like to share our experience connected with scaling solution over clusters of computers in several data centers. We will focus on the current architecture but also on testing and monitoring issues with our deployment process. Finally, we would like to provide an overview of engaged projects like Kafka, Mirrormaker, Storm, Aerospike, Flume, Docker etc. We will describe what we have achieved from given open source and some problems we have come across.

Real Time Data Processing at RTB House - Bartosz Łoś

Evention

Criteo had an Hadoop cluster with 39 PB raw stockage, 13404 CPUs, 105 TB RAM, 40 TB data imported per day and over 100000 jobs per day. This cluster was critical in both stockage and compute but without backups. After many efforts to increase our redundancy, we now have two clusters that, combined, have more than 2000 nodes, 130 PB, two different versions of Hadoop and 200000 jobs per day but these clusters do not yet provide a redundant solution to our all storage and compute needs. This talk discusses the choices and issues we solved in creating a 1200 node cluster with new hardware in a new data centre. Some of the challenges involved in running two different clusters in parallel will be presented. We will also analyse what went right (and wrong) in our attempt to achieve redundancy and our plans to improve our capacity to handle the loss of a data centre.

Redundancy for Big Hadoop Clusters is hard - Stuart Pook

Evention

Fandom is the largest entertainment fan site in the world. With more than 360,000 fan communities and a global audience of over 190 million monthly uniques, we are the fan’s voice in entertainment. Being the largest entertainment site, wikia generates massive volumes of data, which varies from clickstream, user activities, api requests, ad delivery, A/B testing and much more. The big challenge is not just the volume but the orchestration involved in combining various sources of data with various periodicity, volumes. And Making sure the processed data is available for the consumers within the expected time. Thus helping gain the right insights well within the right time. A conscious decision was made to choose the right open source tool to solve the problem of orchestration, after evaluating various tools we decided to use Apache airflow. This presentation will give an overview of comparisons of existing tools and emphasize on why we choose airflow. And how Airflow is being used to create a stable reliable orchestration platform to enable non data engineers to seamlessly access data by democratizing data. We will focus on some tricks and best practises of developing workflows with Airflow and show how we are using some of the features of airflow.

Orchestrating Big Data pipelines @ Fandom - Krystian Mistrzak Thejas Murthy

Evention

A plethora of data processing tools, most of them open source, is available to us. But who actually runs data pipelines? What about dynamically allocating resources to data pipeline components? In this talk we will discuss options to operate elastic data pipelines with modern, cloud native platforms such as DC/OS with Apache Mesos, Kubernetes and Docker Swarm. We will review good practices, from containerizing workloads to making things resilient and show elastic data pipelines in action.

DataOps or how I learned to love production - Michael Hausenblas

Evention

Imagine such situation: you have deployed a service to production and everything seems to work. After some time your phone rings and an analyst says ‘Could you help me with searching latest clickstream produced by your application?’. Well, now it got serious. To make matters worse, you have been notified about the error by your client. It shouldn’t have happened. It should be the other way round. @Allegro we found a solution for this use-case. I am going to tell you how we managed to detect anomalies (heavy web traffic after successful commercial, or fall of search events, or no clicks on Ad). We tested all available solutions (Twitter detector, HTM algorithms) and came to conclusion that all machine learning models are too complicated. We didn’t understand them. We created our own simple model. I will show you how we moved from promising idea in R language to final working solution in Scala. If you like buzzwords these might be for you: #Machine Learning, #Scala, #R, #Statistics, #Simplicity, #Real-time processing

Anomaly detection made easy - Piotr Guzik Allegro

Evention

Avito is the third biggest classified site in the world after Craigslit and 58.com from China. Avito nowadays is not a monolite project, but comprises dozens specialized vertical sites and applications. The introduction of microservice architecture in Avito spawned hundreds of new services. In this situation is is critical to implement common BI infrastructure, able to collect, process, combine and analyse data from all those microservices and persistent to constant changes. Avito Analytics is based on HP Vertica MPP database, highly normalized data lake and an asynchronous event bus. Those tools give Avito the ability to use all types of Machine Learning and Reporting tools, manage sites, applications and microservices. Avito is the Russian OLX. Moreover, nowadays, Avito and OLX are both part of the Naspers group, we do the same business in different counties and share experience.

Scalable analytics for microservices architecture nikolay golov

Evention

More from Evention (20)

Elephants in the cloud or how to become cloud ready - Krzysztof Adamski, GetI...

Deriving Actionable Insights from High Volume Media Streams - Jörn Kottmann, ...

Enhancing Spark - increase streaming capabilities of your applications - Kami...

7 Days of Playing Minesweeper, or How to Shut Down Whistleblower Defense with...

Big Data Journey at a Big Corp - Tomasz Burzyński, Maciej Czyżowicz, Orange P...

Stream processing with Apache Flink - Maximilian Michels Data Artisans

Scaling Cassandra in all directions - Jimmy Mardell Spotify

Big Data for unstructured data Dariusz Śliwa

Elastic development. Implementing Big Data search Grzegorz Kołpuć

H2 o deep water making deep learning accessible to everyone -jo-fai chow

That won’t fit into RAM - Michał Brzezicki

Stream Analytics with SQL on Apache Flink - Fabian Hueske

Hopsworks Secure Streaming as-a-service with Kafka Flinkspark - Theofilos Kak...

ING CoreIntel - collect and process network logs across data centers in near ...

Real Time Data Processing at RTB House - Bartosz Łoś

Redundancy for Big Hadoop Clusters is hard - Stuart Pook

Orchestrating Big Data pipelines @ Fandom - Krystian Mistrzak Thejas Murthy

DataOps or how I learned to love production - Michael Hausenblas

Anomaly detection made easy - Piotr Guzik Allegro

Scalable analytics for microservices architecture nikolay golov

Recently uploaded

Edukaciniai dropshipping via API with DroFx

olyaivanovalion

Zuja dropshipping via API with DroFx.pptx

olyaivanovalion

Ashok Vihar Call Girls in Delhi (–9953330565) Escort Service In Delhi NCR PROVIDE 100% REAL GIRLS ALL ARE GIRLS LOOKING MODELS AND RAM MODELS ALL GIRLS” INDIAN , RUSSIAN ,KASMARI ,PUNJABI HOT GIRLS AND MATURED HOUSE WIFE BOOKING ONLY DECENT GUYS AND GENTLEMAN NO FAKE PERSON FREE HOME SERVICE IN CALL FULL AC ROOM SERVICE IN SOUTH DELHI Ultimate Destination for finding a High Profile Independent Escorts in Delhi.Gurgaon.Noida..!.Like You Feel 100% Real Girl Friend Experience. We are High Class Delhi Escort Agency offering quality services with discretion. We only offer services to gentlemen people. We have lots of girls working with us like students, Russian, models, house wife, and much More We Provide Short Time and Full Night Service Call ☎☎+91–9953330565 ❤꧂ • In Call and Out Call Service in Delhi NCR • 3* 5* 7* Hotels Service in Delhi NCR • 24 Hours Available in Delhi NCR • Indian, Russian, Punjabi, Kashmiri Escorts • Real Models, College Girls, House Wife, Also Available • Short Time and Full Time Service Available • Hygienic Full AC Neat and Clean Rooms Avail. In Hotel 24 hours • Daily New Escorts Staff Available • Minimum to Maximum Range Available. Location;- Delhi, Gurgaon, NCR, Noida, and All Over in Delhi Hotel and Home Services HOTEL SERVICE AVAILABLE :-REDDISSON BLU,ITC WELCOM DWARKA,HOTEL-JW MERRIOTT,HOLIDAY INN MAHIPALPUR AIROCTY,CROWNE PLAZA OKHALA,EROSH NEHRU PLACE,SURYAA KALKAJI,CROWEN PLAZA ROHINI,SHERATON PAHARGANJ,THE AMBIENC,VIVANTA,SURAJKUND,ASHOKA CONTINENTAL , LEELA CHANKYAPURI,_ALL 3* 5* 7* STARTS HOTEL SERVICE BOOKING CALL Call WHATSAPP Call ☎+91–9953330565❤꧂ NIGHT SHORT TIME BOTH ARE AVAILABLE

Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service

9953056974 Low Rate Call Girls In Saket, Delhi NCR

Log Analysis using OSSEC sasoasasasas.pptx

JohnnyPlasten

CebaBaby dropshipping via API with DroFX.pptx

olyaivanovalion

Unlock the Secrets of Discriminative Models: Master the Art of Prediction! 🌟 Dive into our latest presentation where we simplify the complex world of machine learning. Discover how discriminative models, from Logistic Regression to Neural Networks, optimize decision-making by focusing on the relationships between observed features and outcomes. Perfect for beginners and experts alike! 🚀 #MachineLearning #DataScience #AI #PredictiveAnalytics #100ConceptsOfAIwithAnupamaK

100-Concepts-of-AI by Anupama Kate .pptx

Anupama Kate

Determinants of health, dimensions of health, positive health and spectrum of...

shambhavirathore45

VidaXL dropshipping via API with DroFx.pptx

olyaivanovalion

Ravak dropshipping via API with DroFx.pptx

olyaivanovalion

Invezz.com - Grow your wealth with trading signals

Invezz1

Accredited-Transport-Cooperatives-Jan-2021-Web.pdf

adriantubila

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore Escorts Service Booking Contact Details :- WhatsApp Chat :- +91-7737669865 2-May-2024(SMW) Call Girls In Model Towh Bangalore +91-7737669865 !! Best Woman Seeking Man Call Girls Service, Escorts Service in Home Hotel in Bangalore NCR 24 Hours Available Service Call Girls, Contact Us +91-7737669865 (Any Time. Any Where) Call Girls in Bangalore, Noida, Gurgaon, Ghaziabad,Sexy Indian Female Escorts Service Bangalore NCRWelcome To Bangalore Escorts Service – An All Over New Bangalore Very Sexy Hot Call Girls Agency Service Escorts In South BangaloreNCRBangalore’s No. 1 High Profile Independent Female Escorts Service. We Provide Good Quality Educated Profile At Very Regnebal Price 100% Safe And Original.We Are Provide Escorts Service All OYO Hotels ,3*,4*,5* Star Hotel And Home Flat, Apartment. Guest-House. Services In -Call And Out – Call Both Are Services Available. 24Hrs. Any Time Any Where. In All Over Bangalore Noida Gurgaon Ghaziabad Faridabad.More Information And Contact Profile Real Pic Visit Our Website City Wise Escorts Service Agency.Good Looking Cheap And Best Models Girls U Can Get Best Click On Link……Night Call Girls Now In Hotel Le Meridien Gurgaon Near Female Escort One Shot — 5000/in call (time 1 hour), 6000/out call Two shot with one girl — 8000/in call (time 2 hour), 10000/out call Body to body massage with sex- 8000/in call (time 1 hour) Full night Service for one person– 12000/in call, 13000/out call (shot limit 3-4 shots) Full night Service for more than 1 person — please contact Us —7737669865 We are available 24*7 all days of the year. Call us — 7737669865 Thank you for Visiting.

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...

amitlee9823

Carero dropshipping via API with DroFx.pptx

olyaivanovalion

Data-Analysis for Chicago Crime Data 2023

ymrp368

BabyOno dropshipping via API with DroFx.pptx

olyaivanovalion

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore Booking Contact Details :- WhatsApp Chat :- +91-7737669865 2-May-2024(SMW) Call Girls In Model Towh Bangalore +91-7737669865 !! Best Woman Seeking Man Call Girls Service, Escorts Service in Home Hotel in Bangalore NCR 24 Hours Available Service Call Girls, Contact Us +91-7737669865 (Any Time. Any Where) Call Girls in Bangalore, Noida, Gurgaon, Ghaziabad,Sexy Indian Female Escorts Service Bangalore NCRWelcome To Bangalore Escorts Service – An All Over New Bangalore Very Sexy Hot Call Girls Agency Service Escorts In South BangaloreNCRBangalore’s No. 1 High Profile Independent Female Escorts Service. We Provide Good Quality Educated Profile At Very Regnebal Price 100% Safe And Original.We Are Provide Escorts Service All OYO Hotels ,3*,4*,5* Star Hotel And Home Flat, Apartment. Guest-House. Services In -Call And Out – Call Both Are Services Available. 24Hrs. Any Time Any Where. In All Over Bangalore Noida Gurgaon Ghaziabad Faridabad.More Information And Contact Profile Real Pic Visit Our Website City Wise Escorts Service Agency.Good Looking Cheap And Best Models Girls U Can Get Best Click On Link……Night Call Girls Now In Hotel Le Meridien Gurgaon Near Female Escort One Shot — 5000/in call (time 1 hour), 6000/out call Two shot with one girl — 8000/in call (time 2 hour), 10000/out call Body to body massage with sex- 8000/in call (time 1 hour) Full night Service for one person– 12000/in call, 13000/out call (shot limit 3-4 shots) Full night Service for more than 1 person — please contact Us —7737669865 We are available 24*7 all days of the year. Call us — 7737669865 Thank you for Visiting.

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...

amitlee9823

Call Girl In Dwarka ☎92055#41914 ¶¶ Indian,Russian Best Quality full Educated And Full Cooperative Independent Call Girls Escort Services In New Delhi- I Have Extremely Beautiful Broad Minded Cute Sexy & Hot Call Girls and Escorts, We Are Located in 3* 4* 5* Hotels in Delhi. Safe & Secure High Class Services Affordable Rate 100% Satisfaction, Unlimited Enjoyment. Any Time for Model/Teens Escort in Delhi High class luxury and premium escorts agency Indian Russian Call Girls In Delhi Booking Good High Profile Escorts (Call Girls) In Delhi 5 Star Hotel ,Incall Service,OutCall Service, We provide services by Call Girls,College Girls,Modals Get High Profile queens,Well Educated,Good Looking,Full Cooperative Model, Russian Models,Punjabi Girls Kashmeri Girls Services etc… We Provide Hottest Female With Safe And Consensual With Most Limits Respected Complete Satisfaction Guaranteed…Service. Call Me Spacial For Including Incall//outcall Service In New Delhi Indian Russian Escorts Service,

Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night

Delhi Call girls

Sampling (random) method and Non random.ppt

Dr. Soumendra Kumar Patra

Smarteg dropshipping via API with DroFx.pptx

olyaivanovalion

BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx

MohammedJunaid861692

Recently uploaded (20)

Edukaciniai dropshipping via API with DroFx

Zuja dropshipping via API with DroFx.pptx

Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service

Log Analysis using OSSEC sasoasasasas.pptx

CebaBaby dropshipping via API with DroFX.pptx

100-Concepts-of-AI by Anupama Kate .pptx

Determinants of health, dimensions of health, positive health and spectrum of...

VidaXL dropshipping via API with DroFx.pptx

Ravak dropshipping via API with DroFx.pptx

Invezz.com - Grow your wealth with trading signals

Accredited-Transport-Cooperatives-Jan-2021-Web.pdf

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...

Carero dropshipping via API with DroFx.pptx

Data-Analysis for Chicago Crime Data 2023

BabyOno dropshipping via API with DroFx.pptx

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...

Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night

Sampling (random) method and Non random.ppt

Smarteg dropshipping via API with DroFx.pptx

BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx

Meta experimentation at Etsy - Emily Sommer

1. Introduction to the (Bag of Little) Bootstrap(s)

2. Agenda 1 2 3 4 5 Identification of the Problem Classic Bootstrap Bag of Little Bootstraps Evaluation of Success (or Failure) Next Steps 2

3. Identification of the Problem SECTION 1 3

4. A/B Testing 4

5. CONTROL VARIANT No pugs. No change. Etsy with PUG! 5 Standard A/B Experiment

6. CONTROLS VARIANTS 6 Browser # ABJ463MZ2 OFF to pug or not to pug?

7. CONTROLS VARIANTS 7

8. CONTROLS VARIANTS 8

9. 9

10. 10 standard error: accounting for variance and sample sizes

11. confidence intervals 11 0 0 ? → +

12. A/A Testing 12

13. CONTROL VARIANT No pugs. No change. No pugs. No change. 13 Standard A/A Experiment

14. 14 → + What the?

15. Statistical Assumptions for Parametric Testing 15

16. i.i.d. independent & identically distributed 16 A A B C D E

17. normally distributed 17

18. 18 narrow confidence intervals

19. Classic Bootstrap SECTION 2 19

20. 20 ... sample boots: WITH replacement (do this many, many times)

21. take the 95% confidence interval 21 2.5th 97.5th distribution of t-statistics boot 1 boot 3 boot 2 boot 4 boot 1 results boot 2 results boot 3 results boot 4 results

22. 22 BOOTSTRAP BY USER instead of by "visit" (one user can have many "visits") user id visit id $ A 45 1 A 23 1 A 85 0 B 37 0 C 12 1 C 72 0 A B C } } } closer to i.i.d. (independent & identically distributed)

23. Bag of Little Bootstraps SECTION 3 23

24. 24

25. 25 A B C D E Fusers in our original experiment fetch a bag: resample withOUT replacementbag 1 A E F B 25

26. 26 A B C D E Fusers in our original experiment bag 1 F A E B E B F A F B monte carlo subsamples: resample WITH replacement (from our bag) *TO SIZE OF ORIGINAL DATA SET* boot 1 26

27. 27 t-statistic = difference in populations standard error [per monte carlo'd subsample]

28. bootstrapped confidence interval 28 2.5th 97.5th t-statistics E B F A F B boot 1 E B F A B boot 2 A E F B boot 3 E A A E F A F B boot 4 B

29. A B C D E Fusers in our original experiment confidence interval from bag 1 2.5th 97.5th t-statistics E B F A B boot 2 E B F A F B boot 1 E F B boot 3 E F A F B boot 4 A E A A B bag 1 F A E B 29

30. average the confidence intervals 30 avg. avg. 2.5th 97.5th averaged t-statistics bag 1 bag 3 bag 2 bag 4 bag 1 results bag 2 results bag 3 results bag 4 results

31. Fixes i.i.d. & distribution Suitable for distributed systems Faster, less memory WINS 31

32. Evaluation of Success (or Failure) SECTION 4 32

33. Reshuffled Data (Simulated A/As) 33

34. Generated Data (Simulated A/Bs) 34

35. Existing Experiments 35

36. Next Steps SECTION 5 36

37. Hyperparameter optimization, Poisson Bootstrap 37

38. Support for larger experiments 38

39. Power calculations & complicated metrics 39

40. Resources Kleiner, A., Talwalkar, A., Sarkar, P., and Jordan, M. I. A scalable bootstrap for massive data. arXiv preprint arXiv:1112.5016v2, 2012. URL http://arxiv.org/abs/1112.5016v2 Bakshy, E., Eckles, D. Uncertainty in Online Experiments with Dependent Data: An Evaluation of Bootstrap Methods. arXiv preprint arXiv:1304.7406v3, 2013. URL https://arxiv.org/pdf/1304.7406v3.pdf Idea for Bootstrapping @ Etsy: @hpster (Hilary Parker) 40

41. QUESTIONS, COMMENTS, REFUTATIONS? Contact me: Emily Sommer (esommer@etsy.com) Thanks! 41

Meta experimentation at Etsy - Emily Sommer

Recommended

Recommended

More Related Content

More from Evention

More from Evention (20)

Recently uploaded

Recently uploaded (20)

Meta experimentation at Etsy - Emily Sommer