SlideShare a Scribd company logo
1 of 20
Download to read offline
Renaissance Technologies Presentation
-
Insight Data Science
Kuhan Wang
October 21th, 2015
1 / 20
Introduction
Insight Data Science: developed
a machine learning pipeline in a
consulting project.
PhD Particle Physics, McGill
University, researcher on the Large
Hadron Collider.
Lead the search for microscopic
black holes and exotic gravity
states in the ATLAS Collaboration.
2 / 20
Consulting Scenario
Company X wishes to maximize user engagement through
optimal placement of advertisements on content URLs.
Ad Type: Tourism
Keyword: Cuba
Keyword:
Package Tour
Keyword: Airplane
Ad Type X
Keyword 1
Keyword 2
Keyword 3
Keyword N
.
.
.
Example: Tourism ads not ideal on investment content URL.
3 / 20
A Pipeline to Analyze Textual Features
Developed and implemented a pipeline to analyze
importance of textual feature on content URLs relative to
engagement.
Scrape
URL
Process
Text
Model
Features
Extract
Keywords
Update
Keywords
Collect Data, Reiterate
Begin
4 / 20
User Engagement Data
Occurrences
Counts
Summary of Engagement Data
Page Loaded
Ad Viewed
Ad Clicked
Summary of Engagement Data
5 / 20
Modeling
Attempted linear regression.
Classify engagement as yes/no.
Word Count
0 1 2 3 4 5 6 7 8 9 10
Probability[%]
0
0.2
0.4
0.6
0.8
1
Logistic Classification Model
Ad Clicked
Ad Not Clicked
Logistic Classification Model
6 / 20
Validation
Randomly split data into training/test sets.
- Distribution of validation scores (shown for 50/50 split).
Precision
0.55 0.6 0.65 0.7 0.75 0.8 0.85
Recall
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
NumberofMCToys
0
10
20
30
40
50
60
70
80
Ad Type 1
Distribution of Precision vs Recall for 50.0% Test/Train
〉Precision, Recall〈
7 / 20
Deliverables
Extracted keywords:
Rank Ad Type 1 Ad Type 2 Ad Type 3 Ad Type 4
1 debt coordinator mortgage gold
2 gift administrative home 0
3 profit minimum procurement stock
4 check minimum wage loan fund
5 balance reports trustee event
Pipeline in Python is delivered to company for
implementation.
Project details: http://kuhanw.zohosites.com/.
8 / 20
9 / 20
Dissertation Project
Particle Colliders recreate conditions in the early universe.
Searched for signatures of microscopic gravity at the Large Hadron
Collider.
10 / 20
The Large Hadron Collider
27 km ring, most powerful particle accelerator built to date.
- 13 TeV collisions.
ATLAS: a giant particle detector.
Produced black holes leave debris due to evaporation inside detector.
2008JINST3S08001
Figure 2.1: Schematic layout of the LHC (Beam 1- clockwise, Beam 2 — anticlockwise).
systems. The insertion at Point 4 contains two RF systems: one independent system for each LHC
beam. The straight section at Point 6 contains the beam dump insertion, where the two beams are
vertically extracted from the machine using a combination of horizontally deflecting fast-pulsed
(’kicker’) magnets and vertically-deflecting double steel septum magnets. Each beam features an
11 / 20
Data Processing
Developed complete analysis
pipeline in C++.
- Processed ∼10 TB of LHC data
using distributed computing
methods.
Raw Data From Detector
Processed Data with Objects
Analysis Data Structure
Histogram Data for Final Fitting
~ TB
~ 100 GB
~ GB
~ TB
~ MB
12 / 20
Technical Analysis
~Energy
of event
Black Hole
Signals
Background
Prediction
Quantify compatibility with likelihood model.
L(ns|µ, b, θ) = P(ns|s, µ, b, θ) ×
i
Nsyst(θ0, θ, σθ)i . (1)
13 / 20
Results
Placed leading constraints on models of microscopic gravity physics.
Models of n
extra
dimensions
Planck
mass of
theory
95% CL
Exclusion
Contours
Black Hole
Mass
Model Type
Public results: JHEP 07 (2015) 032, arXiv:1503.08988 [hep-ex]
14 / 20
ATLAS Detector
15 / 20
Thank you for your time.
16 / 20
Large Extra Spatial Dimensions
The size and number of extra spatial dimensions suppress the
observed gravitational strength.
Observed gravity is weaker than intrinsic gravity within the
bulk.
17 / 20
Backup
Feature Frequency/Documents
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
RelativeNumberofDocuments[%]
4−
10
3−
10
2−
10
1−
10
1
Ad Type 1Ad Type 1
18 / 20
FeatureRank
Kuhan Wang1
1. Insight Data Science
October 2, 2015
Abstract
FeatureRank is a software tool for extracting correlations between text
ngram features and user engagement, thereby optimizing the placement
of financial widgets on URL articles.
1 Directory Structure
• /
processing.py
Pre-processing to parse relevant information from engagement csv files.
crawl.py
A simple web crawler that pulls the title and < p > tag text from URLs.
FeatureRank.py
Driver file to execute main functions.
feature_extraction_model.py
The core program that contains the machine learning algorithms.
post_processing.py
Post processing to produce evaluation metrics and ngram rankings.
web_text_data_set_1_2.json 19 / 20
Precision
0.55 0.6 0.65 0.7 0.75 0.8 0.85
Recall
0.4
0.45
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
NumberofMCToys
0
5
10
15
20
25
Ad Type 1
Distribution of Precision vs Recall for 0.33% Test/Train
〉Precision, Recall〈
20 / 20

More Related Content

What's hot

From copert2 to copert4
From copert2 to copert4From copert2 to copert4
From copert2 to copert4xazosxazos
 
Participatory engagement of stakeholders with energy models
Participatory engagement of stakeholders with energy modelsParticipatory engagement of stakeholders with energy models
Participatory engagement of stakeholders with energy modelsIEA-ETSAP
 
Sentiment Knowledge Discovery in Twitter Streaming Data
Sentiment Knowledge Discovery in Twitter Streaming DataSentiment Knowledge Discovery in Twitter Streaming Data
Sentiment Knowledge Discovery in Twitter Streaming DataAlbert Bifet
 
Introduction to STILT – an on-demand CO2 footprint calculator service
Introduction to STILT – an on-demand CO2 footprint calculator serviceIntroduction to STILT – an on-demand CO2 footprint calculator service
Introduction to STILT – an on-demand CO2 footprint calculator serviceEUDAT
 
Copenhagen Optimization Case study - optimization study - check-in and baggag...
Copenhagen Optimization Case study - optimization study - check-in and baggag...Copenhagen Optimization Case study - optimization study - check-in and baggag...
Copenhagen Optimization Case study - optimization study - check-in and baggag...Sarah Frances Procter
 
"Structural probabilistic assessment of offshore wind turbine operation fatig...
"Structural probabilistic assessment of offshore wind turbine operation fatig..."Structural probabilistic assessment of offshore wind turbine operation fatig...
"Structural probabilistic assessment of offshore wind turbine operation fatig...TRUSS ITN
 

What's hot (8)

Real-world Applications of Symbolic Regression
Real-world Applications of Symbolic RegressionReal-world Applications of Symbolic Regression
Real-world Applications of Symbolic Regression
 
From copert2 to copert4
From copert2 to copert4From copert2 to copert4
From copert2 to copert4
 
Participatory engagement of stakeholders with energy models
Participatory engagement of stakeholders with energy modelsParticipatory engagement of stakeholders with energy models
Participatory engagement of stakeholders with energy models
 
Sentiment Knowledge Discovery in Twitter Streaming Data
Sentiment Knowledge Discovery in Twitter Streaming DataSentiment Knowledge Discovery in Twitter Streaming Data
Sentiment Knowledge Discovery in Twitter Streaming Data
 
Introduction to STILT – an on-demand CO2 footprint calculator service
Introduction to STILT – an on-demand CO2 footprint calculator serviceIntroduction to STILT – an on-demand CO2 footprint calculator service
Introduction to STILT – an on-demand CO2 footprint calculator service
 
Edward Robson
Edward RobsonEdward Robson
Edward Robson
 
Copenhagen Optimization Case study - optimization study - check-in and baggag...
Copenhagen Optimization Case study - optimization study - check-in and baggag...Copenhagen Optimization Case study - optimization study - check-in and baggag...
Copenhagen Optimization Case study - optimization study - check-in and baggag...
 
"Structural probabilistic assessment of offshore wind turbine operation fatig...
"Structural probabilistic assessment of offshore wind turbine operation fatig..."Structural probabilistic assessment of offshore wind turbine operation fatig...
"Structural probabilistic assessment of offshore wind turbine operation fatig...
 

Viewers also liked

Tarea 3 recursos tec
Tarea 3 recursos tecTarea 3 recursos tec
Tarea 3 recursos tecdeyaniris4
 
Turazza vogue
Turazza vogueTurazza vogue
Turazza voguetizizi
 
Power point tema 7 mate daiana
Power point tema 7 mate daianaPower point tema 7 mate daiana
Power point tema 7 mate daianamaestrojuanavila
 
Presentación3 e learning
Presentación3 e learningPresentación3 e learning
Presentación3 e learningHEAVYRONALD
 
OI_2016_3_onderwijs_portfolio
OI_2016_3_onderwijs_portfolioOI_2016_3_onderwijs_portfolio
OI_2016_3_onderwijs_portfolioHelma Weijnand
 
Top Pipeline Management Reports
Top Pipeline Management ReportsTop Pipeline Management Reports
Top Pipeline Management ReportsDreamforce
 
Exploitation PéDagogique De Tic En Classe
Exploitation PéDagogique De Tic En ClasseExploitation PéDagogique De Tic En Classe
Exploitation PéDagogique De Tic En Classeguest09062a
 
EDHEC Your Talent Sourcing 2015/2016
EDHEC Your Talent Sourcing 2015/2016EDHEC Your Talent Sourcing 2015/2016
EDHEC Your Talent Sourcing 2015/2016EDHEC Business School
 

Viewers also liked (15)

Tarea 3 recursos tec
Tarea 3 recursos tecTarea 3 recursos tec
Tarea 3 recursos tec
 
Melinda_Quinones_Resume
Melinda_Quinones_ResumeMelinda_Quinones_Resume
Melinda_Quinones_Resume
 
Tema 4
Tema 4Tema 4
Tema 4
 
Turazza vogue
Turazza vogueTurazza vogue
Turazza vogue
 
Presentación2
Presentación2Presentación2
Presentación2
 
Power point tema 7 mate daiana
Power point tema 7 mate daianaPower point tema 7 mate daiana
Power point tema 7 mate daiana
 
Presentación3 e learning
Presentación3 e learningPresentación3 e learning
Presentación3 e learning
 
OI_2016_3_onderwijs_portfolio
OI_2016_3_onderwijs_portfolioOI_2016_3_onderwijs_portfolio
OI_2016_3_onderwijs_portfolio
 
Jokabidearen oinarri biologikoak
Jokabidearen oinarri biologikoakJokabidearen oinarri biologikoak
Jokabidearen oinarri biologikoak
 
review-cred-4-30-15
review-cred-4-30-15review-cred-4-30-15
review-cred-4-30-15
 
Vessels
VesselsVessels
Vessels
 
Top Pipeline Management Reports
Top Pipeline Management ReportsTop Pipeline Management Reports
Top Pipeline Management Reports
 
Les journées de Chipo - Jour 297
Les journées de Chipo - Jour 297Les journées de Chipo - Jour 297
Les journées de Chipo - Jour 297
 
Exploitation PéDagogique De Tic En Classe
Exploitation PéDagogique De Tic En ClasseExploitation PéDagogique De Tic En Classe
Exploitation PéDagogique De Tic En Classe
 
EDHEC Your Talent Sourcing 2015/2016
EDHEC Your Talent Sourcing 2015/2016EDHEC Your Talent Sourcing 2015/2016
EDHEC Your Talent Sourcing 2015/2016
 

Similar to Renaissance v2

Software tools for data-driven research and their application to thermoelectr...
Software tools for data-driven research and their application to thermoelectr...Software tools for data-driven research and their application to thermoelectr...
Software tools for data-driven research and their application to thermoelectr...Anubhav Jain
 
|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and Blockchain|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and BlockchainKan Yuenyong
 
The quantum technologies roadmap
The quantum technologies roadmapThe quantum technologies roadmap
The quantum technologies roadmapGabriel O'Brien
 
2009 HEP Science Network Requirements Workshop Final Report
2009 HEP Science Network Requirements Workshop Final Report2009 HEP Science Network Requirements Workshop Final Report
2009 HEP Science Network Requirements Workshop Final Reportbutest
 
大強子計算網格與OSS
大強子計算網格與OSS大強子計算網格與OSS
大強子計算網格與OSSYuan CHAO
 
Probabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksProbabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksTomaso Aste
 
Quantum Dots_MEEE_AIUB
Quantum Dots_MEEE_AIUBQuantum Dots_MEEE_AIUB
Quantum Dots_MEEE_AIUBNusrat Mary
 
Optimum designing of a transformer considering lay out constraints by penalty...
Optimum designing of a transformer considering lay out constraints by penalty...Optimum designing of a transformer considering lay out constraints by penalty...
Optimum designing of a transformer considering lay out constraints by penalty...INFOGAIN PUBLICATION
 
Error of Multileaf collimator prediction using recurrent neural network (LSTM)
Error of Multileaf collimator prediction using recurrent neural network (LSTM)Error of Multileaf collimator prediction using recurrent neural network (LSTM)
Error of Multileaf collimator prediction using recurrent neural network (LSTM)WonjoongCheon
 
A Comparative Study On Swarm-Based Algorithms To Solve The Stochastic Optimiz...
A Comparative Study On Swarm-Based Algorithms To Solve The Stochastic Optimiz...A Comparative Study On Swarm-Based Algorithms To Solve The Stochastic Optimiz...
A Comparative Study On Swarm-Based Algorithms To Solve The Stochastic Optimiz...Monica Waters
 
The SpaceDrive Project - First Results on EMDrive and Mach-Effect Thrusters
The SpaceDrive Project - First Results on EMDrive and Mach-Effect ThrustersThe SpaceDrive Project - First Results on EMDrive and Mach-Effect Thrusters
The SpaceDrive Project - First Results on EMDrive and Mach-Effect ThrustersSérgio Sacani
 
Thompson 2016 meas._sci._technol._27_072001
Thompson 2016 meas._sci._technol._27_072001Thompson 2016 meas._sci._technol._27_072001
Thompson 2016 meas._sci._technol._27_072001Kaustubh Chaudhari
 
A new Compton scattered tomography modality and its application to material n...
A new Compton scattered tomography modality and its application to material n...A new Compton scattered tomography modality and its application to material n...
A new Compton scattered tomography modality and its application to material n...irjes
 
Global grid of master events for waveform cross correlation: design and testing
Global grid of master events for waveform cross correlation: design and testingGlobal grid of master events for waveform cross correlation: design and testing
Global grid of master events for waveform cross correlation: design and testingIvan Kitov
 
Searching for aftershocks of underground explosions with cross correlation
Searching for aftershocks of underground explosions with cross correlationSearching for aftershocks of underground explosions with cross correlation
Searching for aftershocks of underground explosions with cross correlationIvan Kitov
 
Plaxis bulletin 37 2015
Plaxis bulletin 37 2015Plaxis bulletin 37 2015
Plaxis bulletin 37 2015Plaxis
 
From JET to ITER and Beyond – Scaling from 1PB in 30 Years to 1PB per day!
From JET to ITER and Beyond – Scaling from 1PB in 30 Years to 1PB per day!From JET to ITER and Beyond – Scaling from 1PB in 30 Years to 1PB per day!
From JET to ITER and Beyond – Scaling from 1PB in 30 Years to 1PB per day!Jisc
 

Similar to Renaissance v2 (20)

Software tools for data-driven research and their application to thermoelectr...
Software tools for data-driven research and their application to thermoelectr...Software tools for data-driven research and their application to thermoelectr...
Software tools for data-driven research and their application to thermoelectr...
 
|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and Blockchain|QAB> : Quantum Computing, AI and Blockchain
|QAB> : Quantum Computing, AI and Blockchain
 
The quantum technologies roadmap
The quantum technologies roadmapThe quantum technologies roadmap
The quantum technologies roadmap
 
2009 HEP Science Network Requirements Workshop Final Report
2009 HEP Science Network Requirements Workshop Final Report2009 HEP Science Network Requirements Workshop Final Report
2009 HEP Science Network Requirements Workshop Final Report
 
大強子計算網格與OSS
大強子計算網格與OSS大強子計算網格與OSS
大強子計算網格與OSS
 
Probabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksProbabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering Networks
 
Quantum Dots_MEEE_AIUB
Quantum Dots_MEEE_AIUBQuantum Dots_MEEE_AIUB
Quantum Dots_MEEE_AIUB
 
Optimum designing of a transformer considering lay out constraints by penalty...
Optimum designing of a transformer considering lay out constraints by penalty...Optimum designing of a transformer considering lay out constraints by penalty...
Optimum designing of a transformer considering lay out constraints by penalty...
 
Error of Multileaf collimator prediction using recurrent neural network (LSTM)
Error of Multileaf collimator prediction using recurrent neural network (LSTM)Error of Multileaf collimator prediction using recurrent neural network (LSTM)
Error of Multileaf collimator prediction using recurrent neural network (LSTM)
 
A Comparative Study On Swarm-Based Algorithms To Solve The Stochastic Optimiz...
A Comparative Study On Swarm-Based Algorithms To Solve The Stochastic Optimiz...A Comparative Study On Swarm-Based Algorithms To Solve The Stochastic Optimiz...
A Comparative Study On Swarm-Based Algorithms To Solve The Stochastic Optimiz...
 
The SpaceDrive Project - First Results on EMDrive and Mach-Effect Thrusters
The SpaceDrive Project - First Results on EMDrive and Mach-Effect ThrustersThe SpaceDrive Project - First Results on EMDrive and Mach-Effect Thrusters
The SpaceDrive Project - First Results on EMDrive and Mach-Effect Thrusters
 
Thompson 2016 meas._sci._technol._27_072001
Thompson 2016 meas._sci._technol._27_072001Thompson 2016 meas._sci._technol._27_072001
Thompson 2016 meas._sci._technol._27_072001
 
PointNet
PointNetPointNet
PointNet
 
A new Compton scattered tomography modality and its application to material n...
A new Compton scattered tomography modality and its application to material n...A new Compton scattered tomography modality and its application to material n...
A new Compton scattered tomography modality and its application to material n...
 
Fulltext01
Fulltext01Fulltext01
Fulltext01
 
Global grid of master events for waveform cross correlation: design and testing
Global grid of master events for waveform cross correlation: design and testingGlobal grid of master events for waveform cross correlation: design and testing
Global grid of master events for waveform cross correlation: design and testing
 
WoT 2013 Interop
WoT 2013 InteropWoT 2013 Interop
WoT 2013 Interop
 
Searching for aftershocks of underground explosions with cross correlation
Searching for aftershocks of underground explosions with cross correlationSearching for aftershocks of underground explosions with cross correlation
Searching for aftershocks of underground explosions with cross correlation
 
Plaxis bulletin 37 2015
Plaxis bulletin 37 2015Plaxis bulletin 37 2015
Plaxis bulletin 37 2015
 
From JET to ITER and Beyond – Scaling from 1PB in 30 Years to 1PB per day!
From JET to ITER and Beyond – Scaling from 1PB in 30 Years to 1PB per day!From JET to ITER and Beyond – Scaling from 1PB in 30 Years to 1PB per day!
From JET to ITER and Beyond – Scaling from 1PB in 30 Years to 1PB per day!
 

Recently uploaded

Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 

Recently uploaded (20)

Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 

Renaissance v2

  • 1. Renaissance Technologies Presentation - Insight Data Science Kuhan Wang October 21th, 2015 1 / 20
  • 2. Introduction Insight Data Science: developed a machine learning pipeline in a consulting project. PhD Particle Physics, McGill University, researcher on the Large Hadron Collider. Lead the search for microscopic black holes and exotic gravity states in the ATLAS Collaboration. 2 / 20
  • 3. Consulting Scenario Company X wishes to maximize user engagement through optimal placement of advertisements on content URLs. Ad Type: Tourism Keyword: Cuba Keyword: Package Tour Keyword: Airplane Ad Type X Keyword 1 Keyword 2 Keyword 3 Keyword N . . . Example: Tourism ads not ideal on investment content URL. 3 / 20
  • 4. A Pipeline to Analyze Textual Features Developed and implemented a pipeline to analyze importance of textual feature on content URLs relative to engagement. Scrape URL Process Text Model Features Extract Keywords Update Keywords Collect Data, Reiterate Begin 4 / 20
  • 5. User Engagement Data Occurrences Counts Summary of Engagement Data Page Loaded Ad Viewed Ad Clicked Summary of Engagement Data 5 / 20
  • 6. Modeling Attempted linear regression. Classify engagement as yes/no. Word Count 0 1 2 3 4 5 6 7 8 9 10 Probability[%] 0 0.2 0.4 0.6 0.8 1 Logistic Classification Model Ad Clicked Ad Not Clicked Logistic Classification Model 6 / 20
  • 7. Validation Randomly split data into training/test sets. - Distribution of validation scores (shown for 50/50 split). Precision 0.55 0.6 0.65 0.7 0.75 0.8 0.85 Recall 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 NumberofMCToys 0 10 20 30 40 50 60 70 80 Ad Type 1 Distribution of Precision vs Recall for 50.0% Test/Train 〉Precision, Recall〈 7 / 20
  • 8. Deliverables Extracted keywords: Rank Ad Type 1 Ad Type 2 Ad Type 3 Ad Type 4 1 debt coordinator mortgage gold 2 gift administrative home 0 3 profit minimum procurement stock 4 check minimum wage loan fund 5 balance reports trustee event Pipeline in Python is delivered to company for implementation. Project details: http://kuhanw.zohosites.com/. 8 / 20
  • 10. Dissertation Project Particle Colliders recreate conditions in the early universe. Searched for signatures of microscopic gravity at the Large Hadron Collider. 10 / 20
  • 11. The Large Hadron Collider 27 km ring, most powerful particle accelerator built to date. - 13 TeV collisions. ATLAS: a giant particle detector. Produced black holes leave debris due to evaporation inside detector. 2008JINST3S08001 Figure 2.1: Schematic layout of the LHC (Beam 1- clockwise, Beam 2 — anticlockwise). systems. The insertion at Point 4 contains two RF systems: one independent system for each LHC beam. The straight section at Point 6 contains the beam dump insertion, where the two beams are vertically extracted from the machine using a combination of horizontally deflecting fast-pulsed (’kicker’) magnets and vertically-deflecting double steel septum magnets. Each beam features an 11 / 20
  • 12. Data Processing Developed complete analysis pipeline in C++. - Processed ∼10 TB of LHC data using distributed computing methods. Raw Data From Detector Processed Data with Objects Analysis Data Structure Histogram Data for Final Fitting ~ TB ~ 100 GB ~ GB ~ TB ~ MB 12 / 20
  • 13. Technical Analysis ~Energy of event Black Hole Signals Background Prediction Quantify compatibility with likelihood model. L(ns|µ, b, θ) = P(ns|s, µ, b, θ) × i Nsyst(θ0, θ, σθ)i . (1) 13 / 20
  • 14. Results Placed leading constraints on models of microscopic gravity physics. Models of n extra dimensions Planck mass of theory 95% CL Exclusion Contours Black Hole Mass Model Type Public results: JHEP 07 (2015) 032, arXiv:1503.08988 [hep-ex] 14 / 20
  • 16. Thank you for your time. 16 / 20
  • 17. Large Extra Spatial Dimensions The size and number of extra spatial dimensions suppress the observed gravitational strength. Observed gravity is weaker than intrinsic gravity within the bulk. 17 / 20
  • 18. Backup Feature Frequency/Documents 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 RelativeNumberofDocuments[%] 4− 10 3− 10 2− 10 1− 10 1 Ad Type 1Ad Type 1 18 / 20
  • 19. FeatureRank Kuhan Wang1 1. Insight Data Science October 2, 2015 Abstract FeatureRank is a software tool for extracting correlations between text ngram features and user engagement, thereby optimizing the placement of financial widgets on URL articles. 1 Directory Structure • / processing.py Pre-processing to parse relevant information from engagement csv files. crawl.py A simple web crawler that pulls the title and < p > tag text from URLs. FeatureRank.py Driver file to execute main functions. feature_extraction_model.py The core program that contains the machine learning algorithms. post_processing.py Post processing to produce evaluation metrics and ngram rankings. web_text_data_set_1_2.json 19 / 20
  • 20. Precision 0.55 0.6 0.65 0.7 0.75 0.8 0.85 Recall 0.4 0.45 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 NumberofMCToys 0 5 10 15 20 25 Ad Type 1 Distribution of Precision vs Recall for 0.33% Test/Train 〉Precision, Recall〈 20 / 20