SlideShare a Scribd company logo
1 of 31
Download to read offline
Improvement for
StackOverflow.com
Chentao Zhang
Insight Data Engineering SV
Motivations
java
hadoop
java
Input Data
Question:
{
“post_id”:67172,
“post_date”:”6-10-2015-00-01-02”,
“type”:0,
“parent_id”:0
“tiltle”:” Java Exception”,
“body”:”....”,
“tags”:“java;algorithm”,
“user_id”:782,
…
}
Answer:
{
“post_id”:67172,
“post_date”:”6-10-2015-00-01-23”,
“type”:1,
“parent_id”:67172
“tiltle”:” “,
“body”:”You should....”,
“tags”:“”,
“user_id”:1982,
…
}
Data Modeling and Queries
~Elasticsearch
1.Index
• A collection of documents that have somewhat similar characteristics
• Corresponding to ‘database’ in Relational Database.
2.Type
• logical category/partition of your index whose semantics is completely up to you
• Corresponding to ‘table’ in Relational Database.
3.Document
• A basic unit of information that can be indexed
• Corresponding to ‘row’ in Relational Database.
question_id tags
answer_time(
sec)
posted_at Random_sq
231 Java 3010
2016_01_02_21_2
0_01
11_10
290 spark 7381
2016_01_02_22_0
9_01
11_28
341 Java 5611
2016_01_10_01_0
2_05
11_31
Data Modeling and Queries
stackover/questions:
index type Document
question_id tags
answer_time(
sec)
posted_at Random_sq
231 Java 3010
2016_01_02_21_2
0_01
11_10
290 spark 7381
2016_01_02_22_0
9_01
11_28
341 Java 5611
2016_01_10_01_0
2_05
11_31
Data Modeling and Queries
stackover/questions:
index type Document
question_id tags
answer_time(
sec)
posted_at Random_sq
231 Java 3010
2016_01_02_21_2
0_01
11_10
290 spark 7381
2016_01_02_22_0
9_01
11_28
341 Java 5611
2016_01_10_01_0
2_05
11_31
Data Modeling and Queries
stackover/questions:
index type Document
question_id tags
answer_time(
sec)
posted_at Random_sq
231 Java 3010
2016_01_02_21_2
0_01
11_10
290 spark 7381
2016_01_02_22_0
9_01
11_28
341 Java 5611
2016_01_10_01_0
2_05
11_31
Data Modeling and Queries
stackover/questions:
index type Document
question_id tags
answer_time(
sec)
posted_at Random_sq
231 Java 3010
2016_01_02_21_2
0_01
11_10
290 spark 7381
2016_01_02_22_0
9_01
11_28
341 Java 5611
2016_01_10_01_0
2_05
11_31
Data Modeling and Queries
stackover/questions:
index type Document
question_id tags
answer_time(
sec)
posted_at Random_sq
231 Java 3010
2016_01_02_21_2
0_01
11_10
290 spark 7381
2016_01_02_22_0
9_01
11_28
341 Java 5611
2016_01_10_01_0
2_05
11_31
Data Modeling and Queries
stackover/questions:
index type Document
• Prob. of a question labeled with specific tag(such as ‘java’) and answered in
10 mins
= number of questions answered in 10 mins and tagged with ‘java’
/ total number of questions tagged with ‘java’
question_id tags
answer_time(
sec)
posted_at Random_sq
231 Java 3010
2016_01_02_21_2
0_01
11_10
290 spark 7381
2016_01_02_22_0
9_01
11_28
341 Java 5611
2016_01_10_01_0
2_05
11_31
Data Modeling and Queries
stackover/questions:
index type Document
• Prob. of a question labeled with specific tag(such as ‘java’) and answered in
10 mins
= number of questions answered in 10 mins and tagged with ‘java’
/ total number of questions tagged with ‘java’
• Stratified Sampling
~tags
~posted_at(month)
question_id tags
answer_time(
sec)
posted_at Random_sq
231 Java 3010
2016_01_02_21_2
0_01
11_10
290 spark 7381
2016_01_02_22_0
9_01
11_28
341 Java 5611
2016_01_10_01_0
2_05
11_31
Data Modeling and Queries
stackover/questions:
index type Document
• Prob. of a question labeled with specific tag(such as ‘java’) and answered in
10 mins
= number of questions answered in 10 mins and tagged with ‘java’
/ total number of questions tagged with ‘java’
• Stratified Sampling
~tags
~posted_at(month)
userid tags
231 [“Java”,”hadoop"]
290 [“hadoop”,”Spark”]
341 [“java”,”sql”,”hadoop”]
Data Modeling and Queries
stackovergraph/userstags:
userid tags
231 [“Java”,”hadoop"]
290 [“hadoop”,”Spark”]
341 [“java”,”sql”,”hadoop”]
Data Modeling and Queries
stackovergraph/userstags:
userid tags
231 [“Java”,”hadoop"]
290 [“hadoop”,”Spark”]
341 [“java”,”sql”,”hadoop”]
Data Modeling and Queries
stackovergraph/userstags:
userid tags
231 [“Java”,”hadoop"]
290 [“hadoop”,”Spark”]
341 [“java”,”sql”,”hadoop”]
Data Modeling and Queries
stackovergraph/userstags:
userid tags
231 [“Java”,”JVM"]
290 [“JVM”,”Spark”]
341 [“java”,”sql”,”JVM”]
Data Modeling and Queries
stackovergraph/userstags:
Java
Tag num
Java 1
userid tags
231 [“Java”,”JVM"]
290 [“JVM”,”Spark”]
341 [“java”,”sql”,”JVM”]
Data Modeling and Queries
stackovergraph/userstags:
Java JVM
Tag num
Java 1
JVM 1
userid tags
231 [“Java”,”JVM"]
290 [“JVM”,”Spark”]
341 [“java”,”sql”,”JVM”]
Data Modeling and Queries
stackovergraph/userstags:
Java JVM
Tag num
Java 1
JVM 1
1
userid tags
231 [“Java”,”JVM"]
290 [“JVM”,”Spark”]
341 [“java”,”sql”,”JVM”]
Data Modeling and Queries
stackovergraph/userstags:
Java JVM
Tag num
Java 1
JVM 1
1
userid tags
231 [“Java”,”JVM"]
290 [“JVM”,”Spark”]
341 [“java”,”sql”,”JVM”]
Data Modeling and Queries
stackovergraph/userstags:
Java JVM
Tag num
Java 1
JVM 2
1
userid tags
231 [“Java”,”JVM"]
290 [“JVM”,”Spark”]
341 [“java”,”sql”,”JVM”]
Data Modeling and Queries
stackovergraph/userstags:
Java JVM
Tag num
Java 1
JVM 2
spark 1
1
Spark
1
userid tags
231 [“Java”,”JVM"]
290 [“JVM”,”Spark”]
341 [“java”,”sql”,”JVM”]
Data Modeling and Queries
stackovergraph/userstags:
Java JVM
Spark
Tag num
Java 2
JVM 2
spark 1
1
1
userid tags
231 [“Java”,”JVM"]
290 [“JVM”,”Spark”]
341 [“java”,”sql”,”JVM”]
Data Modeling and Queries
stackovergraph/userstags:
Java JVM
Spark
Tag num
Java 2
JVM 2
spark 1
sql 1
1
1
Sql
userid tags
231 [“Java”,”JVM"]
290 [“JVM”,”Spark”]
341 [“java”,”sql”,”JVM”]
Data Modeling and Queries
stackovergraph/userstags:
Java JVM
Spark
Tag num
Java 2
JVM 2
spark 1
sql 1
1
1
Sql
1
userid tags
231 [“Java”,”JVM"]
290 [“JVM”,”Spark”]
341 [“java”,”sql”,”JVM”]
Data Modeling and Queries
stackovergraph/userstags:
Java JVM
Spark
Tag num
Java 2
JVM 3
spark 1
sql 1
2
1
Sql
1 1
userid tags
231 [“Java”,”JVM"]
290 [“JVM”,”Spark”]
341 [“Java”,”sql”,”JVM”]
Data Modeling and Queries
stackovergraph/userstags:
Java JVM
Spark
Tag num
Java 2
JVM 3
spark 1
sql 1
2
1
Sql
1 1
Data Modeling and Queries
Tag num
Java 2
JVM 3
spark 1
sql 1
Recommend tags for users:
Java JVM
Spark
2
1
Sql
1 1
Proportion of people who can answer “B” question in people who can answer “A” question
=weight of edge AB / number of people who have answered “A” question
=Similarity of “A” to “B”
Data Modeling and Queries
Tag num
Java 2
JVM 3
spark 1
sql 1
Recommend tags for users:
Java JVM
Spark
2
1
Sql
1 1
Data Pipeline
Historical
data(60G)
Streaming data
1.Computing how long it takes to get answer for each question
2.Based on sampling fraction ,generating random number
3.Computing what types of questions which each user has answered
(constructing graph)
1.Sampling data
2.Computing prob.
3.searching neighbors
About Me
• Chentao(Sam) Zhang
• MS in Electrical & Computer
Engineering from University of
Delaware
• Passionated to learn and try
new things

More Related Content

Viewers also liked

Prezi per impediti!
Prezi per impediti!Prezi per impediti!
Prezi per impediti!foacrue
 
BeyinBilgisayarArayuzleri_SerefUgurDemir
BeyinBilgisayarArayuzleri_SerefUgurDemirBeyinBilgisayarArayuzleri_SerefUgurDemir
BeyinBilgisayarArayuzleri_SerefUgurDemirSeref Ugur Demir
 
I Belive
 I Belive  I Belive
I Belive abuxus
 
'Couldn't stay away', Tauros story (2 of 2)
'Couldn't stay away', Tauros story (2 of 2)'Couldn't stay away', Tauros story (2 of 2)
'Couldn't stay away', Tauros story (2 of 2)Joseph Mellenbruch
 
Erhan_Ugur_SystematicMappingProject
Erhan_Ugur_SystematicMappingProjectErhan_Ugur_SystematicMappingProject
Erhan_Ugur_SystematicMappingProjectSeref Ugur Demir
 
Business communication introduction copy
Business communication introduction   copyBusiness communication introduction   copy
Business communication introduction copyvbht
 
Delitos sobre infracciones a propiedad intelectual y derechos
Delitos sobre infracciones a propiedad intelectual y derechosDelitos sobre infracciones a propiedad intelectual y derechos
Delitos sobre infracciones a propiedad intelectual y derechosJavier Salazar Santivañez
 
Nolle prosequi, Kenyan Criminal Procedure, Criminal law
Nolle prosequi, Kenyan Criminal Procedure, Criminal lawNolle prosequi, Kenyan Criminal Procedure, Criminal law
Nolle prosequi, Kenyan Criminal Procedure, Criminal lawQuincy Kiptoo
 
Carrier ID: Are You Ready to Turn Carrier ID On?
Carrier ID: Are You Ready to Turn Carrier ID On?Carrier ID: Are You Ready to Turn Carrier ID On?
Carrier ID: Are You Ready to Turn Carrier ID On?Newtec
 
Chapter 4 Classification
Chapter 4 ClassificationChapter 4 Classification
Chapter 4 ClassificationKhalid Elshafie
 

Viewers also liked (12)

Prezi per impediti!
Prezi per impediti!Prezi per impediti!
Prezi per impediti!
 
BeyinBilgisayarArayuzleri_SerefUgurDemir
BeyinBilgisayarArayuzleri_SerefUgurDemirBeyinBilgisayarArayuzleri_SerefUgurDemir
BeyinBilgisayarArayuzleri_SerefUgurDemir
 
Thought for the week 279
Thought for the week  279Thought for the week  279
Thought for the week 279
 
I Belive
 I Belive  I Belive
I Belive
 
'Couldn't stay away', Tauros story (2 of 2)
'Couldn't stay away', Tauros story (2 of 2)'Couldn't stay away', Tauros story (2 of 2)
'Couldn't stay away', Tauros story (2 of 2)
 
Erhan_Ugur_SystematicMappingProject
Erhan_Ugur_SystematicMappingProjectErhan_Ugur_SystematicMappingProject
Erhan_Ugur_SystematicMappingProject
 
Business communication introduction copy
Business communication introduction   copyBusiness communication introduction   copy
Business communication introduction copy
 
Onaral IITL_Brainstorm25feb10v4
Onaral IITL_Brainstorm25feb10v4Onaral IITL_Brainstorm25feb10v4
Onaral IITL_Brainstorm25feb10v4
 
Delitos sobre infracciones a propiedad intelectual y derechos
Delitos sobre infracciones a propiedad intelectual y derechosDelitos sobre infracciones a propiedad intelectual y derechos
Delitos sobre infracciones a propiedad intelectual y derechos
 
Nolle prosequi, Kenyan Criminal Procedure, Criminal law
Nolle prosequi, Kenyan Criminal Procedure, Criminal lawNolle prosequi, Kenyan Criminal Procedure, Criminal law
Nolle prosequi, Kenyan Criminal Procedure, Criminal law
 
Carrier ID: Are You Ready to Turn Carrier ID On?
Carrier ID: Are You Ready to Turn Carrier ID On?Carrier ID: Are You Ready to Turn Carrier ID On?
Carrier ID: Are You Ready to Turn Carrier ID On?
 
Chapter 4 Classification
Chapter 4 ClassificationChapter 4 Classification
Chapter 4 Classification
 

Similar to Sam zhang demo

Creating an end-to-end Recommender System with Apache Spark and Elasticsearch...
Creating an end-to-end Recommender System with Apache Spark and Elasticsearch...Creating an end-to-end Recommender System with Apache Spark and Elasticsearch...
Creating an end-to-end Recommender System with Apache Spark and Elasticsearch...sparktc
 
Apache Solr/Lucene Internals by Anatoliy Sokolenko
Apache Solr/Lucene Internals  by Anatoliy SokolenkoApache Solr/Lucene Internals  by Anatoliy Sokolenko
Apache Solr/Lucene Internals by Anatoliy SokolenkoProvectus
 
Scala in a wild enterprise
Scala in a wild enterpriseScala in a wild enterprise
Scala in a wild enterpriseRafael Bagmanov
 
Joker'15 Java straitjackets for MongoDB
Joker'15 Java straitjackets for MongoDBJoker'15 Java straitjackets for MongoDB
Joker'15 Java straitjackets for MongoDBAlexey Zinoviev
 
Rafael Bagmanov «Scala in a wild enterprise»
Rafael Bagmanov «Scala in a wild enterprise»Rafael Bagmanov «Scala in a wild enterprise»
Rafael Bagmanov «Scala in a wild enterprise»e-Legion
 
Struts2-Spring=Hibernate
Struts2-Spring=HibernateStruts2-Spring=Hibernate
Struts2-Spring=HibernateJay Shah
 
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Spark Summit
 
Spark Sql for Training
Spark Sql for TrainingSpark Sql for Training
Spark Sql for TrainingBryan Yang
 
Finding the right stuff, an intro to Elasticsearch (at Rug::B)
Finding the right stuff, an intro to Elasticsearch (at Rug::B) Finding the right stuff, an intro to Elasticsearch (at Rug::B)
Finding the right stuff, an intro to Elasticsearch (at Rug::B) Michael Reinsch
 
Spring 4-groovy
Spring 4-groovySpring 4-groovy
Spring 4-groovyGR8Conf
 
Scala Frustrations
Scala FrustrationsScala Frustrations
Scala Frustrationstakezoe
 
Lecture 5 JSTL, custom tags, maven
Lecture 5   JSTL, custom tags, mavenLecture 5   JSTL, custom tags, maven
Lecture 5 JSTL, custom tags, mavenFahad Golra
 
Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...
Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...
Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...Databricks
 
Mi0041 java and web design
Mi0041   java and web designMi0041   java and web design
Mi0041 java and web designsmumbahelp
 

Similar to Sam zhang demo (20)

Creating an end-to-end Recommender System with Apache Spark and Elasticsearch...
Creating an end-to-end Recommender System with Apache Spark and Elasticsearch...Creating an end-to-end Recommender System with Apache Spark and Elasticsearch...
Creating an end-to-end Recommender System with Apache Spark and Elasticsearch...
 
Apache Solr/Lucene Internals by Anatoliy Sokolenko
Apache Solr/Lucene Internals  by Anatoliy SokolenkoApache Solr/Lucene Internals  by Anatoliy Sokolenko
Apache Solr/Lucene Internals by Anatoliy Sokolenko
 
Sam zhang demo
Sam zhang demoSam zhang demo
Sam zhang demo
 
MeteorJS Introduction
MeteorJS IntroductionMeteorJS Introduction
MeteorJS Introduction
 
Scala in a wild enterprise
Scala in a wild enterpriseScala in a wild enterprise
Scala in a wild enterprise
 
Joker'15 Java straitjackets for MongoDB
Joker'15 Java straitjackets for MongoDBJoker'15 Java straitjackets for MongoDB
Joker'15 Java straitjackets for MongoDB
 
Rafael Bagmanov «Scala in a wild enterprise»
Rafael Bagmanov «Scala in a wild enterprise»Rafael Bagmanov «Scala in a wild enterprise»
Rafael Bagmanov «Scala in a wild enterprise»
 
Spring data requery
Spring data requerySpring data requery
Spring data requery
 
Struts2-Spring=Hibernate
Struts2-Spring=HibernateStruts2-Spring=Hibernate
Struts2-Spring=Hibernate
 
JSON-B for CZJUG
JSON-B for CZJUGJSON-B for CZJUG
JSON-B for CZJUG
 
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
 
Spark Sql for Training
Spark Sql for TrainingSpark Sql for Training
Spark Sql for Training
 
Finding the right stuff, an intro to Elasticsearch (at Rug::B)
Finding the right stuff, an intro to Elasticsearch (at Rug::B) Finding the right stuff, an intro to Elasticsearch (at Rug::B)
Finding the right stuff, an intro to Elasticsearch (at Rug::B)
 
Spring 4-groovy
Spring 4-groovySpring 4-groovy
Spring 4-groovy
 
10 jdbc
10 jdbc10 jdbc
10 jdbc
 
10 jdbc
10 jdbc10 jdbc
10 jdbc
 
Scala Frustrations
Scala FrustrationsScala Frustrations
Scala Frustrations
 
Lecture 5 JSTL, custom tags, maven
Lecture 5   JSTL, custom tags, mavenLecture 5   JSTL, custom tags, maven
Lecture 5 JSTL, custom tags, maven
 
Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...
Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...
Spark DataFrames: Simple and Fast Analytics on Structured Data at Spark Summi...
 
Mi0041 java and web design
Mi0041   java and web designMi0041   java and web design
Mi0041 java and web design
 

Recently uploaded

Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlkumarajju5765
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 

Recently uploaded (20)

Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 

Sam zhang demo

  • 3. Input Data Question: { “post_id”:67172, “post_date”:”6-10-2015-00-01-02”, “type”:0, “parent_id”:0 “tiltle”:” Java Exception”, “body”:”....”, “tags”:“java;algorithm”, “user_id”:782, … } Answer: { “post_id”:67172, “post_date”:”6-10-2015-00-01-23”, “type”:1, “parent_id”:67172 “tiltle”:” “, “body”:”You should....”, “tags”:“”, “user_id”:1982, … }
  • 4. Data Modeling and Queries ~Elasticsearch 1.Index • A collection of documents that have somewhat similar characteristics • Corresponding to ‘database’ in Relational Database. 2.Type • logical category/partition of your index whose semantics is completely up to you • Corresponding to ‘table’ in Relational Database. 3.Document • A basic unit of information that can be indexed • Corresponding to ‘row’ in Relational Database.
  • 5. question_id tags answer_time( sec) posted_at Random_sq 231 Java 3010 2016_01_02_21_2 0_01 11_10 290 spark 7381 2016_01_02_22_0 9_01 11_28 341 Java 5611 2016_01_10_01_0 2_05 11_31 Data Modeling and Queries stackover/questions: index type Document
  • 6. question_id tags answer_time( sec) posted_at Random_sq 231 Java 3010 2016_01_02_21_2 0_01 11_10 290 spark 7381 2016_01_02_22_0 9_01 11_28 341 Java 5611 2016_01_10_01_0 2_05 11_31 Data Modeling and Queries stackover/questions: index type Document
  • 7. question_id tags answer_time( sec) posted_at Random_sq 231 Java 3010 2016_01_02_21_2 0_01 11_10 290 spark 7381 2016_01_02_22_0 9_01 11_28 341 Java 5611 2016_01_10_01_0 2_05 11_31 Data Modeling and Queries stackover/questions: index type Document
  • 8. question_id tags answer_time( sec) posted_at Random_sq 231 Java 3010 2016_01_02_21_2 0_01 11_10 290 spark 7381 2016_01_02_22_0 9_01 11_28 341 Java 5611 2016_01_10_01_0 2_05 11_31 Data Modeling and Queries stackover/questions: index type Document
  • 9. question_id tags answer_time( sec) posted_at Random_sq 231 Java 3010 2016_01_02_21_2 0_01 11_10 290 spark 7381 2016_01_02_22_0 9_01 11_28 341 Java 5611 2016_01_10_01_0 2_05 11_31 Data Modeling and Queries stackover/questions: index type Document
  • 10. question_id tags answer_time( sec) posted_at Random_sq 231 Java 3010 2016_01_02_21_2 0_01 11_10 290 spark 7381 2016_01_02_22_0 9_01 11_28 341 Java 5611 2016_01_10_01_0 2_05 11_31 Data Modeling and Queries stackover/questions: index type Document • Prob. of a question labeled with specific tag(such as ‘java’) and answered in 10 mins = number of questions answered in 10 mins and tagged with ‘java’ / total number of questions tagged with ‘java’
  • 11. question_id tags answer_time( sec) posted_at Random_sq 231 Java 3010 2016_01_02_21_2 0_01 11_10 290 spark 7381 2016_01_02_22_0 9_01 11_28 341 Java 5611 2016_01_10_01_0 2_05 11_31 Data Modeling and Queries stackover/questions: index type Document • Prob. of a question labeled with specific tag(such as ‘java’) and answered in 10 mins = number of questions answered in 10 mins and tagged with ‘java’ / total number of questions tagged with ‘java’ • Stratified Sampling ~tags ~posted_at(month)
  • 12. question_id tags answer_time( sec) posted_at Random_sq 231 Java 3010 2016_01_02_21_2 0_01 11_10 290 spark 7381 2016_01_02_22_0 9_01 11_28 341 Java 5611 2016_01_10_01_0 2_05 11_31 Data Modeling and Queries stackover/questions: index type Document • Prob. of a question labeled with specific tag(such as ‘java’) and answered in 10 mins = number of questions answered in 10 mins and tagged with ‘java’ / total number of questions tagged with ‘java’ • Stratified Sampling ~tags ~posted_at(month)
  • 13. userid tags 231 [“Java”,”hadoop"] 290 [“hadoop”,”Spark”] 341 [“java”,”sql”,”hadoop”] Data Modeling and Queries stackovergraph/userstags:
  • 14. userid tags 231 [“Java”,”hadoop"] 290 [“hadoop”,”Spark”] 341 [“java”,”sql”,”hadoop”] Data Modeling and Queries stackovergraph/userstags:
  • 15. userid tags 231 [“Java”,”hadoop"] 290 [“hadoop”,”Spark”] 341 [“java”,”sql”,”hadoop”] Data Modeling and Queries stackovergraph/userstags:
  • 16. userid tags 231 [“Java”,”hadoop"] 290 [“hadoop”,”Spark”] 341 [“java”,”sql”,”hadoop”] Data Modeling and Queries stackovergraph/userstags:
  • 17. userid tags 231 [“Java”,”JVM"] 290 [“JVM”,”Spark”] 341 [“java”,”sql”,”JVM”] Data Modeling and Queries stackovergraph/userstags: Java Tag num Java 1
  • 18. userid tags 231 [“Java”,”JVM"] 290 [“JVM”,”Spark”] 341 [“java”,”sql”,”JVM”] Data Modeling and Queries stackovergraph/userstags: Java JVM Tag num Java 1 JVM 1
  • 19. userid tags 231 [“Java”,”JVM"] 290 [“JVM”,”Spark”] 341 [“java”,”sql”,”JVM”] Data Modeling and Queries stackovergraph/userstags: Java JVM Tag num Java 1 JVM 1 1
  • 20. userid tags 231 [“Java”,”JVM"] 290 [“JVM”,”Spark”] 341 [“java”,”sql”,”JVM”] Data Modeling and Queries stackovergraph/userstags: Java JVM Tag num Java 1 JVM 1 1
  • 21. userid tags 231 [“Java”,”JVM"] 290 [“JVM”,”Spark”] 341 [“java”,”sql”,”JVM”] Data Modeling and Queries stackovergraph/userstags: Java JVM Tag num Java 1 JVM 2 1
  • 22. userid tags 231 [“Java”,”JVM"] 290 [“JVM”,”Spark”] 341 [“java”,”sql”,”JVM”] Data Modeling and Queries stackovergraph/userstags: Java JVM Tag num Java 1 JVM 2 spark 1 1 Spark 1
  • 23. userid tags 231 [“Java”,”JVM"] 290 [“JVM”,”Spark”] 341 [“java”,”sql”,”JVM”] Data Modeling and Queries stackovergraph/userstags: Java JVM Spark Tag num Java 2 JVM 2 spark 1 1 1
  • 24. userid tags 231 [“Java”,”JVM"] 290 [“JVM”,”Spark”] 341 [“java”,”sql”,”JVM”] Data Modeling and Queries stackovergraph/userstags: Java JVM Spark Tag num Java 2 JVM 2 spark 1 sql 1 1 1 Sql
  • 25. userid tags 231 [“Java”,”JVM"] 290 [“JVM”,”Spark”] 341 [“java”,”sql”,”JVM”] Data Modeling and Queries stackovergraph/userstags: Java JVM Spark Tag num Java 2 JVM 2 spark 1 sql 1 1 1 Sql 1
  • 26. userid tags 231 [“Java”,”JVM"] 290 [“JVM”,”Spark”] 341 [“java”,”sql”,”JVM”] Data Modeling and Queries stackovergraph/userstags: Java JVM Spark Tag num Java 2 JVM 3 spark 1 sql 1 2 1 Sql 1 1
  • 27. userid tags 231 [“Java”,”JVM"] 290 [“JVM”,”Spark”] 341 [“Java”,”sql”,”JVM”] Data Modeling and Queries stackovergraph/userstags: Java JVM Spark Tag num Java 2 JVM 3 spark 1 sql 1 2 1 Sql 1 1
  • 28. Data Modeling and Queries Tag num Java 2 JVM 3 spark 1 sql 1 Recommend tags for users: Java JVM Spark 2 1 Sql 1 1 Proportion of people who can answer “B” question in people who can answer “A” question =weight of edge AB / number of people who have answered “A” question =Similarity of “A” to “B”
  • 29. Data Modeling and Queries Tag num Java 2 JVM 3 spark 1 sql 1 Recommend tags for users: Java JVM Spark 2 1 Sql 1 1
  • 30. Data Pipeline Historical data(60G) Streaming data 1.Computing how long it takes to get answer for each question 2.Based on sampling fraction ,generating random number 3.Computing what types of questions which each user has answered (constructing graph) 1.Sampling data 2.Computing prob. 3.searching neighbors
  • 31. About Me • Chentao(Sam) Zhang • MS in Electrical & Computer Engineering from University of Delaware • Passionated to learn and try new things