SlideShare a Scribd company logo
1 of 16
Modeling the Evolution of Product Entities
Priya Radhakrishnan1, Manish Gupta1,2, Vasudeva Varma1
1Search and Information Extraction Lab, IIIT-Hyderabad, India
2Microsoft, Hyderabad, India
ACM
SIGIR 2014
July 6-11, 2014
The 37th Annual International ACM SIGIR Conference
July 9, 2014
priya.r@research.iiit.ac.in, gmanish@microsoft.com, vv@iiit.ac.in
Motivation
• Predict the previous version (predecessor) of a product entity
• Link various versions of a product in a temporal order
– Windows 7.0 > Windows 8.0
• Helps study evolution of product entities
– Ubuntu 4.10 > Ubuntu 5.04 > Ubuntu 5.10 > Ubuntu 6.06
– Install CD > USB > LiveCD
Finding the predecessor of a product entity is the first step
towards modeling the Evolution of Product Entities
Modeling the Evolution of Product Entities
(priya.r@research.iiit.ac.in)
2
A Typical Product Listing on Amazon
Modeling the Evolution of Product Entities
(priya.r@research.iiit.ac.in)
3
• Deceptively easy task
• No common naming convention for versions or products
– Windows (3.0 > 95 > 98 > 2000 > XP > 7.0 > 8.0)
– Ubuntu (Warty > Hoary > Breezy > Dapper > Edgy )
• Product mentions occur in unstructured natural language form
– “JVC S-VHS Camcorder”, “Super VHS Camcorder”, “S VHS Camcorder”
Challenges
Modeling the Evolution of Product Entities
(priya.r@research.iiit.ac.in)
4
Proposed Approach
• Two stage approach
– Stage 1: Given a set of product entity names, we parse
the product names to identify the brand, product and
version indicating words from the product name
– Stage 2: Cluster products based on brand and
product. Within a cluster, rank the version(s) in
temporal order
Modeling the Evolution of Product Entities
(priya.r@research.iiit.ac.in)
5
Modeling the Evolution of Product Entities
(priya.r@research.iiit.ac.in)
A Two Stage Approach
Stage 1
Leica D-Lux 6 digital camera
Input Output
D-Lux 6 digital camera
Stage 2
Leica D-Lux 6 digital camera
Leica D-Lux 4 digital camera
Digital camera Leica D-Lux 5
Leica
D-Lux
4 5 6
Canon A630
Canon A640
Canon
A630 A640
6
Stage 1 – Parsing the Product Title (1)
• Categorizing words in the product title as brand, product, version and other
• Conditional Random Fields (CRF) based approach
• CRF tagger is trained on manually labeled (word, label) pairs using 3
features
– Product description
– Context patterns surrounding the labels
– Linguistic patterns frequently associated with the labels
• Label words in product titles from the test set
• Given any query product version, we identify its (brand, product)
• Group together product entities that have the same brand and product as
the given query
Modeling the Evolution of Product Entities
(priya.r@research.iiit.ac.in)
7
Stage 1 – Parsing the Product Title (2)
• Feature Set
– Product description
• Description, weight, review, model, category, URL
– Context patterns
• Position of the word title, alphabet, numeral, parenthesis, previous word
– Linguistic patterns
• Words of the product title have POS tag (NNP, MD, VB, JJ, NN, CD, NNS, IN, RB, DT, VBP, VBD, CC, VBN, JJS, VBZ,
LS, VBG, FW, PRP$, PRP, SYM)
Modeling the Evolution of Product Entities
(priya.r@research.iiit.ac.in)
8
Stage 2 – Predicting Predecessor Version
Members of a set of product entities with the same brand name and product
name are candidates for being Predecessor version of query entity’s version
• Classification based approach
• Binary features on Ordering
– Lexical : Does the candidate lexically precede the given query product version?
– Review-Date : Is the candidate older than the given query product version based on
review date?
– Mentions : Was the candidate mentioned in the query product versions description or
reviews?
• Nokia Lumia 1020 precedes Nokia Lumia 1320 by Lexical and Review-date
features
Modeling the Evolution of Product Entities
(priya.r@research.iiit.ac.in)
9
Dataset
• Crawled 462K product description pages from www.amazon.com
• Labeled 500 from “camera and photo” category
• 40 out of the 500 product titles had predecessor version
Modeling the Evolution of Product Entities
(priya.r@research.iiit.ac.in)
10
Results – Stage 1
Parsing the product title
CRF Accuracy for the Product Title Parsing Task
Label P R F
Brand name 0.98 0.65 0.77
Product name 0.89 0.58 0.69
Version 0.69 0.48 0.55
Product name/Version 0.88 0.55 0.67
Others 0.84 0.98 0.91
Modeling the Evolution of Product Entities
(priya.r@research.iiit.ac.in)
11
Results – Stage 2
Predicting the Predecessor Version
Classifier Accuracy for the Positive Class for Product Predecessor Version
Prediction
Feature TP FP P R F
Lexical + Review- Date 0.632 0.049 0.533 0.632 0.578
All Features 0.579 0.049 0.512 0.579 0.543
Review-Date 0.579 0.061 0.458 0.579 0.512
Review-Date + Mentions 0.553 0.047 0.512 0.553 0.532
Lexical + Mentions 0.5 0.049 0.475 0.5 0.487
Lexical 0.5 0.056 0.442 0.5 0.469
Mentions 0.45 0.049 0.462 0.45 0.456
Modeling the Evolution of Product Entities
(priya.r@research.iiit.ac.in)
12
Related Work
• Extracting attribute values for product entities
1. Mauge, Karin et al. Structuring E-Commerce Inventory. ACL ’12
2. Putthividhya, Duangmanee et al. Bootstrapped Named Entity Recognition
for Product Attribute Extraction. EMNLP ’11
• Identifying related entities
3. N. Bach and S. Badaskar. A Survey on Relation Extraction LTI CMU 2007
4. L. Fang, A. D. Sarma, C. Yu, and P. Bohannon. REX: Explaining
Relationships Between Entity Pairs. PVLDB, 5(3):241252, Nov 2011
None of the above works focus on extracting the version information from the
product listing titles, which we have accomplished in the proposed work
Modeling the Evolution of Product Entities
(priya.r@research.iiit.ac.in)
13
Conclusion
• A two-stage approach to find the predecessor of a product entity
• First stage achieved a precision of ~88%
• Second stage achieved a precision of ~53%
• Applications
– Product search engine ranking
– Comparing product versions
Modeling the Evolution of Product Entities
(priya.r@research.iiit.ac.in)
14
Future Directions
We plan to enhance this work as follows
• For building product version trees
• Studying how features of product entities evolve
Code
• https://github.com/priyaradhakrishnan0/EntityRanking
Golden dataset
• IIIT-H Internal Link
– http://10.2.4.116/backup/sieldata/datasets/EntityRanking/data.zip
• External Link
– http://tinyurl.com/lclapy8
Modeling the Evolution of Product Entities
(priya.r@research.iiit.ac.in)
15
Thank you for your time and attention
Modeling the Evolution of Product Entities
(priya.r@research.iiit.ac.in)
16

More Related Content

Similar to Radhakrishnan14 sigir

Product design and development ch2
Product design and development ch2Product design and development ch2
Product design and development ch2Kavindra Singh
 
Modern development paradigms
Modern development paradigmsModern development paradigms
Modern development paradigmsIvano Malavolta
 
Knowledge based enterprisinjg strategy for lean product develoment
Knowledge based enterprisinjg strategy for lean product develomentKnowledge based enterprisinjg strategy for lean product develoment
Knowledge based enterprisinjg strategy for lean product develomentKnowledge Solution
 
[2016/2017] Modern development paradigms
[2016/2017] Modern development paradigms [2016/2017] Modern development paradigms
[2016/2017] Modern development paradigms Ivano Malavolta
 
Fundamentals of software development
Fundamentals of software developmentFundamentals of software development
Fundamentals of software developmentPratik Devmurari
 
Manual Testing Notes
Manual Testing NotesManual Testing Notes
Manual Testing Notesguest208aa1
 
Production ML Systems and Computer Vision with Google Cloud
Production ML Systems and Computer Vision with Google CloudProduction ML Systems and Computer Vision with Google Cloud
Production ML Systems and Computer Vision with Google Cloudgdgsurrey
 
SPM Lecture 1 - Introduction
SPM Lecture 1 - IntroductionSPM Lecture 1 - Introduction
SPM Lecture 1 - IntroductionGarm Lucassen
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
ACCOUNTING INFORMATION SYSTEMSAccess and Data Analytics Test.docx
ACCOUNTING INFORMATION SYSTEMSAccess and Data Analytics Test.docxACCOUNTING INFORMATION SYSTEMSAccess and Data Analytics Test.docx
ACCOUNTING INFORMATION SYSTEMSAccess and Data Analytics Test.docxSALU18
 
product aspect ranking and applications
product aspect ranking and applicationsproduct aspect ranking and applications
product aspect ranking and applicationsswathi78
 
Solo Requisitos 2008 - 07 Upc
Solo Requisitos 2008 - 07 UpcSolo Requisitos 2008 - 07 Upc
Solo Requisitos 2008 - 07 UpcPepe
 
LSP ( Logic Score Preference ) _ Rajan_Dhabalia_San Francisco State University
LSP ( Logic Score Preference ) _ Rajan_Dhabalia_San Francisco State UniversityLSP ( Logic Score Preference ) _ Rajan_Dhabalia_San Francisco State University
LSP ( Logic Score Preference ) _ Rajan_Dhabalia_San Francisco State Universitydhabalia
 
Ch 9 traceability and verification
Ch 9 traceability and verificationCh 9 traceability and verification
Ch 9 traceability and verificationKittitouch Suteeca
 
Software engineering
Software engineeringSoftware engineering
Software engineeringnidhi5388
 
Using Deep Learning and Customized Solr Components to Improve search Relevanc...
Using Deep Learning and Customized Solr Components to Improve search Relevanc...Using Deep Learning and Customized Solr Components to Improve search Relevanc...
Using Deep Learning and Customized Solr Components to Improve search Relevanc...Lucidworks
 
Off-Label Data Mesh: A Prescription for Healthier Data
Off-Label Data Mesh: A Prescription for Healthier DataOff-Label Data Mesh: A Prescription for Healthier Data
Off-Label Data Mesh: A Prescription for Healthier DataHostedbyConfluent
 

Similar to Radhakrishnan14 sigir (20)

Product design and development ch2
Product design and development ch2Product design and development ch2
Product design and development ch2
 
Modern development paradigms
Modern development paradigmsModern development paradigms
Modern development paradigms
 
Knowledge based enterprisinjg strategy for lean product develoment
Knowledge based enterprisinjg strategy for lean product develomentKnowledge based enterprisinjg strategy for lean product develoment
Knowledge based enterprisinjg strategy for lean product develoment
 
[2016/2017] Modern development paradigms
[2016/2017] Modern development paradigms [2016/2017] Modern development paradigms
[2016/2017] Modern development paradigms
 
Fundamentals of software development
Fundamentals of software developmentFundamentals of software development
Fundamentals of software development
 
Manual Testing Notes
Manual Testing NotesManual Testing Notes
Manual Testing Notes
 
Projects worked at Walmart
Projects worked at WalmartProjects worked at Walmart
Projects worked at Walmart
 
Production ML Systems and Computer Vision with Google Cloud
Production ML Systems and Computer Vision with Google CloudProduction ML Systems and Computer Vision with Google Cloud
Production ML Systems and Computer Vision with Google Cloud
 
Data Science curriculum
Data Science curriculumData Science curriculum
Data Science curriculum
 
SPM Lecture 1 - Introduction
SPM Lecture 1 - IntroductionSPM Lecture 1 - Introduction
SPM Lecture 1 - Introduction
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
ACCOUNTING INFORMATION SYSTEMSAccess and Data Analytics Test.docx
ACCOUNTING INFORMATION SYSTEMSAccess and Data Analytics Test.docxACCOUNTING INFORMATION SYSTEMSAccess and Data Analytics Test.docx
ACCOUNTING INFORMATION SYSTEMSAccess and Data Analytics Test.docx
 
product aspect ranking and applications
product aspect ranking and applicationsproduct aspect ranking and applications
product aspect ranking and applications
 
Solo Requisitos 2008 - 07 Upc
Solo Requisitos 2008 - 07 UpcSolo Requisitos 2008 - 07 Upc
Solo Requisitos 2008 - 07 Upc
 
LSP ( Logic Score Preference ) _ Rajan_Dhabalia_San Francisco State University
LSP ( Logic Score Preference ) _ Rajan_Dhabalia_San Francisco State UniversityLSP ( Logic Score Preference ) _ Rajan_Dhabalia_San Francisco State University
LSP ( Logic Score Preference ) _ Rajan_Dhabalia_San Francisco State University
 
Ch 9 traceability and verification
Ch 9 traceability and verificationCh 9 traceability and verification
Ch 9 traceability and verification
 
Software engineering
Software engineeringSoftware engineering
Software engineering
 
Using Deep Learning and Customized Solr Components to Improve search Relevanc...
Using Deep Learning and Customized Solr Components to Improve search Relevanc...Using Deep Learning and Customized Solr Components to Improve search Relevanc...
Using Deep Learning and Customized Solr Components to Improve search Relevanc...
 
Off-Label Data Mesh: A Prescription for Healthier Data
Off-Label Data Mesh: A Prescription for Healthier DataOff-Label Data Mesh: A Prescription for Healthier Data
Off-Label Data Mesh: A Prescription for Healthier Data
 
AI_in_Aero_UAV.pdf
AI_in_Aero_UAV.pdfAI_in_Aero_UAV.pdf
AI_in_Aero_UAV.pdf
 

Recently uploaded

SOC Analyst Guide For Beginners SOC analysts work as members of a managed sec...
SOC Analyst Guide For Beginners SOC analysts work as members of a managed sec...SOC Analyst Guide For Beginners SOC analysts work as members of a managed sec...
SOC Analyst Guide For Beginners SOC analysts work as members of a managed sec...Varun Mithran
 
Beyond Inbound: Unlocking the Secrets of API Egress Traffic Management
Beyond Inbound: Unlocking the Secrets of API Egress Traffic ManagementBeyond Inbound: Unlocking the Secrets of API Egress Traffic Management
Beyond Inbound: Unlocking the Secrets of API Egress Traffic Managementseank14
 
一比一定制波士顿学院毕业证学位证书
一比一定制波士顿学院毕业证学位证书一比一定制波士顿学院毕业证学位证书
一比一定制波士顿学院毕业证学位证书A
 
AI Generated 3D Models | AI 3D Model Generator
AI Generated 3D Models | AI 3D Model GeneratorAI Generated 3D Models | AI 3D Model Generator
AI Generated 3D Models | AI 3D Model Generator3DailyAI1
 
如何办理(UCLA毕业证)加州大学洛杉矶分校毕业证成绩单本科硕士学位证留信学历认证
如何办理(UCLA毕业证)加州大学洛杉矶分校毕业证成绩单本科硕士学位证留信学历认证如何办理(UCLA毕业证)加州大学洛杉矶分校毕业证成绩单本科硕士学位证留信学历认证
如何办理(UCLA毕业证)加州大学洛杉矶分校毕业证成绩单本科硕士学位证留信学历认证hfkmxufye
 
一比一原版(Design毕业证书)新加坡科技设计大学毕业证原件一模一样
一比一原版(Design毕业证书)新加坡科技设计大学毕业证原件一模一样一比一原版(Design毕业证书)新加坡科技设计大学毕业证原件一模一样
一比一原版(Design毕业证书)新加坡科技设计大学毕业证原件一模一样AS
 
一比一原版(毕业证书)新加坡南洋理工学院毕业证原件一模一样
一比一原版(毕业证书)新加坡南洋理工学院毕业证原件一模一样一比一原版(毕业证书)新加坡南洋理工学院毕业证原件一模一样
一比一原版(毕业证书)新加坡南洋理工学院毕业证原件一模一样AS
 
一比一原版(Polytechnic毕业证书)新加坡理工学院毕业证原件一模一样
一比一原版(Polytechnic毕业证书)新加坡理工学院毕业证原件一模一样一比一原版(Polytechnic毕业证书)新加坡理工学院毕业证原件一模一样
一比一原版(Polytechnic毕业证书)新加坡理工学院毕业证原件一模一样AS
 
[Hackersuli] Élő szövet a fémvázon: Python és gépi tanulás a Zeek platformon
[Hackersuli] Élő szövet a fémvázon: Python és gépi tanulás a Zeek platformon[Hackersuli] Élő szövet a fémvázon: Python és gépi tanulás a Zeek platformon
[Hackersuli] Élő szövet a fémvázon: Python és gépi tanulás a Zeek platformonhackersuli
 
一比一原版(UWE毕业证书)西英格兰大学毕业证原件一模一样
一比一原版(UWE毕业证书)西英格兰大学毕业证原件一模一样一比一原版(UWE毕业证书)西英格兰大学毕业证原件一模一样
一比一原版(UWE毕业证书)西英格兰大学毕业证原件一模一样Fi
 
原版定制(Management毕业证书)新加坡管理大学毕业证原件一模一样
原版定制(Management毕业证书)新加坡管理大学毕业证原件一模一样原版定制(Management毕业证书)新加坡管理大学毕业证原件一模一样
原版定制(Management毕业证书)新加坡管理大学毕业证原件一模一样asdafd
 
TORTOGEL TELAH MENJADI SALAH SATU PLATFORM PERMAINAN PALING FAVORIT.
TORTOGEL TELAH MENJADI SALAH SATU PLATFORM PERMAINAN PALING FAVORIT.TORTOGEL TELAH MENJADI SALAH SATU PLATFORM PERMAINAN PALING FAVORIT.
TORTOGEL TELAH MENJADI SALAH SATU PLATFORM PERMAINAN PALING FAVORIT.Tortogel
 
一比一原版(PSU毕业证书)美国宾州州立大学毕业证如何办理
一比一原版(PSU毕业证书)美国宾州州立大学毕业证如何办理一比一原版(PSU毕业证书)美国宾州州立大学毕业证如何办理
一比一原版(PSU毕业证书)美国宾州州立大学毕业证如何办理Fir
 
一比一原版(TRU毕业证书)温哥华社区学院毕业证如何办理
一比一原版(TRU毕业证书)温哥华社区学院毕业证如何办理一比一原版(TRU毕业证书)温哥华社区学院毕业证如何办理
一比一原版(TRU毕业证书)温哥华社区学院毕业证如何办理Fir
 
Free scottie t shirts Free scottie t shirts
Free scottie t shirts Free scottie t shirtsFree scottie t shirts Free scottie t shirts
Free scottie t shirts Free scottie t shirtsrahman018755
 
Discovering OfficialUSA.com Your Go-To Resource.pdf
Discovering OfficialUSA.com Your Go-To Resource.pdfDiscovering OfficialUSA.com Your Go-To Resource.pdf
Discovering OfficialUSA.com Your Go-To Resource.pdfSadaf Khan
 
Dan Quinn Commanders Feather Dad Hat Hoodie
Dan Quinn Commanders Feather Dad Hat HoodieDan Quinn Commanders Feather Dad Hat Hoodie
Dan Quinn Commanders Feather Dad Hat Hoodierahman018755
 
Reggie miller choke t shirtsReggie miller choke t shirts
Reggie miller choke t shirtsReggie miller choke t shirtsReggie miller choke t shirtsReggie miller choke t shirts
Reggie miller choke t shirtsReggie miller choke t shirtsrahman018755
 
The Rise of Subscription-Based Digital Services.pdf
The Rise of Subscription-Based Digital Services.pdfThe Rise of Subscription-Based Digital Services.pdf
The Rise of Subscription-Based Digital Services.pdfe-Market Hub
 

Recently uploaded (20)

SOC Analyst Guide For Beginners SOC analysts work as members of a managed sec...
SOC Analyst Guide For Beginners SOC analysts work as members of a managed sec...SOC Analyst Guide For Beginners SOC analysts work as members of a managed sec...
SOC Analyst Guide For Beginners SOC analysts work as members of a managed sec...
 
Beyond Inbound: Unlocking the Secrets of API Egress Traffic Management
Beyond Inbound: Unlocking the Secrets of API Egress Traffic ManagementBeyond Inbound: Unlocking the Secrets of API Egress Traffic Management
Beyond Inbound: Unlocking the Secrets of API Egress Traffic Management
 
一比一定制波士顿学院毕业证学位证书
一比一定制波士顿学院毕业证学位证书一比一定制波士顿学院毕业证学位证书
一比一定制波士顿学院毕业证学位证书
 
AI Generated 3D Models | AI 3D Model Generator
AI Generated 3D Models | AI 3D Model GeneratorAI Generated 3D Models | AI 3D Model Generator
AI Generated 3D Models | AI 3D Model Generator
 
如何办理(UCLA毕业证)加州大学洛杉矶分校毕业证成绩单本科硕士学位证留信学历认证
如何办理(UCLA毕业证)加州大学洛杉矶分校毕业证成绩单本科硕士学位证留信学历认证如何办理(UCLA毕业证)加州大学洛杉矶分校毕业证成绩单本科硕士学位证留信学历认证
如何办理(UCLA毕业证)加州大学洛杉矶分校毕业证成绩单本科硕士学位证留信学历认证
 
GOOGLE Io 2024 At takes center stage.pdf
GOOGLE Io 2024 At takes center stage.pdfGOOGLE Io 2024 At takes center stage.pdf
GOOGLE Io 2024 At takes center stage.pdf
 
一比一原版(Design毕业证书)新加坡科技设计大学毕业证原件一模一样
一比一原版(Design毕业证书)新加坡科技设计大学毕业证原件一模一样一比一原版(Design毕业证书)新加坡科技设计大学毕业证原件一模一样
一比一原版(Design毕业证书)新加坡科技设计大学毕业证原件一模一样
 
一比一原版(毕业证书)新加坡南洋理工学院毕业证原件一模一样
一比一原版(毕业证书)新加坡南洋理工学院毕业证原件一模一样一比一原版(毕业证书)新加坡南洋理工学院毕业证原件一模一样
一比一原版(毕业证书)新加坡南洋理工学院毕业证原件一模一样
 
一比一原版(Polytechnic毕业证书)新加坡理工学院毕业证原件一模一样
一比一原版(Polytechnic毕业证书)新加坡理工学院毕业证原件一模一样一比一原版(Polytechnic毕业证书)新加坡理工学院毕业证原件一模一样
一比一原版(Polytechnic毕业证书)新加坡理工学院毕业证原件一模一样
 
[Hackersuli] Élő szövet a fémvázon: Python és gépi tanulás a Zeek platformon
[Hackersuli] Élő szövet a fémvázon: Python és gépi tanulás a Zeek platformon[Hackersuli] Élő szövet a fémvázon: Python és gépi tanulás a Zeek platformon
[Hackersuli] Élő szövet a fémvázon: Python és gépi tanulás a Zeek platformon
 
一比一原版(UWE毕业证书)西英格兰大学毕业证原件一模一样
一比一原版(UWE毕业证书)西英格兰大学毕业证原件一模一样一比一原版(UWE毕业证书)西英格兰大学毕业证原件一模一样
一比一原版(UWE毕业证书)西英格兰大学毕业证原件一模一样
 
原版定制(Management毕业证书)新加坡管理大学毕业证原件一模一样
原版定制(Management毕业证书)新加坡管理大学毕业证原件一模一样原版定制(Management毕业证书)新加坡管理大学毕业证原件一模一样
原版定制(Management毕业证书)新加坡管理大学毕业证原件一模一样
 
TORTOGEL TELAH MENJADI SALAH SATU PLATFORM PERMAINAN PALING FAVORIT.
TORTOGEL TELAH MENJADI SALAH SATU PLATFORM PERMAINAN PALING FAVORIT.TORTOGEL TELAH MENJADI SALAH SATU PLATFORM PERMAINAN PALING FAVORIT.
TORTOGEL TELAH MENJADI SALAH SATU PLATFORM PERMAINAN PALING FAVORIT.
 
一比一原版(PSU毕业证书)美国宾州州立大学毕业证如何办理
一比一原版(PSU毕业证书)美国宾州州立大学毕业证如何办理一比一原版(PSU毕业证书)美国宾州州立大学毕业证如何办理
一比一原版(PSU毕业证书)美国宾州州立大学毕业证如何办理
 
一比一原版(TRU毕业证书)温哥华社区学院毕业证如何办理
一比一原版(TRU毕业证书)温哥华社区学院毕业证如何办理一比一原版(TRU毕业证书)温哥华社区学院毕业证如何办理
一比一原版(TRU毕业证书)温哥华社区学院毕业证如何办理
 
Free scottie t shirts Free scottie t shirts
Free scottie t shirts Free scottie t shirtsFree scottie t shirts Free scottie t shirts
Free scottie t shirts Free scottie t shirts
 
Discovering OfficialUSA.com Your Go-To Resource.pdf
Discovering OfficialUSA.com Your Go-To Resource.pdfDiscovering OfficialUSA.com Your Go-To Resource.pdf
Discovering OfficialUSA.com Your Go-To Resource.pdf
 
Dan Quinn Commanders Feather Dad Hat Hoodie
Dan Quinn Commanders Feather Dad Hat HoodieDan Quinn Commanders Feather Dad Hat Hoodie
Dan Quinn Commanders Feather Dad Hat Hoodie
 
Reggie miller choke t shirtsReggie miller choke t shirts
Reggie miller choke t shirtsReggie miller choke t shirtsReggie miller choke t shirtsReggie miller choke t shirts
Reggie miller choke t shirtsReggie miller choke t shirts
 
The Rise of Subscription-Based Digital Services.pdf
The Rise of Subscription-Based Digital Services.pdfThe Rise of Subscription-Based Digital Services.pdf
The Rise of Subscription-Based Digital Services.pdf
 

Radhakrishnan14 sigir

  • 1. Modeling the Evolution of Product Entities Priya Radhakrishnan1, Manish Gupta1,2, Vasudeva Varma1 1Search and Information Extraction Lab, IIIT-Hyderabad, India 2Microsoft, Hyderabad, India ACM SIGIR 2014 July 6-11, 2014 The 37th Annual International ACM SIGIR Conference July 9, 2014 priya.r@research.iiit.ac.in, gmanish@microsoft.com, vv@iiit.ac.in
  • 2. Motivation • Predict the previous version (predecessor) of a product entity • Link various versions of a product in a temporal order – Windows 7.0 > Windows 8.0 • Helps study evolution of product entities – Ubuntu 4.10 > Ubuntu 5.04 > Ubuntu 5.10 > Ubuntu 6.06 – Install CD > USB > LiveCD Finding the predecessor of a product entity is the first step towards modeling the Evolution of Product Entities Modeling the Evolution of Product Entities (priya.r@research.iiit.ac.in) 2
  • 3. A Typical Product Listing on Amazon Modeling the Evolution of Product Entities (priya.r@research.iiit.ac.in) 3
  • 4. • Deceptively easy task • No common naming convention for versions or products – Windows (3.0 > 95 > 98 > 2000 > XP > 7.0 > 8.0) – Ubuntu (Warty > Hoary > Breezy > Dapper > Edgy ) • Product mentions occur in unstructured natural language form – “JVC S-VHS Camcorder”, “Super VHS Camcorder”, “S VHS Camcorder” Challenges Modeling the Evolution of Product Entities (priya.r@research.iiit.ac.in) 4
  • 5. Proposed Approach • Two stage approach – Stage 1: Given a set of product entity names, we parse the product names to identify the brand, product and version indicating words from the product name – Stage 2: Cluster products based on brand and product. Within a cluster, rank the version(s) in temporal order Modeling the Evolution of Product Entities (priya.r@research.iiit.ac.in) 5
  • 6. Modeling the Evolution of Product Entities (priya.r@research.iiit.ac.in) A Two Stage Approach Stage 1 Leica D-Lux 6 digital camera Input Output D-Lux 6 digital camera Stage 2 Leica D-Lux 6 digital camera Leica D-Lux 4 digital camera Digital camera Leica D-Lux 5 Leica D-Lux 4 5 6 Canon A630 Canon A640 Canon A630 A640 6
  • 7. Stage 1 – Parsing the Product Title (1) • Categorizing words in the product title as brand, product, version and other • Conditional Random Fields (CRF) based approach • CRF tagger is trained on manually labeled (word, label) pairs using 3 features – Product description – Context patterns surrounding the labels – Linguistic patterns frequently associated with the labels • Label words in product titles from the test set • Given any query product version, we identify its (brand, product) • Group together product entities that have the same brand and product as the given query Modeling the Evolution of Product Entities (priya.r@research.iiit.ac.in) 7
  • 8. Stage 1 – Parsing the Product Title (2) • Feature Set – Product description • Description, weight, review, model, category, URL – Context patterns • Position of the word title, alphabet, numeral, parenthesis, previous word – Linguistic patterns • Words of the product title have POS tag (NNP, MD, VB, JJ, NN, CD, NNS, IN, RB, DT, VBP, VBD, CC, VBN, JJS, VBZ, LS, VBG, FW, PRP$, PRP, SYM) Modeling the Evolution of Product Entities (priya.r@research.iiit.ac.in) 8
  • 9. Stage 2 – Predicting Predecessor Version Members of a set of product entities with the same brand name and product name are candidates for being Predecessor version of query entity’s version • Classification based approach • Binary features on Ordering – Lexical : Does the candidate lexically precede the given query product version? – Review-Date : Is the candidate older than the given query product version based on review date? – Mentions : Was the candidate mentioned in the query product versions description or reviews? • Nokia Lumia 1020 precedes Nokia Lumia 1320 by Lexical and Review-date features Modeling the Evolution of Product Entities (priya.r@research.iiit.ac.in) 9
  • 10. Dataset • Crawled 462K product description pages from www.amazon.com • Labeled 500 from “camera and photo” category • 40 out of the 500 product titles had predecessor version Modeling the Evolution of Product Entities (priya.r@research.iiit.ac.in) 10
  • 11. Results – Stage 1 Parsing the product title CRF Accuracy for the Product Title Parsing Task Label P R F Brand name 0.98 0.65 0.77 Product name 0.89 0.58 0.69 Version 0.69 0.48 0.55 Product name/Version 0.88 0.55 0.67 Others 0.84 0.98 0.91 Modeling the Evolution of Product Entities (priya.r@research.iiit.ac.in) 11
  • 12. Results – Stage 2 Predicting the Predecessor Version Classifier Accuracy for the Positive Class for Product Predecessor Version Prediction Feature TP FP P R F Lexical + Review- Date 0.632 0.049 0.533 0.632 0.578 All Features 0.579 0.049 0.512 0.579 0.543 Review-Date 0.579 0.061 0.458 0.579 0.512 Review-Date + Mentions 0.553 0.047 0.512 0.553 0.532 Lexical + Mentions 0.5 0.049 0.475 0.5 0.487 Lexical 0.5 0.056 0.442 0.5 0.469 Mentions 0.45 0.049 0.462 0.45 0.456 Modeling the Evolution of Product Entities (priya.r@research.iiit.ac.in) 12
  • 13. Related Work • Extracting attribute values for product entities 1. Mauge, Karin et al. Structuring E-Commerce Inventory. ACL ’12 2. Putthividhya, Duangmanee et al. Bootstrapped Named Entity Recognition for Product Attribute Extraction. EMNLP ’11 • Identifying related entities 3. N. Bach and S. Badaskar. A Survey on Relation Extraction LTI CMU 2007 4. L. Fang, A. D. Sarma, C. Yu, and P. Bohannon. REX: Explaining Relationships Between Entity Pairs. PVLDB, 5(3):241252, Nov 2011 None of the above works focus on extracting the version information from the product listing titles, which we have accomplished in the proposed work Modeling the Evolution of Product Entities (priya.r@research.iiit.ac.in) 13
  • 14. Conclusion • A two-stage approach to find the predecessor of a product entity • First stage achieved a precision of ~88% • Second stage achieved a precision of ~53% • Applications – Product search engine ranking – Comparing product versions Modeling the Evolution of Product Entities (priya.r@research.iiit.ac.in) 14
  • 15. Future Directions We plan to enhance this work as follows • For building product version trees • Studying how features of product entities evolve Code • https://github.com/priyaradhakrishnan0/EntityRanking Golden dataset • IIIT-H Internal Link – http://10.2.4.116/backup/sieldata/datasets/EntityRanking/data.zip • External Link – http://tinyurl.com/lclapy8 Modeling the Evolution of Product Entities (priya.r@research.iiit.ac.in) 15
  • 16. Thank you for your time and attention Modeling the Evolution of Product Entities (priya.r@research.iiit.ac.in) 16