Products Classification  Joyce Chan
Preliminary Knowledge <ul><ul><li>the words  labeling ,  tagging ,  classification ,  categorization  are used   interchan...
Classification: Different strategies <ul><li>No Classification </li></ul><ul><ul><li>products are not tagged to anything <...
No tagging / No Classification <ul><ul><li>all products are to be directly retrieved through sql or search engine queries ...
The case for having a products taxonomy <ul><li>Pros </li></ul><ul><ul><li>helps people find & explore what they are looki...
The case for static/editorially classified taxonomy <ul><li>Pros </li></ul><ul><ul><li>highly mappable to  product shelfin...
Single Level Static Taxonomy - only labeling / tagging <ul><ul><li>each product has  one  label </li></ul></ul><ul><ul><li...
One dimensional (one path) static Taxonomy with fixed levels of classifications <ul><ul><li>  </li></ul></ul><ul><ul><li> ...
The case for dynamic taxonomy <ul><li>Pros </li></ul><ul><ul><li>cheap to have the computer place the products on the taxo...
Dynamic taxonomy predefined w/ a fixed product db & supervised facet extraction from collections of text  annotated  items...
Dynamic Taxonomy & unsupervised facet extraction for collections of text documents <ul><ul><li>no prior facets to begin wi...
Hybrid: i) Dynamic taxonomy w/ a fixed hierarchy& supervised facet extraction + ii) social tagging (aka. folksonomy) <ul><...
Hybrid: 4 level (Jeremy's) taxonomy creation, w/ Gladson or GS1 labels and unsupervised facet extraction <ul><li>Pro </li>...
Upcoming SlideShare
Loading in …5
×

Classifying

327 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
327
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Classifying

  1. 1. Products Classification  Joyce Chan
  2. 2. Preliminary Knowledge <ul><ul><li>the words  labeling ,  tagging ,  classification ,  categorization  are used   interchangeably </li></ul></ul><ul><ul><li>the words taxonomy, hierarchy  are used   interchangeably </li></ul></ul><ul><ul><li>facet : is one of the path of the hierachy </li></ul></ul><ul><ul><li>static taxonomy : products are manually, or editorially mapped to the hierarchy </li></ul></ul><ul><ul><li>dynamic taxonomy : mapping of product to hierarchy generated by the system automatically, no help from people </li></ul></ul><ul><ul><li>document : a commonly used term when talking about searching </li></ul></ul><ul><ul><ul><li>here, we are refering to a product, plus all the metadata that are associated with the product </li></ul></ul></ul><ul><ul><ul><li>ie.  document Beatrice homo milk , it's metadata or attributes can be: </li></ul></ul></ul><ul><ul><ul><ul><li>title is Beatrice homo milk </li></ul></ul></ul></ul><ul><ul><ul><ul><li>it is a type of recipe ingredient </li></ul></ul></ul></ul><ul><ul><ul><ul><li>description is that it's tasty and rich & creamy </li></ul></ul></ul></ul><ul><ul><ul><ul><li>made by the Beatrice company </li></ul></ul></ul></ul><ul><ul><ul><ul><li>it's price is $3.60 per bag, etc </li></ul></ul></ul></ul><ul><ul><ul><ul><li>it is on sale at the Oakville location Loblaws </li></ul></ul></ul></ul><ul><ul><ul><ul><li>users highly recommend buying this milk </li></ul></ul></ul></ul><ul><ul><ul><ul><li>it's image name is b-milk.jpg </li></ul></ul></ul></ul>
  3. 3. Classification: Different strategies <ul><li>No Classification </li></ul><ul><ul><li>products are not tagged to anything </li></ul></ul><ul><li>Single level classification </li></ul><ul><ul><li>single level static taxonomy </li></ul></ul><ul><li>Multi level classification </li></ul><ul><ul><li>1 dimensional static taxonomy applying tree breakdown </li></ul></ul><ul><ul><li>hybrid of one dimensional & single level static taxonomy w/ Jeremy's tree breakdown </li></ul></ul><ul><ul><li>multi-dimensional static taxonomy applying Jeremy's tree breakdown </li></ul></ul><ul><ul><li>dynamic taxonomy w/ supervised extraction of facet from annotated text documents </li></ul></ul><ul><ul><li>dynamic taxonomy w/ unsupervised extraction of facets </li></ul></ul><ul><ul><li>static taxonomy w/ dataproviders' labels & unsupervised extraction of facets </li></ul></ul>
  4. 4. No tagging / No Classification <ul><ul><li>all products are to be directly retrieved through sql or search engine queries  </li></ul></ul><ul><ul><li>we assume users can find relevant information quickly with no further assistance </li></ul></ul><ul><li>Pros </li></ul><ul><ul><li>simple to implement, this is done already as we have a product database </li></ul></ul><ul><li>Cons </li></ul><ul><ul><li>with a large product database, it is confusing to users </li></ul></ul><ul><ul><li>ie. users search milk, many types of results are returned, they may have to flip through a few pages before finding what they need </li></ul></ul>
  5. 5. The case for having a products taxonomy <ul><li>Pros </li></ul><ul><ul><li>helps people find & explore what they are looking for in website and concierge device if they cannot quickly find it though directly searching </li></ul></ul><ul><ul><li>users have became used to e-commerce interfaces with product taxonomy </li></ul></ul>
  6. 6. The case for static/editorially classified taxonomy <ul><li>Pros </li></ul><ul><ul><li>highly mappable to product shelfing , kind of like the dewey classification system for the library </li></ul></ul><ul><li>Cons </li></ul><ul><ul><li>a lot of manual labor effort to maintain the classification structure that we provide, since we have thousands and thousands of products and hope to expand our product database in the future </li></ul></ul>
  7. 7. Single Level Static Taxonomy - only labeling / tagging <ul><ul><li>each product has one label </li></ul></ul><ul><ul><li>ie. Beatrice brand homo milk <= 'dairy' </li></ul></ul><ul><li>Pros </li></ul><ul><ul><li>provided by Gladson already, very straightforward to implement </li></ul></ul><ul><li>Cons </li></ul><ul><ul><li>not incredibly descriptive, not useful to users (customers, managers, inventory staff, or us)  </li></ul></ul>
  8. 8. One dimensional (one path) static Taxonomy with fixed levels of classifications <ul><ul><li>  </li></ul></ul><ul><ul><li>  </li></ul></ul><ul><ul><li>here there is a path from the root - department down to product </li></ul></ul><ul><ul><li>for instance Beatrice homo milk is classified as dairy, milk, homo, upc=1234567890 </li></ul></ul><ul><li>Pros </li></ul><ul><ul><li>easy to implement </li></ul></ul><ul><ul><li>everything classified under standard number of level of concepts </li></ul></ul><ul><ul><li>improves searching quite a bit </li></ul></ul><ul><li>Cons </li></ul><ul><ul><li>not allowing a product to be classified in multiple 'classes' </li></ul></ul><ul><ul><li>labour intensive to editorially edit product - classifications </li></ul></ul>
  9. 9. The case for dynamic taxonomy <ul><li>Pros </li></ul><ul><ul><li>cheap to have the computer place the products on the taxonomy by itself every time we add a new product to the database  </li></ul></ul><ul><li>Cons </li></ul><ul><ul><li>we're probably going to be applying a fairly complex taxonomy scheme, such as Amazon's </li></ul></ul><ul><ul><li>some possible implementational challenges, such as the correct use of machine learning libraries </li></ul></ul>
  10. 10. Dynamic taxonomy predefined w/ a fixed product db & supervised facet extraction from collections of text annotated items <ul><ul><li>we would have a predefined taxonomy, with some data already mapped under it </li></ul></ul><ul><ul><li>when a new item appears that has not mapped to the base taxonomy, use of machine learning algorithms to put it in the correct place </li></ul></ul><ul><li>Pros </li></ul><ul><ul><li>completely automated classification </li></ul></ul><ul><ul><li>with Amazon, because it's the most feature complete grocery multi-leveled taxonomy that I found (Tesco being another good one, but it's n/a right now) </li></ul></ul><ul><li>Cons </li></ul><ul><ul><li>new types of facets cannot be discovered, because we're using the predefined one </li></ul></ul>
  11. 11. Dynamic Taxonomy & unsupervised facet extraction for collections of text documents <ul><ul><li>no prior facets to begin with, algorithm will build taxonomy all by itself  </li></ul></ul><ul><ul><li>usually used on things like unclassified articles, etc </li></ul></ul><ul><li>algorithm </li></ul><ul><ul><li>for each item in products collection, identify which term is important </li></ul></ul><ul><ul><li>for each important term, query 1+ external resources & get contextual terms that appear in the result.  Add retrieved terms to the original document as part of its meta-data, now it is a context-aware document </li></ul></ul><ul><ul><li>analyze the frequency of the terms, both in the original collection & the expanded collection to identify the candidate facet terms </li></ul></ul><ul><li>pros </li></ul><ul><ul><li>new facet keywords can be created and automatically inserted into the taxonomy with no human intervention </li></ul></ul><ul><li>cons </li></ul><ul><ul><li>for each step in above algo, we need to use a ML algorithm </li></ul></ul><ul><ul><li>hard (for our company) to evaluate recall & precision given our small and non-standardized set of data </li></ul></ul>
  12. 12. Hybrid: i) Dynamic taxonomy w/ a fixed hierarchy& supervised facet extraction + ii) social tagging (aka. folksonomy) <ul><ul><li>we see that unsupervised learning is not suitable for our dataset, therefore I propose the use of a hybrid scheme to enable taxonomy creation </li></ul></ul><ul><ul><li>we can use our dynamic taxonomy scheme and also allow users to create new facet keywords, but maybe only the moderator can add the the new keyword into the taxonomy </li></ul></ul><ul><ul><li>the rest of the tags are just freely floating outside of the taxonomy </li></ul></ul><ul><ul><li>ie:  http://www.amazon.com/gp/product/tags-on-product/B001EO5XTO/ref=tag_dpp_cust_edpp_sa </li></ul></ul><ul><ul><ul><li>Amazon had allowed their customers to create their own tags of the product that is helpful for their own purposes </li></ul></ul></ul><ul><ul><li>possibly to even merge our tags with facebook </li></ul></ul><ul><ul><ul><li>http://techcrunch.com/2010/07/27/amazon-now-taps-into-facebook-for-social-product-recommendations/ </li></ul></ul></ul><ul><ul><li>Pros : possibly more useful to shoppers for them to remember their own stuff </li></ul></ul><ul><ul><li>Cons:  we'd have to get comfortable with having a plethora of tags not necessarily related to each other </li></ul></ul>
  13. 13. Hybrid: 4 level (Jeremy's) taxonomy creation, w/ Gladson or GS1 labels and unsupervised facet extraction <ul><li>Pro </li></ul><ul><ul><li>sounds the closes to what we're trying to accomplish </li></ul></ul><ul><ul><li>possible extensions with social tagging as well </li></ul></ul><ul><ul><li>works kind of well w/ shelfing </li></ul></ul><ul><li>Cons </li></ul><ul><ul><li>not as richly descriptive due to having only a fewer levels on the taxonomy </li></ul></ul><ul><ul><li>since taxonomy is confined to a certain number of levels, I don't really know how to implement this right now (I can research) </li></ul></ul>

×