Vector spaces for information extraction - Random Projection Example
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Vector spaces for information extraction - Random Projection Example

on

  • 597 views

A very short talk given at UCD latent semantic workshop

A very short talk given at UCD latent semantic workshop

Statistics

Views

Total Views
597
Views on SlideShare
589
Embed Views
8

Actions

Likes
0
Downloads
0
Comments
0

2 Embeds 8

http://pars.ie 6
http://atmykitchen.info 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Vector spaces for information extraction - Random Projection Example Presentation Transcript

  • 1. Vector Spaces for  Information Extraction Behrang Q. Zadeh behrangatoffice@gmail.com Knowledge Discovery Unit @ Insight Centre @ National University of Ireland, Galway Insight Workshop on Latent Space Methods – Dublin, UCD, 2014
  • 2. Vector Spaces in Information Extraction Entities to be Extracted or compared Contexts that are used for comparison • Vector spaces in IE are: • a representation framework for the Distributional Hypothesis*;  • Sparse; • Large (order of millions by millions); • Changing Dynamically; *not exclusively 
  • 3. Vector Spaces in Information Extraction Entities to be Extracted or compared Contexts that are used for comparison • In classic methods the dimension  of VSM growths as data growth.  • Dimension Reduction techniques  based on Matrix Factorization  may not be applied: • Iterative methods are still of the  complexity of O(n2) 
  • 4. Vector Spaces in Information Extraction • Random Projection is one solution: • Estimate a VSM by a random projection matrix that made  of a set of randomly created vectors. • i.e. based on the Johnson‐Lindenstrauss lemma • verified by the results reported in (Hecht‐Nielsen, 1994) * The above figure is copyrighted by Alex Clemmer (http://nullspace.io/) 
  • 5. Vector Spaces in Information Extraction • Random Projection ‐ Application  Example • Extraction of Technology Terms (term classification) • Data Size: only 10,000 publication • Contexts: words and their position in the  neighbourhood of terms • Original Dimension:  • approximately  5 million • Reducing the dimension to 2000 using  Random Projection Behrang’s research evolves around classification and finding the optimal contexts in random vector spaces for  the extraction of technology terms and their relation. If you are interested please email him at  behrangatoffice@gmail.com