The Geometry of Learning

The Geometry of Learning November 17th, 2009, Utrecht, The Netherlands Fridolin WildKMi, The Open University

(createdwith http://www.wordle.net)

Outline Context & Framing Theories Latent Semantic Analysis (LSA) Social Network Analysis (SNA) Meaningful Interaction Analysis (MIA) Conclusion & Outlook

Information (96dpi) Information could be the quality of a certain signal. Information could be a logical abstractor, the release mechanism. Information & Knowledge Knowledge could be the delta at the receiver (a paper, a human, a library).

Learning is change Learning is about competence development Competence becomes visible in performance Professional competence is mainly about (re-)constructing and processing information and knowledge from cues Professional competence development is much about learning concepts from language Professional performance is much about demonstrating conceptual knowledge with language Language! What is learning about?

Tying shoelaces Douglas Adams’ ‘meaning of liff’: Epping: The futile movements of forefingers and eyebrows used when failing to attract the attention of waiters and barmen. Shoeburyness: The vague uncomfortable feeling you get when sitting on a seat which is still warm from somebody else's bottom I have been convincingly Sapir-Whorfed by this book. Non-textual concepts things we can’t (easily) learn from language

Word Choice Educated adult understands ~100,000 word forms An average sentence contains 20 tokens. Thus 100,00020 possible combinationsof words in a sentence  maximum of log2 100,00020= 332 bits in word choice alone. 20! = 2.4 x 1018 possible orders of 20 words = maximum of 61 bits from order of the words. 332/(61+ 332) = 84%word choice (Landauer, 2007)

Latent Semantic Analysis “Humans learn word meanings and how to combine them into passage meaning through experience with ~paragraph unitized verbal environments.” “They don’t remember all the separate words of a passage; they remember its overall gist or meaning.” “LSA learns by ‘reading’ ~paragraph unitized texts that represent the environment.” “It doesn’t remember all the separate words of a text it; it remembers its overall gist or meaning.” (Landauer, 2007)

Latent Semantics latent-semantic space In other words:Assumption: language utterances have a semantic structure Problem: structure is obscured by word usage(noise, synonymy, polysemy, …) Solution: map doc-term matrix using conceptual indices derived statistically (truncated SVD) and make similarity comparisons using angles

Input (e.g., documents) term = feature vocabulary = ordered set of features Only the red terms appear in more than one document, so strip the rest. TEXTMATRIX { M } = Deerwester, Dumais, Furnas, Landauer, and Harshman (1990): Indexing by Latent Semantic Analysis, In: Journal of the American Society for Information Science, 41(6):391-407

Singular Value Decomposition =

Truncated SVD latent-semantic space … we will get a different matrix (different values, but still of the same format as M).

Reconstructed, Reduced Matrix m4: Graphminors: A survey

Similarity in a Latent-Semantic Space Query Y dimension Target 1 Angle 1 Angle 2 Target 2 X dimension (Landauer, 2007)

doc2doc - similarities Unreduced = pure vector space model - Based on M = TSD’ - Pearson Correlation over document vectors reduced - based on M2 = TS2D’ - Pearson Correlation over document vectors

Typical, simple workflow tm = textmatrix(‘dir/‘) tm = lw_logtf(tm) * gw_idf(tm) space = lsa(tm, dims=dimcalc_share()) tm3 = fold_in(tm, space) as.textmatrix(tm)

Processing Pipeline (with Options) 4 x 12 x 7 x 2 x 3 = 2016 Combinations

b) SVD is computationally expensive From seconds (lower hundreds of documents, optimised linear algebra libraries, truncated SVD) To minutes (hundreds to thousands of documents) To hours (tens and hundreds of thousands) a) SVD factor stability SVD calculates factorsover a given text base; different texts – different factors Problem: avoid unwanted factor changes Solution: folding-in of instead of recalculating Projecting by Folding-In

2 1 vT Folding-In in Detail (cf. Berry et al., 1995) Mk (2) convert „Dk“-format vector to „Mk“-format Tk Sk Dk (1) convert Original Vector to „Dk“-format

The Value of Singular Values Pearson(jahr, wien) Pearson(eu, österreich)

Summary Writing: Working Principle (Landauer, 2007)

Summary Writing Gold Standard 1 Gold Standard 2 Y dimension Gold Standard 3 Essay 1 Essay 2 X dimension

‘Dumb’ Summary Writing (Code) library( "lsa“ )# load package # load training texts trm = textmatrix( "trainingtexts/“ ) trm = lw_bintf( trm ) * gw_idf( trm )# weighting space = lsa( trm )# create an LSA space # fold-in summaries to be tested (including gold standard text) tem = textmatrix( "testessays/", vocabulary=rownames(trm) ) tem_red = fold_in( tem, space ) # score a summary by comparing with # gold standard text (very simple method!) cor( tem_red[,"goldstandard.txt"], tem_red[,"E1.txt"] ) => 0.7

Evaluating Effectiveness Compare Machine Scores with Human Scores Human-to-Human Correlation Usually around .6 Increased by familiarity between assessors, tighter assessment schemes, … Scores vary even stronger with decreasing subject familiarity (.8 at high familiarity, worst test -.07) ,[object Object]

Training Collection: 3 ‘golden essays’, plus 302 documents from a marketing glossary, average length: 56.1 words,[object Object]

Social Network Analysis Existing for a long time (term coined 1954) Basic idea: Actors and Relationships between them (e.g. Interactions) Actors can be people (groups, media, tags, …) Actors and Ties form a Graph (edges and nodes) Within that graph, certain structures can be investigated Betweenness, Degree of Centrality, Density, Cohesion Structural Patterns can be identified (e.g. the Troll)

Incidence Matrix msg_id = incident, authorsappear in incidents

DeriveAdjacency Matrix = t(im) %*% im

Measuring Techniques (Sample) Closenesshow close to all others Degree Centralitynumber of (in/out) connections to others Betweennesshow often intermediary Componentse.g. kmeans cluster (k=3)

Co-Authorship Network WI (2005)

Paper Collaboration Prolearn e.g. co-authorships of ~30 deliverables of three work packages (ProLearn NoE) Roles: reviewer (red), editor (green), contributor Size: Prestige() But: type of interaction? Content of interaction? => not possible!

TEL Project Cooperation (2004-2007)

iCamp Collaboration (Y1) Shades of yellow: WP leadership Red: coordinator

Meaningful Interaction Analysis (MIA) Fusion: Combining LSA with SNA Terms and Documents (or anything else represented with column vectors or row vectors) are mapped into same space by LSA Semantic proximity can be measured between them: how close is a term to a document? (S)NA allows to analyse these resulting graph structures By e.g. cluster or component analysis By e.g. identifying central descriptors for these

The mathemagics behind Meaning Interaction Analysis

Knowledge Proxy: LSA Part Tk= left-hand sided matrix = ‚term loadings‘ on the singular value Dk= right-hand sided matrix = ‚document loadings‘ on the singular value Multiply them into same space VT = TkSk VD = DkTSk Cosine Distance Matrix over ... = a graph Extension: add author vectors VAthrough cluster centroids or vector addition of their publication vectors latent-semantic space Ofcourse:useexistingspaceandfold inthewholesetsofvectors

Knowledge Proxy: SNA Part:Filter the Network Every vectorhas a cosinedistancetoeveryother (maybe negative)! So: filter forthedesiredsimilaritystrength

ConSpectmonitoring conceptual development

Spot unwanted fragmentation e.g. two authors work on the same topic, but with different collaborator groups and with different literature Intervention Instrument: automatically recommend to hold a flashmeeting Bringing together what belongs together Wild, Ochoa, Heinze, Crespo, Quick (2009, to appear)

The Geometry of Learning

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to The Geometry of Learning

Similar to The Geometry of Learning (20)

More from fridolin.wild

More from fridolin.wild (20)

Recently uploaded

Recently uploaded (20)

The Geometry of Learning