Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Interactive Visual Da...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Outline
1 Introduce a...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
What is LVS?
Language...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Background
LVS is bui...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Background
http://lit...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Workspace
Browser
Chr...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Server
Since LVS is h...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Data Preparation
Impo...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Terminology Review
a....
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Terminology Review
a....
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Workshop Files
http:/...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Movie Data
Budget
Dir...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Language Variation Su...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Language Variation Su...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Upload File
Upload mo...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Uploaded Dataset
The ...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Summary
Summary provi...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Data Structure
1 Tota...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Cross Tabulation
Cros...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Cross Tabulation Plot...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Cross Tabulation Plot...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Adjusting Browser - L...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Adjusting Browser - L...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Language Variation Su...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Visual Analytics
Visu...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
One Variable Plot
23 ...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
One Variable Plot
23 ...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Customizing Plot
24 /...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Customizing Plot
24 /...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Saving Plot
25 / 44
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Cluster Plot
Classific...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Cluster Plot
27 / 44
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Cluster Plot
Group 1 ...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Inferential Statistic...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Language Variation Su...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
How to Create a Regre...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Modeling
32 / 44
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Regression Types
Mode...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Regression
34 / 44
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Model Output
35 / 44
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Model Output
35 / 44
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Interpretation: Budge...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Interpretation: Budge...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Conditional Tree
Cond...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Conditional Tree
38 /...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Conditional Tree
38 /...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Conditional Tree
1 Ge...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Random Forest
1 Varia...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Random Forest
Let’s a...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Random Forest
42 / 44
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Random Forest
Genre i...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
Let’s Have a Short Br...
Introduction
Data
Preparation
LVS
Working with
Data
Visual
Analytics
Inferential
Analysis
References
References I
[1] Baay...
Upcoming SlideShare
Loading in …5
×

Workshop on Quantitative Analytics Using Interactive On-line Tool

170 views

Published on

Introduction to a user-friendly on-line interactive statistical tool built using Shiny and R (Part I)

Published in: Data & Analytics
  • Be the first to like this

Workshop on Quantitative Analytics Using Interactive On-line Tool

  1. 1. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Interactive Visual Data Analysis Part One Language Variation Suite Olga Scrivner Indiana University Workshop in Methods 1 / 44
  2. 2. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Outline 1 Introduce a web application for quantitative analysis: LVS 2 Develop practical skills 3 Understand and interpret advanced statistical models PositionSentence p < 0.001 1 ind, pre post Heaviness p = 0.003 2 ≤ 1 > 1 Period p < 0.001 3 ≤ 1 > 1 Node 4 (n = 81) VOOV 0 0.2 0.4 0.6 0.8 1 Node 5 (n = 119) VOOV 0 0.2 0.4 0.6 0.8 1 Node 6 (n = 181) VOOV 0 0.2 0.4 0.6 0.8 1 Period p < 0.001 7 ≤ 2 > 2 Node 8 (n = 221) VOOV 0 0.2 0.4 0.6 0.8 1 Focus p < 0.001 9 cf nf Node 10 (n = 66) VOOV 0 0.2 0.4 0.6 0.8 1 Main_Verb_Structure p < 0.001 11 ACIOther, Restructuring Node 12 (n = 43) VOOV 0 0.2 0.4 0.6 0.8 1 Node 13 (n = 265) VOOV 0 0.2 0.4 0.6 0.8 1 2 / 44
  3. 3. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References What is LVS? Language Variation Suite It is a Shiny web application originally designed for data analysis in sociolinguistic research. It can be used for: Processing spreadsheet data Reporting in tables and graphs Analyzing means, regression, conditional trees ... (and much more) 3 / 44
  4. 4. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Background LVS is built in R using Shiny package: 1 R - a free programming language for statistical computing and graphics 2 Shiny App - a web application framework for R Computational power of R + Web interactivity 4 / 44
  5. 5. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Background http://littleactuary.github.io/blog/Web-application-framework-with-Shiny/ 5 / 44
  6. 6. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Workspace Browser Chrome, Firefox, Safari - recommendable Explorer may cause instability issues Accessibility PC, Mac, Linux Data files will be uploaded from any location on your computer Smart Phone Data files must be on a cloud platform connected to your phone account (e.g. dropbox) 6 / 44
  7. 7. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Server Since LVS is hosted on a server, Shiny idle time-out settings may stop application when it is left inactive (it will grey out). Solution: Click reload and re-upload your csv file 7 / 44
  8. 8. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Data Preparation Important things to consider before data entry: File format: Comma separated value (CSV) - faster processing Excel format will slow processing Column names should not contain spaces Permitted: non-accented characters, numbers, underscore, hyphen, and period One column must contain your dependent variable The rest of the columns contain independent variables 8 / 44
  9. 9. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Terminology Review a. Categorical - non-numerical data with two values yes - no; male - female b. Continuous - numerical data duration, age, year c. Multinomial - non-numerical data with three or more values regions, nationalities d. Ordinal - scale: currently not supported 9 / 44
  10. 10. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Terminology Review a. Categorical - non-numerical data with two values yes - no; male - female b. Continuous - numerical data duration, age, year c. Multinomial - non-numerical data with three or more values regions, nationalities d. Ordinal - scale: currently not supported 9 / 44
  11. 11. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Workshop Files http://ssrc.indiana.edu/seminars/wim.shtml 1 movie metadata.csv Simplified set from https://www.kaggle.com/deepmatrix/ imdb-5000-movie-dataset 2 LVS web site: https://languagevariationsuite.com 10 / 44
  12. 12. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Movie Data Budget Director Actor 1 Director facebook likes Actor 1 facebook likes Genre Year 11 / 44
  13. 13. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Language Variation Suite - Structure 1 Data Upload file, data summary, adjust data, cross tabulation 2 Visual Analysis Plotting, cluster classification 3 Inferential Statistics Modeling, regression, conditional trees, random forest 12 / 44
  14. 14. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Language Variation Suite - Structure 1 Data Upload file, data summary, adjust data, cross tabulation 2 Visual Analysis Plotting, cluster classification 3 Inferential Statistics Modeling, regression, conditional trees, random forest 12 / 44
  15. 15. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Upload File Upload movie metadata.csv 13 / 44
  16. 16. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Uploaded Dataset The data content is imported as a table and allows for sorting columns. 14 / 44
  17. 17. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Summary Summary provides a quantitative summary for each variable, e.g. frequency count, mean, median. 15 / 44
  18. 18. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Data Structure 1 Total number of observations (rows) 2 Number of variables (columns) 3 Variable types Factor - categorical values Num - numeric values (0.95, 1.05) Int - integer values (1, 2, 3) 16 / 44
  19. 19. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Cross Tabulation Cross-tabulation examines the relationship between variables. 17 / 44
  20. 20. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Cross Tabulation Plot 18 / 44
  21. 21. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Cross Tabulation Plot 18 / 44
  22. 22. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Adjusting Browser - Layout Shiny pages are fluid and reactive. 19 / 44
  23. 23. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Adjusting Browser - Layout 20 / 44
  24. 24. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Language Variation Suite - Structure 1 Data Upload file, data summary, adjust data, cross tabulation 2 Visual Analysis Plotting, cluster classification 3 Inferential statistics Modeling, regression, varbrul analysis, conditional trees, random forest 21 / 44
  25. 25. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Visual Analytics Visual Analytics: “The science of analytical reasoning facilitated by visual interactive interfaces” (Thomas et al. 2005) 22 / 44
  26. 26. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References One Variable Plot 23 / 44
  27. 27. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References One Variable Plot 23 / 44
  28. 28. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Customizing Plot 24 / 44
  29. 29. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Customizing Plot 24 / 44
  30. 30. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Saving Plot 25 / 44
  31. 31. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Cluster Plot Classification of data into sub-groups is based on pairwise similarities Groups are clustered in the form of a tree-like dendrogram 26 / 44
  32. 32. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Cluster Plot 27 / 44
  33. 33. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Cluster Plot Group 1 Animation, Biography and Group 2 Action, Drama, Comedy 28 / 44
  34. 34. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Inferential Statistics 29 / 44
  35. 35. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Language Variation Suite - Structure 1 Data Upload file, data summary, adjust data, cross tabulation 2 Visual Analysis Plotting, cluster classification 3 Inferential statistics Modeling, regression, varbrul analysis, conditional trees, random forest 30 / 44
  36. 36. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References How to Create a Regression Model Step 1 Modeling - create a model with dependent and independent variables Step 2 Regression - specify the type of regression (fixed, mixed) and type of dependent variable (binary, continuous, multinomial) Step 3 Stepwise Regression - compare models (Log-likelihood, AIC, BIC) Step 4 Conditional Trees - apply non-parametric tests to the model 31 / 44
  37. 37. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Modeling 32 / 44
  38. 38. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Regression Types Model a.) Fixed effect b.) Mixed effect - individual speaker/token variation (within group) Type of Dependent Variable a.) Binary/categorical (only two values) b.) Continuous (numeric) c.) Multinomial - categorical with more than two values 33 / 44
  39. 39. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Regression 34 / 44
  40. 40. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Model Output 35 / 44
  41. 41. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Model Output 35 / 44
  42. 42. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Interpretation: Budget and Genre Genre Action is the reference value Positive coefficient - positive effect Negative coefficient - negative effect http://www.free-online-calculator-use.com/scientific-notation-converter.html 36 / 44
  43. 43. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Interpretation: Budget and Genre Genre Action is the reference value Positive coefficient - positive effect Negative coefficient - negative effect http://www.free-online-calculator-use.com/scientific-notation-converter.html 36 / 44 exponential notation: 1.46e-7 0.000000146
  44. 44. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Conditional Tree Conditional tree: a simple non-parametric regression analysis, commonly used in social and psychological studies Linear regression: all information is combined linearly Conditional tree regression: visual splitting to capture interaction between variables Recursive splitting (tree branches) 37 / 44
  45. 45. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Conditional Tree 38 / 44
  46. 46. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Conditional Tree 38 / 44
  47. 47. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Conditional Tree 1 Genre is the significant factor for budget 2 Budget distribution is split in two groups: Action and Animation Biography, Comedy and Drama 3 Budget is significantly higher for Animation and Action 39 / 44
  48. 48. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Random Forest 1 Variable importance for predictors 2 Robust technique with small n large p data 3 All predictors considered jointly (allows for inclusion of correlated factors) 40 / 44
  49. 49. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Random Forest Let’s add more factors! Return to Modeling Add independent factors: director facebook likes, actor 1 facebook likes, title year 41 / 44
  50. 50. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Random Forest 42 / 44
  51. 51. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Random Forest Genre is the most important predictor for this model. Close to zero or red-dotted line (cut off values) - irrelevant for this model42 / 44
  52. 52. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References Let’s Have a Short Break 43 / 44
  53. 53. Introduction Data Preparation LVS Working with Data Visual Analytics Inferential Analysis References References I [1] Baayen, Harald. 2008. Analyzing linguistic data: A practical introduction to statistics. Cambridge: Cambridge University Press [2] Gries, Stefan Th. 2015. Quantitative designs and statistical techniques. In Douglas Biber Randi Reppen (eds.), The Cambridge Handbook of English Corpus Linguistics. Cambridge: Cambridge University Press [3] Schnapp, Jeffrey, and Peter Presner. 2009. Digital Humanities Manifesto 2.0. [4] http://gifsanimados.espaciolatino.com/x bob esponja 8.gif [5] https://daniellestolt.files.wordpress.com/2013/01/are-you-ready1.gif [6] http://www.martijnwieling.nl/R/sheets.pdf 44 / 44

×