SlideShare a Scribd company logo
1 of 20
ASISTS in Context
Data Analysis is like cooking
• Steps
– Picking your dish
– Finding the right ingredients
– Cleaning your ingredients
– Preparation
– Using the right tools
– Finishing the dish
– Final touches
– Presenting the final product
2
Picking your dish/
Picking your question to analyze
• Is it an interesting question? Who is asking the question?
• Do you have the right data to answer these questions? (Tip:
it’s not just ASISTS)
• If you don’t have the data, do you know where to get it?
• Do you have the right tools to do the analyses?
• Who is your audience?
• When do you need to be done?
3
Sample questions
• Should my program focus on ABE or ESL?
• What can my program do to improve retention?
• How can you improve program performance?
4
Finding the ingredients (or identifying your
data sources)
• Questions to ask:
– Does the data have the right information (fields)?
– Do you know what each of the values in the relevant fields stand for?
– Is the time frame relevant to answering the question?
– Is it relevant to the geographical area for which you doing the analysis?
– How reliable is the data?
– Is it one data set or more than one?
– If multiple data sets, can you relate them?
– What are the privacy, legal and security concerns?
5
Accuracy/appropriateness of the data
• For each data element ask the following questions:
– Who collects this data?
– Why is this data being collected?
– Is there a reason for systematic bias in this data?
– Does this field contain a lot of missing data?
– Does this field contain a large number of outlier values?
– Does the data make sense?
6
Some data sources to consider
• ASISTS (https://www.asists.com)
• Census (www.census.gov)
• Immigration data (http://www.dhs.gov/office-immigration-
statistics)
• NAAL (http://nces.ed.gov/naal/)
• Other government open data projects
– Data.gov (http://www.data.gov/)
– NYC open data portal (https://nycopendata.socrata.com/)
– NYS open data portal (https://data.ny.gov/)
– Data from other government entities (example: School districts)
7
To look for in ASISTS
• Existing reports (with and without dissagregation)
• Downloads of existing reports
• Data downloads
• Reviewing data screens
8
Census data
• The Decennial Census
• The American Community Survey (ACS)
• The Current Population Survey (CPS)
• Survey of Income and Program Participation (SIPP)
• Statistics about governments
• Economic census
• The American Fact Finder (AFF)
9
The American Fact Finder
10
Cleaning your data/ingredients
• Watch out for
– Outliers and invalid values
– Number of records that make sense
• Simple methods for cleaning your data
– Sorting in spreadsheets
– Frequency counts
• Validate against other sources
11
A data cleaning exercise
• Cleaning up employment status data
– Download ASISTS student data
– Do percentages of student with different status
– Compare to employment statistics for your area
– Talk to the manager and intake staff responsible for collecting data
12
The tools of the trade
• Microsoft Excel
• Access
• For advanced statistics, R, SPSS, SAS
• Census and other data rich web sites
• Google maps
• ARC GIS for mapping
• Other Google tools (Fusion, Ngage)
13
Preparing your data
• To get your data into the right format for analysis
• Recode
• Sort
• Group
• Deleting unnecessary data
• Removing duplications
• Delete blank rows
14
Finishing your dish/analyzing the data
• What do you want to say about the data?
• How do you want to say it?
• What analyses are most appropriate to answer your
questions? (Tip: you don’t have to be a statistician to do
good data analysis)
• How do you want to present your data?
15
Presenting your data
• Presentation is everything!
• Talking about your result the right way is as important as
using the right tables and charts
• Tools
– Excel
– PowerPoint
– Google Fusion tables
• Don’t over generalize!
16
Data
Analysis
as
iterative
Asking the
Question
Refining the
Question
Locating the
data
Cleaning
data
Analysis
Presentation
17
18
The adult ed data blog
• www.adultedgps.blogspot.com
– Regular posts of data analyses and policy updates
– Policy and data related tweets
– Searchable
– Downloadable presentations
19
Contact Info:
Venu Thelakkat
venut@lacnyc.org
Adultedgps.blogspot.com
20

More Related Content

What's hot

Ariana Richert Engineering Resume Fall 2016
Ariana Richert Engineering Resume Fall 2016Ariana Richert Engineering Resume Fall 2016
Ariana Richert Engineering Resume Fall 2016
Ariana Richert
 
Yichi Chen Resume
Yichi Chen ResumeYichi Chen Resume
Yichi Chen Resume
Yichi Chen
 
Frontiers of Open Data Science Research
Frontiers of Open Data Science ResearchFrontiers of Open Data Science Research
Frontiers of Open Data Science Research
odsc
 
Intern Expo_Olivia Wright (1)
Intern Expo_Olivia Wright (1)Intern Expo_Olivia Wright (1)
Intern Expo_Olivia Wright (1)
Olivia Wright
 
Introducing SPSS customer overview
Introducing SPSS customer overviewIntroducing SPSS customer overview
Introducing SPSS customer overview
ebuc
 

What's hot (20)

Ariana Richert Engineering Resume Fall 2016
Ariana Richert Engineering Resume Fall 2016Ariana Richert Engineering Resume Fall 2016
Ariana Richert Engineering Resume Fall 2016
 
Research Data Management
Research  Data ManagementResearch  Data Management
Research Data Management
 
CIRPA 2016: Individual Level Predictive Analytics for Improving Student Enrol...
CIRPA 2016: Individual Level Predictive Analytics for Improving Student Enrol...CIRPA 2016: Individual Level Predictive Analytics for Improving Student Enrol...
CIRPA 2016: Individual Level Predictive Analytics for Improving Student Enrol...
 
Data Analyst Job Description | Edureka
Data Analyst Job Description | EdurekaData Analyst Job Description | Edureka
Data Analyst Job Description | Edureka
 
Aggregate Data
Aggregate DataAggregate Data
Aggregate Data
 
Vivek Adithya Mohankumar - Resume
Vivek Adithya Mohankumar - ResumeVivek Adithya Mohankumar - Resume
Vivek Adithya Mohankumar - Resume
 
Yichi Chen Resume
Yichi Chen ResumeYichi Chen Resume
Yichi Chen Resume
 
Frontiers of Open Data Science Research
Frontiers of Open Data Science ResearchFrontiers of Open Data Science Research
Frontiers of Open Data Science Research
 
ChenxiWangRsume
ChenxiWangRsumeChenxiWangRsume
ChenxiWangRsume
 
PSPP overview and Introduction to R & R Commander
PSPP overview and Introduction to R & R CommanderPSPP overview and Introduction to R & R Commander
PSPP overview and Introduction to R & R Commander
 
Intern Expo_Olivia Wright (1)
Intern Expo_Olivia Wright (1)Intern Expo_Olivia Wright (1)
Intern Expo_Olivia Wright (1)
 
Data Analyst Roles & Responsibilities | Edureka
Data Analyst Roles & Responsibilities | EdurekaData Analyst Roles & Responsibilities | Edureka
Data Analyst Roles & Responsibilities | Edureka
 
Data Literacy for Librarians
Data Literacy for LibrariansData Literacy for Librarians
Data Literacy for Librarians
 
NAWResume
NAWResumeNAWResume
NAWResume
 
Introducing SPSS customer overview
Introducing SPSS customer overviewIntroducing SPSS customer overview
Introducing SPSS customer overview
 
Vishal Mhadeshwar_CV
Vishal Mhadeshwar_CVVishal Mhadeshwar_CV
Vishal Mhadeshwar_CV
 
Filling the gaps in translational research
Filling the gaps in translational researchFilling the gaps in translational research
Filling the gaps in translational research
 
Leave no research data behind: unlocking the potential of every byte
Leave no research data behind: unlocking the potential of every byteLeave no research data behind: unlocking the potential of every byte
Leave no research data behind: unlocking the potential of every byte
 
BAS 250 Lecture 2
BAS 250 Lecture 2BAS 250 Lecture 2
BAS 250 Lecture 2
 
Big data
Big dataBig data
Big data
 

Similar to Asists in context nyacce 2013

Introducition to Data scinece compiled by hu
Introducition to Data scinece compiled by huIntroducition to Data scinece compiled by hu
Introducition to Data scinece compiled by hu
wekineheshete
 
Data collection methods
Data collection methodsData collection methods
Data collection methods
ashima_sodhi
 

Similar to Asists in context nyacce 2013 (20)

Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big Data
 
Analyzing and Interpreting Data statippt
Analyzing and Interpreting Data statipptAnalyzing and Interpreting Data statippt
Analyzing and Interpreting Data statippt
 
Introducition to Data scinece compiled by hu
Introducition to Data scinece compiled by huIntroducition to Data scinece compiled by hu
Introducition to Data scinece compiled by hu
 
Data in the HS Classroom: When, Why, and How?
Data in the HS Classroom: When, Why, and How?Data in the HS Classroom: When, Why, and How?
Data in the HS Classroom: When, Why, and How?
 
Presentation final.pptx
Presentation final.pptxPresentation final.pptx
Presentation final.pptx
 
Learning Analytics Primer: Getting Started with Learning and Performance Anal...
Learning Analytics Primer: Getting Started with Learning and Performance Anal...Learning Analytics Primer: Getting Started with Learning and Performance Anal...
Learning Analytics Primer: Getting Started with Learning and Performance Anal...
 
power_of_data-dm_panel
power_of_data-dm_panelpower_of_data-dm_panel
power_of_data-dm_panel
 
Altron presentation on Emerging Technologies: Data Science and Artificial Int...
Altron presentation on Emerging Technologies: Data Science and Artificial Int...Altron presentation on Emerging Technologies: Data Science and Artificial Int...
Altron presentation on Emerging Technologies: Data Science and Artificial Int...
 
Finding Meaning in the Numbers
Finding Meaning in the NumbersFinding Meaning in the Numbers
Finding Meaning in the Numbers
 
Data collection methods
Data collection methodsData collection methods
Data collection methods
 
Dealing with incomplete data for mapping and spatial analysis
Dealing with incomplete data for mapping and spatial analysisDealing with incomplete data for mapping and spatial analysis
Dealing with incomplete data for mapping and spatial analysis
 
Digital Economics
Digital EconomicsDigital Economics
Digital Economics
 
Foundational Strategies for Trust in Big Data Part 2: Understanding Your Data
Foundational Strategies for Trust in Big Data Part 2: Understanding Your DataFoundational Strategies for Trust in Big Data Part 2: Understanding Your Data
Foundational Strategies for Trust in Big Data Part 2: Understanding Your Data
 
Rsearch methodology
Rsearch methodologyRsearch methodology
Rsearch methodology
 
Machine Learning using Big data
Machine Learning using Big data Machine Learning using Big data
Machine Learning using Big data
 
Creating Interactive Dashboards with Microsoft Excel
Creating Interactive Dashboards with Microsoft ExcelCreating Interactive Dashboards with Microsoft Excel
Creating Interactive Dashboards with Microsoft Excel
 
BI-Analytics-Overview.pptx
BI-Analytics-Overview.pptxBI-Analytics-Overview.pptx
BI-Analytics-Overview.pptx
 
Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)
 
Transform Your Downstream Cloud Analytics with Data Quality 
Transform Your Downstream Cloud Analytics with Data Quality Transform Your Downstream Cloud Analytics with Data Quality 
Transform Your Downstream Cloud Analytics with Data Quality 
 
Role of Statistics In Research.pptx
Role of Statistics In Research.pptxRole of Statistics In Research.pptx
Role of Statistics In Research.pptx
 

Asists in context nyacce 2013

  • 2. Data Analysis is like cooking • Steps – Picking your dish – Finding the right ingredients – Cleaning your ingredients – Preparation – Using the right tools – Finishing the dish – Final touches – Presenting the final product 2
  • 3. Picking your dish/ Picking your question to analyze • Is it an interesting question? Who is asking the question? • Do you have the right data to answer these questions? (Tip: it’s not just ASISTS) • If you don’t have the data, do you know where to get it? • Do you have the right tools to do the analyses? • Who is your audience? • When do you need to be done? 3
  • 4. Sample questions • Should my program focus on ABE or ESL? • What can my program do to improve retention? • How can you improve program performance? 4
  • 5. Finding the ingredients (or identifying your data sources) • Questions to ask: – Does the data have the right information (fields)? – Do you know what each of the values in the relevant fields stand for? – Is the time frame relevant to answering the question? – Is it relevant to the geographical area for which you doing the analysis? – How reliable is the data? – Is it one data set or more than one? – If multiple data sets, can you relate them? – What are the privacy, legal and security concerns? 5
  • 6. Accuracy/appropriateness of the data • For each data element ask the following questions: – Who collects this data? – Why is this data being collected? – Is there a reason for systematic bias in this data? – Does this field contain a lot of missing data? – Does this field contain a large number of outlier values? – Does the data make sense? 6
  • 7. Some data sources to consider • ASISTS (https://www.asists.com) • Census (www.census.gov) • Immigration data (http://www.dhs.gov/office-immigration- statistics) • NAAL (http://nces.ed.gov/naal/) • Other government open data projects – Data.gov (http://www.data.gov/) – NYC open data portal (https://nycopendata.socrata.com/) – NYS open data portal (https://data.ny.gov/) – Data from other government entities (example: School districts) 7
  • 8. To look for in ASISTS • Existing reports (with and without dissagregation) • Downloads of existing reports • Data downloads • Reviewing data screens 8
  • 9. Census data • The Decennial Census • The American Community Survey (ACS) • The Current Population Survey (CPS) • Survey of Income and Program Participation (SIPP) • Statistics about governments • Economic census • The American Fact Finder (AFF) 9
  • 10. The American Fact Finder 10
  • 11. Cleaning your data/ingredients • Watch out for – Outliers and invalid values – Number of records that make sense • Simple methods for cleaning your data – Sorting in spreadsheets – Frequency counts • Validate against other sources 11
  • 12. A data cleaning exercise • Cleaning up employment status data – Download ASISTS student data – Do percentages of student with different status – Compare to employment statistics for your area – Talk to the manager and intake staff responsible for collecting data 12
  • 13. The tools of the trade • Microsoft Excel • Access • For advanced statistics, R, SPSS, SAS • Census and other data rich web sites • Google maps • ARC GIS for mapping • Other Google tools (Fusion, Ngage) 13
  • 14. Preparing your data • To get your data into the right format for analysis • Recode • Sort • Group • Deleting unnecessary data • Removing duplications • Delete blank rows 14
  • 15. Finishing your dish/analyzing the data • What do you want to say about the data? • How do you want to say it? • What analyses are most appropriate to answer your questions? (Tip: you don’t have to be a statistician to do good data analysis) • How do you want to present your data? 15
  • 16. Presenting your data • Presentation is everything! • Talking about your result the right way is as important as using the right tables and charts • Tools – Excel – PowerPoint – Google Fusion tables • Don’t over generalize! 16
  • 17. Data Analysis as iterative Asking the Question Refining the Question Locating the data Cleaning data Analysis Presentation 17
  • 18. 18
  • 19. The adult ed data blog • www.adultedgps.blogspot.com – Regular posts of data analyses and policy updates – Policy and data related tweets – Searchable – Downloadable presentations 19

Editor's Notes

  1. The real title is ‘beating a metaphor to death’
  2. Do a quick check to see who likes cookingFollow this exercise through to ask things like:How do you decide what to cook?- How do you decide what to cook with ? (with what you have, or what you want to cook?)- You have to clean your ingredients- You have to prepare the ingredients-
  3. Is it the person analyzijng the data’s interest? Or is it a boss’s interest? Or a funder?Talk through how to scan for data.Depending on resources/time, can collect data either formally or informally.Sometimes you have to be realistic about getting an analysis done in the available time. People who ask for data sometimes have unrealistic expectations. Sometimes you just don’t have data to answer certain questions? Example how many GED ® teachers are in New York State?
  4. Ask for other examples
  5. The right fields means not only looking at the labels but how the information is coded. For example if you want to look at Afro Carribean men, it does not help if the data is just coded as African American only. Do you have access to a data dictionary?How fresh is the data? Use the Great Cities example of using NALS data (more relevant but not very recent and not available at the local level) Also Census data one year surveys not available for small localities
  6. Is the person collecting the data biased towards making the data look a certain way? Use example of employment status in NRS ASISTS data. Contrast employment statistics with actual program dataFor missing data , use the example of zip codesFor outliers, use hours data
  7. Qualifiers for using ASISTS data
  8. The decennial census is conducted once every 10 yearsThe ACS is conducted every year but the data is about one to two years old but covers a lot of the information in which we are usually interestedThe CPS handles economic statistics and is used for calculating the unemployment rateThe SIPP is a longitudinal survey, the only longitudinal major program that the Cenus operates regularly. Government statistics covers government structure, processes and spendingEconomic census covers private sectorGo through the American Fact Finder
  9. You use the American Fact Finder to run customized queries, but you don’t always need to
  10. Put the example of employment status
  11. Talk about advantages of Excel and Access, compare and contrastR is free SPSS and SAS are not.