ICPSR Data Services at Kenyon College - David Thomas


Published on

Presentation on the ICPSR data archive at Kenyon College, March 2014

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • As of September 2012, over 68,700 datasets (over 585,000 files) available for download. As a sense of volume of downloads, total downloads for FY 2012 = over 1,172,304 datasets downloaded/accessed (4,765,641).Also in FY2012 – about 35,345 (19,600 members) MyData accounts downloaded/accessed something – were active.
  • ICPSR supports students, faculty, researchers, and policymakers.
  • As you seen, ICPSR doesn’t just deliver data. We surround that data with tools and services that support its use and interpretation.
  • What’s in the collection?Resources using data in the ICPSR holdings as the primary data sourceResources using ICPSR data in a comparison with the primary dataset investigatedResources "about" an ICPSR dataset or study series.
  • Know of reports, articles, publications connected to our data? Contact us!
  • We have several different use cases for our citation search.- Users who are looking for facts/tables, rather than raw data- Users who want to see how a particular dataset has been utilized- Researchers and funding agencies who want to gauge the impact of a particular study by seeing how much it has been citedThe citation search is a bit limited in that it only searches the citation, not the full text. Due to recent court rulings on the Google copyright case that favor indexing copyrighted content and providing snippets to users on the search results page, we are investigating the possibility of expanding this service to include full text.One of the strengths of this utility is that we provide links to full text whenever possible. We retain DOIs when we can locate them, and we provide dynamic links to WorldCat and Google Scholar if we don’t know the specific location of the article/report, making it that much easier for users to get to the full text.
  • Our study search indexes the metadata record that ICPSR creates, as well as the full text of the documentation files, including all the variable markup and question and answer text. In the study search, the metadata record is heavily weighted, especially the study title and the subject terms.In practice, we encounter three typical search behaviors from our users that dictate which search options best meet the user’s needs.
  • To better understand user search behaviors, we're going to make a change to the Find & Analyze Data site to get a feel for the use of these three styles. We’re adding a short survey that changes search behavior based upon the user’s selection and displays relevant search tips. In addition, those choices will be recorded in Google Analytics so we can see what percentage of our users favors each style. It is our hope that in six months or so we’ll have the data to know where to focus our efforts to improve the search.[Briefly demonstrate how the survey choices cause different search tips to appear. Explain that the second option changes the sort to “variable relevance.”]
  • SOLR has natural language search capabilities. For some users, finding the right dataset is as simple as typing in the research question. Unfortunately, bad search engines are so common that users seldom provide more than one or two query terms, on the assumption that more query terms would narrow search results to nothing. The challenge for us is educating our users and making sure our metadata is sufficient to the task.To provide an example, let’s do a natural language search on the NACJD website using a simple research question, such as “Does juvenile drug use lead to delinquency?” The results are pretty good, and tend to be better than concise keyword searches, such as “juvenile ‘drug use’ delinquency,” which results in studies that only grab two of the three factors or focus on prediction of delinquency instead of delinquency itself. If I searched for “juvenile drug use” as a phrase, I’d actually get only one result. That’s one of the big limitations of phrase searching.[go to the front page of NACJD and search on the question “Does juvenile drug use lead to delinquency?”]
  • One relatively recent addition to our website is the variable relevance sort. When you do a search and then sort by variable relevance, we weight the variables more heavily and display matching variables on the search results page, along with instructions that the user should separate different concepts with commas. This makes it relatively easy for a user to find a dataset that has a specific combination of variables.The variable display also includes checkboxes beside each variable and we provide a “compare variable” utility that allows users to view selected variables side by side.[Do search on age gender race then sort by Variable Relevance. Use the tool to add commas. Briefly demo the compare variables function.]
  • If a user is searching for a particular investigator, the search is relatively easy, unless the person’s name is very common. (E.g., Smith.) Our site offers a “Browse by Author” page to make the task somewhat easier, and to allow users to see variations on the name if they run into trouble. Site visitors can use quotation marks for phrase searching, but this can be problematic if the user is mistaken about the word order or gets a word in the title wrong. For example, “study” versus “survey.” In general, searching by title is very effective and we don’t need to provide the user with additional guidance.
  • The new search function is pretty simple and straight-forward. When searching for studies that have specific variables, separate your concepts with commas and choose “Variable Relevance” as your sort option on the search results screen. If you have comments or suggestions, please feel free to email web-support@icpsr.umich.edu.
  • ICPSR Data Services at Kenyon College - David Thomas

    1. 1. ICPSR Data Services @ Kenyon College Frederique Laubepin, Instructional Resources David Thomas, Resource Center for Minority Data
    2. 2. Agenda • Intro of the day and agenda • ICPSR in the Kenyon Classroom – Professor Corker – Professor Kohlman • ICPSR Background • Searching ICPSR Holdings • ICPSR and data in the classroom • Questions?
    3. 3. • One of the world’s oldest and largest social science data archives, est. 1962 • Data distributed on punch cards, then reel- to-reel tape, now: – Data available on demand – Over 8,200 studies with over 68,700 data sets • Membership organization among 21 universities, now: – Currently about 735 members world-wide – Federal funding of public collections What is ICPSR? - Then and Now -
    4. 4. What We Do – It’s About Data! • Seek research data and pertinent documents from researchers (PIs, research agencies, government) • Process and preserve the data and documents • Disseminate data • Provide education, training, & instructional resources
    5. 5. Why People Use ICPSR • Write articles, papers, or theses using real research data • Conduct secondary research to support findings of current research or to generate new findings • Use as intro material in grant proposals • Preserve/disseminate primary research data – Fulfill data management plan (grant) requirements • Study or teach quantitative methods
    6. 6. “Shopping” for Data: The MyData Account • MyData account – operates as authentication and like a shopping cart! • Authenticate once every six months on campus and you can carry it with you
    7. 7. Supporting the Data • Free user support • The HELP Page offers: – User support (at ICPSR) email and phone contact information – Data User Help Center: Short Tutorials & Webinars available 24/7 – Local Support: Who to contact at your local institution – Glossary of Terms – Social Networks: Where you can find us on YouTube, Facebook, Twitter, Slideshare, and more
    8. 8. It’s really a searchable database . . . containing over 62,500 citations of known published and unpublished works resulting from analyses of data archived at ICPSR . . .that can generate study bibliographies associating each study with the literature about it . . . Included in the integrated search on the ICPSR Web site The Bibliography of Data-related Literature
    9. 9. Demonstrating the Impact of Research
    10. 10. Assessing the data in the collection • Searching for and Downloading data • Simple crosstab • Codebook • Full Descriptives • Online Analysis Functionality
    11. 11. Study Search Behaviors In practice, we encounter three typical search behaviors from our users: • A user has a research question in mind. • A user is looking for a dataset that contains specific variables. • A user is looking for a specific dataset and has the study title or investigator name.
    12. 12. Natural Language Searching • Does juvenile drug use lead to delinquency? • juvenile “drug use” delinquency • “juvenile drug use” delinquency
    13. 13. Searching by Variable/Concept
    14. 14. Searching for a Specific Dataset
    15. 15. Search Conclusion •Sorting by “Variable Relevance” – ranks the variable text (questions, labels, categories) highly so that top results contain the variable concepts – displays matching variables on the search results screen – allows you to check variables and compare them side-by-side – provides direct links to the full variable description
    16. 16. •Separating search terms with commas treats each as a distinct variable/concept. –drug abuse, gender, race –newspaper in home, voting, party affiliation
    17. 17. Find Data
    18. 18. Simple Crosstab
    19. 19. Compare Variables
    20. 20. Crosstab Creator
    21. 21. Crosstab Assignment Builder
    22. 22. Online Analysis
    23. 23. For More Info: • Explore the website - www.icpsr.umich.edu • Sign up for our email announcements - www.icpsr.umich.edu/icpsrweb/membership/lists/index.jsp • “Like” ICPSR on Facebook/follow ICPSR on Twitter • Attend or view our webinars (open to the public!) • Find our presentations on www.slideshare.net – user: icpsr • Contact user support – netmail@icpsr.umich.edu