IQSS Presentation to Program in Health Policy


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

IQSS Presentation to Program in Health Policy

  1. 1. Research Technology ConsultingSimo GoshevAlex StorerSteve WorthingtonIsta Zahnsupport@help.hmdc.harvard.edu
  2. 2. Consulting Goals Data analysis support and programming services Research project planning and guidance selecting appropriate technology for research projects Facilitating appropriate organization, storage and sharing of data Training on the use of both established software packages and emerging tools
  3. 3. Scope Free! Support the entire social science community Consults measured in hours rather than weeks or months Currently doing outreach to departments, student groups and centers Drop-ins on Fridays at 1pm in the training lab, Appointments, Help Tickets and casual chats in K306
  4. 4. Who WeScope Are
  5. 5. Simo Goshev  BA – Sofia, Bulgaria Applied Econometrics  MS – McMaster University Statistics  PhD – McMaster University EconomicsAnalysis: Tools: Econometrics Mainly Stata Applied Microeconometrics Some R Panel Data Applied statistics
  6. 6. Help with econometrics  What model is most suitable for my data on hospital IT innovation?  I am looking at HIV in children. Can you help me design an overlapping generations model?  Why are the confidence intervals of my spline of health care spending so wide/narrow?  Could the interaction between an exogenous and endogenous variable be exogenous?  I am looking for a way to compare survival between two cancer management programs. Can you help me?
  7. 7. Help with computation/estimation I am trying to estimate a model but for some reason the routine fails. Could you have a look at my script ? I am working with a large dataset and my machine is giving up on me. Do you have any suggestions? Which routine is best for…?
  8. 8. Replication study in health economics•Graduate Student •Make sense of a study and Stata code 1 1 .8 .8 .6 .6 .4 .4 .2 .2 65 70 75 80 65 70 75 80
  9. 9. Predictors of hospital IT adoption•Graduate Student, School of Public •Understand what factors facilitate/hinder Health adoption of IT in US hospitals Data:  Sample of hospitals clustered within states  Count of IT’s adopted by a hospital in 3 consecutive years Modeling strategy:  Three-level mixed effects model
  10. 10. Alex Storer  BS,BA - UC Berkeley Electrical Engineering & Computer Science, Cognitive Science  PhD – Boston University Cognitive & Neural SystemsAnalysis: Tools: Machine Learning Matlab, R, Python Signal Processing Emacs, LaTeX, Linux Surface Based Techniques Simulation Optimization
  11. 11. Text Analysis Topic Models Large corpus Prevalenc e of Sentiment certain terms
  12. 12. Text Analysis Twitter: #obamacare Positive/Ne gative Opinions?
  13. 13. Text Analysis Distinct Content Groupings Congress Speeches
  14. 14. Text Analysis NY Times Archive Term: "Medicare"
  15. 15. Text Analysis Topic Models  What models are appropriate to perform our analysis?  What software is appropriate? Prevalenc e of Sentiment certain terms
  16. 16. Text Analysis  Where do we obtain this corpus?  How do we pre-process it so we can analyze it? Large corpus
  17. 17. Federal Procurement Database
  18. 18. Federal Procurement Database Only first 500 hits, only a few columns All of the data, but…
  19. 19. Federal Procurement Database Download atom feeds Parse XML Tree structure Python! Search for union of entries Output as CSVFor 20gb of data, there is no way to download by hand…
  20. 20. Steve Worthington  BA / MS – Durham, UK Anthropology & Archeology  PhD – NYU Biological AnthropologyAnalysis: Tools: Linear models (OLS, GLS, PLS, etc.) Mainly R Resampling (permutation, bootstrap) Some SAS, SPSS Ordination (PCA, LDA, CVA, etc.)
  21. 21. Cleaning / reshaping data•Department of •171 files, 3 types (2 ascii •Parse messy data Economics text, 1 binary) into a long-format Stata•Daily Lat/Long data on •One file for each year data frame rainfall in India (1951 – (containing 365 daily 2007) matrices) June 21st 2007
  22. 22. Cleaning / reshaping data• No common delimiter (spaces and tabs)• Use regexp to parse each datum• Use template to place each datum into correct row/column Template
  23. 23. Cleaning / reshaping data Long format data frame in Stata Rainfall for each day and lat/long
  24. 24. Rainfall / CEO movie
  25. 25. Rainfall / CEO movie
  26. 26. Geospatial Analysis in R Spatial prediction: interpolation of data points Spatial autocorrelation analysis Drug resistant TB Moldova
  27. 27. Ista Zahn  BS – University of Oregon Psychology  PhD (ABD) – University of Rochester Social PsychologyAnalysis: Tools: Regression R, Stata, SAS, SPSS Mixed Models Emacs, LaTeX, Linux Scale Development
  28. 28. Workshops(schedule at
  29. 29. IQSS Services THE INSTITUTE FOR Quantitative Social Science at Harvard University
  30. 30. Contact Us!support@help.hmdc.harvard.edu, Room K306Fridays afternoons, K018