Don’t Know/Refused:
A Researcher’s Perspective on
HMIS Data Collection Challenges
By Susan Walker
Rhode Island Coalition for the Homeless
Bowman Systems Collaborate 2016
What do I Know About Homelessness and Data?
 Occupy Providence 2011-2012, back when I used to call
constituents “friends”.
 Antiques Dealer since 1998, alongside homeless and
marginalized people.
 Studied Medical Billing & Coding at a for-profit online
university. Learned about records and predatory lending.
 Currently enrolled at Brown School of Public Health,
learning and doing data collection and analysis.
 Recently hired by my dream employer.
 I’m new! If you pay close attention, you’ll probably catch
my mistakes!
Data
Collection
Concepts
 Hard to Reach Population
 Question Ordering
 Coding
 Skip Patterns
 Prompting with Answer Choices
 Probing
 Look backs
 Data Collection Points
 Non-Verbal Communication
 Interviewer Bias
 Client disposition/Setting
 Sample Size
 Bias
 ERROR!
Personality
Data Quality is influenced by the
personalities of…
 The HMIS end user
 The client
 The provider’s office culture
 The funding managers
 Government officials
 The HMIS administrators
 Variable itself
Form Versus
Questionnaire
 The HUD Intake form has issues.
 You ask a client for their name.
 You don’t ask a client for their name data quality.
 “Has the client been continuously homeless (i.e., on the street, in an emergency
shelter, or safe haven) for at least one year? … Still on the HUD Template.
 Our Entry in HMIS is in a very different order.
 A form gives the impression that the data is somehow uniform, and fits easily
into the boxes.
 A questionnaire, with instructions, communicates that gathering data requires
conversation, and entering data requires judgment.
Entry Date
 The Entry Date and data entry date are
often different: a source of error.
 There can be overlapping ES entries on
the same date.
 Backdating is great, but can lead to
absurdities.
 I emphasize the importance of the
relationship between date and data:
 By entering a date, they are
confirming that all the data in the
record is true.
 If they haven’t taken care to verify
it, there is a problem.
Name
 Street Names… Duplicate Clients
 Data Quality Descriptors?
 We have found out that we sometimes
have 2 clients to one ID because they
have the same name.
 We train Case Managers to enter a name
data quality descriptor, but not to ask
about it.
 It is unclear which HMIS Data Elements
need to be asked of the client and which
do not.
ALIAS
Social Security
Number
 There is a disincentive for
sharing this data.
 HUD’s rationale:
unduplicated client counts,
and SSN necessary for
applications.
 Data Quality Descriptor errors: If the Case Manager enters “doesn’t know” or
“refused”, then the blank SSN shouldn’t flag.
 Some shelters were using 4 digits as a matter of procedure.
 In the event of a security breach, our clients would be vulnerable.
Date of Birth
 Date of Birth is a variable with personality!
 It’s a number does not express a quantity.
 We use it to calculate age.
 If it’s not entered properly, it’s hard to calculate age.
 This is about the only trouble free variable.
 Our case managers find “Date of Birth Type” confusing, which is about the only
thing I would consider changing in ServicePoint.
Veteran Status
• Self report might be in conflict with actual Veteran Status.
• Often missing, but I don’t understand the skip pattern.
• Why do you think people skip this question?
• In Rhode Island, Veteran Status as defined in the data standards would exclude
people from programs.
• We train to accept self report. The VA verifies later on.
Primary Race
 Frustration with these categories is widespread.
 Positive –They allow for meaningful comparison with all other race
data.
 Negative- Our clients do not like categorizing themselves this way.
 The HMIS Race Data Standards are based on the OMB’s Standards.
 Unless the Census changes, these categories are unlikely to change.
Secondary Race
 “Allow clients to identify as many
racial categories as apply (up to 5).”
 The 2000 Census was the first to
offer more than one race category,
with a view to understanding
multi-racial populations.
 The secondary race option is a good faith effort to assess racial diversity.
 Stay tuned for revised OMB standards which will likely impact HUD data
standards.
Ethnicity
 “In the 2010 census, about 20 percent of
Hispanics left the race question completely
blank.”
 Ethnicity is about culture, customs and
language.
 Different races can have the same ethnicity.
People of one race can be many different
ethnicities.
 We discovered in training that people
answer “Latino” for race, and trained to
enter the ethnicity and probe for primary
and secondary race.
Gender
 A simple, yet awkward question.
 Prefacing the whole interview with “I need to
ask every question, even though some may seem
obvious” can help.
 The data standards changed recently.
 It’s highly likely that case managers nationwide
answer this question without asking clients.
Disabling Condition
 We can’t decide where to put this question.
 Personality: Often in conflict with Disability HUD Verifications.
 Case managers report hearing “I don’t know”.
 Case managers need training on how to probe by specific conditions.
 How do you ask, and preserve the dignity of the client?
 Disability is a huge term, but the HUD criteria are specific.
 There’s a conflict if your program prohibits asking about disability before
housing placement.
 Confusion surrounding the “self report” standard, and the documentation
required for housing eligibility.
Residence Prior to Project Entry
 It’s not clear from HMIS that this needs to be
updated every entry.
 It often does not get updated every entry.
 Our Case Managers want a “getting evicted” option.
 We need an error report that shows that this was
not updated at entry.
 HMIS Data Standards 5.1 devotes 10 pages to this
element.
 New Option “Interim Housing”
 It can be very challenging to code this information
properly even if the client is properly interviewed.
Length of Time
 We tried to put this into plain English.
 This field is only as good as the
previous field.
 It would be nice if error reports would
flag when these fields are neglected.
 There is no way to match the length of time to the previous place to see that the
length of time actually corresponds to the previous place.
Head of Household
 Frequently skipped.
 What household?
 Even our plain English questions do not make sense in most settings.
 Long instructions in the Data Standards.
 Skipping seems to be associated with the awkwardness of the question, and the
amount of judgment needed to answers the question.
 In our COC many case managers enter the client with the disability as HoH.
 After that, we select the woman before the man, as men commonly exit PH-FAM.
 Personality-How case managers answer this question reflects uncomfortable
truths about our society.
Entering from Street or Emergency Shelter
 Our data completeness is good with
this field.
 No way of knowing if it is updated at
each entry.
 Length of Time Homeless is an
extremely important metric.
 We don’t have Safe Havens in RI.
 “Look Back” data is challenging.
How Many Times?
 Our data is complete for this field.
 Hard to know if it’s accurate.
 In training we discovered that more than
one case manager would count 4 years as 4
times.
 We trained them to count the breaks.
 We had a case manager act like a client, and he did a fantastic job
of not answering any question straight.
 We can train to collect and enter data, but training in judgment
is very difficult.
Total Number of Months 1 month, ah ah ah…
2 months ah ah ah…
1 day = 1 month???
Income
 “Projects collecting data through client
interviews should ask clients whether they
receive income from each of the sources
listed rather than asking them to state the
sources of income they receive.”
 We struggle to get PH providers to update
this information annually.
 Our case managers are not entering
“Under the Table” income.
 We are encouraging copious use of
Updates so that the data is already there at
Annual Assessment.
 Clients are not sure what the “right”
answer is.
Non-Cash Benefits
 TANF: A variable with an alias.
 Some benefits have amounts and some
don’t.
 Plagued by duplicates.
 I used this screenshot in a newsletter to
show what perfect end dates look like.
 We are training that it is ok to use files
and case notes to make these updates in
the Annual Assessment.
Health Insurance
Domestic Violence
 The accuracy of this data element is entirely
dependent on the situation.
 What do you do when a client answered YES at
an earlier entry, and answers NO at a current
entry interview?
 People enter NO, then enter “Client doesn’t
know”
 The purpose of this element is to determine if
the client needs protection: a unique personality
trait.
 The data standards have a specific interview
protocol, unique to this element.
 Also unique data entry protocols.
Is the Client Permanently Housed?
• Personality: Answered in error
all over our HMIS.
• Too much data rather than not
enough.
• A cut and dried answer that is
easy to determine.
• It seems like it should be more
important than it is.
• A misunderstood variable. It does
not serve the purpose people
think it does.
Error(s)
• Error and analysis are two sides of
the same concept.
• Individual errors are addressed.
• I don’t know if Confidence Intervals
and Standard Deviations are
accounted for in ART Reports.
• Our data is used for eligibility,
funding, reporting and
understanding, but the limitations
of the data are rarely discussed.
• A calculation for average Length of
Time Homeless needs a Confidence
Interval.
Validity and Reliability
Motivation &
Data Quality
 Winning buy-in from end users is probably the biggest challenge.
 Federal and local funding depend on data.
 Eligibility depends on data.
 Providing the right services depends on data.
 Entering high quality data makes you a “good citizen of the HMIS”.
 HMIS gives voice to people who have no voice.
 HMIS tells their story, and people really do listen.
References
Bunzli, L. (2016). The Definition of Chronic Homelessness: Applications and Implications for Policy and Practice Part II:
Mixed Methods Study.
Cohn, D. (2015). Census Considers New Approach to Asking About Race - By Not Using the Term at All. Pew Research
Center. Retrieved from http://www.pewresearch.org/fact-tank/2015/06/18/census-considers-new-approach-to-asking-
about-race-by-not-using-the-term-at-all/
HUD. (2014). HMIS Data Collection Template for Project ENTRY – CoC Program.
HUD. (2016). 2014 HMIS Data Standards Data Manual . HUD.
Matt Warfield, B. D. (2016). How has the ACA Medicaid Expansion Affected Providers Serving the Homeless Population:
Analysis of Coverage,Revenues, and Costs. Retrieved from http://kff.org/medicaid/issue-brief/how-has-the-aca-medicaid-
expansion-affected-providers-serving-the-homeless-population-analysis-of-coverage-revenues-and-costs/
Prewitt, K. (2013, August 21). Fix the Census’ Archaic Racial Categories. The New York Times. Retrieved from
http://www.nytimes.com/2013/08/22/opinion/fix-the-census-archaic-racial-categories.html?_r=0
Image Credits
Adams, S. Dilbert.com. Retrieved from http://dilbert.com/terms/.
Hargreaves, R. Mr. Men. Fabbri.
Sesame Street.
Questions?

Dont Know Refused

  • 1.
    Don’t Know/Refused: A Researcher’sPerspective on HMIS Data Collection Challenges By Susan Walker Rhode Island Coalition for the Homeless Bowman Systems Collaborate 2016
  • 2.
    What do IKnow About Homelessness and Data?  Occupy Providence 2011-2012, back when I used to call constituents “friends”.  Antiques Dealer since 1998, alongside homeless and marginalized people.  Studied Medical Billing & Coding at a for-profit online university. Learned about records and predatory lending.  Currently enrolled at Brown School of Public Health, learning and doing data collection and analysis.  Recently hired by my dream employer.  I’m new! If you pay close attention, you’ll probably catch my mistakes!
  • 3.
    Data Collection Concepts  Hard toReach Population  Question Ordering  Coding  Skip Patterns  Prompting with Answer Choices  Probing  Look backs  Data Collection Points  Non-Verbal Communication  Interviewer Bias  Client disposition/Setting  Sample Size  Bias  ERROR!
  • 4.
    Personality Data Quality isinfluenced by the personalities of…  The HMIS end user  The client  The provider’s office culture  The funding managers  Government officials  The HMIS administrators  Variable itself
  • 5.
    Form Versus Questionnaire  TheHUD Intake form has issues.  You ask a client for their name.  You don’t ask a client for their name data quality.  “Has the client been continuously homeless (i.e., on the street, in an emergency shelter, or safe haven) for at least one year? … Still on the HUD Template.  Our Entry in HMIS is in a very different order.  A form gives the impression that the data is somehow uniform, and fits easily into the boxes.  A questionnaire, with instructions, communicates that gathering data requires conversation, and entering data requires judgment.
  • 6.
    Entry Date  TheEntry Date and data entry date are often different: a source of error.  There can be overlapping ES entries on the same date.  Backdating is great, but can lead to absurdities.  I emphasize the importance of the relationship between date and data:  By entering a date, they are confirming that all the data in the record is true.  If they haven’t taken care to verify it, there is a problem.
  • 7.
    Name  Street Names…Duplicate Clients  Data Quality Descriptors?  We have found out that we sometimes have 2 clients to one ID because they have the same name.  We train Case Managers to enter a name data quality descriptor, but not to ask about it.  It is unclear which HMIS Data Elements need to be asked of the client and which do not. ALIAS
  • 8.
    Social Security Number  Thereis a disincentive for sharing this data.  HUD’s rationale: unduplicated client counts, and SSN necessary for applications.  Data Quality Descriptor errors: If the Case Manager enters “doesn’t know” or “refused”, then the blank SSN shouldn’t flag.  Some shelters were using 4 digits as a matter of procedure.  In the event of a security breach, our clients would be vulnerable.
  • 9.
    Date of Birth Date of Birth is a variable with personality!  It’s a number does not express a quantity.  We use it to calculate age.  If it’s not entered properly, it’s hard to calculate age.  This is about the only trouble free variable.  Our case managers find “Date of Birth Type” confusing, which is about the only thing I would consider changing in ServicePoint.
  • 10.
    Veteran Status • Selfreport might be in conflict with actual Veteran Status. • Often missing, but I don’t understand the skip pattern. • Why do you think people skip this question? • In Rhode Island, Veteran Status as defined in the data standards would exclude people from programs. • We train to accept self report. The VA verifies later on.
  • 11.
    Primary Race  Frustrationwith these categories is widespread.  Positive –They allow for meaningful comparison with all other race data.  Negative- Our clients do not like categorizing themselves this way.  The HMIS Race Data Standards are based on the OMB’s Standards.  Unless the Census changes, these categories are unlikely to change.
  • 12.
    Secondary Race  “Allowclients to identify as many racial categories as apply (up to 5).”  The 2000 Census was the first to offer more than one race category, with a view to understanding multi-racial populations.  The secondary race option is a good faith effort to assess racial diversity.  Stay tuned for revised OMB standards which will likely impact HUD data standards.
  • 13.
    Ethnicity  “In the2010 census, about 20 percent of Hispanics left the race question completely blank.”  Ethnicity is about culture, customs and language.  Different races can have the same ethnicity. People of one race can be many different ethnicities.  We discovered in training that people answer “Latino” for race, and trained to enter the ethnicity and probe for primary and secondary race.
  • 14.
    Gender  A simple,yet awkward question.  Prefacing the whole interview with “I need to ask every question, even though some may seem obvious” can help.  The data standards changed recently.  It’s highly likely that case managers nationwide answer this question without asking clients.
  • 15.
    Disabling Condition  Wecan’t decide where to put this question.  Personality: Often in conflict with Disability HUD Verifications.  Case managers report hearing “I don’t know”.  Case managers need training on how to probe by specific conditions.  How do you ask, and preserve the dignity of the client?  Disability is a huge term, but the HUD criteria are specific.  There’s a conflict if your program prohibits asking about disability before housing placement.  Confusion surrounding the “self report” standard, and the documentation required for housing eligibility.
  • 16.
    Residence Prior toProject Entry  It’s not clear from HMIS that this needs to be updated every entry.  It often does not get updated every entry.  Our Case Managers want a “getting evicted” option.  We need an error report that shows that this was not updated at entry.  HMIS Data Standards 5.1 devotes 10 pages to this element.  New Option “Interim Housing”  It can be very challenging to code this information properly even if the client is properly interviewed.
  • 17.
    Length of Time We tried to put this into plain English.  This field is only as good as the previous field.  It would be nice if error reports would flag when these fields are neglected.  There is no way to match the length of time to the previous place to see that the length of time actually corresponds to the previous place.
  • 18.
    Head of Household Frequently skipped.  What household?  Even our plain English questions do not make sense in most settings.  Long instructions in the Data Standards.  Skipping seems to be associated with the awkwardness of the question, and the amount of judgment needed to answers the question.  In our COC many case managers enter the client with the disability as HoH.  After that, we select the woman before the man, as men commonly exit PH-FAM.  Personality-How case managers answer this question reflects uncomfortable truths about our society.
  • 19.
    Entering from Streetor Emergency Shelter  Our data completeness is good with this field.  No way of knowing if it is updated at each entry.  Length of Time Homeless is an extremely important metric.  We don’t have Safe Havens in RI.  “Look Back” data is challenging.
  • 20.
    How Many Times? Our data is complete for this field.  Hard to know if it’s accurate.  In training we discovered that more than one case manager would count 4 years as 4 times.  We trained them to count the breaks.  We had a case manager act like a client, and he did a fantastic job of not answering any question straight.  We can train to collect and enter data, but training in judgment is very difficult.
  • 21.
    Total Number ofMonths 1 month, ah ah ah… 2 months ah ah ah… 1 day = 1 month???
  • 22.
    Income  “Projects collectingdata through client interviews should ask clients whether they receive income from each of the sources listed rather than asking them to state the sources of income they receive.”  We struggle to get PH providers to update this information annually.  Our case managers are not entering “Under the Table” income.  We are encouraging copious use of Updates so that the data is already there at Annual Assessment.  Clients are not sure what the “right” answer is.
  • 23.
    Non-Cash Benefits  TANF:A variable with an alias.  Some benefits have amounts and some don’t.  Plagued by duplicates.  I used this screenshot in a newsletter to show what perfect end dates look like.  We are training that it is ok to use files and case notes to make these updates in the Annual Assessment.
  • 24.
  • 25.
    Domestic Violence  Theaccuracy of this data element is entirely dependent on the situation.  What do you do when a client answered YES at an earlier entry, and answers NO at a current entry interview?  People enter NO, then enter “Client doesn’t know”  The purpose of this element is to determine if the client needs protection: a unique personality trait.  The data standards have a specific interview protocol, unique to this element.  Also unique data entry protocols.
  • 26.
    Is the ClientPermanently Housed? • Personality: Answered in error all over our HMIS. • Too much data rather than not enough. • A cut and dried answer that is easy to determine. • It seems like it should be more important than it is. • A misunderstood variable. It does not serve the purpose people think it does.
  • 27.
    Error(s) • Error andanalysis are two sides of the same concept. • Individual errors are addressed. • I don’t know if Confidence Intervals and Standard Deviations are accounted for in ART Reports. • Our data is used for eligibility, funding, reporting and understanding, but the limitations of the data are rarely discussed. • A calculation for average Length of Time Homeless needs a Confidence Interval.
  • 28.
  • 29.
    Motivation & Data Quality Winning buy-in from end users is probably the biggest challenge.  Federal and local funding depend on data.  Eligibility depends on data.  Providing the right services depends on data.  Entering high quality data makes you a “good citizen of the HMIS”.  HMIS gives voice to people who have no voice.  HMIS tells their story, and people really do listen.
  • 30.
    References Bunzli, L. (2016).The Definition of Chronic Homelessness: Applications and Implications for Policy and Practice Part II: Mixed Methods Study. Cohn, D. (2015). Census Considers New Approach to Asking About Race - By Not Using the Term at All. Pew Research Center. Retrieved from http://www.pewresearch.org/fact-tank/2015/06/18/census-considers-new-approach-to-asking- about-race-by-not-using-the-term-at-all/ HUD. (2014). HMIS Data Collection Template for Project ENTRY – CoC Program. HUD. (2016). 2014 HMIS Data Standards Data Manual . HUD. Matt Warfield, B. D. (2016). How has the ACA Medicaid Expansion Affected Providers Serving the Homeless Population: Analysis of Coverage,Revenues, and Costs. Retrieved from http://kff.org/medicaid/issue-brief/how-has-the-aca-medicaid- expansion-affected-providers-serving-the-homeless-population-analysis-of-coverage-revenues-and-costs/ Prewitt, K. (2013, August 21). Fix the Census’ Archaic Racial Categories. The New York Times. Retrieved from http://www.nytimes.com/2013/08/22/opinion/fix-the-census-archaic-racial-categories.html?_r=0
  • 31.
    Image Credits Adams, S.Dilbert.com. Retrieved from http://dilbert.com/terms/. Hargreaves, R. Mr. Men. Fabbri. Sesame Street.
  • 32.

Editor's Notes

  • #2 Thank you all for joining me. Today I’m going to talk about data collection and quality concepts, using our redesigned HUD Entry Questionnaire as an example, and share insights gained from working with case managers.
  • #3 This picture was taken the night we decided to break down the Occupy Providence encampment. The city agreed to open a day center for homeless people if we left the park. I didn’t camp, but I spent a lot of time at the park. I often cooked and ate with fellow occupiers. “Constituents” have been guests in my house and sat at my table. Antiques is a funny business. It transcends class. My colleagues in the antiques business range from extremely wealthy people looking to have a losing business for the sake of reducing their tax burden, to people who literally live out of their van and hop from flea market to flea market. It’s a fringe culture where people who don’t function well in standard jobs and mainstream society have found a place. I got excited about data collection and analysis when I enrolled in an online university for Medical Billing and Coding. The University was a scam, and while I learned a lot, it did not get me a job and out of the antiques business. I wanted my resume to look good, so I went to Brown for Public Health. I’ll share my $60,000 education with you for free. I learned all about data collection with a view to publishing papers that other statisticians consider valid. It’s a very high data quality standard.
  • #4 A good interviewer does not necessarily need to be familiar with all these concepts, but needs training and support that incorporates these concepts. Collecting data on homeless people is notoriously hard. It’s no surprise that street outreach has the biggest data quality challenges in our COC. Transient people are going to have highly variable data. In public health they train you to ask the most sensitive information towards the end of the questionnaire. This is how it’s done with the HUD Entry Template, but there is no training to prepare people. Try framing the questionnaire: “I need to ask you some questions so we know how to serve you best. Just answer honestly, and not according to what you think the ‘right answer’ is. Some of these questions may seem very personal.” In our COC, the template questions appear in a different order than you enter them in HMIS. This affects data quality. Our data standards cover coding to a certain degree, but not uniformly. I refer to the data standards daily, but I doubt our case managers do. Coding is judgment. I will talk extensively about coding in this presentation. Skip patterns- Is there a reason for missing data, or is it random? Our trainings reveal how our case managers struggle with even simpler concepts. The chronic homeless questions are hard to ask. They are often skipped. It‘s not a random omission. While the Yes or No questions are supposed to be asked independently, I have no idea how to get the right information without reading the HUD Verification lists of choices. The living situation questions require probing. There is no way to get the data without having a conversation, and a skillful “lookback”. While data collection points are laid out in the data standards manual, they’re not explicit as you are using HMIS. The face you make when asking a question can influence the answer you get. Interviewer bias: “You don’t have HIV, Do you?” In our COC, some shelters frustrate the clients so much that by the time they see a case manager, they’re probably not in the mood to talk. The saving grace of HMIS data is the size of it. Having a large “sample” improves validity. I’m going to be talking about kinds of bias, for example instruction bias. When the instruction lack clarity, your data skews according the individuals collecting the data. All of these concepts help pinpoint the source of error, however Error itself is Often discussed but rarely quantified. I struggle with the way we report on HMIS data without discussing standard deviation or confidence intervals. I make copious use of cartoons when I send out my “error-grams”. I try to make people smile. I’m not sure if it’s working.
  • #5 In order to obtain uniform data, it is important to understand that non-uniform personalities are at play. Even the data types have personalities. There’s a rumor in our CoC that a client needs to be receiving SSDI in order to select YES for disability. That’s a personality at work.
  • #6 Name and name data quality are two completely different kinds of information. One is a fact about the client. The other is a judgment by the person entering data. Two totally different processes. In my first month on the job, my colleague Emalee told me she was rewriting the Entry form to make it easier to use and fit on less pages (4 v 11) I didn’t exactly grab it out of her hands, but I got pretty excited. By this time, I had written and launched 2 full scale surveys, one that went through IRB review, collaborated with my boss, a Brown Faculty Researcher, on 4 surveys, and analyzed the results of 2 surveys. While surveys are different than databases, the principals of questionnaire design are applicable. Understanding how data quality and completeness can interfere with analysis also helps. The data standards show example conversations, but I think specific, in person training on probing, when to list answer choices, and how to frame a “look back” are important skills to teach. Theoretically, having an interviewer gather the information rather than having a client fill out the form should improve data quality, IF the interviewers are all asking the questions uniformly, and entering the data uniformly. Now that the data standards have changed again, we need to re-write this. I consider it a blessing because I really wanted a period to experiment with the questionnaire and see how our case managers react to it. Trainings on using this questionnaire have revealed a lot of ways that we can continue to improve it.
  • #7 Believe it or not, there are data quality concerns with this field
  • #8 First Strategy: Put all questions that should be asked using words into words. Second Strategy: Write out instructions for filling in data elements that are not asked of clients. You can’t assume that case managers are not asking “What is your name data quality”. The “personality” of a data element comes out in the kinds of errors it leads to. Our case managers were not sure if you could correct the spelling of the client’s name without creating a duplicate client. Any time a data field is programmed to behave differently, that’s a personality trait of that variable.
  • #9 “Skip pattern”: a concept that everyone is probably familiar with, even if you haven’t called it that. In the 0213, I noticed nearly every client in one ES had #error I called them up, and sure enough, the missing data was not random omission, but intentional omission. It was their policy to collect the last 4 digits. I looked up the data standard, emailed the funding manager and the agency’s lead HMIS user, and corrected the problem. Our revised questionnaire includes instructions for entering ssn data quality, just like the name data quality. I think there’s a skip pattern with data quality descriptors. People just don’t understand them so they skip them. My COLLEAGUE Don pointed out something interesting: I had thought that are clients, having little or no income, would not be targets for identity theft. Not the case: they are ideal targets for identity theft due to lack of credit history and no addresses. They might never find out if they were victims of identity theft. Also, when I send out error reports, I add sticky notes to the PDF’s. The case managers find this more helpful than just sending an error report. I explain the errors in plain English, let them know when they are not responsible for the errors on their report, and offer positive reinforcement when I see good data quality.
  • #11 Our case managers struggle with the interplay between accepting self-report, the onus of making sure HMIS data reflects the truth, and knowing that things will have to get verified later. It seems our group skews toward not wanting to enter anything that is not documented. Another personality issue… There’s no way of getting this information once the client is gone. Does anyone complete missing data by verifying with VA?
  • #12 Everyone’s least favorite question. This is a headline from a 2013 NY Times Opinion Article by Kenneth Prewitt, former director of the Census Bureau “Remarkably, a discredited relic of 18th-century science, the “five races of mankind,” lives on in the 21st century. “ We could ask the question a different way, but it would make the data difficult to analyze, and would not compare well with any historical data. This question has lots of personality. The 2020 Census might have different questions. They started asking alternative race questions for a subset of the US population in 2010. Race and Hispanic Origin Alternative Questionnaire Experiment (AQE) People from the Mideast and North Africa are “white”? AHAR is not accepting missing race data. What do you do? Bossy variable.
  • #13 This is how we saved space on our revised entry questionnaire. The 2000 census was the first to offer respondents more than one race category. End Users don’t like the secondary race requirement. What do we do if people select 5 race categories?
  • #14 I accept the ethnicity rationale, but where are ALL the ethnicities. This is one of 7 data elements that the data standards recommend verifying every time. Oddly, when I started my job, there were only about 50/3682 active clients with missing ethnicity data. For such a sticky subject, it seems like there should be more of a problem. “Additionally, many civil rights advocates have urged the Census Bureau and OMB to create a new, separate ethnicity for Americans of Middle Eastern and North African (MENA) descent, who currently are defined as “White” in the OMB Standards. The AQE did not test a new MENA category, but the AQE focus groups revealed widespread agreement that the classification of people of Middle Eastern and North African origin as White was inappropriate. The bureau now plans to test a MENA category during field tests for the 2020 census scheduled for 2015.”
  • #15 I don’t interview clients, but in my work as a research assistant, I had to ask this question to potential research participants. It was awkward.
  • #16 We’re finding there is a lot of confusion around the purpose of HMIS data. I think it might distort the data when case managers are always thinking about eligibility. HMIS supports eligibility, but it’s purpose is not to determine eligibility. Or is it? Regarding this question, I often give my “HMIS is a difficult system to use because it allows you to make mistakes”. I explain that the system has to be flexible enough to be applicable across service settings and client types, and roll with the data standards changes. The need for flexibility trumps the need for lots of validation. In my first pass at data quality when I was hired in March, I noticed this is often skipped. As I learned to run the 0260s, I saw that it is often wrong. I am guessing that people don’t understand the purpose of this when the HUD verification covers the same information. It’s hard to get buy in from case managers with a problem like this. One case manager thought that the client had to be receiving SSI or SSDI to select yes, instead of “noted as a potential yes” Our case managers have asked for a list of qualifying disabilities. How to ask about disability? Google failed me. Do you trouble?
  • #17  I wish the data elements that need verification and updates at every entry could be flagged. This is often skipped at the most important data collection point, the first ES stay. There are 10 pages of guidelines in the v 5.1 of the data standards on Living Situation, including decision making trees, interview strategies, coding guidelines There are new data standards that we haven’t implemented yet, and I’m looking forward to them. We can look forward to the “interim housing” option. “Status Documented: This question does not require documentation of the responses. It does not replace documentation requirements of chronic homelessness for projects that require such documentation. “ This is not a data standard. This field is entirely unlike the rest of the HMIS fields. Its purpose it entirely utilitarian. It’s understood that it is often wrong. This is the only question in HMIS where the answers themselves have sub-categories, which is interesting from an analysis perspective.
  • #18 I wish there was a way of connecting the LOT and the previous place, or a data quality report that flagged if they were not updated at the same time. This is snip is from our COC’s Data Quality Plan, and it’s this statement that troubles me. I can often see from the HMIS that the previous place is another shelter or whatever, but that these fields haven’t been updated since 2011…
  • #19 The skip pattern for this variable is that it doesn’t make sense. What household? This information is something that is ascertained through questions about whether or not the client has a family. Using the word “household” in any questioning strategy is likely to confuse the client. This is more of a judgment and coding situation. More personality traits: Apparently it has been programmed to behave differently, so we are seeing new anomalies in HMIS. And when to exit a client from a household is a gray area. Last 2 bullet points.
  • #20 I went back in and added a bunch of YES answers and dates on the second day they were in ES.
  • #21 I found an error in the new data standards “A break is considered 6 or less consecutive nights not residing in a place not meant for human habitation, in shelter or in a Safe Haven. The look back time would not be broken by a stay less than 7 consecutive nights; or..” In all fairness, we should have said “separated by at least 7 NIGHTS” on our questionnaire. I’ve made a number of very public errors myself since I started this job. I try to get some mileage out of them by saying “I specialize in errors!” To err is human.
  • #22 Actually, I couldn’t find the 1 day = 1 month rule in the 5.1 Data standards missing look back protocol for this as well. Implications: Not only are we working with a cumbersome definition of chronic homelessness, but there is insufficient guidance on how gather this complicated information. Of course, I haven’t seen all the webinars… Also the Count is a great example of how ethnicity is different from race. Ethnicity: Eastern European. Race: Puppet.
  • #23 The data standards recommend prompting with answer choices, and I think most case managers do this either instinctively or were trained to do it. Our PH providers are not updating this in HMIS regularly. Frequent updates every time they know a change has taken place. Can ease the Annual Assessment burden. Response Bias- Clients give the answer they think the interviewer wants to hear.
  • #24 Commonly in conflict with YES or NO question. Also, we don’t have all these benefits in RI. It’s hard to encourage case managers to read the list, when they know that half the list doesn’t apply. They are pressed for time. The section 8 question is just weird. I have never seen it marked YES in our COC. What’s behind that? Section 8 clients are housed and not entered into HMIS? Or the fact that a case manager succeeds in getting them a voucher is recorded elsewhere in hMIS? The ratio of clients to vouchers is so huge that it’s unlikely we would detect this unicorn?
  • #25 People don’t know what type of insurance they’re receiving. In RI, all State Health Insurance is Medicaid. The HUD Verification is commonly in conflict with the YES or NO question. For this information to be valid, we need better definitions. ( I think Eric’s numbers are suffering from definition bias) Health Insurance purchased through the State Exchange is Private Pay. I spent the fall of 2015 reading every single Kaiser Family Foundation paper on Medicaid Expansion. Cool chart showing that homeless in non-expansion states are disproportionately affected. These data are extremely important. All eyes on Medicaid.
  • #26 Read from slides Data Standard: “Projects should be especially sensitive to the collection of domestic violence information from clients and should implement appropriate interview protocols to protect client privacy and safety such as: asking this question in a private location and not in the presence of a romantic partner; delaying all entry of data about clients identified with a recent history of domestic violence; or choosing not to disclose data about clients with a history of domestic violence to other homeless projects.“
  • #27 I often wonder who the intended audience is for the data standards. “This manual is designed for CoCs, HMIS Lead Agencies, HMIS System Administrators, and HMIS Users to help them understand the data elements that are required in an HMIS to meet participation and reporting requirements established by HUD and the federal partners. “ I would argue that those are very different audiences, and they need the information presented in different ways. I do a lot of “translating” the data standards, and promulgating. Excuse me, do you have a minute to talk about the data standards?
  • #28 Errors are my friend. They’re my job security. But error as a statistical concept is also a friend. These are confidence intervals. Explain the whiskers Data is great, but you need to always understand the limitations of the data. … .LOT Homeless. The confidence Interval will tell you how much the 10 year outlier is skewing the average. You can have perfect, error free data entry, and still come out with a number that might not mean what you think it means (about the mean). Important decisions are based on these data.
  • #29 Reliability is the degree to which an assessment tool produces stable and consistent results. Validity refers to how well a test measures what it is purported to measure. This 2 x 2 table was created by a friend of mine, and graduate from my program. She worked with HMIS data for her thesis, and compared the actual stories of shelter clients versus HMIS Chronic determinations. While this is a small sample size, it could be indicative of a both a validity and reliability issue. With so much depending on these data, I think it’s crucial to use all the tools available to improve data quality.