Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic Search vs. Human Cognition and Sourcing


Published on

A deep dive into resume and LinkedIn sourcing and matching solutions claiming to use artificial intelligence, semantic search, and NLP, including how they work, their pros, cons, and limitations, and examples of what sourcers and recruiters can do that even the most advanced automated search and match algorithms can't do. Topics covered include human capital data information retrieval and analysis (HCDIR & A), Boolean and extended Boolean, semantic search, dynamic inference, dark matter resumes and social network profiles, and what I believe to be the ideal resume search and matching solution.

Published in: Technology, Business
No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Lost and Found by metrognome0 via Flickr/creative commons
  • Haystack image
  • Haystack image
  • Concept = meaningful combination of words
  • Decrease content and speak to?
  • Hierarchical, parent-child, one-way: ontologies apply a larger variety of relation types/categorizationPractice and science of classificationHealthcare - hospital
  • Source: WikipediaLuwig Wittgenstein’s theories about how words are defined by context
  • Source: WikipediaLuwig Wittgenstein’s theories about how words are defined by context
  • Query clouds
  • Scientific discipline
  • A priori: independent of experience – book learning vs. OJTA posteriori: dependent on experience or empirical evidence
  • Some people = good enough for your organization? Find the best, and ALL of the bestMeans the same candidates are also missed
  • Statistical NLP - Automatically analyze and define relationships between words and conceptsRelevant:the ability (as of an information retrieval system) to retrieve material that satisfies the needs of the user
  • http://www.flickr.com/photos/stickergiant/4793776078/sizes/o/in/photostream/
  • Infer = to derive as a conclusion from facts or premises
  • More accurately, created intelligence
  • B.S., summa cum laude from Harvard UniversityPh.D, University of California, Berkeley
  • Those two problems are at the present time largely unsolved. Now, I think, however, that within a few decades, we should be able to create robots as smart as mice, maybe dogs and cats.
  • Sourcing requires creativity, interpretive analysis, judgment, and common sense – a natural understanding based on experience
  • Sourcing requires creativity, interpretive analysis, judgment, and common sense – a natural understanding based on experience
  • Dynamic – continuous and productive activity or changeStatic – showing little/no change
  • Dynamic Inference = Infer = to derive as a conclusion from facts or premises
  • Dynamic Inference = Infer = to derive as a conclusion from facts or premises
  • Dynamic Inference = Infer = to derive as a conclusion from facts or premises
  • Dynamic Inference = Infer = to derive as a conclusion from facts or premises
  • Dynamic Inference = Infer = to derive as a conclusion from facts or premises
  • Dynamic Inference = Infer = to derive as a conclusion from facts or premises
  • Dynamic – continuous and productive activity or change
  • Aware: having or showing realization, perception, or knowledgeWhat if? How else? Curious…
  • The system gives you what it thinks you wanted, and you are able to tell the system what you wantedA Knowledge Engineering system that integrates human knowledge into computer systems in order to solve complex problems normally requiringa high level of human expertise
  • An expert system is computer software that attempts to mimic the reasoning of a human specialist
  • Kaizen and the Toyota Way
  • Financial data/BI - Tons of software – financial EPR, business intelligence apps – they STILL require people to analyze and interpretApplications can’t truly analyze/interpret
  • Financial data/BI - Tons of software – financial EPR, business intelligence apps – they STILL require people to analyze and interpretApplications can’t truly analyze/interpret
  • Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic Search vs. Human Cognition and Sourcing

    1. 1. Talent Sourcing and Matching Artificial Intelligence & Black Box Semantic Search vs. Human Cognition & SourcingGlen Catheywww.linkedin.com/in/glencatheywww.booleanblackbelt.com
    2. 2. What’s the big deal anyway?Some people believe resume, LinkedIn and Internet sourcing is so easy that sourcing is either dying or dead or can be performed for $6/hour
    3. 3. Resume and LinkedIn sourcing appears simple and easy on thesurface, however – it is deceptively difficult and complex
    4. 4.  Anyone can find candidates because all searches "work" as long as they are syntactically correct  That doesn’t mean the searches are finding all of the best candidates! People make assumptions when creating searches  Every time an assumption is made, there is room for error and you unknowingly miss and/or eliminate results!
    5. 5.  No single search can return all potentially qualified people  Every search both includes some qualified people and excludes some qualified people Some of the best people have resumes or social profiles that may not appear to be obvious or strong matches to your needs
    6. 6.  People cannot effectively be reduced to and represented by a text-based document Job seekers are NOT professional resume or LinkedIn profile writers Most people still believe shorter and concise resumes and social profiles are still better  This means they are removing data/info from their resumes which can no longer be searched for!
    7. 7.  No one mentions every skill or responsibility they’ve had, nor describes every environment they’ve ever worked in There are many ways of expressing the same skills and experience Employers often don’t use the same job titles for the same job functions
    8. 8.  People don’t create their resumes and LinkedIn profiles thinking about how you will search for them Sometimes people don’t even use correct terminology Anyone easy for you to find is easy for other recruiters to find = no competitive advantage!
    9. 9. In addition to the people you do find, there are Dark Matter results of people that exist to beretrieved, but cant be found through standard, direct or obvious methods I estimate Dark Matter to be at least 50% of each source searched
    10. 10. Finding some people is easy…
    11. 11. Finding all of the best people IS NOT!
    12. 12. “When every business has free and ubiquitousdata, the ability to understand it and extractvalue from it becomes the complimentary scarcefactor. It leads to intelligence, and the intelligentbusiness is the successful business, regardless ofits size. Data is the sword of the 21st century,those who wield it well, the Samurai.” -Jonathan Rosenberg, SVP, Product Management @ Google
    13. 13.  Stop wasting time trying to create difficult and complex Boolean search strings Let "intelligent search and match applications" do the work for you A single query will give you the results you need - no more re-querying, no more waste of time!
    14. 14.  Understand titles, skills, and concepts Automatically analyze and define relationships between words and concepts Intuit and infer experience by context
    15. 15.  Perform pattern recognition Employ semantic search Perform fuzzy matching
    16. 16. How do they really work?
    17. 17.  Intuit experience by context = resume parsing Parsing breaks down and extracts resume information  Most recent title and employer  Skills and experience  Years of experience – overall, in each position, with specific skills, in management, etc.  Education
    18. 18.  Parsing enables structured, fielded search Search by:  Most recent title  Recent experience  Years of experience  Etc.
    19. 19.  Well developed ontologies and taxonomies  Hierarchical
    20. 20.  Synonymous terms  Programmer, Software Engineer, Developer  Tax Manager, Manager of Tax  CSR, Customer Service Representative  Ruby on Rails, RoR, Rails, Ruby  Oracle Financials, Oracle Applications, e-Business Suite, etc.
    21. 21.  Some applications use complex statistical methods in an attempt to "understand" language and the relationships between words Example: Google Distance
    22. 22.  Keywords with the same or similar meanings in a natural language sense tend to be "close" in units of Google distance, while words with dissimilar meanings tend to be farther apart
    23. 23.  A measure of semantic interrelatedness derived from the number of hits returned by the Google search engine for a given set of keywords
    24. 24.  Non-interactive and unsupervised machine learning technique seeking to automatically analyze and define relationships between words and concepts Clustering is a common technique for statistical data analysis
    25. 25.  The design and development of algorithms that allow computers to evolve behaviors based on empirical data A major focus is to automatically learn to recognize complex patterns and make intelligent decisions and classifications based on data
    26. 26.  Aims to classify data (patterns) in resumes based either on a priori knowledge or on statistical information extracted from the patterns  A priori: independent of experience  Example of pattern recognition: spam filters
    27. 27.  Finds approximate matches to a pattern in a string Useful for word and phrase variations and misspellings
    28. 28.  Reduce time to find relevant matches Can lessen or eliminate the need for recruiters to have deep and specialized knowledge within an industry or skill set Reduce and even eliminate time spent on research
    29. 29.  Go beyond literal, identical lexical matching Levels the playing field Can make an inexperienced person look like a sourcing wizard  Good for teams with low search/sourcing capability
    30. 30.  Work well for positions where titles effectively identify matches and where there is a low volume and variety of keywords Good for a high volume of unchanging hiring needs
    31. 31.  Removes thought from the talent identification and decision making process Danger of eliminating the need for recruiters to understand what they’re searching for Information technology, healthcare, and other sectors/verticals can create pose serious challenges to matching apps
    32. 32.  Apps find some people, bury or eliminate others  Is finding some people good enough for your organization?  Shouldn’t your goal be to find ALL of the BEST people?
    33. 33.  Matching apps level the playing field  People from different companies using the same solution will both find and miss the same people  Competitors using the same search and match solution will have no competitive advantage over each other!
    34. 34.  Belief that one search finds all of the best candidates is intrinsically flawed and simply not based in reality Top talent isnt represented by what a search engine "thinks" has the best resume or profile AI and semantic search apps favor keyword rich resumes and profiles
    35. 35.  Keyword poor resumes and profiles may in fact represent better talent than keyword rich resumes and profiles It’s not just a matter of keyword frequency or even keyword presence! AI powered search & match applications can only return results that explicitly mention required keywords and their variants
    36. 36.  Many people have skills and experience that are simply not mentioned anywhere in their resumes! These people are the Dark Matter of databases, ATS’s, and social networks, and they exist but cannot be found via direct search/match methods – AI or otherwise!
    37. 37.  Pre-built taxonomies are static, limited in their completeness and must be continually updated in order to stay relevant and effective Taxonomies are only as good as who created them Applications can only match on what’s present and cannot “think outside of the box”
    38. 38.  Semantic clustering and NLP applications can retrieve related search terms, but that does not mean they are relevant for your need!
    39. 39.  Match primarily on titles and skill terms  True match is at the level of role, responsibilities, environment, etc. Some applications rank results favoring recent employment duration  Is someone who has been in their current company for 5 years really “better” than someone who has been with their current company for 2 years?
    40. 40.  Apps don’t "know" what you’re looking for or whats the best match for your company Apps are not and cannot be "aware" of people that were excluded from their search results Applications are not truly intelligent – they do not actually "know" or "understand" the meaning of titles and terms
    41. 41.  The ability to learn or understand or to deal with new or trying situations The ability to apply knowledge to manipulate one’s environment or to think abstractly REASON; the power of comprehending and inferring Source: Merriam-Webster.com
    42. 42.  The capability of a machine to imitate intelligent human behavior Artificial = humanly contrived Source: Merriam-Webster.com
    43. 43.  Dr. Michio Kaku  Theoretical physicist and futurist specializing in string field theory  Harvard Grad (summa cum laude)  Berkeley Ph.D  Currently working on completing Einsteins dream of a unified field theory  What are his thoughts on AI?
    44. 44.  “…pattern recognition and common sense are the two most difficult, unsolved problems in artificial intelligence theory. Pattern recognition means the ability to see, hear, and to understand what you are seeing and understand what you are hearing. Common sense means your ability to make sense out of the world, which even children can perform.” - Dr. Michio Kaku
    45. 45.  Dr. Michio Kaku believes the job market of the future will be “dominated by jobs involving common sense (e.g. leadership, judgment, entertainment, art, analysis, creativity) and pattern recognition (e.g. vision and non- repetitive jobs). Jobs like brokers, tellers, agents, low level accountants and jobs involving inventory and repetition will be eliminated.”
    46. 46.  That’s good news for sourcers and recruiters who perform sourcing! Sourcing requires judgment, creativity, analysis, common sense and pattern recognition (instantly making sense of human capital data) Sourcers of the future will be human capital data analysts who are experts in HCDIR & A – Human Capital Data Information Retrieval and Analysis
    47. 47.  Matching apps do not have the dynamic ability to learn, understand and instantly relate new concepts and through direct experience and observation They depend on taxonomies, statistical models, or semantic clustering to “understand” relationships and concepts
    48. 48.  The human mind naturally organizes its knowledge of the world, instantly relating new terms and concepts and judging their relevance
    49. 49.  Example: A sourcer who is completely unfamiliar with “infection control” can instantly recognize non-highlighted but related and relevant terms and incorporate them into new and improved searches Carolinas HealthCare System, Charlotte, NC Infection Preventionist 1997-present Responsible for all aspects of infection prevention and control for an 800+bed hospital. Uses science-based research to perform infection prevention. Conducts all aspects of surveillance, data analysis, and presents data to interdisciplinary teams, including the Infection Control Committee.
    50. 50.  Human sourcers can learn from research and search results, dynamically and adaptively identifying related and relevant search terms and incorporate them into successive searches to continuously refine and improve searches for more relevant results
    51. 51.  For example, if a recruiter was sourcing for a position that required a skill that they were unfamiliar with (e.g.,“Cockburn Use Case Methodology” ) they could quickly perform research to learn more about it In the next slide, you will see a screen capture of such research
    52. 52.  From this quick research , the recruiter would be able to determine that most people would not explicitly mention “Cockburn Use Case Methodology,” let alone “Cockburn” (which the research revealed is pronounced “Co-burn”) – thus they would not include the term in their searches
    53. 53.  Instead, it would be a better idea to search for candidates that mention experience with Agile methodology and simply call and ask them if they have experience with using Cockburn’s use case methodology (which many likely would)
    54. 54.  Applications using Natural Language Processing do not truly understand human language They use complex statistical methods to resolve the many difficulties associated with making sense of human language NLP experts admit that to computers, even simple sentences can be highly ambiguous when processed with realistic grammars, yielding thousands or millions of possible analyses
    55. 55.  Humans effortlessly and automatically process and understand language, regardless of sentence length or complexity, ambiguity, incorrect grammar, etc. We can udnretsnad any msseed up stnecene as lnog as the lsat and frsit lteetrs of wdros are in the crrcoet plaecs
    56. 56.  Human sourcers and recruiters can deduce potential experience, even in the absence of information (not explicitly mentioned in the resume/profile) Applications can only work with what’s actually mentioned in a resume – if its not explicitly mentioned, it cant match on it
    57. 57.  Applications are not aware that many of the best people have average resumes Applications are not aware of the people their algorithms bury in results or eliminate entirely Human sourcers can become aware of and specifically target this Dark Matter
    58. 58. How can you target resumes and LinkedIn profiles that exist, but your searches can’t and don’t retrieve them?
    59. 59.  Well developed taxonomies, semantically generated query clouds and matching algorithms can help greatly with automatically searching for and matching on synonymous terms, related words, word variants, misspellings, etc.
    60. 60. Think + Perform Research For keyword, phrase or title you are thinking of using in your search, realize: 1. Not everyone will explicitly mention what you think they would or should mention in their resume/profile 2. There are many different and often unexpected ways of expressing the same skills and experience
    61. 61. Global Experience What search terms might you use if you are looking for people with global experience?How many can you think of off the top of your head?
    62. 62. In a few minutes of exploratory research, a sourcer cancome up with a volume of related and relevant terms  Global, international, foreign, multinational, worldwid e  Europe, European, EU, EMEA, Asia, Asia-Pac, Pacific Rim, South America, Latin America, Americas, CALA (Caribbean and Latin America), Middle East  Canada, Japan, China, Russia, India, UK, United Kingdom, etc.  Countries, Offshore, Overseas
    63. 63. How can you target results of people that your searches retrieve but the results are buried (ranked poorly or"too many" results to be reviewed) and you don’t find them?
    64. 64.  Search and matching software powered by artificial intelligence / black box semantic search doesnt have a solution to this challenge One of the major claims AI/semantic search applications make is that their solutions can find the "right people" in one search
    65. 65.  However - a single search strategy is intrinsically flawed and limited - no single search can find all qualified candidates, and each search both includes qualified people as well as excludes qualified people I am not aware of any search & match software that allows for successive searching via mutually exclusive filtering
    66. 66. Run Multiple Searches Start with maximum qualifications Use the NOT operator to systematically filter through mutually exclusive result sets End with minimum qualifications
    67. 67.  Required: A,B,C Explicitly desired: D,E Implicitly desired: F
    68. 68. 1. A and B and C and D and E and F2. A and B and C and D and E and NOT F3. A and B and C and D and NOT E and F4. A and B and C and NOT D and E and F5. A and B and C and NOT D and NOT E and F6. A and B and C and D and NOT E and NOT F7. A and B and C and NOT D and E and NOT F8. A and B and C and NOT D and NOT E and NOT F
    69. 69. Search #1Search #8
    70. 70. Probability-Based and Exhaustive!This approach allows for: 1. The specific targeting of people who theoretically have the highest probability of being a match based on information present 2. The specific targeting of people who may be the best match, but may have keyword/information poor resumes or profiles, who do not explicitly mention what you think the "right" person would or should mention 3. The ability to systematically filter through all available results via manageable and mutually exclusive result sets – never seeing the same person twice!
    71. 71.  A mix of “man and machine,” integrating human knowledge and expertise into computer systems Essentially - the best of both worlds:  Autopilot: An artificially intelligent semantic matching engine  Manual Override: Ability to take complete control over searches and search results
    72. 72.  An artificial intelligence semantic matching engine coupled with taxonomies built by human SMEs that are continually modified and improved specifically for the organization  No COTS solution is customized for any specific employer, industry or discipline, nor 100% complete
    73. 73.  Resume and LinkedIn profile parsing Structured, contextual search  Most recent title and experience, overall years of experience, education, etc. White Box relevance weighting  Configurable by users – no black box! Searchable tagging for level 5 semantic search
    74. 74.  Standard and extended Boolean in full text and field-based search  AND, OR, NOT, configurable proximity, weighting  Configurable proximity enables level 3 semantic search  Variable term weighting allows users to control which search terms are more important and thus control over true relevance
    75. 75.  Lucene is a free and open source text search engine that support configurable proximity and term weighting, and can be integrated into some existing ATSs/databases Some Applicant Tracking Systems already have databases powered by text search engines that allow for extended Boolean
    76. 76. “Society has reached the point whereone can push a button and immediatelybe deluged with…information. This is allvery convenient, of course, but if one isnot careful there is a danger of losingthe ability to think.” - Eiji Toyoda
    77. 77.  Data and information requires analysis to support decision making Just as very expensive Business Intelligence and Financial Analytics software hasnt replaced the need for people to make sense of the data, there is no software solution for HR and recruiting that replaces the need for people to analyze and interpret human capital data to make appropriate decisions
    78. 78.  Matching apps move/retrieve information, but only PEOPLE can analyze and interpret for relevance and make intelligent decisions  Relevant: the ability (as of an information retrieval system) to retrieve material that satisfies the needs of the user [1]  Only the user (sourcer/recruiter) can judge relevance! [1] Source: Merriam-Webster.com
    79. 79.  Sourcers and recruiters need technology that can enable their productivity Intelligent search and match apps are not a replacement for creative, curious, investigative people Do not seek to automate that which you do not understand and cannot accomplish manually!
    80. 80. “Computers move information,people do the work” - Jeffrey Liker