Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic Search vs. Human Cognition and Sourcing
Upcoming SlideShare
Loading in...5

Like this? Share it with your network


Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic Search vs. Human Cognition and Sourcing



A deep dive into resume and LinkedIn sourcing and matching solutions claiming to use artificial intelligence, semantic search, and NLP, including how they work, their pros, cons, and limitations, and ...

A deep dive into resume and LinkedIn sourcing and matching solutions claiming to use artificial intelligence, semantic search, and NLP, including how they work, their pros, cons, and limitations, and examples of what sourcers and recruiters can do that even the most advanced automated search and match algorithms can't do. Topics covered include human capital data information retrieval and analysis (HCDIR & A), Boolean and extended Boolean, semantic search, dynamic inference, dark matter resumes and social network profiles, and what I believe to be the ideal resume search and matching solution.



Total Views
Views on SlideShare
Embed Views



52 Embeds 86,586

http://booleanblackbelt.com 74557
http://www.booleanblackbelt.com 9284
http://feeds.feedburner.com 1214
https://www.linkedin.com 991
http://www.linkedin.com 112
http://feedly.com 100
http://cloud.feedly.com 90
http://webcache.googleusercontent.com 44
http://www.newsblur.com 36
http://translate.googleusercontent.com 22
http://newsblur.com 17
https://twitter.com 10
http://feedreader.com 9
http://www.feedspot.com 8
http://plus.url.google.com 8
http://www.pcrecruiter.net 7
http://digg.com 7
https://www.google.com 7
https://www.rebelmouse.com 5
http://www.google.com 4
http://webmail.bma.org.uk 4
http://ranksit.com 3
https://translate.googleusercontent.com 3
https://m.facebook.com 3
http://a0.twimg.com 3
http://bo13.otys.nl 3
http://news.google.com 2
http://beyondboolean.com 2
http://seoautomated.com 2
http://www.google.com.sg 2
http://www.google.com.jm 2
http://feedproxy.google.com 2
http://inoreader.com 2
http://talentintelligencesolutions.com 2
http://zenrecruiter.com 2
http://search.mywebsearch.com 1 1 1
http://www.beyondboolean.com 1
https://mail.google.com 1
http://lnkd.in&_=1409675850620 HTTP 1
http://www.google.ee 1
http://www.printwhatyoulike.com 1
http://www.inoreader.com 1
http://us-mg6.mail.yahoo.com 1
https://www.commafeed.com 1
http://app.brandwatch.com 1
http://booleancentral.com 1
http://www.twylah.com 1
http://hrbloggers.com 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • Lost and Found by metrognome0 via Flickr/creative commons
  • Haystack image
  • Haystack image
  • Concept = meaningful combination of words
  • Decrease content and speak to?
  • Hierarchical, parent-child, one-way: ontologies apply a larger variety of relation types/categorizationPractice and science of classificationHealthcare - hospital
  • Source: WikipediaLuwig Wittgenstein’s theories about how words are defined by context
  • Source: WikipediaLuwig Wittgenstein’s theories about how words are defined by context
  • Query clouds
  • Scientific discipline
  • A priori: independent of experience – book learning vs. OJTA posteriori: dependent on experience or empirical evidence
  • Some people = good enough for your organization? Find the best, and ALL of the bestMeans the same candidates are also missed
  • Statistical NLP - Automatically analyze and define relationships between words and conceptsRelevant:the ability (as of an information retrieval system) to retrieve material that satisfies the needs of the user
  • http://www.flickr.com/photos/stickergiant/4793776078/sizes/o/in/photostream/
  • Infer = to derive as a conclusion from facts or premises
  • More accurately, created intelligence
  • B.S., summa cum laude from Harvard UniversityPh.D, University of California, Berkeley
  • Those two problems are at the present time largely unsolved. Now, I think, however, that within a few decades, we should be able to create robots as smart as mice, maybe dogs and cats.
  • Sourcing requires creativity, interpretive analysis, judgment, and common sense – a natural understanding based on experience
  • Sourcing requires creativity, interpretive analysis, judgment, and common sense – a natural understanding based on experience
  • Dynamic – continuous and productive activity or changeStatic – showing little/no change
  • Dynamic Inference = Infer = to derive as a conclusion from facts or premises
  • Dynamic Inference = Infer = to derive as a conclusion from facts or premises
  • Dynamic Inference = Infer = to derive as a conclusion from facts or premises
  • Dynamic Inference = Infer = to derive as a conclusion from facts or premises
  • Dynamic Inference = Infer = to derive as a conclusion from facts or premises
  • Dynamic Inference = Infer = to derive as a conclusion from facts or premises
  • Dynamic – continuous and productive activity or change
  • Aware: having or showing realization, perception, or knowledgeWhat if? How else? Curious…
  • The system gives you what it thinks you wanted, and you are able to tell the system what you wantedA Knowledge Engineering system that integrates human knowledge into computer systems in order to solve complex problems normally requiringa high level of human expertise
  • An expert system is computer software that attempts to mimic the reasoning of a human specialist
  • Kaizen and the Toyota Way
  • Financial data/BI - Tons of software – financial EPR, business intelligence apps – they STILL require people to analyze and interpretApplications can’t truly analyze/interpret
  • Financial data/BI - Tons of software – financial EPR, business intelligence apps – they STILL require people to analyze and interpretApplications can’t truly analyze/interpret

Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic Search vs. Human Cognition and Sourcing Presentation Transcript

  • 1. Talent Sourcing and Matching Artificial Intelligence & Black Box Semantic Search vs. Human Cognition & SourcingGlen Catheywww.linkedin.com/in/glencatheywww.booleanblackbelt.com
  • 2. What’s the big deal anyway?Some people believe resume, LinkedIn and Internet sourcing is so easy that sourcing is either dying or dead or can be performed for $6/hour
  • 3. Resume and LinkedIn sourcing appears simple and easy on thesurface, however – it is deceptively difficult and complex
  • 4.  Anyone can find candidates because all searches "work" as long as they are syntactically correct  That doesn’t mean the searches are finding all of the best candidates! People make assumptions when creating searches  Every time an assumption is made, there is room for error and you unknowingly miss and/or eliminate results!
  • 5.  No single search can return all potentially qualified people  Every search both includes some qualified people and excludes some qualified people Some of the best people have resumes or social profiles that may not appear to be obvious or strong matches to your needs
  • 6.  People cannot effectively be reduced to and represented by a text-based document Job seekers are NOT professional resume or LinkedIn profile writers Most people still believe shorter and concise resumes and social profiles are still better  This means they are removing data/info from their resumes which can no longer be searched for!
  • 7.  No one mentions every skill or responsibility they’ve had, nor describes every environment they’ve ever worked in There are many ways of expressing the same skills and experience Employers often don’t use the same job titles for the same job functions
  • 8.  People don’t create their resumes and LinkedIn profiles thinking about how you will search for them Sometimes people don’t even use correct terminology Anyone easy for you to find is easy for other recruiters to find = no competitive advantage!
  • 9. In addition to the people you do find, there are Dark Matter results of people that exist to beretrieved, but cant be found through standard, direct or obvious methods I estimate Dark Matter to be at least 50% of each source searched
  • 10. Finding some people is easy…
  • 11. Finding all of the best people IS NOT!
  • 12. “When every business has free and ubiquitousdata, the ability to understand it and extractvalue from it becomes the complimentary scarcefactor. It leads to intelligence, and the intelligentbusiness is the successful business, regardless ofits size. Data is the sword of the 21st century,those who wield it well, the Samurai.” -Jonathan Rosenberg, SVP, Product Management @ Google
  • 13.  Stop wasting time trying to create difficult and complex Boolean search strings Let "intelligent search and match applications" do the work for you A single query will give you the results you need - no more re-querying, no more waste of time!
  • 14.  Understand titles, skills, and concepts Automatically analyze and define relationships between words and concepts Intuit and infer experience by context
  • 15.  Perform pattern recognition Employ semantic search Perform fuzzy matching
  • 16. How do they really work?
  • 17.  Intuit experience by context = resume parsing Parsing breaks down and extracts resume information  Most recent title and employer  Skills and experience  Years of experience – overall, in each position, with specific skills, in management, etc.  Education
  • 18.  Parsing enables structured, fielded search Search by:  Most recent title  Recent experience  Years of experience  Etc.
  • 19.  Well developed ontologies and taxonomies  Hierarchical
  • 20.  Synonymous terms  Programmer, Software Engineer, Developer  Tax Manager, Manager of Tax  CSR, Customer Service Representative  Ruby on Rails, RoR, Rails, Ruby  Oracle Financials, Oracle Applications, e-Business Suite, etc.
  • 21.  Some applications use complex statistical methods in an attempt to "understand" language and the relationships between words Example: Google Distance
  • 22.  Keywords with the same or similar meanings in a natural language sense tend to be "close" in units of Google distance, while words with dissimilar meanings tend to be farther apart
  • 23.  A measure of semantic interrelatedness derived from the number of hits returned by the Google search engine for a given set of keywords
  • 24.  Non-interactive and unsupervised machine learning technique seeking to automatically analyze and define relationships between words and concepts Clustering is a common technique for statistical data analysis
  • 25.  The design and development of algorithms that allow computers to evolve behaviors based on empirical data A major focus is to automatically learn to recognize complex patterns and make intelligent decisions and classifications based on data
  • 26.  Aims to classify data (patterns) in resumes based either on a priori knowledge or on statistical information extracted from the patterns  A priori: independent of experience  Example of pattern recognition: spam filters
  • 27.  Finds approximate matches to a pattern in a string Useful for word and phrase variations and misspellings
  • 28.  Reduce time to find relevant matches Can lessen or eliminate the need for recruiters to have deep and specialized knowledge within an industry or skill set Reduce and even eliminate time spent on research
  • 29.  Go beyond literal, identical lexical matching Levels the playing field Can make an inexperienced person look like a sourcing wizard  Good for teams with low search/sourcing capability
  • 30.  Work well for positions where titles effectively identify matches and where there is a low volume and variety of keywords Good for a high volume of unchanging hiring needs
  • 31.  Removes thought from the talent identification and decision making process Danger of eliminating the need for recruiters to understand what they’re searching for Information technology, healthcare, and other sectors/verticals can create pose serious challenges to matching apps
  • 32.  Apps find some people, bury or eliminate others  Is finding some people good enough for your organization?  Shouldn’t your goal be to find ALL of the BEST people?
  • 33.  Matching apps level the playing field  People from different companies using the same solution will both find and miss the same people  Competitors using the same search and match solution will have no competitive advantage over each other!
  • 34.  Belief that one search finds all of the best candidates is intrinsically flawed and simply not based in reality Top talent isnt represented by what a search engine "thinks" has the best resume or profile AI and semantic search apps favor keyword rich resumes and profiles
  • 35.  Keyword poor resumes and profiles may in fact represent better talent than keyword rich resumes and profiles It’s not just a matter of keyword frequency or even keyword presence! AI powered search & match applications can only return results that explicitly mention required keywords and their variants
  • 36.  Many people have skills and experience that are simply not mentioned anywhere in their resumes! These people are the Dark Matter of databases, ATS’s, and social networks, and they exist but cannot be found via direct search/match methods – AI or otherwise!
  • 37.  Pre-built taxonomies are static, limited in their completeness and must be continually updated in order to stay relevant and effective Taxonomies are only as good as who created them Applications can only match on what’s present and cannot “think outside of the box”
  • 38.  Semantic clustering and NLP applications can retrieve related search terms, but that does not mean they are relevant for your need!
  • 39.  Match primarily on titles and skill terms  True match is at the level of role, responsibilities, environment, etc. Some applications rank results favoring recent employment duration  Is someone who has been in their current company for 5 years really “better” than someone who has been with their current company for 2 years?
  • 40.  Apps don’t "know" what you’re looking for or whats the best match for your company Apps are not and cannot be "aware" of people that were excluded from their search results Applications are not truly intelligent – they do not actually "know" or "understand" the meaning of titles and terms
  • 41.  The ability to learn or understand or to deal with new or trying situations The ability to apply knowledge to manipulate one’s environment or to think abstractly REASON; the power of comprehending and inferring Source: Merriam-Webster.com
  • 42.  The capability of a machine to imitate intelligent human behavior Artificial = humanly contrived Source: Merriam-Webster.com
  • 43.  Dr. Michio Kaku  Theoretical physicist and futurist specializing in string field theory  Harvard Grad (summa cum laude)  Berkeley Ph.D  Currently working on completing Einsteins dream of a unified field theory  What are his thoughts on AI?
  • 44.  “…pattern recognition and common sense are the two most difficult, unsolved problems in artificial intelligence theory. Pattern recognition means the ability to see, hear, and to understand what you are seeing and understand what you are hearing. Common sense means your ability to make sense out of the world, which even children can perform.” - Dr. Michio Kaku
  • 45.  Dr. Michio Kaku believes the job market of the future will be “dominated by jobs involving common sense (e.g. leadership, judgment, entertainment, art, analysis, creativity) and pattern recognition (e.g. vision and non- repetitive jobs). Jobs like brokers, tellers, agents, low level accountants and jobs involving inventory and repetition will be eliminated.”
  • 46.  That’s good news for sourcers and recruiters who perform sourcing! Sourcing requires judgment, creativity, analysis, common sense and pattern recognition (instantly making sense of human capital data) Sourcers of the future will be human capital data analysts who are experts in HCDIR & A – Human Capital Data Information Retrieval and Analysis
  • 47.  Matching apps do not have the dynamic ability to learn, understand and instantly relate new concepts and through direct experience and observation They depend on taxonomies, statistical models, or semantic clustering to “understand” relationships and concepts
  • 48.  The human mind naturally organizes its knowledge of the world, instantly relating new terms and concepts and judging their relevance
  • 49.  Example: A sourcer who is completely unfamiliar with “infection control” can instantly recognize non-highlighted but related and relevant terms and incorporate them into new and improved searches Carolinas HealthCare System, Charlotte, NC Infection Preventionist 1997-present Responsible for all aspects of infection prevention and control for an 800+bed hospital. Uses science-based research to perform infection prevention. Conducts all aspects of surveillance, data analysis, and presents data to interdisciplinary teams, including the Infection Control Committee.
  • 50.  Human sourcers can learn from research and search results, dynamically and adaptively identifying related and relevant search terms and incorporate them into successive searches to continuously refine and improve searches for more relevant results
  • 51.  For example, if a recruiter was sourcing for a position that required a skill that they were unfamiliar with (e.g.,“Cockburn Use Case Methodology” ) they could quickly perform research to learn more about it In the next slide, you will see a screen capture of such research
  • 52.  From this quick research , the recruiter would be able to determine that most people would not explicitly mention “Cockburn Use Case Methodology,” let alone “Cockburn” (which the research revealed is pronounced “Co-burn”) – thus they would not include the term in their searches
  • 53.  Instead, it would be a better idea to search for candidates that mention experience with Agile methodology and simply call and ask them if they have experience with using Cockburn’s use case methodology (which many likely would)
  • 54.  Applications using Natural Language Processing do not truly understand human language They use complex statistical methods to resolve the many difficulties associated with making sense of human language NLP experts admit that to computers, even simple sentences can be highly ambiguous when processed with realistic grammars, yielding thousands or millions of possible analyses
  • 55.  Humans effortlessly and automatically process and understand language, regardless of sentence length or complexity, ambiguity, incorrect grammar, etc. We can udnretsnad any msseed up stnecene as lnog as the lsat and frsit lteetrs of wdros are in the crrcoet plaecs
  • 56.  Human sourcers and recruiters can deduce potential experience, even in the absence of information (not explicitly mentioned in the resume/profile) Applications can only work with what’s actually mentioned in a resume – if its not explicitly mentioned, it cant match on it
  • 57.  Applications are not aware that many of the best people have average resumes Applications are not aware of the people their algorithms bury in results or eliminate entirely Human sourcers can become aware of and specifically target this Dark Matter
  • 58. How can you target resumes and LinkedIn profiles that exist, but your searches can’t and don’t retrieve them?
  • 59.  Well developed taxonomies, semantically generated query clouds and matching algorithms can help greatly with automatically searching for and matching on synonymous terms, related words, word variants, misspellings, etc.
  • 60. Think + Perform Research For keyword, phrase or title you are thinking of using in your search, realize: 1. Not everyone will explicitly mention what you think they would or should mention in their resume/profile 2. There are many different and often unexpected ways of expressing the same skills and experience
  • 61. Global Experience What search terms might you use if you are looking for people with global experience?How many can you think of off the top of your head?
  • 62. In a few minutes of exploratory research, a sourcer cancome up with a volume of related and relevant terms  Global, international, foreign, multinational, worldwid e  Europe, European, EU, EMEA, Asia, Asia-Pac, Pacific Rim, South America, Latin America, Americas, CALA (Caribbean and Latin America), Middle East  Canada, Japan, China, Russia, India, UK, United Kingdom, etc.  Countries, Offshore, Overseas
  • 63. How can you target results of people that your searches retrieve but the results are buried (ranked poorly or"too many" results to be reviewed) and you don’t find them?
  • 64.  Search and matching software powered by artificial intelligence / black box semantic search doesnt have a solution to this challenge One of the major claims AI/semantic search applications make is that their solutions can find the "right people" in one search
  • 65.  However - a single search strategy is intrinsically flawed and limited - no single search can find all qualified candidates, and each search both includes qualified people as well as excludes qualified people I am not aware of any search & match software that allows for successive searching via mutually exclusive filtering
  • 66. Run Multiple Searches Start with maximum qualifications Use the NOT operator to systematically filter through mutually exclusive result sets End with minimum qualifications
  • 67.  Required: A,B,C Explicitly desired: D,E Implicitly desired: F
  • 68. 1. A and B and C and D and E and F2. A and B and C and D and E and NOT F3. A and B and C and D and NOT E and F4. A and B and C and NOT D and E and F5. A and B and C and NOT D and NOT E and F6. A and B and C and D and NOT E and NOT F7. A and B and C and NOT D and E and NOT F8. A and B and C and NOT D and NOT E and NOT F
  • 69. Search #1Search #8
  • 70. Probability-Based and Exhaustive!This approach allows for: 1. The specific targeting of people who theoretically have the highest probability of being a match based on information present 2. The specific targeting of people who may be the best match, but may have keyword/information poor resumes or profiles, who do not explicitly mention what you think the "right" person would or should mention 3. The ability to systematically filter through all available results via manageable and mutually exclusive result sets – never seeing the same person twice!
  • 71.  A mix of “man and machine,” integrating human knowledge and expertise into computer systems Essentially - the best of both worlds:  Autopilot: An artificially intelligent semantic matching engine  Manual Override: Ability to take complete control over searches and search results
  • 72.  An artificial intelligence semantic matching engine coupled with taxonomies built by human SMEs that are continually modified and improved specifically for the organization  No COTS solution is customized for any specific employer, industry or discipline, nor 100% complete
  • 73.  Resume and LinkedIn profile parsing Structured, contextual search  Most recent title and experience, overall years of experience, education, etc. White Box relevance weighting  Configurable by users – no black box! Searchable tagging for level 5 semantic search
  • 74.  Standard and extended Boolean in full text and field-based search  AND, OR, NOT, configurable proximity, weighting  Configurable proximity enables level 3 semantic search  Variable term weighting allows users to control which search terms are more important and thus control over true relevance
  • 75.  Lucene is a free and open source text search engine that support configurable proximity and term weighting, and can be integrated into some existing ATSs/databases Some Applicant Tracking Systems already have databases powered by text search engines that allow for extended Boolean
  • 76. “Society has reached the point whereone can push a button and immediatelybe deluged with…information. This is allvery convenient, of course, but if one isnot careful there is a danger of losingthe ability to think.” - Eiji Toyoda
  • 77.  Data and information requires analysis to support decision making Just as very expensive Business Intelligence and Financial Analytics software hasnt replaced the need for people to make sense of the data, there is no software solution for HR and recruiting that replaces the need for people to analyze and interpret human capital data to make appropriate decisions
  • 78.  Matching apps move/retrieve information, but only PEOPLE can analyze and interpret for relevance and make intelligent decisions  Relevant: the ability (as of an information retrieval system) to retrieve material that satisfies the needs of the user [1]  Only the user (sourcer/recruiter) can judge relevance! [1] Source: Merriam-Webster.com
  • 79.  Sourcers and recruiters need technology that can enable their productivity Intelligent search and match apps are not a replacement for creative, curious, investigative people Do not seek to automate that which you do not understand and cannot accomplish manually!
  • 80. “Computers move information,people do the work” - Jeffrey Liker