The document discusses text analytics and various approaches used for tasks like text classification, clustering, topic modeling, semantic analysis, sentiment analysis and text summarization. It notes that data production will be 44 times greater by 2020 than 2009 and unstructured data represents 70-90% of captured data. It provides high-level overviews of different text analytics techniques.
Building AI Applications using Knowledge GraphsAndre Freitas
Goals of this Tutorial:
Provide a broad view of the multiple perspectives underlying knowledge graphs.
Show knowledge graphs as a foundation for building AI systems.
Method:
Focus on the contemporary and emerging perspectives.
Sampling exemplar approaches and infrastructures on each of these emerging perspectives (not an exhaustive survey).
The PoolParty Semantic Classifier is a component of the Semantic Suite, which makes use of machine learning in combination with Knowledge Graphs.
We discuss the potential of the fusion of machine learning, neuronal networks, and knowledge graphs based on use cases and this concrete technology offering.
We introduce the term 'Semantic AI' that refers to the combined usage of various AI methods.
Pistoia Alliance Webinar Demystifying AI: Centre of Excellence for AI Webina...Pistoia Alliance
Pistoia Alliance launched its Centre of Excellence for Artificial Intelligence (AI) in Life Sciences where we hope to bring together best practice, adoption strategy and hackathons covering a range of challenges.
Over the coming months we will be hosting a series of topics and speakers giving their perspectives on the role of Artificial & Augmented Intelligence in Life Sciences and Healthcare.
The topics will cover some of the current challenges, user stories & value in using AI in life sciences. If you want to get involved in this series as a speaker or suggest topics please get in touch
Webinar 1 will focused on the following
A Brief History
Big Data/ML/DL/AI - fundamentals and concepts
Data Fidelity importance
Some best practices
Co-Creation methods for interactive computer systems design are now widely accepted as part of the methodological repertoire in any software devel-opment process. As the community is becoming more aware of the fact that soft-ware is driven by complex, artificially intelligent algorithms, the question arises what “Co-Creation of Algorithms” in the sense of end users explicitly shaping the parameters of algorithms could mean, and how it would work. Algorithms are not tangible like tool features and effects are harder to be explained or under-stood, especially in early design phases without a software prototype. Therefore, we propose a Simulation-based Co-Creation method that allows TEL researchers to collaboratively design algorithms with end users by creating user stories and personas, modelling assumptions and discussing simulated effects. The method extends the build & evaluate loop of co-design iterations, even when the learning technology for the algorithm is not ready. Our proposal is a methodological idea for discussion in the EC-TEL community, yet to be applied in a practice.
[Mini-Workshop] Content Architecture: Where Humans and Machines AgreeAndrea L. Ames
Andrea's Information Development World mini-workshop
http://informationdevelopmentworld.com/speakers/andrea-ames/
Handout: https://www.slideshare.net/aames/handout-for-miniworkshop-content-architecture-where-humans-and-machines-agree
If there’s one thing about content on which humans and machines can agree, it’s consistency — particularly architectural consistency. Often the format, markup language, or content management approach that you use is far less relevant than the output of the content—the deliverables, themselves—in the success of content for both humans and machines. This is somewhat controversial, as much of the discussion of “structured content” dives directly to the underlying format—even though the architecture and design of the resulting experience and content within that experience should be driving those more technical decisions.
Arguably, the most critical aspect of structured content—“the architecture”—drives the success of the content for people and machines. The pitfalls of leaping directly into a technology discussion—about XML, content management systems, etc.—vs. spending the right time and focus on design can often lead to significantly less successful content, rework, and additional cost.
Attend this mini-workshop with Andrea Ames to better understand content modeling at the deliverable and experience level—not at the individual article or topic level. You’ll learn about an approach for accomplishing great content architecture (one that can save time, reduce costs, and help you use your limited resources wisely). And, you’ll discover the steps you’ll need to follow in order to successfully create—and validate—your own content modeling approach.
Building AI Applications using Knowledge GraphsAndre Freitas
Goals of this Tutorial:
Provide a broad view of the multiple perspectives underlying knowledge graphs.
Show knowledge graphs as a foundation for building AI systems.
Method:
Focus on the contemporary and emerging perspectives.
Sampling exemplar approaches and infrastructures on each of these emerging perspectives (not an exhaustive survey).
The PoolParty Semantic Classifier is a component of the Semantic Suite, which makes use of machine learning in combination with Knowledge Graphs.
We discuss the potential of the fusion of machine learning, neuronal networks, and knowledge graphs based on use cases and this concrete technology offering.
We introduce the term 'Semantic AI' that refers to the combined usage of various AI methods.
Pistoia Alliance Webinar Demystifying AI: Centre of Excellence for AI Webina...Pistoia Alliance
Pistoia Alliance launched its Centre of Excellence for Artificial Intelligence (AI) in Life Sciences where we hope to bring together best practice, adoption strategy and hackathons covering a range of challenges.
Over the coming months we will be hosting a series of topics and speakers giving their perspectives on the role of Artificial & Augmented Intelligence in Life Sciences and Healthcare.
The topics will cover some of the current challenges, user stories & value in using AI in life sciences. If you want to get involved in this series as a speaker or suggest topics please get in touch
Webinar 1 will focused on the following
A Brief History
Big Data/ML/DL/AI - fundamentals and concepts
Data Fidelity importance
Some best practices
Co-Creation methods for interactive computer systems design are now widely accepted as part of the methodological repertoire in any software devel-opment process. As the community is becoming more aware of the fact that soft-ware is driven by complex, artificially intelligent algorithms, the question arises what “Co-Creation of Algorithms” in the sense of end users explicitly shaping the parameters of algorithms could mean, and how it would work. Algorithms are not tangible like tool features and effects are harder to be explained or under-stood, especially in early design phases without a software prototype. Therefore, we propose a Simulation-based Co-Creation method that allows TEL researchers to collaboratively design algorithms with end users by creating user stories and personas, modelling assumptions and discussing simulated effects. The method extends the build & evaluate loop of co-design iterations, even when the learning technology for the algorithm is not ready. Our proposal is a methodological idea for discussion in the EC-TEL community, yet to be applied in a practice.
[Mini-Workshop] Content Architecture: Where Humans and Machines AgreeAndrea L. Ames
Andrea's Information Development World mini-workshop
http://informationdevelopmentworld.com/speakers/andrea-ames/
Handout: https://www.slideshare.net/aames/handout-for-miniworkshop-content-architecture-where-humans-and-machines-agree
If there’s one thing about content on which humans and machines can agree, it’s consistency — particularly architectural consistency. Often the format, markup language, or content management approach that you use is far less relevant than the output of the content—the deliverables, themselves—in the success of content for both humans and machines. This is somewhat controversial, as much of the discussion of “structured content” dives directly to the underlying format—even though the architecture and design of the resulting experience and content within that experience should be driving those more technical decisions.
Arguably, the most critical aspect of structured content—“the architecture”—drives the success of the content for people and machines. The pitfalls of leaping directly into a technology discussion—about XML, content management systems, etc.—vs. spending the right time and focus on design can often lead to significantly less successful content, rework, and additional cost.
Attend this mini-workshop with Andrea Ames to better understand content modeling at the deliverable and experience level—not at the individual article or topic level. You’ll learn about an approach for accomplishing great content architecture (one that can save time, reduce costs, and help you use your limited resources wisely). And, you’ll discover the steps you’ll need to follow in order to successfully create—and validate—your own content modeling approach.
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...ryanorban
Data scientists, data engineers, and data businesspeople are critical to leveraging data in any organization. A common complaint from data science managers is that data scientists invest time prototyping algorithms, and throw them over a proverbial fence to engineers to implement, only to find the algorithms must be rebuilt from scratch to scale. This is a symptom of a broader ailment -- that data teams are often designed as functional silos without proper communication and planning.
This talk outlines a framework to build and organize a data team that produces better results, minimizes wasted effort among team members, and ships great data products.
[Keynote] Human vs Machine: Conflict or Collaboration?Andrea L. Ames
Andrea's Information Development World 2017 keynote
Unless you have been vacationing on Mars for the past couple of years, you know that AI, machine learning, and cognitive computing are the hottest things in digital experience since HTML 1.0. And as a savvy content professional, you know that 80-90% of the digital experience is content. Content is the conversation we have with our prospects and our customers. Content carries the client relationship into the digital realm.
So how does content fare in this new, smarter digital space? What impact does machine-based experience have on the content that we create and the content experiences we want our customers to have? Must we learn an entirely new way of doing things? Or is the Machine Age just forcing us to adopt content-creation approaches that we should have been using all along? Is the development of human-readable content in conflict with the processes and designs we must follow to create good machine-processable content? Or is the content more similar than not?
In this opening keynote address, content experience strategist, Andrea Ames, will discuss the importance of making our content both human-readable and machine-processable. You’ll discover how doing so can help you ensure you are providing the best content experiences possible.
Finding the 'Seams': Making User Stories SmallerTechWell
When we adopt agile practices and a lean mindset, we make great promises to ourselves but we often encounter difficulties in creating user stories that are of high quality and utility. Mitch Goldstein describes why user stories and their value are the currency of agile and lean software development. Mitch illustrates why making smaller and more nimble stories significantly increases the likelihood of a story's completion and success. What do we look for in user stories that tell us they need to be split? Are there certain words or phrases that identify stories as good candidates for splitting? Mitch shares valuable tips for more effective stories, as well as how to enhance your expertise and reduce angst in estimation and prioritization. Small stories lay the groundwork for great agile successes.
How will Publishers Benefit from Artificial Intelligence? Karger case: Human ...Neil Blair Christensen
Presentation made at Society for Scholarly Publishing Annual Meeting 2018. Themed around how publishers can benefit from artificial intelligence, natural language processing, and machine-learning. Case example of Karger Publishers using Classify to automate human and machine curation of content packages for sales and marketing.
Videos:
How to create a class (package) https://www.youtube.com/watch?v=z5hhYpWiA88
How to evaluate a class (package) https://www.youtube.com/watch?v=Vq9z6l8GAdg
How to export/ integrate a class (package) https://www.youtube.com/watch?v=mYRcXUcRCL4
ARMA Calgary Spring Seminar: The Nuts and Bolts of Metadata Tagging and Taxon...Concept Searching, Inc
Michael Pay, Concept Searching's Chief Technology Officer, will be speaking at the ARMA Calgary Spring Seminar on Tuesday April 25th, 2017 on:
The Nuts and Bolts of Metadata Tagging and Taxonomies Made Easy
Taxonomies are often thought of as hard to use and needing specialized applications or IT skills. Not so with Concept Searching’s unique technologies. Join Michael Paye, Concept Searching’s Chief Technology Officer, to see how taxonomies, auto-classification, and multi-term metadata generation unburden the IT team, eliminate end user tagging, and empower business users.
This session focuses on records management challenges in Office 365, Microsoft Exchange, and file shares, demonstrating:
• Automated multi-term metadata generation
• Unique taxonomy tools and interactive features, such as clue suggestion, instant feedback, and assigning weights to terms
• Flexible and simple reporting across the three repositories
• Automated records identification
• Tagging of content directly in Office 365 and on file shares
GPSTEC201_Building an Artificial Intelligence Practice for Consulting PartnersAmazon Web Services
Companies around the world are looking at using artificial intelligence and machine learning to launch new innovative products and services and to drive efficiencies via automation in their businesses. Come to this session to understand why you should consider building an AI/ML practice in your consulting company. Learn the importance of having strong data engineering skills, including data annotation, and get some tips on building a data science team that can deliver customer projects.
A Topic Model of Analytics Job Adverts (The Operational Research Society 55th...Michael Mortenson
This presentation presents recent research into definitions of analytics through analysis of related job adverts. The results help us identify a new categorisation of analytics methodologies, and discusses the implications for the operational research community.
Creative Semantic SEO or Why Your UX Content Strategy Needs Keyword ResearchMarie Eve Gosemick
Do you speak human as well as Google and other search engines? Language is our primary interface. It's the basis to your UX content strategy.
What if semantic SEO, keyword research and topic mapping could support backend development and marketing technology project management?
Senior Content Designer/Strategist Marie Eve explores how structured data can be surprisingly creative with potential applications and examples in banking and retail.
A Topic Model of Analytics Job Adverts (Operational Research Society Annual C...Michael Mortenson
This presentation presents recent research into definitions of analytics through analysis of related job adverts. The results help us identify a new categorisation of analytics methodologies, and discusses the implications for the operational research community.
AI Powered Conversational Interfaces for Personalized Learning & ChatbotsAmazon Web Services
We have all seen the power of AI and ML used to transform industries of every kind. But what does this all mean for humans? Can advancements in AI boost human intelligence and make information more easily available? We believe so! We will hear from Cerego, creator of a personalized learning platform that helps millions of learners - in classrooms, at work, or even on the battlefield - improve their retention and understanding for any content they need, when they need it. They are leveraging services like Amazon Alexa to enable new voice-driven learning experiences that take the power of the cloud, AI, and now voice to the next level of boosting human performance. We will then explore an open source QnABot (chatbot) solution powered by Amazon Lex and Amazon Alexa for Q&A, Virtual Tours, Trivia quizzes, and more. The White House Historical Association (WHHA) will discuss and demo their work to implement a QnABot-powered virtual tour of the White House from the perspective of the roles of the US president.
Content Management, Metadata and Semantic WebAmit Sheth
Keynote given at NetObjectDays conference, Erfurt, September 11, 2001.
One of the earliest keynotes discussing commercial semantic web technologies, semantic web applications (including semantic search, semantic targeting, semantic content management). Prof. Sheth started a Semantic Web company Taalee, Inc. in 1999 (Product was MediaAnywhere A/V search engine),that merged to become Voquette in 2001 (product was called SCORE), Semagix in 2004 (product was called Semagix Freedom), and then Fortent in 2006 (products included Know Your Customers). Additional details can be found in U.S. Patent #6311194, 30 Oct. 2001 (filed 2000).
Note: the commercial system used "WorldModel" as at the time, business customers were not yet warm to "Ontology" - the concept/intent is the same. More recent information at http://knoesis.org
Sara Mae O’Brien Scott and Tatiana Baquero Cakici, Senior Consultants at Enterprise Knowledge (EK), presented “AI Fast Track to Search-Focused AI Solutions” at the Information Architecture Conference (IAC24) that took place on April 11, 2024 in Seattle, WA.
In their presentation, O’Brien-Scott and Cakici focused on what Enterprise AI is, why it is important, and what it takes to empower organizations to get started on a search-based AI journey and stay on track. The presentation explored the complexities of enterprise search challenges and how IA principles can be leveraged to provide AI solutions through the use of a semantic layer. O’Brien-Scott and Cakici showcased a case study where a taxonomy, an ontology, and a knowledge graph were used to structure content at a healthcare workforce solutions organization, providing personalized content recommendations and increasing content findability.
In this session, participants gained insights about the following:
Most common types of AI categories and use cases;
Recommended steps to design and implement taxonomies and ontologies, ensuring they evolve effectively and support the organization’s search objectives;
Taxonomy and ontology design considerations and best practices;
Real-world AI applications that illustrated the value of taxonomies, ontologies, and knowledge graphs; and
Tools, roles, and skills to design and implement AI-powered search solutions.
Step-by-step information about how associations can create an effective content strategy. Presentation given by Hilary Marsh and Rana Salzmann at the Association Forum Annual Meeting, June 2013
COVID-19 heightened chronic challenges within the global healthcare industry. It became a catalyst amid fierce competition and tight regulations for health providers and payers to focus on digital health, cybersecurity, patient data transparency, and a variety of customer-centric and operational enhancements. As a result, we found the 2022 trendline pointing to improvements in access and quality of care.
Healthcare challenges such as optimizing the cost of care while simultaneously enabling personalized interventions and consumer-friendly shoppable services are long-standing − but, historically, the industry has been slow to react.
Read our Top Trends 2022 report to examine the lingering ramifications of the pandemic, responses from medical and insurance organizations, and the worldwide impact of ever-changing regulatory standards and mandates.
A combination of factors − the pandemic, catastrophic weather events, evolving policyholder expectations, and insurers’ drive for operational efficiency and future relevance − are sparking P&C industry changes.
In a post-COVID, new-normal environment, the most strategic insurers are building resilient, crisis-proof enterprises poised to take advantage of emerging and future business opportunities. They are leveraging advanced data analytics and novel technologies to assure agility and achieve positive revenue and customer satisfaction outcomes. Competitive advantage will hinge on accelerated digitalization and faster go-to-market. Therefore, win-win partnerships and embedded services with InsurTechs and other ecosystem players are critical.
Read Capgemini’s Top P&C Insurance Trends 2022 for a glimpse at the tactical and strategic initiatives carriers are undertaking to boost customer-centricity, product agility, intelligent processes, and an open ecosystem to ensure profitable growth and future-readiness.
More Related Content
Similar to Ai in text analytics shailesh patel - capgemini .cwin18.telford
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...ryanorban
Data scientists, data engineers, and data businesspeople are critical to leveraging data in any organization. A common complaint from data science managers is that data scientists invest time prototyping algorithms, and throw them over a proverbial fence to engineers to implement, only to find the algorithms must be rebuilt from scratch to scale. This is a symptom of a broader ailment -- that data teams are often designed as functional silos without proper communication and planning.
This talk outlines a framework to build and organize a data team that produces better results, minimizes wasted effort among team members, and ships great data products.
[Keynote] Human vs Machine: Conflict or Collaboration?Andrea L. Ames
Andrea's Information Development World 2017 keynote
Unless you have been vacationing on Mars for the past couple of years, you know that AI, machine learning, and cognitive computing are the hottest things in digital experience since HTML 1.0. And as a savvy content professional, you know that 80-90% of the digital experience is content. Content is the conversation we have with our prospects and our customers. Content carries the client relationship into the digital realm.
So how does content fare in this new, smarter digital space? What impact does machine-based experience have on the content that we create and the content experiences we want our customers to have? Must we learn an entirely new way of doing things? Or is the Machine Age just forcing us to adopt content-creation approaches that we should have been using all along? Is the development of human-readable content in conflict with the processes and designs we must follow to create good machine-processable content? Or is the content more similar than not?
In this opening keynote address, content experience strategist, Andrea Ames, will discuss the importance of making our content both human-readable and machine-processable. You’ll discover how doing so can help you ensure you are providing the best content experiences possible.
Finding the 'Seams': Making User Stories SmallerTechWell
When we adopt agile practices and a lean mindset, we make great promises to ourselves but we often encounter difficulties in creating user stories that are of high quality and utility. Mitch Goldstein describes why user stories and their value are the currency of agile and lean software development. Mitch illustrates why making smaller and more nimble stories significantly increases the likelihood of a story's completion and success. What do we look for in user stories that tell us they need to be split? Are there certain words or phrases that identify stories as good candidates for splitting? Mitch shares valuable tips for more effective stories, as well as how to enhance your expertise and reduce angst in estimation and prioritization. Small stories lay the groundwork for great agile successes.
How will Publishers Benefit from Artificial Intelligence? Karger case: Human ...Neil Blair Christensen
Presentation made at Society for Scholarly Publishing Annual Meeting 2018. Themed around how publishers can benefit from artificial intelligence, natural language processing, and machine-learning. Case example of Karger Publishers using Classify to automate human and machine curation of content packages for sales and marketing.
Videos:
How to create a class (package) https://www.youtube.com/watch?v=z5hhYpWiA88
How to evaluate a class (package) https://www.youtube.com/watch?v=Vq9z6l8GAdg
How to export/ integrate a class (package) https://www.youtube.com/watch?v=mYRcXUcRCL4
ARMA Calgary Spring Seminar: The Nuts and Bolts of Metadata Tagging and Taxon...Concept Searching, Inc
Michael Pay, Concept Searching's Chief Technology Officer, will be speaking at the ARMA Calgary Spring Seminar on Tuesday April 25th, 2017 on:
The Nuts and Bolts of Metadata Tagging and Taxonomies Made Easy
Taxonomies are often thought of as hard to use and needing specialized applications or IT skills. Not so with Concept Searching’s unique technologies. Join Michael Paye, Concept Searching’s Chief Technology Officer, to see how taxonomies, auto-classification, and multi-term metadata generation unburden the IT team, eliminate end user tagging, and empower business users.
This session focuses on records management challenges in Office 365, Microsoft Exchange, and file shares, demonstrating:
• Automated multi-term metadata generation
• Unique taxonomy tools and interactive features, such as clue suggestion, instant feedback, and assigning weights to terms
• Flexible and simple reporting across the three repositories
• Automated records identification
• Tagging of content directly in Office 365 and on file shares
GPSTEC201_Building an Artificial Intelligence Practice for Consulting PartnersAmazon Web Services
Companies around the world are looking at using artificial intelligence and machine learning to launch new innovative products and services and to drive efficiencies via automation in their businesses. Come to this session to understand why you should consider building an AI/ML practice in your consulting company. Learn the importance of having strong data engineering skills, including data annotation, and get some tips on building a data science team that can deliver customer projects.
A Topic Model of Analytics Job Adverts (The Operational Research Society 55th...Michael Mortenson
This presentation presents recent research into definitions of analytics through analysis of related job adverts. The results help us identify a new categorisation of analytics methodologies, and discusses the implications for the operational research community.
Creative Semantic SEO or Why Your UX Content Strategy Needs Keyword ResearchMarie Eve Gosemick
Do you speak human as well as Google and other search engines? Language is our primary interface. It's the basis to your UX content strategy.
What if semantic SEO, keyword research and topic mapping could support backend development and marketing technology project management?
Senior Content Designer/Strategist Marie Eve explores how structured data can be surprisingly creative with potential applications and examples in banking and retail.
A Topic Model of Analytics Job Adverts (Operational Research Society Annual C...Michael Mortenson
This presentation presents recent research into definitions of analytics through analysis of related job adverts. The results help us identify a new categorisation of analytics methodologies, and discusses the implications for the operational research community.
AI Powered Conversational Interfaces for Personalized Learning & ChatbotsAmazon Web Services
We have all seen the power of AI and ML used to transform industries of every kind. But what does this all mean for humans? Can advancements in AI boost human intelligence and make information more easily available? We believe so! We will hear from Cerego, creator of a personalized learning platform that helps millions of learners - in classrooms, at work, or even on the battlefield - improve their retention and understanding for any content they need, when they need it. They are leveraging services like Amazon Alexa to enable new voice-driven learning experiences that take the power of the cloud, AI, and now voice to the next level of boosting human performance. We will then explore an open source QnABot (chatbot) solution powered by Amazon Lex and Amazon Alexa for Q&A, Virtual Tours, Trivia quizzes, and more. The White House Historical Association (WHHA) will discuss and demo their work to implement a QnABot-powered virtual tour of the White House from the perspective of the roles of the US president.
Content Management, Metadata and Semantic WebAmit Sheth
Keynote given at NetObjectDays conference, Erfurt, September 11, 2001.
One of the earliest keynotes discussing commercial semantic web technologies, semantic web applications (including semantic search, semantic targeting, semantic content management). Prof. Sheth started a Semantic Web company Taalee, Inc. in 1999 (Product was MediaAnywhere A/V search engine),that merged to become Voquette in 2001 (product was called SCORE), Semagix in 2004 (product was called Semagix Freedom), and then Fortent in 2006 (products included Know Your Customers). Additional details can be found in U.S. Patent #6311194, 30 Oct. 2001 (filed 2000).
Note: the commercial system used "WorldModel" as at the time, business customers were not yet warm to "Ontology" - the concept/intent is the same. More recent information at http://knoesis.org
Sara Mae O’Brien Scott and Tatiana Baquero Cakici, Senior Consultants at Enterprise Knowledge (EK), presented “AI Fast Track to Search-Focused AI Solutions” at the Information Architecture Conference (IAC24) that took place on April 11, 2024 in Seattle, WA.
In their presentation, O’Brien-Scott and Cakici focused on what Enterprise AI is, why it is important, and what it takes to empower organizations to get started on a search-based AI journey and stay on track. The presentation explored the complexities of enterprise search challenges and how IA principles can be leveraged to provide AI solutions through the use of a semantic layer. O’Brien-Scott and Cakici showcased a case study where a taxonomy, an ontology, and a knowledge graph were used to structure content at a healthcare workforce solutions organization, providing personalized content recommendations and increasing content findability.
In this session, participants gained insights about the following:
Most common types of AI categories and use cases;
Recommended steps to design and implement taxonomies and ontologies, ensuring they evolve effectively and support the organization’s search objectives;
Taxonomy and ontology design considerations and best practices;
Real-world AI applications that illustrated the value of taxonomies, ontologies, and knowledge graphs; and
Tools, roles, and skills to design and implement AI-powered search solutions.
Step-by-step information about how associations can create an effective content strategy. Presentation given by Hilary Marsh and Rana Salzmann at the Association Forum Annual Meeting, June 2013
Similar to Ai in text analytics shailesh patel - capgemini .cwin18.telford (20)
COVID-19 heightened chronic challenges within the global healthcare industry. It became a catalyst amid fierce competition and tight regulations for health providers and payers to focus on digital health, cybersecurity, patient data transparency, and a variety of customer-centric and operational enhancements. As a result, we found the 2022 trendline pointing to improvements in access and quality of care.
Healthcare challenges such as optimizing the cost of care while simultaneously enabling personalized interventions and consumer-friendly shoppable services are long-standing − but, historically, the industry has been slow to react.
Read our Top Trends 2022 report to examine the lingering ramifications of the pandemic, responses from medical and insurance organizations, and the worldwide impact of ever-changing regulatory standards and mandates.
A combination of factors − the pandemic, catastrophic weather events, evolving policyholder expectations, and insurers’ drive for operational efficiency and future relevance − are sparking P&C industry changes.
In a post-COVID, new-normal environment, the most strategic insurers are building resilient, crisis-proof enterprises poised to take advantage of emerging and future business opportunities. They are leveraging advanced data analytics and novel technologies to assure agility and achieve positive revenue and customer satisfaction outcomes. Competitive advantage will hinge on accelerated digitalization and faster go-to-market. Therefore, win-win partnerships and embedded services with InsurTechs and other ecosystem players are critical.
Read Capgemini’s Top P&C Insurance Trends 2022 for a glimpse at the tactical and strategic initiatives carriers are undertaking to boost customer-centricity, product agility, intelligent processes, and an open ecosystem to ensure profitable growth and future-readiness.
This analysis provides an overview of the top trends in the commercial banking sector as they shift to technology high gear to boost client efficiency and battle a volatile, uncertain, competitive, and evolving landscape.
First, it was retail banking. Now, advanced technology is shifting to – and disrupting − the commercial banking space. Many commercial banks, known for paperwork, red tape, and branch dependency, were unprepared to support clients during their post-COVID-19 ramp-up. But now, the digital pivot to new mindsets, partnerships, and processes is in overdrive.
As commercial banks grapple with competition from FinTechs, BigTechs, and alternative lenders, their inability
to fulfill SME demands and pandemic after-shocks necessitates transformative process changes and a move
to experiential, sustainable, and inclusive banking models. We expect banks to strive to meet the demands
of corporate clients and SMEs by digitally transforming critical workflows and improving client experience.
Additionally, incremental process improvements in the middle and back-office that leverage intelligent
automation will keep the competition at bay because engaged clients are loyal.
Adopting newer methods to mine data and moving to as-a-Service models will prepare commercial banks
to flexibly respond to newcomers and find ways to co-exist through effective collaboration. The time has come for commercial banks to put transformation on the fast track as lending losses in wallet and market share could spill over to other functions!
How incumbents react and respond to 2022 trends could determine their relevancy and resiliency in the years ahead.
The Covid-19 pandemic necessitated the payments industry undergo a facelift, sparked by novel approaches from new-age players, fostered by industry consolidation, and customers’ demand for end-to-end experience. Crossing the threshold, the industry is entering a new era – Payments 4.X, where payments are embedded and invisible, and an enabling function to provide frictionless customer experience. As customers make a permanent shift to next-gen payment methods, Digital IDs are critical for a seamless payment experience. The B2B payments segment is witnessing rapid digitization. BigTechs, PayTechs, and industry newcomers are ready to jump in with newfangled solutions to help underserved small to medium-sized businesses (SMBs).
As incumbents struggle with profits, new-age firms are forging ahead to take the lead in the Payments 4.X era by riding the success of non-card products and services. The new era demands collaboration, platformification, and firms can unleash full market potential only by embracing API-based business models and open ecosystems. Data prowess and enhanced payment processing capabilities are inevitable to thrive ahead. The clock is ticking for banks and traditional payments firms because the competitive advantage is not guaranteed forever. As industry players seek economies of scale, consolidations loom, and non-banks explore new territories to threaten incumbents’ market share. While all these 2022 trends are at play, central bank digital currency (CBDC) is emerging globally and might open a new chapter in the current payments landscape.
As we slowly move out of the pandemic, financial services firms have learned the criticality of virtual engagement to business resilience. Wealth management firms will need capabilities to cater to new-age clients and deliver new-age services. This report aims to understand and analyze the top trends in the Wealth Management industry this year and beyond.
A year ago, our Top Trends in Wealth Management report emphasized how the pandemic sparked disruption and digital transformation and changing investor attitudes around Environmental, Social, and Corporate Governance (ESG) products. As we begin 2022, many of those trends continue to hold as COVID-19’s wide-reaching effects continue to influence the wealth management industry.
As wealth management (WM) firms supercharge their digital transformation journeys, investments in cybersecurity and human-centered design are becoming critical to building superior digital client experience (CX). Another holdover trend − sustainable investing – is gaining mainstream attention and generating increasingly sophisticated client demands. Data and analytics capabilities will become ever more essential for ESG scoring and personalized customer engagement. As large financial services firms refocus on their wealth management business while new digital players make industry strides, competition is becoming historically intense. Not surprisingly, client experience is the new battleground.
This analysis provides an overview of the top trends in the retail banking sector driven by the competition, digital transformation, and innovation led by retail banks exploring novel ways to create and retain value in evolving landscape.
COVID-19 caught banks off guard and shook legacy mindsets to the core. With 20/20 (2020) hindsight, firms are more aware, digitally resilient, and financially stable as they head into 2022. The trials of the past 18 months forced firms to shore up existing business and consider new models and revenue streams.
Customer-centricity remains at the top of most FS agendas and is a 2022 focal point. Banks will focus on achieving operational excellence as diligently as delivering superior CX. In 2022 and beyond, it will be paramount for FIs to explore and invest in new technologies to remain relevant and resilient.
Banking 4.X will arrive in full force in 2022 with platform-supported firms monetizing diverse ecosystem capabilities and aggressively harvesting data to create experiential customer journeys through intelligent and personalized engagements. The new era will compel future-focused banks to finally abandon legacy infrastructure and collaborate with third-party specialists to solidify their best-fit, long-term roles. Increasingly, open platforms will make banks invisible as banking becomes embedded into customer lifestyles. At the same time, banks will shed asset-heavy models and shift to the cloud for greater agility, speed to market, and faster innovation. The shift will act as a precursor to adopting new technologies on the horizon – 5G and Decentralized Finance.
The recent past was filled will extraordinary lessons for financial institutions. Now is the time to act on those learnings and move forward profitably.
While COVID-19 has sparked the demand for life insurance, it has also exposed the operating model vulnerabilities in distribution, servicing, and customer retention. In a post-COVID, new-normal environment, insurers need to enhance their capabilities around advanced data management and focus on seamless and secure data sharing to provide superior CX and hyper-personalized offerings. Accelerated digitalization and faster go-to-market are vital to remaining competitive, and win-win partnerships with ecosystems are critical in the journey.
Read our Top Life Insurance Trends 2022 to explore the tactical and strategic initiatives carriers undertake to acquire competencies around customer centricity, product agility, intelligent processes, and an open ecosystem to ensure profitable growth and future readiness.
Property & Casualty Insurance Top Trends 2021Capgemini
The Property & Casualty insurance landscape is evolving quickly with the changing risk landscape, entry of new players, and changing customer expectations. The ripple effects of COVID-19 on the P&C insurance industry and natural disasters such as forest fires have adversely impacted insurance firm books.
In this scenario, to ensure growth and future-readiness, the most strategic insurers strive to be ‘Inventive Insurers’ – assuming a customer-centric approach, deploying intelligent processes, practicing business resilience and go-to-market agility, and embracing an open ecosystem.
Read our Property & Casualty Insurance Top Trends 2021 report to explore the strategies insurers are adapting to remain competitive amidst the evolving business landscape and how they can explore new ways to enhance their profitability.
A combination of factors such as demographic changes, evolving consumer preferences, and desire to become operationally efficient were already spurring changes in the life insurance industry. Enter 2020 – the COVID-19 pandemic is having a significant impact on the industry.
At the peak of disruption, the focus was on ensuring business continuity, but new initiatives are cropping up to tackle the challenges as the industry is adapting to the new normal.
Furthermore, COVID-19 has acted as a catalyst, pushing life insurers to prioritize their efforts on improving customer centricity, developing go-to-market agility, making processes intelligent, building business resilience, and embracing the open ecosystem.
Read our Life Insurance Top Trends 2021 report to explore the strategies insurers are adopting to manage the changing market dynamics.
The uncertainty of 2020 is setting the global tone for the immediate future in the financial services industry. So it is no surprise banks are laser-focused on business resilience, emphasizing both financial and operational risks. The need to adapt quickly to new normal conditions through virtual customer engagement is clear.
Customer centricity continues to drive commercial banks’ solution designs. And, the pandemic compelled products that deliver immediate client value ‒ quick digital onboarding, seamless lending, and support for small and medium-sized enterprises (SMEs). The onus is now on banks to go to market more quickly, which requires the implementation of intelligent processes and integrating corporates’ enterprise resource planning (ERP) systems with banking workflows.
To achieve go-to-market agility, banks across the globe are investing in and collaborating with FinTechs. Many of these partnerships are focused on boosting digital lending and providing seamless support to anxious small-business clients in need of assurance.
With newfound impetus for FinTech collaboration, commercial banks have picked up their step on the path toward OpenX. COVID-19 made it evident that survival during turbulence is manageable through collaboration with ecosystem players.
Read our Top Trends in Commercial Banking 2021 report to explore the strategies banks are adapting to transform their businesses from a product-led, siloed model to an experiential and agile plan.
When we published the Top Trends in Wealth Management 2020, little did we foresee the pandemic that would sweep through the world and disrupt life as we knew it. Yet, when we reviewed last year’s trends, we found that many still hold and some have taken on even greater relevance. One such trend is sustainable investing, which had begun to gain prominence as investors became more aware of ESG considerations, and firms rolled out more sustainable investing offerings. Another trend that has accelerated in the post-COVID world is the importance of investing in omnichannel capabilities and technologies such as artificial intelligence (AI) to enhance personalization and advisor effectiveness. The pandemic has driven wealth management firms to accelerate their digital transformation journey, with some immediate focus areas being interactive client communications and digital advisor tools.
There is no denying that time is of the essence. Yes, budgets are tight, but the Open X ecosystem offers wealth management firms opportunities to reimagine their operating models and deliver excellent customer experience cost-effectively.
Top trends in Payments: 2020 highlighted the payments industry’s flux driven by new trends in technology adoption, innovative solutions, and changing consumer behavior. The pandemic has tested the digital mastery of players, who are already grappling with transition. Non-cash transactions are on a robust growth path, accelerated by increased adoption during COVID-19. Regulators are working to instill trust and address non-cash payments risk amid unparalleled growth as players collaborate to quell uncertainty. Regional initiatives, such as the P27 (Nordics real-time payments system) and the EPI (European Payments Initiative), are gaining traction in response to country-level fragmentation and competition.
Investment in emerging technologies is looked upon as an elixir to mitigate fraud, data-driven offerings are being considered for providing value-added propositions, and distributed ledger technology is in focus for digital currency solutions, efficiency enhancement, and cost gains. New players, such as retailers/merchants, are integrating payments into their value chains while technology giants are upscaling their financial services game by weaving offerings around payments as a center stage. Constrained by budgets, firms consider business models such as Platform-as-a-Service (PaaS) to provide cost-effective and superior customer experience.
A combination of factors, including demographic changes, evolving consumer preferences, and regulatory and compliance mandates, were already spurring change in the health insurance industry. Enter 2020 and the COVID-19 pandemic, which is having sweeping implications for the industry.
At the peak of disruption, the focus was on ensuring business continuity, but new initiatives are cropping up to tackle the challenges as the industry adapts to the new normal.
Furthermore, some changes are here to stay, and it will be prudent for the industry players to be resilient to the market shifts by being agile, improving member centricity, making processes intelligent, and embracing the open ecosystem.
Read our Health Insurance Top Trends 2021 report to explore the strategies insurers are adopting to manage the external pressures.
The banking industry’s resilience is being tested as banks navigate through a remarkable 2020 filled with uncertainties. The impact of COVID-19 has been about setting the tone for future operational models. Retail banks have shifted focus towards integrated risk management with a more holistic view of operational risks. Adapting to the new normal, banks have prioritized cost transformation while engaging customers virtually. Incumbents sought to be more responsible within fast-changing environmental conditions and ESG remained a critical focus.
To provide more experiential services, banks are leveraging techniques such as segment-of-one to hyper-personalize offerings while aiming to humanize digital channels for increased engagement. Banks are also revamping middle and back offices, going beyond the front end leveraging intelligent processes. Open X is enabling banks to play on their strengths and use the expertise of ecosystem players. Going forward, banks are poised to become an enhanced one-stop shop by providing consumers value-adding FS and non-FS experiences.
To acquire customers in cost-effective manner, retail banks are tapping value-based propositions ‒ such as POS financing and mortgage refinancing. Further, Banking-as-Service provides incumbents a way to provide their high-value offerings to other players. In preparation for the future, banks will be looking to improve their go-to-market agility by leveraging the benefits of cloud. This analysis outlines the top 10 trends in retail banking for 2021.
Explore how Capgemini’s Connected autonomous planning fine-tunes Consumer Products Company’s operations for manufacturing, transport, procurement, and virtually every other aspect of the supply-value network in a touchless, autonomous way.
Financial services is undergoing a paradigm shift that is forcing incumbent retail banks to rethink growth strategies as they struggle to remain relevant. Growing competition from BigTechs, FinTech firms, and challenger banks has added to the complexity created by increasingly stringent regulatory and compliance requirements. Customers now expect a seamless customer journey and personalized offerings because they have become accustomed to top-notch individualized service from GAFA giants Google, Apple, Facebook, and Amazon. The changing ecosystem offers established banks new, unexplored opportunities and encourages a transition beyond traditional products to meet the exacting requirements of today’s customers. Bank collaboration with FinTech and RegTech partners is becoming commonplace. Incumbents are exploring point-of-sale financing and unsecured consumer lending, while they also boost their digital channel competencies to reach a broader customer base. Banks are beginning to accept open APIs and are working with third-party specialists to create an open shared marketplace. Technological advancements such as AI are fueling efforts to evolve customer onboarding and touchpoint processes. Increasingly, banks are turning to design thinking methodology to understand the customer journey, extract deep insights, and develop a more refined user experience across the customer lifecycle.
Our analysis of the top retail banking trends for 2020 offers a glimpse into the fast-changing banking ecosystem and explores the tools and solutions being used to face new-age challenges.
Aspects of the life insurance industry have remained constant for years – and so have premiums. Traditional savings products have taken a huge hit in terms of attractiveness because low interest-rates prevail. Meanwhile, the risk landscape is shifting, and insurers need to align better with the emerging business environment, manage changing customer preferences, and improve operational efficiencies. Within today’s scenario, industry players are undertaking tactical and strategic shifts in attempts to manage unpredictable market dynamics. Insurers must develop alternative products to breathe new life into policies and leverage emerging technologies (artificial intelligence (AI), analytics, and blockchain) to improve efficiency, agility, flexibility, and customer-centricity.
Read Top Trends in Life Insurance: 2020 for a look at the innovative steps future-focused insurers are considering to meet industry challenges and opportunities.
The health insurance industry is evolving and undergoing significant changes. As the risk landscape shifts, insurers are working to improve operational efficiencies, meet evolving customer preferences, and align better with the changing business environment. Accordingly, payers must adapt and align business models and offerings. An incisive tactical approach is required to accommodate members’ needs and related emerging risks — medical, health, and environmental. Advanced technologies such as artificial intelligence, analytics, automation, and connected devices are enabling insurers to manage these changes proactively, partner with members, and help to prevent risks, all the while continuing to fulfill payer responsibilities.
Read Top Trends in Health Insurance: 2020 to learn which strategies insurers are adopting to navigate and align with today’s challenges.
Similar to other financial services domains, payments is evolving into an open ecosystem. The EU’s Payment Services Directive (PSD2) pioneered open banking by encouraging banks and established payments players to securely open the systems to foster competition, innovation, and more customer choices. In tandem with non-cash transaction growth, regulations are driving banks and payments firms to expand their array of payment methods and channels. Governments are encouraging financial inclusion by also promoting the adoption of non-cash payments. Increasingly, merchants and corporates seek to offer alternative payment systems because of widespread popularity among consumers. Alternative payments also enable merchants to provide real-time and cross-border payments to boost business efficiency.
Banks, payment firms, card firms, BigTechs, FinTechs, and other players are continuously developing new technology to cash in on market changes. However, data breaches and fraud continue to hinder innovation as firms devote countless resources each year to address security issues. Many governments are also designing new regulations to reduce ecosystem threats. All these measures are expected to make the current ecosystem much more secure and simple for players as well as customers.
Top Trends in Payments: 2020 explores and analyzes payments ecosystem initiatives and solutions for this year and beyond
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Shailesh Patel Head of AI for the HMRC Account. I lead the AI Centre of Excellence and my team provides Enterprise AI solutions and support to our internal delivery groups as well as HMRC.
I’ve been working with Capgemini for 15 years and on the HMRC Account for 11 of those years.
Why Text Analytics? The processing of unstructured data is key for most organisations.
My team are creating PoCs for Text Analytics based solutions i.e. Email Caches, Free-Format Text, etc
This text data holds a lot of valuable information and there techniques available that allow us to mine that insight to better understand the value.
This presentation is about some appreciation of text analytics approaches.
Do a poll and get a view of how many know what AI, Machine Learning, Deep Learning, Reinforcement Learning, Transfer Learning
Do people know the difference between linear regression and logistic regression?
Do people know what a neural network is and the concept of neurons and activation functions?
Do people understand Bayesian statistical modelling for classification approaches?
AI was coined by John McCarthy in 1955 at Stanford University. He was a math professor from Dartmouth.
Umbrella Term for Learning Based Algorithms that can be made to improve with better data. Change data not AI code. Old days changed code, now we change data.
Data production will be 44 times greater in 2020 than it was in 2009
We generate more data in 2 days than was generated up to 2002
Big Data is the driver behind this revolution. And now that we have a confluence of Big Data, Compute and AI Algorithms we can actually gain insights from the data
Unstructured Data Represents 70% to 90% of the Data Captured
Most of the data we capture is unstructured. Image, Video, Audio and Free Format Text. And most those other formats result in textual descriptions like captions or speech to text or just text.
Pattern Based Reusable Approaches Now Available
We have lots of text analytics techniques we can use and reuse to analyse that text data and we use it to classify, cluster and network that information.
The approaches and techniques I describe today are reusable. And they all yield some value in terms of analytics or processing of text data. Based on statistical modelling we can now use techniques to classify, cluster and network data.
Which leads to…
Accelerated Approaches for Actionable Insights
This type of data can be difficult and time consuming to process manually but now with these approaches we can accelerate the activity.
Large volumes, unstructured and it’s like looking for a needle in a haystack.
Information is hidden in plain sight in certain circumstances to make it difficult to process i.e. scanned PDF files, hand written documents, or just prose to describe something.
We can use these techniques with automation to help accelerate our path to insight.
Resulting in improved customer experience, increase sales, detect fraud or non-compliance and so on.
All these techniques are based on mathematical modelling i.e. statistics, linear algebra, calculus, and so on.
So today…
********
Corpus and Corpora.
This includes text, audio, image and video data. This session will concentrate on the unique insights gained from text data, including the pre-processing of the other formats which results in text.
Utilising techniques around Natural Language Processing, Semantic Analysis, Sentiment Analytics and other advanced modelling approaches can result in valuable competitive intelligence and insight.
A bit of fun!
Statistics, Calculus, Linear Algebra, etc
What we are going to do today is go through the derivation of some these equations from first principles. Only joking!
What this shows is that all the Text Analytics processes are based on statistical modelling! They are statistical problem solvers. Also remember that you can prove or disprove anything with statistics, so you need some experts to validate your models.
********
Going clockwise
LDA (Latent (Diri-clay) Dirichlet Allocations)
LSA (Latent Semantic Analysis)
Linear Regressions
ADAM Optimizer for Neural Networks (instead of Stochastic Gradient Descent)
Principle Component Analysis
Logistic Regression
Stochastic Gradient Descent
SVD (Singular Value Decomposition) NOT on Picture – Like PCA used for Dimensionality Reduction.
Relevance: With deluge of information, we automated mechanisms to process information to ensure that we only get information that is relevant to our needs. This could be search results, recommender systems, etc
Feedback: As feedback is captured from your customers. How will you analyse and gain insights into that data. How will you ensure you are supplying the best service to your customer.
AI Assistance: As AI based services become more and more prevalent, we need better systems that communicate with us using natural language interfaces providing NLU and NLG. These need a more sophisticated ability to analyse text and ensure the service consumes the semantic meaning of our requests. Talk about Google Duplex demo Google I/O 2018 based on RNNs. Use of Linguistic Modifiers. One of the key research insights was to constrain Duplex to closed domains, which are narrow enough to explore extensively. Duplex can only carry out natural conversations after being deeply trained in such domains. It cannot carry out general conversations. The system also sounds more natural thanks to the incorporation of speech disfluencies (e.g. “hmm”s and “uh”s).
Insights: Given the volume of data captured, there are more insights to be gained from analysing the data. Most forms of data capture will results in some form of text i.e. pictures with caption generation, speech to text, optical character recognition and literal text data. This can be consumed and analysed to provide predictions, analysis and ????
Experience: With the understanding of this text data we can now improve the customers experience with our services. The use chatbots allows us to provide an expedient, efficient no-wait service. The analysis of buying habits ensure we can better predict the buying habits of customers and provide them with recommendations on what else they may like. The list is endless
These are a very small number of the areas that benefit from text analysis.
********
Feedback Processing, Email Processing,
AI Assistants Like Alex, Google, Cortana, Siri, to convert Speech to Text
Search Engines to Make Search Results More Relevant
Filtering for reduction in Spam or increase topics of interest
Organise the Data into Topics of Interest i.e. News, Sport, etc
Text Summaries
ChatBot Processing
Corpus is a body of text for analysis. Can consist of many sentences which are documents.
Text Wrangling / Text Pre-processing / Feature Transformation – Convert Unstructured Text into a Multi-Dimensional Structured Representation
Text Extraction – Get the bodies the text from HMTL, Binary PDF/Word, XML, etc
Text Normalization – Store the text in a consistent form. As simple as removing syntax modifiers i.e. removing non-alphanumeric characters and more complex domain specific normalisation using context.
TF/IDF – Term Frequencies / Inverse-Document Frequency – Logarithmic Based Frequency Analysis
Text Vectorization – Frequency Vectors (The simplest vector encoding model is to simply fill in the vector with the frequency of each word as it appears in the document), One Hot Encoding, Bag of Words, Bag of Sequences
Tokenisation – Requires Substantial Domain Knowledge. Identify words, places, names, etc i.e. turn characters into something meaningful i.e. identify words
Stop Word Removal – Remove ‘the’, ’a’, ’for’, etc which are noise in the data.
Stemming – Change Plural to singular, etc Games == Game, Frequencies == Frequency, etc. Sinking, Sank, Sink == Sink / Lemmatization
Dimensionality Reduction – Reduce the Features Used. Could be driven by frequency i.e. have more frequency words or even less but more important domain specific words.
********
Feature Engineering methods use Statistical Modelling to create a high dimensional model of the document
Bag of Words – Most ML applications work with the bag-of-words representation in which words are treated as dimensions or features with values corresponding to word frequencies.
The bag-of-words model is a simplifying representation used in natural language processing and information retrieval (IR). Also known as the vector space model. In this model, a text (such as a sentence or a document) is represented as the bag (multiset) of its words, disregarding grammar and even word order but keeping multiplicity.
Usually stored in a sparse storage model
Set of Sequences – Natural Language Processing for context driven processing. Data driven approach to representing text capturing the sequential properties of text. The bag-of-words representation will not reveal the fact that a person's name is always followed by the verb "likes" in this text. As an alternative, the n-gram model can be used to store this spatial information within the text
********
Techniques - Topic Modelling (PLSA, LDA), Named Entity Recognition, Pattern-Based Identified Entities, Quantative Text Analysis, NLP/NLU/NLG, Text Summarization, Chatbots, Speech Recognition, RegEx Processing,
Text Classification/Regression – Based on supervised learning. We have a label data set for training which allows us to find know priors which may be used for visualisation, searching or input into another model. Logistic regression allows us to classify datasets in a linear when original documents have be transformed into a linear space. E.g. Looking for sports related articles,
Topic Modelling: Discover topics in the data i.e. find related terms and correlate them based on proximity of words. E.g. Sports, Gardening, Entertainment, Football, etc with overlap of keywords i.e sports and football.
Text Clustering: Using unsupervised learning to find grouping in the data E.g. Find explicit groups i.e. sports, gardening, entertainment, etc
Semantic Analysis: What do the words actually mean? You need to understand context. Jeopardy with IBM Watson.
Sentiment Analysis: Positive/Negative or Neutral E.g. product reviews; were people happy or disappointed by products they purchased.
Text Summarisation: Too Long; Didn’t Read
Text Correlation – Graph modelling of text data to infer relationships. Identify and Create relations between textual entities i.e. people and organisations.
And there are more:
Correlation using Graph Analysis
Quantitative Text Analysis using Quantitative Approaches
And various visualisation approaches such as word clouds, etc.
I’m only going to talk about some today.
********
Simplest Text Analytics capability where we know what we are looking for in order to carry out a prediction.
Two rules to implement: Firstly, what do we want to measure i.e. spam, political affiliation, specific topic, etc, Secondly, observation by analysis of text i.e the classification. This results in automatic classification.
How do we classify or predict based on a known criteria i.e. a labelled dataset creating a trained model. We wish to extract knowledge about something based on a known prior. Spam Filtering a great example where we are looking for know keywords to classify emails as spam i.e. PPI, Hey! In title of email, your name in an email title, etc.
This is the bread and butter of supervised learning given the training dataset availability to create a trained model for use in generalisation.
Example Spam or Not Spam email.
Recommender Engines - Rather than manually creating recommendations we can analyse product descriptions and reviews to find recommendations.
********
Technique: Naïve Bayes classifier models are simple linear probabilistic mathematical models for classification. Using word frequency approaches allows for fast classification of documents. Naïve Bayes is an online model that can be updated in real-time.
Recurrent Neural Networks for time-series datasets are also available for non-linear modelling approaches to classification.
More recent approaches such as Support Vector Machines have become more fashionable and provide a more accurate classifier than standard probabilistic approaches.
For example, a recommendation system may have classifiers that identify a product’s target age (e.g., a youth versus an adult bicycle), gender (women’s versus men’s clothing), or category (e.g., electronics versus movies) by classifying the product’s description or other attributes. Product reviews may then be classified to detect quality or to determine similar products.
This technique can be used to support Segmentation and Categorisation, etc. It is an Unsupervised Learning approach to grouping data. Where you don’t know what you are looking for but use a mechanism to group. This can also result in a dimensionality reduction allowing us to work on a smaller dataset.
There are a number of different measures that can be used to determine document similarity. Fundamentally, each relies on our ability to imagine documents as points in space, where the relative closeness of any two documents is a measure of their similarity. E.g. Sports terms like football, cricket, motor racing, etc.
You can use String Matching, Distance Measures, Relational Matching, and others like fuzzy matching, Boolean equality, domain specificity, etc
We are going to talk a little about distance measures based on text vectorisation.
Clustering Techniques:
Partitive Cluster – Grouping based on distance measurements. K-means is an example. A popular method for unsupervised learning tasks, the k-means clustering algorithm starts with an arbitrarily chosen number of clusters, k, and partitions the vectorized instances into clusters according to their proximity to the centroids, which are computed to minimize the within-cluster sum of squares. SVD (Singular Value Decomposition) is a similar technique and could be used as can PCA (Principle Component Analysis).
Hierarchical Clustering – involves creating clusters that have a predetermined ordering from top to bottom. Either start with a single instance and iteratively aggregate by similarity i.e. group or start with all instances and divide until you have a single instance. Using Decision Trees a technique to map and group words or documents.
********
Technique: Similarity Based Algorithms :
Distance Measure – Using a distance metric on feature vectors i.e. documents- closer together are similar. These distances are measured using many mathematical models e.g. Jaccard, TF/IDF, Cosine Similarity
Once you have a distance measure, a cluster mechanism can be implement i.e. Partitive Clustering or Hierarchical Clustering
Techniques for Clustering include Deterministic and Probabilistic Matrix Factorization Methods, Probabilistic Mixture Models of Document, Similarity Based Algorithms, Graph Partitioning and Ensemble Methods. We will touch on one, that is Similarity Based Algorithms.
Use Case: Based on Unsupervised Learning it has no prior knowledge of topics and can be used an exploratory approach to analysis. Could be at document level or term level. Can be used as mechanism for dimensionality reduction
Topic modelling is when you have lots of documents you want to group them together by potential subjects not just word frequencies.
Discover topics or categories of interest. i.e. I have News stories find topics such as Human Interest, Business News, Gardening and Sports. It doesn’t know those topics but can cluster to create those topics.
Again based on Unsupervised Learning to automatically cluster and categorise the words into topics.
Topics are the clusters of similar words but with words falling into more than one cluster whereas clustering only allows a given word in each cluster.
Picture shows document vs words and frequency is mapped by intensity of square.
We could just search the documents for frequency of words and manually document the words and documents based on the discoveries but this is hard work and time consuming.
Topic Modelling allows us to create this group without having to specify the grouping topics using unsupervised learning approaches.
Move documents/words together based on their semantic similarity. i.e. if words appear close together in lots documents then they may be related.
<Explain cluster and unsupervised learning vs supervised learning?>
********
Peter Dirichlet (dee-ree-clay) - 1805–59, German mathematician, noted for his work on number theory and calculus
By using LDA / LSA we can use a statistical modelling to start to group the documents and words to give us meaningful topics we can subsequently use for clustering documents based on topics we discover from this unsupervised approach. Used for word similarity Mapping.
The color gradient identifies the frequency of the words in the document. lighter = less whereas darker = more.
Using Euclidean Distance (Pythagoras Theorem) or Cosine Distances to map the distance and correlation between words.
Or K-means clustering…
Techniques include Latent Dirichlet Allocation (LDA), Latent Semantic Analysis (LSA), and Non-Negative Matrix Factorization (NNMF).
Techniques include Latent Dirichlet Allocation (LDA), Latent Semantic Analysis (LSA), and Non-Negative Matrix Factorization (NNMF).
Every column corresponds to a document, every row to a word. A cell stores the frequency of a word in a document, dark cells indicate high word frequencies. Topic models group both documents, which use similar words, as well as words which occur in a similar set of documents. The resulting patterns are called "Topics“
You’ll notice that we have multiple documents containing words and we want to cluster documents with related words.
So animation starts to move together documents that contain similar words and word that appear in similar document
This results in a uniform diagonal cluster that shows a bunch of related topics documents environment, immigration, space, and so on.
We could now use that grouping of words to inform a supervised model of topic we are looking for in a wider data set and begin to identify groups of documents belonging to those topics.
Animation attribution:
https://en.wikipedia.org/wiki/File:Topic_model_scheme.webm#filelinks
Author: Christoph Carl Kling
Aka Sequential Language Modelling
How do we consider context in corpus so that we can understand the meaning of sentence for example.
Lots of text out there where the context of the words defines the meaning of that data.
This is usually used to find relationships between document or infer or predict based on sequences. E.g. predict the next word, infer emotion in a sentence, generate responses based on the data NLG.
Example where semantics add value (both would be the same in the bag of word/ word frequency solution). But both are clearly different and opposing:
The cat chased the mouse
The mouse chased the cat
Bag of Words would imply that both these sentences are the same. So we need to introduce grammatical context to better interpret the sentences.
Feature Engineering methods use Statistical Modelling to create a high dimensional model of the document.
Grammar Based Approach - In linguistics, grammar is the set of structural rules governing the composition of clauses, phrases, and words in any given natural language.
Grammar-Based Feature Extraction – Allows us to extract grammatical features from the sentence i.e. Noun, Verb, Preposition, Adjective, etc
Syntax Parsing – Deconstruct the sentences into a parse tree so that we better check the grammar correctness of the sentence
Extract Key Phrases – The key terms or phrases provide insights into topics of potential interest
Extract Entities – Create a bag of entities i.e. person, organisation, address, etc
Works well if the sentences are grammatically correct to begin with but fails if we cannot recognise the grammar of the sentence i.e. recognises nouns, verbs, prepositions, etc. By recognising the verbs, nouns, etc we may be able to infer the contextual meaning of a sentence.
n-gram Feature Extraction – A more generalised way of identifying sequences of tokens and language independent. The bag-of-words representation will not reveal the fact that a person's name is always followed by the verb "likes" in this text. As an alternative, the n-gram model can be used to store this spatial information within the text.
Word Embeddings Example to allow for inferences. Embeds words into vector space then measures the closeness of words to convey and predict sequence i.e. one word follows another. Cosine distance is an technique that could be used to measure that distance. Word2Vec offers a pre-defined model for embeddings. Can we use the relationship between words to infer relationships between documents/sentences and hence create a meaning?
********
One very famous example of how word embeddings can represent such relationship is that you can do a vector computation like this:
“king is to queen as man is to woman”
king−man+woman≈queen
n-grams window of 4 for “After, there were several follow-up questions. The New York Times asked when the bill would be signed,”:
(' After', ',', 'there', 'were’)
(',', 'there', 'were', 'several’)
(' there', 'were', 'several', 'follow’)
(' were', 'several', 'follow', 'up’)
(' several', 'follow', 'up', 'questions’)
A lot of the techniques described in this pack are based on word frequency i.e. bag of words for multi-document processing but in this scenario we need to extract key-phrases to better understand the sentences contained in the documents.
Here we are trying to derive contextual meaning so that we can respond appropriately,
A Statistical Model assigns a probability to a sequence of words.
Technique: Language Specific Methods use Grammar Rules to define the syntax of a language along with some statistical analysis. This approach can be rigid due to the inexact nature of the human language.
Technique: Language Independent Methods use a number of modelling approaches such as unigram, bigram, trigram, n-gram models or neural networks to encode the grammatical structure of a language from examples. Much more accurate and allows the use of Feature Engineering methods.
Approach: Word Embeddings – “Embeds” words into a vector space model based on how often a word appears close to other words. With pre-trained models like word2vec and GloVe you capture the semantics of the words, so that similar words have similar vectors.
The spam classification example has recently been displaced by a new vogue: sentiment analysis. How do we capture the emotion of a corpus or document? Positive, Negative or Neutral.
Social media and feedback systems allow us to express our opinions about a product, movie or service. This provides valuable insight to the vendor/provider. Do people like or dislike my product or service?
“I loved the fact that this product didn’t work properly” – Need to allow for sarcasm.
Achieving 70% Accuracy is classifying sentiment as well as humans.
Two Approaches:
Knowledge Based – classify text by affect properties of the sentences i.e. good, bad, like, hate, etc. This has limited uses but can yield quick results.
Statistical Based – Uses semantic analysis based approaches that analyse the grammatical structure of the sentences to yield more accurate results. LSA, LDA, etc techniques can consider the semantics of the sentences and paragraphs that allow a better understand of the emotion and entity that is the target of that emotion.
********
Sentiment analysis models attempt to predict positive (“ I love writing Python code”) or negative (“ I hate it when people repeat themselves”) sentiment based on content and has gained significant popularity thanks to the expressiveness of social media. Because companies are involved in a more general dialogue where they do not control the information channel (such as reviews of their products and services), there is a belief that sentiment analysis can assist with targeted customer support or even model corporate performance. The complexities and nuances inherent in language context make sentiment analysis less straightforward than spam detection.
The Web provides a forum to individuals to express their opinions and sentiments. For example, the product reviews in a Web site might contain text beyond the numerical ratings provided by the user. The textual content of these reviews provides useful information that is not available in numerical ratings. From this point of view, opinion mining can be viewed as the text-centric analog of the rating-centric techniques used in recommender systems. For example, product reviews are often used by both types of methods. Whereas recommender systems analyze the numerical ratings for prediction, opinion mining methods analyze the text of the opinions. It is noteworthy that opinions are often mined from information settings like social media and blogs where ratings are not available. Chapter 13 will discuss the problem of opinion mining and sentiment analysis of text data. The use of information extraction methods for opinion mining is also discussed.
“The movie is surprising with plenty of unsettling plot twists.” (Negative term used in a positive sense in certain domains).
Both Supervised and Unsupervised Learning approaches can be used. Unsupervised where labelled data is not available i.e. social media posts about new topics of interest.
Technique: Using Recursive Neural Networks or Recurrent Neural Networks with LSTMs allows utilise a ‘bag-of-keyphrases’ approach to ensure we retain the nuances and positivity or negativity associated with a word i.e. “terribly helpful”.
How do we take a corpus of text and create a summarised version of that text? There are two techniques available Extractive and Abstractive
Technique: Extractive. Uses existing sentences to create a summary.
Using a method of scoring based on Topic Word Frequencies, Latent Semantic Analysis, Machine Learning with Supervised Learning. By matching to high frequency or even low frequency words and similarity mapping it finds high scoring sentences in the document and uses those to create the summary.
Topic Word approaches work by removing low frequency occurrence and high frequency stop words, topic words left can be used to score sentences that contain them.
Machine Learning uses Trained Models to select appropriates features of a document i.e. frequency of topic words, presence of title words, location features (beginning or end of a paragraph)
Technique: Abstractive. Re-write the sentences from the document. Uses phrases and clauses from the document but new text is generated. This is an area of research in AI that requires coherence and fluency with semantic understanding to support summarisation. Sentence Compression, Information Fusion, Information Ordering are problems that need to be solved. This is a largely unsolved problem but an area of great interest given it potential applications in AI.
********
I call it art because it is still a intelligence driven approach. You need to think about your use cases to ensure you get the right solution to support your business.
No Wrong or Right Answer / Many Approaches to Text Analytics
As you have seen there are many different techniques to analysing text and today you’ve seen a few.
They yield different insights and benefits and should be aligned to the problem you are trying to solve
They are also inter-related so you can use one to carry out many different activities i.e. Clustering or Semantic Analysis
Also remember you can build competing models to see which ones yield the best solution to your problem.
There’s no wrong or right answers, see what works and measure the accuracy of the insights based prior understanding if its available
Iterative Approach to See What Works
There isn’t a one size fits all so you need to tune and tweak the models
Look at your problem space.
Understand your data if you can else look at mechanisms for dimensionality reduction to narrow down your insights if that’s appropriate
Mathematical Driven Statistical Modelling
It’s all still maths! Based on Probabilities. Consider those probabilities when contemplating correctness.
Statistical problem solver at heart. Easy to Implement Some Approaches But Need to be Validated By Your Experts
Enterprise AI Solutions. Lots of Tooling Available COTS and Open Source. SAS, IBM, Microsoft, Google, etc
There are many products out there.
Commoditise and Democratise AI
Accelerate the production of your solutions by using these tools. We have looked at IBM Watson, SAS, RapidMiner, etc.
They have allowed us to create initial models in hours and days because of the pre-define nature of the modelling capability available
Hopefully you can see that Text Analytics is an incredibly useful approach to supporting your business needs!
Thank you!