SlideShare a Scribd company logo
1 of 16
Text mining 
Fuzzy document classification 
Using Elasticsearch 
Lev Ozeryansky
Identity Card 
• Merging of Ankor and We! 
• Owned by Hilan (publicly traded in Tel Aviv Stock exchange) 
• Fast growing IT integration company 
• Over 2000 systems installed and maintained 
• Over 1000 leading customers - Hi-tech, Industry, Academy, Banks, 
Insurance, 
• Strong technological team – over 45 engineers, professional services 
and project managers 
• Over 120 employees 
• Four main divisions – Infrastructure, Big Data, Cloud, Cyber
Technology Edge
What is classification 
• Document classification as document categorization. 
• Using classification. 
• Our classification data source. 
• What we do with? 
• Java programmer. 
• .NET programmer.
Data source
The mathematics 
• Let be class set 
• Let be documents set 
• Classification function
Classification method 
• Cosine similarity 
• Function
Build document class 
vector 
• Java programmer 
• Java 
• 5 
• Hibernate 
• .NET programmer 
• C# 
• 5 
• Nhibernate
Let index classificators 
• Add weight manually. 
• For Java programmer: 
• Java = 0.7 
• 5 = 0.5 
• Hibernate = 0.3 
• For .NET programmer 
• C# = 0.7 
• 5 = 0.5 
• Nhibernate = 0.3
DEMO
w-shingling 
• In natural language processing a w-shingling is a set of 
unique "shingles"—contiguous subsequences of tokens in 
a document. (Wikipedia) 
• Tokenization 
• Elasticsearch analyze mechanism
DEMO
Classification process 
• Tokens array. 
• Classification query. 
• Use terms query when terms array == tokens array 
• Two vectors 
• Vector of filtered tokens 
• Classification vector
DEMO
Classification process 
• SciPy to calculate distance.
Q&A

More Related Content

What's hot

Kibana Tutorial | Kibana Dashboard Tutorial | Kibana Elasticsearch | ELK Stac...
Kibana Tutorial | Kibana Dashboard Tutorial | Kibana Elasticsearch | ELK Stac...Kibana Tutorial | Kibana Dashboard Tutorial | Kibana Elasticsearch | ELK Stac...
Kibana Tutorial | Kibana Dashboard Tutorial | Kibana Elasticsearch | ELK Stac...Edureka!
 
Elasticsearch { "Meetup" : "talk" }
Elasticsearch { "Meetup" : "talk" }Elasticsearch { "Meetup" : "talk" }
Elasticsearch { "Meetup" : "talk" }Lutf Ur Rehman
 
A (XPages) developers guide to Cloudant - MeetIT
A (XPages) developers guide to Cloudant - MeetITA (XPages) developers guide to Cloudant - MeetIT
A (XPages) developers guide to Cloudant - MeetITFrank van der Linden
 
Bank management system c++
Bank management system c++Bank management system c++
Bank management system c++Akshay Sorathia
 
Architecture - why so serious?
Architecture - why so serious?Architecture - why so serious?
Architecture - why so serious?Barbara Fusinska
 
Cloud architectural patterns and Microsoft Azure tools
Cloud architectural patterns and Microsoft Azure toolsCloud architectural patterns and Microsoft Azure tools
Cloud architectural patterns and Microsoft Azure toolsPushkar Chivate
 
Getting started with Laravel & Elasticsearch
Getting started with Laravel & ElasticsearchGetting started with Laravel & Elasticsearch
Getting started with Laravel & ElasticsearchPeter Steenbergen
 
Scala for java developers 6 may 2017 - yeni
Scala for java developers   6 may 2017 - yeniScala for java developers   6 may 2017 - yeni
Scala for java developers 6 may 2017 - yeniBaris Dere
 
Tips & Tricks SQL in the City Seattle 2014
Tips & Tricks SQL in the City Seattle 2014Tips & Tricks SQL in the City Seattle 2014
Tips & Tricks SQL in the City Seattle 2014Ike Ellis
 
Not only SQL - Database Choices
Not only SQL - Database ChoicesNot only SQL - Database Choices
Not only SQL - Database ChoicesLynn Langit
 
Adobe Spark Meetup - 9/19/2018 - San Jose, CA
Adobe Spark Meetup - 9/19/2018 - San Jose, CAAdobe Spark Meetup - 9/19/2018 - San Jose, CA
Adobe Spark Meetup - 9/19/2018 - San Jose, CAJaemi Bremner
 
Bleeding Edge Databases
Bleeding Edge DatabasesBleeding Edge Databases
Bleeding Edge DatabasesLynn Langit
 
AWS for Java Developers workshop
AWS for Java Developers workshopAWS for Java Developers workshop
AWS for Java Developers workshopRory Preddy
 
Machine Learning on the Microsoft Stack
Machine Learning on the Microsoft StackMachine Learning on the Microsoft Stack
Machine Learning on the Microsoft StackLynn Langit
 
When Our Serverless Team Chooses Containers
When Our Serverless Team Chooses ContainersWhen Our Serverless Team Chooses Containers
When Our Serverless Team Chooses ContainersSam Goldstein
 
Test driving Azure Search and DocumentDB
Test driving Azure Search and DocumentDBTest driving Azure Search and DocumentDB
Test driving Azure Search and DocumentDBAndrew Siemer
 
Getting started with Azure Cognitive services
Getting started with Azure Cognitive servicesGetting started with Azure Cognitive services
Getting started with Azure Cognitive servicesRick van den Bosch
 

What's hot (19)

Linq architecture
Linq   architectureLinq   architecture
Linq architecture
 
Kibana Tutorial | Kibana Dashboard Tutorial | Kibana Elasticsearch | ELK Stac...
Kibana Tutorial | Kibana Dashboard Tutorial | Kibana Elasticsearch | ELK Stac...Kibana Tutorial | Kibana Dashboard Tutorial | Kibana Elasticsearch | ELK Stac...
Kibana Tutorial | Kibana Dashboard Tutorial | Kibana Elasticsearch | ELK Stac...
 
Elasticsearch { "Meetup" : "talk" }
Elasticsearch { "Meetup" : "talk" }Elasticsearch { "Meetup" : "talk" }
Elasticsearch { "Meetup" : "talk" }
 
A (XPages) developers guide to Cloudant - MeetIT
A (XPages) developers guide to Cloudant - MeetITA (XPages) developers guide to Cloudant - MeetIT
A (XPages) developers guide to Cloudant - MeetIT
 
Bank management system c++
Bank management system c++Bank management system c++
Bank management system c++
 
Architecture - why so serious?
Architecture - why so serious?Architecture - why so serious?
Architecture - why so serious?
 
Cloud architectural patterns and Microsoft Azure tools
Cloud architectural patterns and Microsoft Azure toolsCloud architectural patterns and Microsoft Azure tools
Cloud architectural patterns and Microsoft Azure tools
 
Getting started with Laravel & Elasticsearch
Getting started with Laravel & ElasticsearchGetting started with Laravel & Elasticsearch
Getting started with Laravel & Elasticsearch
 
Laravel and SOLR
Laravel and SOLRLaravel and SOLR
Laravel and SOLR
 
Scala for java developers 6 may 2017 - yeni
Scala for java developers   6 may 2017 - yeniScala for java developers   6 may 2017 - yeni
Scala for java developers 6 may 2017 - yeni
 
Tips & Tricks SQL in the City Seattle 2014
Tips & Tricks SQL in the City Seattle 2014Tips & Tricks SQL in the City Seattle 2014
Tips & Tricks SQL in the City Seattle 2014
 
Not only SQL - Database Choices
Not only SQL - Database ChoicesNot only SQL - Database Choices
Not only SQL - Database Choices
 
Adobe Spark Meetup - 9/19/2018 - San Jose, CA
Adobe Spark Meetup - 9/19/2018 - San Jose, CAAdobe Spark Meetup - 9/19/2018 - San Jose, CA
Adobe Spark Meetup - 9/19/2018 - San Jose, CA
 
Bleeding Edge Databases
Bleeding Edge DatabasesBleeding Edge Databases
Bleeding Edge Databases
 
AWS for Java Developers workshop
AWS for Java Developers workshopAWS for Java Developers workshop
AWS for Java Developers workshop
 
Machine Learning on the Microsoft Stack
Machine Learning on the Microsoft StackMachine Learning on the Microsoft Stack
Machine Learning on the Microsoft Stack
 
When Our Serverless Team Chooses Containers
When Our Serverless Team Chooses ContainersWhen Our Serverless Team Chooses Containers
When Our Serverless Team Chooses Containers
 
Test driving Azure Search and DocumentDB
Test driving Azure Search and DocumentDBTest driving Azure Search and DocumentDB
Test driving Azure Search and DocumentDB
 
Getting started with Azure Cognitive services
Getting started with Azure Cognitive servicesGetting started with Azure Cognitive services
Getting started with Azure Cognitive services
 

Viewers also liked

HOW TO HACK FACEBOOK BY COMPUTER
HOW TO HACK FACEBOOK BY COMPUTER HOW TO HACK FACEBOOK BY COMPUTER
HOW TO HACK FACEBOOK BY COMPUTER June_Johnson
 
I'm Trying To Relax But It's Not Working
I'm Trying To Relax But It's Not WorkingI'm Trying To Relax But It's Not Working
I'm Trying To Relax But It's Not WorkingIrene Cauwels
 
CV for Jason Cowan
CV for Jason CowanCV for Jason Cowan
CV for Jason CowanJason Cowan
 
MBarberResume1Word-A
MBarberResume1Word-AMBarberResume1Word-A
MBarberResume1Word-AMarisa Barber
 
Production process of print advert
Production process of print advertProduction process of print advert
Production process of print advertMatthewTurnbull1998
 
Activity about parts of brain
Activity about parts of brain Activity about parts of brain
Activity about parts of brain Joel Delliva
 
Narrative theory
Narrative theoryNarrative theory
Narrative theoryKiraRachael
 
Tutorial: CASRAI Standards Development (for a non-technology audience) - Davi...
Tutorial: CASRAI Standards Development (for a non-technology audience) - Davi...Tutorial: CASRAI Standards Development (for a non-technology audience) - Davi...
Tutorial: CASRAI Standards Development (for a non-technology audience) - Davi...CASRAI
 
Cathodic protection in_practise
Cathodic protection in_practiseCathodic protection in_practise
Cathodic protection in_practiseabdallahbeca
 
4th primary school of larissa!!!
4th primary school of larissa!!!4th primary school of larissa!!!
4th primary school of larissa!!!sofiathoma
 
Keynote: Today's Data Grow Tomorrow's Citizens - Tracey P. Lauriault
Keynote: Today's Data Grow Tomorrow's Citizens - Tracey P. LauriaultKeynote: Today's Data Grow Tomorrow's Citizens - Tracey P. Lauriault
Keynote: Today's Data Grow Tomorrow's Citizens - Tracey P. LauriaultCASRAI
 
Мень А. Библиологический словарь (Ii том. k-п) - 2002
Мень А.   Библиологический словарь (Ii том. k-п) - 2002Мень А.   Библиологический словарь (Ii том. k-п) - 2002
Мень А. Библиологический словарь (Ii том. k-п) - 2002Михаил Ломоносов
 

Viewers also liked (19)

HOW TO HACK FACEBOOK BY COMPUTER
HOW TO HACK FACEBOOK BY COMPUTER HOW TO HACK FACEBOOK BY COMPUTER
HOW TO HACK FACEBOOK BY COMPUTER
 
Vu-2015_tissue-plasticity-PNET
Vu-2015_tissue-plasticity-PNETVu-2015_tissue-plasticity-PNET
Vu-2015_tissue-plasticity-PNET
 
I'm Trying To Relax But It's Not Working
I'm Trying To Relax But It's Not WorkingI'm Trying To Relax But It's Not Working
I'm Trying To Relax But It's Not Working
 
CV for Jason Cowan
CV for Jason CowanCV for Jason Cowan
CV for Jason Cowan
 
Target audience
Target audienceTarget audience
Target audience
 
MBarberResume1Word-A
MBarberResume1Word-AMBarberResume1Word-A
MBarberResume1Word-A
 
Biologi-Sel
Biologi-SelBiologi-Sel
Biologi-Sel
 
Production process of print advert
Production process of print advertProduction process of print advert
Production process of print advert
 
Vail Style Guide
Vail Style GuideVail Style Guide
Vail Style Guide
 
Activity about parts of brain
Activity about parts of brain Activity about parts of brain
Activity about parts of brain
 
Narrative theory
Narrative theoryNarrative theory
Narrative theory
 
Tutorial: CASRAI Standards Development (for a non-technology audience) - Davi...
Tutorial: CASRAI Standards Development (for a non-technology audience) - Davi...Tutorial: CASRAI Standards Development (for a non-technology audience) - Davi...
Tutorial: CASRAI Standards Development (for a non-technology audience) - Davi...
 
Pokok bahasan hpp 2014
Pokok bahasan hpp 2014Pokok bahasan hpp 2014
Pokok bahasan hpp 2014
 
Cathodic protection in_practise
Cathodic protection in_practiseCathodic protection in_practise
Cathodic protection in_practise
 
Webinar session 2
Webinar session 2Webinar session 2
Webinar session 2
 
4th primary school of larissa!!!
4th primary school of larissa!!!4th primary school of larissa!!!
4th primary school of larissa!!!
 
Bhungroo_Brochure
Bhungroo_BrochureBhungroo_Brochure
Bhungroo_Brochure
 
Keynote: Today's Data Grow Tomorrow's Citizens - Tracey P. Lauriault
Keynote: Today's Data Grow Tomorrow's Citizens - Tracey P. LauriaultKeynote: Today's Data Grow Tomorrow's Citizens - Tracey P. Lauriault
Keynote: Today's Data Grow Tomorrow's Citizens - Tracey P. Lauriault
 
Мень А. Библиологический словарь (Ii том. k-п) - 2002
Мень А.   Библиологический словарь (Ii том. k-п) - 2002Мень А.   Библиологический словарь (Ii том. k-п) - 2002
Мень А. Библиологический словарь (Ii том. k-п) - 2002
 

Similar to Dev ops-presentation

Characerizing and Validating QoS in the Emerging IoT Network
Characerizing and Validating QoS in the Emerging IoT NetworkCharacerizing and Validating QoS in the Emerging IoT Network
Characerizing and Validating QoS in the Emerging IoT NetworkHans Ashlock
 
Full Text Search with Lucene
Full Text Search with LuceneFull Text Search with Lucene
Full Text Search with LuceneWO Community
 
Self Service for IT Infrastructure
Self Service for IT Infrastructure Self Service for IT Infrastructure
Self Service for IT Infrastructure Cisco DevNet
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learningVishwas Lele
 
Delivering big content at NBC News with RavenDB
Delivering big content at NBC News with RavenDBDelivering big content at NBC News with RavenDB
Delivering big content at NBC News with RavenDBJohn Bennett
 
AWS Lambda support for AWS X-Ray
AWS Lambda support for AWS X-RayAWS Lambda support for AWS X-Ray
AWS Lambda support for AWS X-RayEitan Sela
 
Selenium Online Training
Selenium Online Training Selenium Online Training
Selenium Online Training Nagendra Kumar
 
How to Build Deep Learning Models
How to Build Deep Learning ModelsHow to Build Deep Learning Models
How to Build Deep Learning ModelsJosh Patterson
 
scrazzl - A technical overview
scrazzl - A technical overviewscrazzl - A technical overview
scrazzl - A technical overviewscrazzl
 
Tokyo azure meetup #2 big data made easy
Tokyo azure meetup #2   big data made easyTokyo azure meetup #2   big data made easy
Tokyo azure meetup #2 big data made easyTokyo Azure Meetup
 
Dev nexus 2017
Dev nexus 2017Dev nexus 2017
Dev nexus 2017Roy Russo
 
Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014
Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014
Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014NoSQLmatters
 
Tech Spark Presentation
Tech Spark PresentationTech Spark Presentation
Tech Spark PresentationStephen Borg
 
Apache Cayenne: a Java ORM Alternative
Apache Cayenne: a Java ORM AlternativeApache Cayenne: a Java ORM Alternative
Apache Cayenne: a Java ORM AlternativeAndrus Adamchik
 
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...Spark Summit
 

Similar to Dev ops-presentation (20)

Characerizing and Validating QoS in the Emerging IoT Network
Characerizing and Validating QoS in the Emerging IoT NetworkCharacerizing and Validating QoS in the Emerging IoT Network
Characerizing and Validating QoS in the Emerging IoT Network
 
Rdbms
RdbmsRdbms
Rdbms
 
Full Text Search with Lucene
Full Text Search with LuceneFull Text Search with Lucene
Full Text Search with Lucene
 
Self Service for IT Infrastructure
Self Service for IT Infrastructure Self Service for IT Infrastructure
Self Service for IT Infrastructure
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Delivering big content at NBC News with RavenDB
Delivering big content at NBC News with RavenDBDelivering big content at NBC News with RavenDB
Delivering big content at NBC News with RavenDB
 
AWS Lambda support for AWS X-Ray
AWS Lambda support for AWS X-RayAWS Lambda support for AWS X-Ray
AWS Lambda support for AWS X-Ray
 
Selenium Online Training
Selenium Online Training Selenium Online Training
Selenium Online Training
 
How to Build Deep Learning Models
How to Build Deep Learning ModelsHow to Build Deep Learning Models
How to Build Deep Learning Models
 
Mazenet
MazenetMazenet
Mazenet
 
scrazzl - A technical overview
scrazzl - A technical overviewscrazzl - A technical overview
scrazzl - A technical overview
 
Java
JavaJava
Java
 
Tokyo azure meetup #2 big data made easy
Tokyo azure meetup #2   big data made easyTokyo azure meetup #2   big data made easy
Tokyo azure meetup #2 big data made easy
 
Dev nexus 2017
Dev nexus 2017Dev nexus 2017
Dev nexus 2017
 
Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014
Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014
Sebastian Cohnen – Building a Startup with NoSQL - NoSQL matters Barcelona 2014
 
Tech Spark Presentation
Tech Spark PresentationTech Spark Presentation
Tech Spark Presentation
 
Apache Cayenne: a Java ORM Alternative
Apache Cayenne: a Java ORM AlternativeApache Cayenne: a Java ORM Alternative
Apache Cayenne: a Java ORM Alternative
 
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...
Using SparkML to Power a DSaaS (Data Science as a Service): Spark Summit East...
 
ppt_on_java.pptx
ppt_on_java.pptxppt_on_java.pptx
ppt_on_java.pptx
 
Elasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetupElasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetup
 

Recently uploaded

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 

Recently uploaded (20)

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 

Dev ops-presentation

  • 1. Text mining Fuzzy document classification Using Elasticsearch Lev Ozeryansky
  • 2. Identity Card • Merging of Ankor and We! • Owned by Hilan (publicly traded in Tel Aviv Stock exchange) • Fast growing IT integration company • Over 2000 systems installed and maintained • Over 1000 leading customers - Hi-tech, Industry, Academy, Banks, Insurance, • Strong technological team – over 45 engineers, professional services and project managers • Over 120 employees • Four main divisions – Infrastructure, Big Data, Cloud, Cyber
  • 4. What is classification • Document classification as document categorization. • Using classification. • Our classification data source. • What we do with? • Java programmer. • .NET programmer.
  • 6. The mathematics • Let be class set • Let be documents set • Classification function
  • 7. Classification method • Cosine similarity • Function
  • 8. Build document class vector • Java programmer • Java • 5 • Hibernate • .NET programmer • C# • 5 • Nhibernate
  • 9. Let index classificators • Add weight manually. • For Java programmer: • Java = 0.7 • 5 = 0.5 • Hibernate = 0.3 • For .NET programmer • C# = 0.7 • 5 = 0.5 • Nhibernate = 0.3
  • 10. DEMO
  • 11. w-shingling • In natural language processing a w-shingling is a set of unique "shingles"—contiguous subsequences of tokens in a document. (Wikipedia) • Tokenization • Elasticsearch analyze mechanism
  • 12. DEMO
  • 13. Classification process • Tokens array. • Classification query. • Use terms query when terms array == tokens array • Two vectors • Vector of filtered tokens • Classification vector
  • 14. DEMO
  • 15. Classification process • SciPy to calculate distance.
  • 16. Q&A