Machine Learning with Apache Mahout provides an overview of machine learning algorithms like classification, clustering, and recommendation and how they are implemented in Apache Mahout. It discusses how Mahout can be used to build recommender systems, classify documents, cluster data, and evaluate relevance. Examples are given of how Mahout could be used in applications like search engines to index and classify documents and provide personalized search results.
CAPTCHA as Graphical Password: A Novel Approach to Enhance the Security in WWWIJLT EMAS
This research aims to study the existing password
scheme and to design and develop a new improved graphical
password scheme. A novel protection primitive is presented in
view of strong AI problems namely a new family of graphical
password scheme built up on top of captcha technology, which
we call Captcha as graphical password (CaRP). CaRP is both a
captcha and graphical password scheme. CaRP addresses
number of security issues altogether for example, online guessing
attacks, relay attacks and if combined with dual -view
technologies shoulder-surfing attacks. CaRP likewise offers a
novel way to deal with address the notable image hotspot
problem in well-known
CAPTCHA as Graphical Password: A Novel Approach to Enhance the Security in WWWIJLT EMAS
This research aims to study the existing password
scheme and to design and develop a new improved graphical
password scheme. A novel protection primitive is presented in
view of strong AI problems namely a new family of graphical
password scheme built up on top of captcha technology, which
we call Captcha as graphical password (CaRP). CaRP is both a
captcha and graphical password scheme. CaRP addresses
number of security issues altogether for example, online guessing
attacks, relay attacks and if combined with dual -view
technologies shoulder-surfing attacks. CaRP likewise offers a
novel way to deal with address the notable image hotspot
problem in well-known
Tutorial given at LAK13 conference, Leuven, April, 9th, 2013. The presentation is informed by WP2 of the LinkedUp-project.eu that develops an Evaluation Framework for Open Web Data (Linked Data) Applications for Education purposes.
Overcoming the Top Four Challenges to Real‐Time Performance in Large‐Scale, D...SL Corporation
The most critical large-scale applications today, regardless of industry, involve a demand for real-time data transfer and visualization of potentially large volumes of data. With this demand comes numerous challenges and limiting factors, especially if these applications are deployed in virtual or cloud environments. Attend this session and learn how to overcome the top four challenges to real-time application performance: database performance, network data transfer bandwidth limitations, processor performance and lack of real-time predictability. Solutions discussed will include design of the proper data model for the application data, along with design patterns that facilitate optimal and minimal data transfer across networks.
Given its ability to analyze structured, unstructured, and "multi-structured" data, Hadoop is an increasingly viable option for analytics and business intelligence within the enterprise. Dramatically more scalable and cost-effective than traditional data warehousing technologies, Hadoop is also increasingly used to perform new kinds of analytics that were previously impossible. When it comes to Big Data, retailers are at the forefront of leveraging large volumes of nuanced information about customers, to improve the effectiveness of promotional campaigns, refine pricing models, and lower overall customer acquisition costs. Retailers compete fiercely for consumers' attention, time, and money, and effective use of analytics can result in sustained competitive advantage. Forward-thinking retailers can now take advantage of all data sources to construct a complete picture of a customer. This invariably consists of both structured data (customer and inventory records, spreadsheets, etc.) and unstructured data (clickstream logs, email archives, customer feedback and comment fields, etc.). This allows, for example, online retailers with structured, transactional sales data to connect that data with unstructured comments from product reviews, providing insight into how reviews affect consumers' propensity to purchase a particular product. This session will examine several real-world customer use cases applying combined analysis of structured and unstructured data.
Keynote: Harnessing the power of Elasticsearch for simplified searchElasticsearch
Get an overview of the innovation Elastic is bringing to the Enterprise Search landscape, and learn how you can harness these capabilities across your technology landscape to make the power of search work for you.
Financial Services companies are using machine learning to reduce fraud, streamline processes, and improve their bottom line. AWS provides tools that help them easily use AI tools like MXNet and Tensor Flow to perform predictive analytics, clustering, and more advanced data analyses. In this session, you'll hear how IHS Markit has used Machine Learning on AWS to help global banking institutions manage their commodities portfolios. You will also learn how the Amazon Machine Learning Service can take the hassle out of AI.
الموعد الإثنين 03 يناير 2022
143
مبادرة
#تواصل_تطوير
المحاضرة ال 143 من المبادرة
المهندس / محمد الرافعي طرباي
نقيب المبرمجين بالدقهلية
بعنوان
"IT INDUSTRY"
How To Getting Into IT With Zero Experience
وذلك يوم الإثنين 03 يناير2022
السابعة مساء توقيت القاهرة
الثامنة مساء توقيت مكة المكرمة
و الحضور من تطبيق زووم
https://us02web.zoom.us/meeting/register/tZUpf-GsrD4jH9N9AxO39J013c1D4bqJNTcu
علما ان هناك بث مباشر للمحاضرة على القنوات الخاصة بجمعية المهندسين المصريين
ونأمل أن نوفق في تقديم ما ينفع المهندس ومهمة الهندسة في عالمنا العربي
والله الموفق
للتواصل مع إدارة المبادرة عبر قناة التليجرام
https://t.me/EEAKSA
ومتابعة المبادرة والبث المباشر عبر نوافذنا المختلفة
رابط اللينكدان والمكتبة الالكترونية
https://www.linkedin.com/company/eeaksa-egyptian-engineers-association/
رابط قناة التويتر
https://twitter.com/eeaksa
رابط قناة الفيسبوك
https://www.facebook.com/EEAKSA
رابط قناة اليوتيوب
https://www.youtube.com/user/EEAchannal
رابط التسجيل العام للمحاضرات
https://forms.gle/vVmw7L187tiATRPw9
ملحوظة : توجد شهادات حضور مجانية لمن يسجل فى رابط التقيم اخر المحاضرة
Sample Codes: https://github.com/davegautam/dotnetconfsamplecodes
Presentation on How you can get started with ML.NET. If you are existing .NET Stack Developer and Wanna use the same technology into Machine Learning, this slide focuses on how you can use ML.NET for Machine Learning.
Driving Customer Loyalty with Azure Machine LearningCCG
Learn how you can leverage the elastic, on-demand processing power of Microsoft Azure to create faster, more applicable analytics by viewing this informative webinar. Data Scientist and Author, Ahmed Sherif, demonstrates key analytic use cases that can be spun up quickly with minimal effort and maximum return on investment. To watch the full recording of this webinar, visit http://ccgbi.com/resources/webinars/driving-customer-loyalty-with-AML
Next-best offer refers to the use of predictive analytics solutions to identify the products or services your customers are most likely to be interested in for their next purchase.
Facing this topic I have made a personal research, and realize a synthesis, which has helped me to clarify some ideas. This presentation does not intend to be exhaustive on the subject, but could perhaps bring you some useful insights.
AI-SDV 2020: Bringing AI to SME projects: Addressing customer needs with a fl...Dr. Haxel Consult
Customers interested in Language Analytics solutions typically approach us with a broad range of business cases and specific business needs. Especially when it comes to the data available for their case and for any AI aspects involved, the variation in data types, data quality and data quantity is, by our experience, quite vast and at the same time so critical for a project's success, that we often start our requirements analysis right there: at the data. At Karakun, our Language Analytics team addresses this in an increasingly flexible way: We select from a set of Language Analytics tools and related services (e.g. data cleansing and data procurement) to meet the business needs at hand with the data available or at least in reach – at reasonable costs.
The methodology stack ranges from heuristic logic over statistical solutions to neural networks. At the same time, we aim at reducing the amount of data needed for such training, e.g. by integrating state-of-the-art neural technologies into our platform. That way, also SMEs and their specific business cases can benefit from the full range of Language Analytics options.
To illustrate our approach, we will present an e-Safe solution which allows for semantic document tagging and search in highly secured virtual safes. In addition, our solution provides text-based triggers for complex workflows depending on the safe´s content.
Building Generative AI-infused apps: what's possible and how to startMaxim Salnikov
In this session, we'll explore different scenarios where the features of Generative AI can provide added value to an IT solution. We'll also learn how to begin developing your own application powered by AI. Using Azure OpenAI service as an illustration, we'll examine the various APIs it offers, review the best practices of Prompt Engineering, explore different ways to incorporate your own data into the process, and take a glance at several tools and resources that make the developer experience more seamless.
EclipseCon - Building an IDE for Apache CassandraMichaël Figuière
Apache Cassandra is a distributed, scalable and highly available database used in many large scale infrastructures in companies such as Netflix, eBay, Instagram or Spotify. It comes with a SQL-like query language that reduces its learning curve, but in order to allow developers to have a similar productivity as with traditional RDBMS, several tools are required.
DataStax DevCenter is a standalone IDE built on top of the Eclipse RCP Platform, that makes it easier to create data models and scripts for Cassandra. It relies on Xtext to bring a modern editor with content assist, syntax highlighting, cross references, instant validation and quick fixes. Besides that, in order to build a sophisticated UI while keeping the codebase simple, e4 has been leveraged for dependency injection and event dispatching.
This presentation will provide an overview of the design challenges that we've faced and our experience putting together all these technologies.
Tutorial given at LAK13 conference, Leuven, April, 9th, 2013. The presentation is informed by WP2 of the LinkedUp-project.eu that develops an Evaluation Framework for Open Web Data (Linked Data) Applications for Education purposes.
Overcoming the Top Four Challenges to Real‐Time Performance in Large‐Scale, D...SL Corporation
The most critical large-scale applications today, regardless of industry, involve a demand for real-time data transfer and visualization of potentially large volumes of data. With this demand comes numerous challenges and limiting factors, especially if these applications are deployed in virtual or cloud environments. Attend this session and learn how to overcome the top four challenges to real-time application performance: database performance, network data transfer bandwidth limitations, processor performance and lack of real-time predictability. Solutions discussed will include design of the proper data model for the application data, along with design patterns that facilitate optimal and minimal data transfer across networks.
Given its ability to analyze structured, unstructured, and "multi-structured" data, Hadoop is an increasingly viable option for analytics and business intelligence within the enterprise. Dramatically more scalable and cost-effective than traditional data warehousing technologies, Hadoop is also increasingly used to perform new kinds of analytics that were previously impossible. When it comes to Big Data, retailers are at the forefront of leveraging large volumes of nuanced information about customers, to improve the effectiveness of promotional campaigns, refine pricing models, and lower overall customer acquisition costs. Retailers compete fiercely for consumers' attention, time, and money, and effective use of analytics can result in sustained competitive advantage. Forward-thinking retailers can now take advantage of all data sources to construct a complete picture of a customer. This invariably consists of both structured data (customer and inventory records, spreadsheets, etc.) and unstructured data (clickstream logs, email archives, customer feedback and comment fields, etc.). This allows, for example, online retailers with structured, transactional sales data to connect that data with unstructured comments from product reviews, providing insight into how reviews affect consumers' propensity to purchase a particular product. This session will examine several real-world customer use cases applying combined analysis of structured and unstructured data.
Keynote: Harnessing the power of Elasticsearch for simplified searchElasticsearch
Get an overview of the innovation Elastic is bringing to the Enterprise Search landscape, and learn how you can harness these capabilities across your technology landscape to make the power of search work for you.
Financial Services companies are using machine learning to reduce fraud, streamline processes, and improve their bottom line. AWS provides tools that help them easily use AI tools like MXNet and Tensor Flow to perform predictive analytics, clustering, and more advanced data analyses. In this session, you'll hear how IHS Markit has used Machine Learning on AWS to help global banking institutions manage their commodities portfolios. You will also learn how the Amazon Machine Learning Service can take the hassle out of AI.
الموعد الإثنين 03 يناير 2022
143
مبادرة
#تواصل_تطوير
المحاضرة ال 143 من المبادرة
المهندس / محمد الرافعي طرباي
نقيب المبرمجين بالدقهلية
بعنوان
"IT INDUSTRY"
How To Getting Into IT With Zero Experience
وذلك يوم الإثنين 03 يناير2022
السابعة مساء توقيت القاهرة
الثامنة مساء توقيت مكة المكرمة
و الحضور من تطبيق زووم
https://us02web.zoom.us/meeting/register/tZUpf-GsrD4jH9N9AxO39J013c1D4bqJNTcu
علما ان هناك بث مباشر للمحاضرة على القنوات الخاصة بجمعية المهندسين المصريين
ونأمل أن نوفق في تقديم ما ينفع المهندس ومهمة الهندسة في عالمنا العربي
والله الموفق
للتواصل مع إدارة المبادرة عبر قناة التليجرام
https://t.me/EEAKSA
ومتابعة المبادرة والبث المباشر عبر نوافذنا المختلفة
رابط اللينكدان والمكتبة الالكترونية
https://www.linkedin.com/company/eeaksa-egyptian-engineers-association/
رابط قناة التويتر
https://twitter.com/eeaksa
رابط قناة الفيسبوك
https://www.facebook.com/EEAKSA
رابط قناة اليوتيوب
https://www.youtube.com/user/EEAchannal
رابط التسجيل العام للمحاضرات
https://forms.gle/vVmw7L187tiATRPw9
ملحوظة : توجد شهادات حضور مجانية لمن يسجل فى رابط التقيم اخر المحاضرة
Sample Codes: https://github.com/davegautam/dotnetconfsamplecodes
Presentation on How you can get started with ML.NET. If you are existing .NET Stack Developer and Wanna use the same technology into Machine Learning, this slide focuses on how you can use ML.NET for Machine Learning.
Driving Customer Loyalty with Azure Machine LearningCCG
Learn how you can leverage the elastic, on-demand processing power of Microsoft Azure to create faster, more applicable analytics by viewing this informative webinar. Data Scientist and Author, Ahmed Sherif, demonstrates key analytic use cases that can be spun up quickly with minimal effort and maximum return on investment. To watch the full recording of this webinar, visit http://ccgbi.com/resources/webinars/driving-customer-loyalty-with-AML
Next-best offer refers to the use of predictive analytics solutions to identify the products or services your customers are most likely to be interested in for their next purchase.
Facing this topic I have made a personal research, and realize a synthesis, which has helped me to clarify some ideas. This presentation does not intend to be exhaustive on the subject, but could perhaps bring you some useful insights.
AI-SDV 2020: Bringing AI to SME projects: Addressing customer needs with a fl...Dr. Haxel Consult
Customers interested in Language Analytics solutions typically approach us with a broad range of business cases and specific business needs. Especially when it comes to the data available for their case and for any AI aspects involved, the variation in data types, data quality and data quantity is, by our experience, quite vast and at the same time so critical for a project's success, that we often start our requirements analysis right there: at the data. At Karakun, our Language Analytics team addresses this in an increasingly flexible way: We select from a set of Language Analytics tools and related services (e.g. data cleansing and data procurement) to meet the business needs at hand with the data available or at least in reach – at reasonable costs.
The methodology stack ranges from heuristic logic over statistical solutions to neural networks. At the same time, we aim at reducing the amount of data needed for such training, e.g. by integrating state-of-the-art neural technologies into our platform. That way, also SMEs and their specific business cases can benefit from the full range of Language Analytics options.
To illustrate our approach, we will present an e-Safe solution which allows for semantic document tagging and search in highly secured virtual safes. In addition, our solution provides text-based triggers for complex workflows depending on the safe´s content.
Building Generative AI-infused apps: what's possible and how to startMaxim Salnikov
In this session, we'll explore different scenarios where the features of Generative AI can provide added value to an IT solution. We'll also learn how to begin developing your own application powered by AI. Using Azure OpenAI service as an illustration, we'll examine the various APIs it offers, review the best practices of Prompt Engineering, explore different ways to incorporate your own data into the process, and take a glance at several tools and resources that make the developer experience more seamless.
EclipseCon - Building an IDE for Apache CassandraMichaël Figuière
Apache Cassandra is a distributed, scalable and highly available database used in many large scale infrastructures in companies such as Netflix, eBay, Instagram or Spotify. It comes with a SQL-like query language that reduces its learning curve, but in order to allow developers to have a similar productivity as with traditional RDBMS, several tools are required.
DataStax DevCenter is a standalone IDE built on top of the Eclipse RCP Platform, that makes it easier to create data models and scripts for Cassandra. It relies on Xtext to bring a modern editor with content assist, syntax highlighting, cross references, instant validation and quick fixes. Besides that, in order to build a sophisticated UI while keeping the codebase simple, e4 has been leveraged for dependency injection and event dispatching.
This presentation will provide an overview of the design challenges that we've faced and our experience putting together all these technologies.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
3. Machine Learning
Machine Learning is a
subset of Artificial
Intelligence
Artificial Intelligence
Machine Learning
4. NoSQL, Search and Machine Learning
NoSQL, Search and
Machine Learning
greatly complete
Machine
Learning each other !
NoSQL Search
5. Machine Learning algorithms
• Recommentations
Advice user with recommended items
• Classification
Automatically classify documents based on a given set of
examples
• Clustering
Automatically discover groups within a set of documents
• Patterns mining, evolutionary algorithms, ...
9. Recommendation use cases
• Advice user with items on e-commerce websites
And increase revenue
• Advice user with feature he may be interested in on a Web application
As most features are usually unknown
• Filter and adapt scoring of results of a search engine
Based on similar users clicks, ...
16. Clustering with K-Means
Cluster centers are
moved in order to
A minimize the sum
B
of distances
C
D
E
F
17. Clustering with K-Means
The data point C is
then attached to the
A first center as it has
B
become the nearest
C
D
E
F
18. Clustering use cases
• Finds key topics in a set of documents
News feeds, business documents, ...
• Finds some typical behaviors within a set of users
Visit frequency, buying habits, ...
20. In few words
• Implementation of machine learning algorithms in Java
Continuously growing collection of algorithms
• Most of them come in a MapReduce implementation for Hadoop
Scalable to huge datasets
• Still quite young but growing fast
Started in early 2009
• Intended to be for Machine Learning what Lucene is for Information Retrieval
22. Recommendation example
DataModel model = new FileDataModel(new File("data.csv"));
UserSimilarity simil =
new PearsonCorrelationSimilarity(model);
UserNeighborhood neighborhood =
new NearestNUserNeighborhood(2, similarity, model);
Recommender recommender =
new GenericUserBasedRecommender(model, neighborhood, simil);
List<RecommendedItem> recommendations =
recommender.recommend(1, 1);
The code for a basic recommendation is pretty straightforward !
29. A Search Engine
MyCustomer Search
Document Non Disclosure Agreement 12 days ago
... MyCustomer agrees not to disclose any part of ...
Document 2010 Sales Report 1 month ago
... MyCustomer: 12 M€ with 3 deals ...
Phone Call 2 days ago
Phone Call Customer: MyCustomer Time: 9:55am Duration: 13min
Description: Invoice not received for order #2354E
30. Indexing Pipeline
Tika
PDF
Text
Analyzer
Extractor
Search
Index
Analyzer
Phone
Call
Lucene
31. A more complex Search Engine
MyCustomer Search
Sales Juridic Accounting
Document 2010 Sales Report 1 month ago
... MyCustomer: 12 M€ with 3 deals ...
Phone Call 2 days ago
Phone Call Customer: MyCustomer Time: 9:55am Duration: 13min
Description: Invoice not received for order #2354E
32. Indexing Pipeline with Mahout
Tika Mahout
PDF
Text
Classifier Analyzer
Extractor
Search
Index
Classifier Analyzer
Phone
Call
Lucene
33. Query pipeline
Lucene
Query
Analyzer
Search
Index
Results
34. Query pipeline with Mahout
Lucene
Query
Analyzer
Search
Index
Custom
Analyzer
Scoring
Results
Using Mahout
recommendations
35. Conclusion
• Machine learning brings a lot of valuable features for enterprises
Revenue increasing, better productivity, user adoption, ...
• Mahout is growing fast and is becoming a great choice for Java apps
With easy integration to business applications
• Business people are not used to that kind of use cases
Collaboration with technical folks is mandatory