Salesforce.com uses Hadoop to analyze large amounts of customer data generated from over 130,000 customers and 800 million daily transactions to track product usage and customer behavior. Some key use cases discussed include analyzing product metrics to understand feature adoption, examining user behavior to improve products, and enabling collaborative filtering recommendations. The document outlines Salesforce.com's Hadoop ecosystem and data pipelines used to collect, process, and visualize insights from petabytes of customer data.
This document discusses how Salesforce.com uses Hadoop for product metrics and analytics use cases. It describes how they collect feature usage data from log files, process the data using Pig scripts on Hadoop, and store metrics in custom objects. The metrics are then visualized in reports and dashboards, and product managers can collaborate on features using Chatter. This process helps Salesforce track adoption of features, monitor performance, and gain insights to improve products.
Hadoop is the technology of choice for processing large data sets. At salesforce.com, we service internal and product big data use cases using a combination of Hadoop, Java MapReduce, Pig, Force.com, and machine learning algorithms.
In this webinar, you will learn about an internal use case and a product use case:
:: Product Metrics: Internally, we measure feature usage using a combination of Hadoop, Pig, and the Force.com platform (Custom Objects and Analytics).
:: Community-Based Recommendations: In Chatter, our most successful people and file recommendations are built on a collaborative filtering algorithm that is implemented on Hadoop using Java MapReduce.
Hadoop is used at Salesforce for several big data use cases including product metrics, user behavior analysis, capacity planning, and collaborative filtering. For product metrics, Hadoop collects and analyzes log data from over 130,000 customers to track feature usage, standard metrics, and metrics across channels. It generates reports and dashboards to provide insights to executives and product managers.
Hadoop is the technology of choice for processing large data sets. Force.com provides a great metadata layer to define Hadoop jobs, and store job output (Custom Objects). Force.com also comes with a great visualization layer (Reports & Dashboards) to chart & trend the output from Hadoop jobs. In this session, we will explore a real life use case that combines these technologies to provide a compelling big data processing framework.
Srihitha Technologies provides SAP ABAP Online Training in Ameerpet by real time Experts. For more information about SAP ABAP online training in Ameerpet call 9885144200 / 9394799566.
APAC Big Data Strategy RadhaKrishna HiremaneIntelAPAC
This document discusses Intel's big data strategy in the Asia Pacific region in 2013. It aims to accelerate adoption of Apache Hadoop two years faster by deploying it on Intel Xeon processors. Key opportunities mentioned include telecommunications, financial services, government, and healthcare. The strategy seeks to unlock value from data, support open platforms, and deliver software value from the edge to the cloud. Case studies demonstrate how Hadoop has been applied in retail, genomics, telecommunications, traffic management, and other domains.
Search, APIs, capability management and Sensis's journeyablebagel
Sensis developed a search capability to drive presence in local search and open up its large business listing database. It selected an open source platform and built a total search solution with relevancy testing, quality management, and continuous deployment. Over time, Sensis evolved from a black box solution to a more transparent and manageable platform through problem identification, gold standard testing, and continuous tuning. Future plans include expanding the query engine and implementing machine learning.
This document discusses how Salesforce.com uses Hadoop for product metrics and analytics use cases. It describes how they collect feature usage data from log files, process the data using Pig scripts on Hadoop, and store metrics in custom objects. The metrics are then visualized in reports and dashboards, and product managers can collaborate on features using Chatter. This process helps Salesforce track adoption of features, monitor performance, and gain insights to improve products.
Hadoop is the technology of choice for processing large data sets. At salesforce.com, we service internal and product big data use cases using a combination of Hadoop, Java MapReduce, Pig, Force.com, and machine learning algorithms.
In this webinar, you will learn about an internal use case and a product use case:
:: Product Metrics: Internally, we measure feature usage using a combination of Hadoop, Pig, and the Force.com platform (Custom Objects and Analytics).
:: Community-Based Recommendations: In Chatter, our most successful people and file recommendations are built on a collaborative filtering algorithm that is implemented on Hadoop using Java MapReduce.
Hadoop is used at Salesforce for several big data use cases including product metrics, user behavior analysis, capacity planning, and collaborative filtering. For product metrics, Hadoop collects and analyzes log data from over 130,000 customers to track feature usage, standard metrics, and metrics across channels. It generates reports and dashboards to provide insights to executives and product managers.
Hadoop is the technology of choice for processing large data sets. Force.com provides a great metadata layer to define Hadoop jobs, and store job output (Custom Objects). Force.com also comes with a great visualization layer (Reports & Dashboards) to chart & trend the output from Hadoop jobs. In this session, we will explore a real life use case that combines these technologies to provide a compelling big data processing framework.
Srihitha Technologies provides SAP ABAP Online Training in Ameerpet by real time Experts. For more information about SAP ABAP online training in Ameerpet call 9885144200 / 9394799566.
APAC Big Data Strategy RadhaKrishna HiremaneIntelAPAC
This document discusses Intel's big data strategy in the Asia Pacific region in 2013. It aims to accelerate adoption of Apache Hadoop two years faster by deploying it on Intel Xeon processors. Key opportunities mentioned include telecommunications, financial services, government, and healthcare. The strategy seeks to unlock value from data, support open platforms, and deliver software value from the edge to the cloud. Case studies demonstrate how Hadoop has been applied in retail, genomics, telecommunications, traffic management, and other domains.
Search, APIs, capability management and Sensis's journeyablebagel
Sensis developed a search capability to drive presence in local search and open up its large business listing database. It selected an open source platform and built a total search solution with relevancy testing, quality management, and continuous deployment. Over time, Sensis evolved from a black box solution to a more transparent and manageable platform through problem identification, gold standard testing, and continuous tuning. Future plans include expanding the query engine and implementing machine learning.
Video: http://www.youtube.com/watch?v=BT8WvQMMaV0
Hadoop is the technology of choice for processing large data sets. At salesforce.com, we service internal and product big data use cases using a combination of Hadoop, Java MapReduce, Pig, Force.com, and machine learning algorithms. In this webinar, we will discuss an internal use case and a product use case:
Product Metrics: Internally, we measure feature usage using a combination of Hadoop, Pig, and the Force.com platform (Custom Objects and Analytics).
Community-Based Recommendations: In Chatter, our most successful people and file recommendations are built on a collaborative filtering algorithm that is implemented on Hadoop using Java MapReduce.
The SENSORIA Development Environment is a CASE tool for service-oriented architecture (SOA) development from the SENSORIA EU FP6 project. It has 19 partners from 7 countries over 4 years with 4 million Euro funding. The tool provides an integrated platform for SOA development tools, allowing tools to be discovered, installed, composed, and orchestrated as services. The environment is based on Eclipse and OSGi services. It addresses challenges in SOA such as service specification, composition correctness, and continuous operation in changing environments.
The Web Development Eco-system with VSTS, ASP.NET 2.0 & Microsoft AjaxDarren Sim
This document provides an overview of the Visual Studio Team System (VSTS) for web development. It discusses common pains experienced by web development teams and how VSTS addresses them through integrated tools for source control, work item tracking, reporting, build automation, and project portals. Key features of VSTS demonstrated include change management, work item management, shared and exclusive checkouts, promotion modeling, and reports. Additional resources for learning more about VSTS are also provided.
SIMPDA 2011 - An Open Source Platform for Business Process Mining SpagoWorld
The presentation and the paper supported the speech by Alessandra Toninelli, SpagoBI Consultant, within the Full Paper Section of SIMPDA 2011 (Campione d'Italia, Italy; 29th June - 1st July 2011), the First International Symposium on Data-Driven Process Discovery and Analysis.
This is a talk I delivered in April 2012 at the 33rd Degree conference in Krakow - its about building small simple applications and the unix philosophy
[.Net Juniors Academy] Introdução ao Cloud Computing e Windows Azure PlatformVitor Tomaz
The document discusses cloud computing platforms like Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). It provides examples of Microsoft Azure services including compute, storage, networking, SQL databases, and traffic manager. It also illustrates how Azure SQL databases can provide high availability and scalability through primary and secondary databases.
This document discusses Faban, an open source framework and toolkit for developing benchmarks and workloads. It provides a harness for running benchmark drivers, collecting statistics and displaying results. The framework includes a driver API, configuration screens, and allows plugging in various services and tools. Faban gives developers a way to define benchmark operations and mixes, and controls the lifecycle of services during a benchmark run.
This document provides an overview of SAP solutions. It introduces the SAP R/3 system and its functionality, which integrates all areas of a business. It describes SAP's technology environment including its client/server architecture and support for various platforms, operating systems, databases, and languages. The document also discusses why SAP R/3 has been successful due to its real-time processing, comprehensive and integrated functionality, and support for best business practices. Finally, it briefly introduces mySAP.com solutions and SAP's strategy to enable integration and collaboration across enterprises.
This document provides an overview of SAP solutions. It introduces the SAP R/3 system and its functionality, describes SAP's implementation methodology (ASAP), and discusses mySAP.com solutions for new economy businesses. The SAP R/3 is an integrated business software package that combines areas like sales, distribution, materials management, production planning, financial accounting, and more. It runs on a client/server architecture using SAP's proprietary ABAP language. SAP has over 10 million users worldwide and is the 4th largest software company.
The document is a presentation that provides an overview of ABAP - Web Dynpro. It discusses the motivation for Web Dynpro, its model-view-controller programming model, and how it provides a declarative and technology-independent way to develop user interfaces that can be rendered across different clients. The presentation agenda includes explaining Web Dynpro's programming model, features like ALV and object value selector, and addresses questions from the audience.
The document is a presentation that provides an overview of ABAP - Web Dynpro. It discusses the motivation for Web Dynpro, its model-view-controller programming model, and how it provides a declarative and technology-independent way to develop user interfaces that can be rendered across different clients. The presentation agenda includes explaining Web Dynpro's programming model, features like ALV and object value selector, and addresses questions from the audience.
The presentation supported the webinar on SpagoBI Suite, delivered by Chiara Chiarelli (SpagoBI Developer) through SpagoWorld Webinar Center, on 13th December 2010. http://www.spagoworld.org/
Study of solution development methodology for small size projects.Joon ho Park
Medium-size system integration or IT Solution Company’s solution development project has limitation as like human resource limitation, budget limitation and expert limitation. Especially it is hard to maintain many IT experts for medium-size and small-size system integration or IT Solution Company. Thus in order to efficiently and beneficially complete projects, medium-size and small-size system integration or IT Solution Company should have appropriate solution development methodology.Solution development projects for medium-size and small-size system integration or IT Solution Company are usually shot-term and small budget so that they need slim and light-weight solution development methodology. But usual medium-size and small-size system integration or IT Solution Company do not have their own appropriate solution development methodology. Thus, if those kinds of solution development methodologies are applied to solution development projects for medium-size and small-size system integration or IT solution company without some modifications, shortage of human resources, incompleteness of solution and deliverables could arouse.Especially unnecessary paper works (deliverables and documentations) to both of projects teams and client’s wastes project resources and time. We analyze previous solution development methodologies and derive mandatory deliverables and optional deliverables. Before deriving them, we newly define procedures and tasks for each project stages which are necessary to projects team and clients, from client and expert of interviews. Our proposed solution development methodology can easily leverage the development overhead of short-term projects. Optional deliverables can be omitted by the contraction between project team and client.
The document discusses the Live Framework which provides building blocks for handling user data and connecting applications to hundreds of millions of users. It includes information on Live Services, the Live Framework resource model, and how developers can get started using Live Framework.
EPM Live is a global leader in enterprise project, portfolio, and work management solutions. It has over 5,000 customers in 38 countries. EPM Live offers a single platform to manage all types of work, including projects, applications, services, and more. It provides deployment options, including SaaS, hosted, and on-premise, and scales to meet different organizational maturity levels in project management.
Enterprise search in SharePoint 2013 - Sydney 15th of January 2013Findwise
This document provides an overview and demos of enterprise search capabilities in SharePoint 2013. The presentation covers search architecture improvements in SharePoint 2013 including the new REST API. It also demonstrates managing query suggestions and adding search web parts. Key capabilities for governing search like analytics and recommendations are also shown. The presentation concludes with the top picks for 2013 and an overview of Findwise offerings for implementing and enhancing SharePoint search solutions.
It is mandatory for every medicine or pharma packaging to have a unique serial code or UID. Project is to build a web application that will provide tracking capabilities for the UID for pharma packaging of drugs. The track feature (TRACK n trace) will track the UID of each package by using vision based scanners, RFIDs, etc. and store the data into a local server. The server will be synced daily with a global server (we are looking for cloud based hosting platforms such as Windows Azure or amazon web services). We have to build the trace functionality (Track n TRACE) by building a web interface where a person with the UID can trace the shipment.
We have to keep historical records for as long as 10 years and build logic on basis of the UID state. We have to provide the details from the database as in when was this package manufactured, when was it shipped, etc. If the UID entered is faulty for example; it wasn’t ever manufactured or if it is over its expiration date then we have to generate corresponding errors and also maintain a log of such entries and send notification to the admins with details of IP, Geography or where the error generated.
Modernisation Strategy for Science at RBG Kew. The presentation is part of a "toolkit" delivered to help Kew to rationalise, consolidate and integrate disparate & legacy Science Applications and Data.
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
Video: http://www.youtube.com/watch?v=BT8WvQMMaV0
Hadoop is the technology of choice for processing large data sets. At salesforce.com, we service internal and product big data use cases using a combination of Hadoop, Java MapReduce, Pig, Force.com, and machine learning algorithms. In this webinar, we will discuss an internal use case and a product use case:
Product Metrics: Internally, we measure feature usage using a combination of Hadoop, Pig, and the Force.com platform (Custom Objects and Analytics).
Community-Based Recommendations: In Chatter, our most successful people and file recommendations are built on a collaborative filtering algorithm that is implemented on Hadoop using Java MapReduce.
The SENSORIA Development Environment is a CASE tool for service-oriented architecture (SOA) development from the SENSORIA EU FP6 project. It has 19 partners from 7 countries over 4 years with 4 million Euro funding. The tool provides an integrated platform for SOA development tools, allowing tools to be discovered, installed, composed, and orchestrated as services. The environment is based on Eclipse and OSGi services. It addresses challenges in SOA such as service specification, composition correctness, and continuous operation in changing environments.
The Web Development Eco-system with VSTS, ASP.NET 2.0 & Microsoft AjaxDarren Sim
This document provides an overview of the Visual Studio Team System (VSTS) for web development. It discusses common pains experienced by web development teams and how VSTS addresses them through integrated tools for source control, work item tracking, reporting, build automation, and project portals. Key features of VSTS demonstrated include change management, work item management, shared and exclusive checkouts, promotion modeling, and reports. Additional resources for learning more about VSTS are also provided.
SIMPDA 2011 - An Open Source Platform for Business Process Mining SpagoWorld
The presentation and the paper supported the speech by Alessandra Toninelli, SpagoBI Consultant, within the Full Paper Section of SIMPDA 2011 (Campione d'Italia, Italy; 29th June - 1st July 2011), the First International Symposium on Data-Driven Process Discovery and Analysis.
This is a talk I delivered in April 2012 at the 33rd Degree conference in Krakow - its about building small simple applications and the unix philosophy
[.Net Juniors Academy] Introdução ao Cloud Computing e Windows Azure PlatformVitor Tomaz
The document discusses cloud computing platforms like Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). It provides examples of Microsoft Azure services including compute, storage, networking, SQL databases, and traffic manager. It also illustrates how Azure SQL databases can provide high availability and scalability through primary and secondary databases.
This document discusses Faban, an open source framework and toolkit for developing benchmarks and workloads. It provides a harness for running benchmark drivers, collecting statistics and displaying results. The framework includes a driver API, configuration screens, and allows plugging in various services and tools. Faban gives developers a way to define benchmark operations and mixes, and controls the lifecycle of services during a benchmark run.
This document provides an overview of SAP solutions. It introduces the SAP R/3 system and its functionality, which integrates all areas of a business. It describes SAP's technology environment including its client/server architecture and support for various platforms, operating systems, databases, and languages. The document also discusses why SAP R/3 has been successful due to its real-time processing, comprehensive and integrated functionality, and support for best business practices. Finally, it briefly introduces mySAP.com solutions and SAP's strategy to enable integration and collaboration across enterprises.
This document provides an overview of SAP solutions. It introduces the SAP R/3 system and its functionality, describes SAP's implementation methodology (ASAP), and discusses mySAP.com solutions for new economy businesses. The SAP R/3 is an integrated business software package that combines areas like sales, distribution, materials management, production planning, financial accounting, and more. It runs on a client/server architecture using SAP's proprietary ABAP language. SAP has over 10 million users worldwide and is the 4th largest software company.
The document is a presentation that provides an overview of ABAP - Web Dynpro. It discusses the motivation for Web Dynpro, its model-view-controller programming model, and how it provides a declarative and technology-independent way to develop user interfaces that can be rendered across different clients. The presentation agenda includes explaining Web Dynpro's programming model, features like ALV and object value selector, and addresses questions from the audience.
The document is a presentation that provides an overview of ABAP - Web Dynpro. It discusses the motivation for Web Dynpro, its model-view-controller programming model, and how it provides a declarative and technology-independent way to develop user interfaces that can be rendered across different clients. The presentation agenda includes explaining Web Dynpro's programming model, features like ALV and object value selector, and addresses questions from the audience.
The presentation supported the webinar on SpagoBI Suite, delivered by Chiara Chiarelli (SpagoBI Developer) through SpagoWorld Webinar Center, on 13th December 2010. http://www.spagoworld.org/
Study of solution development methodology for small size projects.Joon ho Park
Medium-size system integration or IT Solution Company’s solution development project has limitation as like human resource limitation, budget limitation and expert limitation. Especially it is hard to maintain many IT experts for medium-size and small-size system integration or IT Solution Company. Thus in order to efficiently and beneficially complete projects, medium-size and small-size system integration or IT Solution Company should have appropriate solution development methodology.Solution development projects for medium-size and small-size system integration or IT Solution Company are usually shot-term and small budget so that they need slim and light-weight solution development methodology. But usual medium-size and small-size system integration or IT Solution Company do not have their own appropriate solution development methodology. Thus, if those kinds of solution development methodologies are applied to solution development projects for medium-size and small-size system integration or IT solution company without some modifications, shortage of human resources, incompleteness of solution and deliverables could arouse.Especially unnecessary paper works (deliverables and documentations) to both of projects teams and client’s wastes project resources and time. We analyze previous solution development methodologies and derive mandatory deliverables and optional deliverables. Before deriving them, we newly define procedures and tasks for each project stages which are necessary to projects team and clients, from client and expert of interviews. Our proposed solution development methodology can easily leverage the development overhead of short-term projects. Optional deliverables can be omitted by the contraction between project team and client.
The document discusses the Live Framework which provides building blocks for handling user data and connecting applications to hundreds of millions of users. It includes information on Live Services, the Live Framework resource model, and how developers can get started using Live Framework.
EPM Live is a global leader in enterprise project, portfolio, and work management solutions. It has over 5,000 customers in 38 countries. EPM Live offers a single platform to manage all types of work, including projects, applications, services, and more. It provides deployment options, including SaaS, hosted, and on-premise, and scales to meet different organizational maturity levels in project management.
Enterprise search in SharePoint 2013 - Sydney 15th of January 2013Findwise
This document provides an overview and demos of enterprise search capabilities in SharePoint 2013. The presentation covers search architecture improvements in SharePoint 2013 including the new REST API. It also demonstrates managing query suggestions and adding search web parts. Key capabilities for governing search like analytics and recommendations are also shown. The presentation concludes with the top picks for 2013 and an overview of Findwise offerings for implementing and enhancing SharePoint search solutions.
It is mandatory for every medicine or pharma packaging to have a unique serial code or UID. Project is to build a web application that will provide tracking capabilities for the UID for pharma packaging of drugs. The track feature (TRACK n trace) will track the UID of each package by using vision based scanners, RFIDs, etc. and store the data into a local server. The server will be synced daily with a global server (we are looking for cloud based hosting platforms such as Windows Azure or amazon web services). We have to build the trace functionality (Track n TRACE) by building a web interface where a person with the UID can trace the shipment.
We have to keep historical records for as long as 10 years and build logic on basis of the UID state. We have to provide the details from the database as in when was this package manufactured, when was it shipped, etc. If the UID entered is faulty for example; it wasn’t ever manufactured or if it is over its expiration date then we have to generate corresponding errors and also maintain a log of such entries and send notification to the admins with details of IP, Geography or where the error generated.
Modernisation Strategy for Science at RBG Kew. The presentation is part of a "toolkit" delivered to help Kew to rationalise, consolidate and integrate disparate & legacy Science Applications and Data.
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
Generating privacy-protected synthetic data using Secludy and MilvusZilliz
During this demo, the founders of Secludy will demonstrate how their system utilizes Milvus to store and manipulate embeddings for generating privacy-protected synthetic data. Their approach not only maintains the confidentiality of the original data but also enhances the utility and scalability of LLMs under privacy constraints. Attendees, including machine learning engineers, data scientists, and data managers, will witness first-hand how Secludy's integration with Milvus empowers organizations to harness the power of LLMs securely and efficiently.
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfflufftailshop
When it comes to unit testing in the .NET ecosystem, developers have a wide range of options available. Among the most popular choices are NUnit, XUnit, and MSTest. These unit testing frameworks provide essential tools and features to help ensure the quality and reliability of code. However, understanding the differences between these frameworks is crucial for selecting the most suitable one for your projects.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
This presentation provides valuable insights into effective cost-saving techniques on AWS. Learn how to optimize your AWS resources by rightsizing, increasing elasticity, picking the right storage class, and choosing the best pricing model. Additionally, discover essential governance mechanisms to ensure continuous cost efficiency. Whether you are new to AWS or an experienced user, this presentation provides clear and practical tips to help you reduce your cloud costs and get the most out of your budget.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
A Comprehensive Guide to DeFi Development Services in 2024Intelisync
DeFi represents a paradigm shift in the financial industry. Instead of relying on traditional, centralized institutions like banks, DeFi leverages blockchain technology to create a decentralized network of financial services. This means that financial transactions can occur directly between parties, without intermediaries, using smart contracts on platforms like Ethereum.
In 2024, we are witnessing an explosion of new DeFi projects and protocols, each pushing the boundaries of what’s possible in finance.
In summary, DeFi in 2024 is not just a trend; it’s a revolution that democratizes finance, enhances security and transparency, and fosters continuous innovation. As we proceed through this presentation, we'll explore the various components and services of DeFi in detail, shedding light on how they are transforming the financial landscape.
At Intelisync, we specialize in providing comprehensive DeFi development services tailored to meet the unique needs of our clients. From smart contract development to dApp creation and security audits, we ensure that your DeFi project is built with innovation, security, and scalability in mind. Trust Intelisync to guide you through the intricate landscape of decentralized finance and unlock the full potential of blockchain technology.
Ready to take your DeFi project to the next level? Partner with Intelisync for expert DeFi development services today!
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfChart Kalyan
A Mix Chart displays historical data of numbers in a graphical or tabular form. The Kalyan Rajdhani Mix Chart specifically shows the results of a sequence of numbers over different periods.
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Dreamforce_2012_Hadoop_Use_Cases
1. How Salesforce.com Uses Hadoop
Some Data Science Use Cases
Narayan Bharadwaj Jed Crosby
salesforce.com salesforce.com
@nadubharadwaj @JedCrosby
2. Safe Harbor
Safe harbor statement under the Private Securities Litigation Reform Act of 1995:
This presentation may contain forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties
materialize or if any of the assumptions proves incorrect, the results of salesforce.com, inc. could differ materially from the results
expressed or implied by the forward-looking statements we make. All statements other than statements of historical fact could be
deemed forward-looking, including any projections of product or service availability, subscriber growth, earnings, revenues, or other
financial items and any statements regarding strategies or plans of management for future operations, statements of belief, any
statements concerning new, planned, or upgraded services or technology developments and customer contracts or use of our services.
The risks and uncertainties referred to above include – but are not limited to – risks associated with developing and delivering new
functionality for our service, new products and services, our new business model, our past operating losses, possible fluctuations in our
operating results and rate of growth, interruptions or delays in our Web hosting, breach of our security measures, the outcome of
intellectual property and other litigation, risks associated with possible mergers and acquisitions, the immature market in which we
operate, our relatively limited operating history, our ability to expand, retain, and motivate our employees and manage our growth, new
releases of our service and successful customer deployment, our limited history reselling non-salesforce.com products, and utilization
and selling to larger enterprise customers. Further information on potential factors that could affect the financial results of
salesforce.com, inc. is included in our annual report on Form 10-Q for the most recent fiscal quarter ended July 31, 2012. This
documents and others containing important disclosures are available on the SEC Filings section of the Investor Information section of
our Web site.
Any unreleased services or features referenced in this or other presentations, press releases or public statements are not currently
available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based
upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward-
looking statements.
3. Agenda
• Technology
• Hadoop use cases
• Use case discussion
• Product Metrics
• User Behavior Analysis
• Collaborative Filtering
• Q&A
Every time you see the elephant, we will attempt to explain a
Hadoop related concept.
4. Got “Cloud Data”?
130k customers 800 million transactions/day
Millions of users Terabytes/day
6. Hadoop Overview
- Started by Doug Cutting at Yahoo!
- Based on two Google papers
Google File System (GFS): http://research.google.com/archive/gfs.html
Google MapReduce: http://research.google.com/archive/mapreduce.html
- Hadoop is an open source Apache project
Hadoop Distributed File System (HDFS)
Distributed Processing Framework (MapReduce)
- Several related projects
HBase, Hive, Pig, Flume, ZooKeeper, Mahout, Oozie, HCatalog
12. Product Metrics – Problem Statement
Track feature usage/adoption across 130k+ customers
Eg: Accounts, Contacts, Visualforce, Apex,…
Track standard metrics across all features
Eg: #Requests, #UniqueOrgs, #UniqueUsers, AvgResponseTime,…
Track features and metrics across all channels
API, UI, Mobile
Primary audience: Executives, Product Managers
13. Data Pipeline
Fancy UI
Feature (What?)
(Visualize)
Feature Metadata Daily Summary
(Instrumentation) (Output)
Crunch it
(How?)
Storage & Processing
14. Product Metrics Pipeline
User Input
User Input
Reports, Dashboards
Reports, Dashboards
(Page Layout)
(Page Layout)
Formula
Workflow
Formula
Workflow
Fields
Fields
Feature Metrics
Feature Metrics Trend Metrics
Trend Metrics
(Custom Object)
(Custom Object) (Custom Object)
(Custom Object)
API
API
Client Machine
Client Machine
Java Program
Java Program
Pig script generator
Pig script generator
Workflow
Workflow
Log Pull
Log Pull
Hadoop
Hadoop Log Files
Log Files
15. Feature Metrics (Custom Object)
Id Feature Name PM Instrumentation Metric1 Metric2 Metric3 Metric4 Status
F0001 Accounts John /001 #requests #UniqOrgs #UniqUsers AvgRT Dev
F0002 Contacts Nancy /003 #requests #UniqOrgs #UniqUsers AvgRT Review
F0003 API Eric A #requests #UniqOrgs #UniqUsers AvgRT Deployed
F0004 Visualforce Roger V #requests #UniqOrgs #UniqUsers AvgRT Decom
F0005 Apex Kim axapx #requests #UniqOrgs #UniqUsers AvgRT Deployed
F0006 Custom Objects Chun /aXX #requests #UniqOrgs #UniqUsers AvgRT Deployed
F0008 Chatter Jed chcmd #requests #UniqOrgs #UniqUsers AvgRT Deployed
F0009 Reports Steve R #requests #UniqOrgs #UniqUsers AvgRT Deployed
20. Basic Pig Script Construct
-- Define UDFs
DEFINE GFV GetFieldValue(‘/path/to/udf/file’);
-- Load data
A = LOAD ‘/path/to/cloud/data/log/files’ USING PigStorage();
-- Filter data
B = FILTER A BY GFV(row, ‘logRecordType’) == ‘U’;
-- Extract Fields
C = FOREACH B GENERATE GFV(*, ‘orgId’), LFV(*. ‘userId’) ……..
-- Group
G = GROUP C BY ……
-- Compute output metrics
O = FOREACH G {
orgs = C.orgId; uniqueOrgs = DISTINCT orgs;
}
-- Store or Dump results
STORE O INTO ‘/path/to/user/output’;
29. Problem Statement
How do we reduce number of clicks on the user interface?
Need to understand top user click paths. What are they typically trying to do?
What are the user clusters/personas?
Approach:
• Markov transition for click path, D3.js visuals
• K-means (unsupervised) clustering for user groups
36. We found this relationship using item-to-item collaborative
filtering
Amazon published this algorithm in 2003.
Amazon.com Recommendations: Item-to-Item Collaborative Filtering, by
Gregory Linden, Brent Smith, and Jeremy York. IEEE Internet Computing,
January-February 2003.
At Salesforce, we adapted this algorithm for Hadoop, and we use
it to recommend files to view and users to follow.
37. Example: CF on 5 files
Vision Statement
Annual Report
Dilbert Comic
Darth Vader Cartoon
Disk Usage Report
38. View History Table
Annual Vision Dilbert Darth Vader Disk Usage
Report Statement Cartoon Cartoon Report
Miranda
1 1 1 0 0
(CEO)
Bob (CFO) 1 1 1 0 0
Susan
0 1 1 1 0
(Sales)
Chun (Sales) 0 0 1 1 0
Alice (IT) 0 0 1 1 1
39. Relationships Between the Files
Annual Report Vision Statement
Darth Vader
Cartoon
Dilbert Cartoon
Disk Usage
Report
40. Relationships Between the Files
Annual Report
2 Vision Statement
0 1
3
2
0 Darth Vader
0 Cartoon
Dilbert Cartoon
3
1 1
Disk Usage
Report
41. Sorted Relationships for Each File
Annual Vision Dilbert Darth Vader Disk Usage
Report Statement Cartoon Cartoon Report
Dilbert (2) Dilbert (3) Vision Stmt. (3) Dilbert (3) Dilbert (1)
Vision Stmt. (2) Annual Rpt. (2) Darth Vader (3) Vision Stmt. (1) Darth Vader (1)
Darth Vader (1) Annual Rpt. (2) Disk Usage (1)
Disk Usage (1)
The popularity problem: notice that Dilbert appears first in every list. This is
probably not what we want.
The solution: divide the relationship tallies by file popularities.
42. Normalized Relationships Between the Files
Annual Report .82 Vision Statement
0 .33
.63 .77
0
0 Darth Vader
Dilbert Cartoon Cartoon
.77
.45 .58
Disk Usage
Report
43. Sorted relationships for each file, normalized by file popularities
Annual Report Vision Dilbert Darth Vader Disk Usage
Statement Cartoon Cartoon Report
Vision Stmt. Annual Report Darth Vader Darth Vader
Dilbert (.77)
(.82) (.82) (.77) (.58)
Vision Stmt. Disk Usage Dilbert
Dilbert (.63) Dilbert (.77)
(.77) (.58) (.45)
Darth Vader Annual Report Vision Stmt.
(.33) (.63) (.33)
Disk Usage
(.45)
High relationship tallies AND similar popularity values now drive closeness.
44. The Item-to-Item CF Algorithm
1) Compute file popularities
2) Compute relationship tallies and divide by file popularities
3) Sort and store the results
45. MapReduce Overview
Map Shuffle Reduce
(adapted from http://code.google.com/p/mapreduce-framework/wiki/MapReduce)
46. 1. Compute File Popularities
<user, file>
Inverse identity map
<file, List<user>>
Reduce
<file, (user count)>
Result is a table of (file, popularity) pairs that you store in the Hadoop distributed cache.
50. 2b. Tally the Relationship Votes − Just a Word Count, Where Each
Relationship Occurrence is a Word
<(file1, file2), Integer(1)>
Identity map
<(file1, file2), List<Integer(1)>
Reduce: count and divide
by popularities
<file1, (file2, similarity score)>, <file2, (file1, similarity score)>
Note that we emit each result twice,
one for each file that belongs to a relationship.
51. Example 2b: the Dilbert/Darth Vader Relationship
<(Dilbert, Vader), Integer(1)>,
<(Dilbert, Vader), Integer(1)>,
<(Dilbert, Vader), Integer(1)>
Identity map
<(Dilbert, Vader), {1, 1, 1}>
Reduce: count and divide
by popularities
<Dilbert, (Vader, sqrt(3/5))>, <Vader, (Dilbert, sqrt(3/5))>
52. 3. Sort and Store Results
<file1, (file2, similarity score)>
Identity map
<file1, List<(file2, similarity score)>>
Reduce
<file1, {top n similar files}>
Store the results in your location of choice
54. Appendix
Cosine formula and normalization trick to avoid the distributed
cache
A •B A B
cos θAB = = •
A B A B
Mahout has CF
Asymptotic order of the algorithm is O(M*N2) in worst case, but is
helped by sparsity.
Google File System, a scalable distributed file system for large distributed data-intensive applications. It provides fault tolerance while running on inexpensive commodity hardware, and it delivers high aggregate performance to a large number of clients. MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key
Workflow rectangle – move to the other side
Custom objects are custom database tables that allow you to store information unique to your organization.
WSC tool to consume the enterprise WSDL to put summary data back into Trend Metrics