The document presents an approach to clustering web pages using a fuzzy logic-based representation and self-organizing maps. The approach, called Extended Fuzzy Combination of Criteria (EFCC), uses fuzzy logic to combine heuristic criteria based on HTML tags and word positions to represent documents. An experiment is described where EFCC is used to represent documents that are then clustered using self-organizing maps. The results and conclusion of the experiment are also presented.
This document summarizes a study investigating the directionality of higher-order auditory pathways using MEG. It presents an overview of the theoretical background, task paradigm, data analysis procedures, and hypotheses. The study aims to determine causality in source space between Broca's area and the posterior superior temporal gyrus using a picture selection task involving sentences with varying word orders. Acquisition will involve MRI, MEG, and DWI from 30 children to examine the development of dorsal pathway II and its role in early top-down modulation during language processing.
This study used signal detection theory to examine how neuroscientists identify the default mode network compared to other prominent resting-state networks. Twenty participants were asked to distinguish the default mode network from three other networks in a rapid forced-choice task, where the networks were presented at different signal thresholds. Results showed that participants more accurately identified the default mode network when it was presented at the most stringent threshold, and made the most conservative decisions when networks were not thresholded. These findings suggest that thresholding fMRI data improves accuracy in identifying brain networks.
New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...Albert Orriols-Puig
The document discusses new challenges in learning classifier systems (LCS) when dealing with domains containing rare classes. It proposes using a design decomposition approach to analyze how LCS address rare classes. Specifically, it examines how the extended classifier system (XCS) handles rare classes. It identifies five critical elements of LCS that are important for detecting small niches associated with rare classes: 1) estimating classifier parameters correctly, 2) providing representatives of rare niches during initialization, 3) generating and growing representatives of rare niches, 4) adjusting the genetic algorithm application rate, and 5) ensuring representatives of rare niches dominate their niches. The document focuses on analyzing the first element of estimating classifier parameters for XCS when dealing with domains
This document provides an overview of genetic fuzzy systems. It begins with a recap of supervised and unsupervised machine learning techniques. It then discusses fuzzy logics and how they can be used to represent imprecise concepts using membership functions. Fuzzy systems that use fuzzy logic to model relationships between variables are introduced. Genetic fuzzy systems combine fuzzy systems with genetic algorithms to design the fuzzy rules, membership functions, and inference engines. The genetic algorithm evolves populations of candidate fuzzy systems through selection, crossover and mutation to optimize system performance.
Prerequisites of AI Techniques Making Robot To Perform Task With Human (autos...ejaruuday
The document discusses techniques for developing cognitive skills in robots to allow them to interact with humans. It describes using a combination of conceptual categories and RDF triples stored in an ORO knowledge server to represent a robot's knowledge. The robot uses various modules like SPARK for spatial reasoning and situation assessment to perceive objects, and gains knowledge through symbolic reasoning, theory of mind modeling, and a working memory model. Key techniques discussed include using OpenCyC concepts to structure a robot's commonsense knowledge and allowing it to reason about objects from different perspectives.
CCIA'2008: Can Evolution Strategies Improve Learning Guidance in XCS? Design ...Albert Orriols-Puig
This document proposes using evolution strategies (ES) instead of genetic algorithms (GA) in the genetic algorithm component of the XCS learning classifier system. It designs an ES-based XCS with a modified classifier representation and ES-based genetic operators. Experiments on real-world datasets show the ES-based XCS outperforms GA-based XCS with selection and mutation alone, though there is no significant difference when crossover is added. Further research is suggested to determine when different search operators should be used.
This document summarizes a study investigating the directionality of higher-order auditory pathways using MEG. It presents an overview of the theoretical background, task paradigm, data analysis procedures, and hypotheses. The study aims to determine causality in source space between Broca's area and the posterior superior temporal gyrus using a picture selection task involving sentences with varying word orders. Acquisition will involve MRI, MEG, and DWI from 30 children to examine the development of dorsal pathway II and its role in early top-down modulation during language processing.
This study used signal detection theory to examine how neuroscientists identify the default mode network compared to other prominent resting-state networks. Twenty participants were asked to distinguish the default mode network from three other networks in a rapid forced-choice task, where the networks were presented at different signal thresholds. Results showed that participants more accurately identified the default mode network when it was presented at the most stringent threshold, and made the most conservative decisions when networks were not thresholded. These findings suggest that thresholding fMRI data improves accuracy in identifying brain networks.
New Challenges in Learning Classifier Systems: Mining Rarities and Evolving F...Albert Orriols-Puig
The document discusses new challenges in learning classifier systems (LCS) when dealing with domains containing rare classes. It proposes using a design decomposition approach to analyze how LCS address rare classes. Specifically, it examines how the extended classifier system (XCS) handles rare classes. It identifies five critical elements of LCS that are important for detecting small niches associated with rare classes: 1) estimating classifier parameters correctly, 2) providing representatives of rare niches during initialization, 3) generating and growing representatives of rare niches, 4) adjusting the genetic algorithm application rate, and 5) ensuring representatives of rare niches dominate their niches. The document focuses on analyzing the first element of estimating classifier parameters for XCS when dealing with domains
This document provides an overview of genetic fuzzy systems. It begins with a recap of supervised and unsupervised machine learning techniques. It then discusses fuzzy logics and how they can be used to represent imprecise concepts using membership functions. Fuzzy systems that use fuzzy logic to model relationships between variables are introduced. Genetic fuzzy systems combine fuzzy systems with genetic algorithms to design the fuzzy rules, membership functions, and inference engines. The genetic algorithm evolves populations of candidate fuzzy systems through selection, crossover and mutation to optimize system performance.
Prerequisites of AI Techniques Making Robot To Perform Task With Human (autos...ejaruuday
The document discusses techniques for developing cognitive skills in robots to allow them to interact with humans. It describes using a combination of conceptual categories and RDF triples stored in an ORO knowledge server to represent a robot's knowledge. The robot uses various modules like SPARK for spatial reasoning and situation assessment to perceive objects, and gains knowledge through symbolic reasoning, theory of mind modeling, and a working memory model. Key techniques discussed include using OpenCyC concepts to structure a robot's commonsense knowledge and allowing it to reason about objects from different perspectives.
CCIA'2008: Can Evolution Strategies Improve Learning Guidance in XCS? Design ...Albert Orriols-Puig
This document proposes using evolution strategies (ES) instead of genetic algorithms (GA) in the genetic algorithm component of the XCS learning classifier system. It designs an ES-based XCS with a modified classifier representation and ES-based genetic operators. Experiments on real-world datasets show the ES-based XCS outperforms GA-based XCS with selection and mutation alone, though there is no significant difference when crossover is added. Further research is suggested to determine when different search operators should be used.
The document discusses fuzzy logic and how it can be applied to strategy games. It introduces fuzzy logic as a way to represent problems with approximate reasoning rather than binary true/false logic. It then discusses how fuzzy logic could be used to determine whether a monster should attack or flee in a game based on fuzzy sets and membership degrees. Finally, it provides some potential solutions for implementing fuzzy logic in games, such as using defuzzification methods to generate crisp outputs from fuzzy inputs.
This document discusses integrating web GIS applications with monitoring tools for analysis and reporting. It provides an overview of GIS applications and web GIS, demonstrates a web GIS map application, and discusses monitoring the availability, performance, and usage of GIS services. The architecture of monitoring tools is explained, including data collection from GIS servers, windows performance counters, and log files. Examples of dashboard reports on summary data, uptime, usage, and performance from the monitoring tools are also shown.
Developing Efficient Web-based GIS ApplicationsSwetha A
The document discusses technologies for developing efficient web-based GIS applications. It describes mapping technologies like static map renderers, slippy maps, and Flash mapping. It also covers database technologies like Oracle, SQL Server, and normalization. Development standards discussed include web wireframing, languages like ASP and PHP, protocols like SOAP, and a three-tier architecture. The conclusion recommends Flash mapping or slippy maps, Oracle database, wireframing, SOAP protocol, and a three-tier architecture for developing efficient web-based GIS applications.
Synthetic Aperture Radar (SAR) uses signal processing techniques to synthesize a large antenna from data collected by a physically small antenna as it moves along a flight path. This allows SAR to achieve high-resolution images independent of altitude. SAR transmits microwave pulses and analyzes the returned echoes to build up images of the terrain. SAR has various applications including topographic mapping and measuring ocean waves, currents, and wind. Ocean backscatter measured by SAR is influenced by surface roughness driven by factors like wind as well as hydrodynamic effects of waves and currents.
This document provides an overview of synthetic aperture radar (SAR). SAR uses motion of a radar antenna mounted on a moving platform to synthesize a large antenna and create high-resolution radar images. It describes the basic principles of SAR, including how successive radar pulses are transmitted and echoes received to build up an image. Applications of SAR include remote sensing, mapping, and monitoring changes over time. Spectral estimation techniques are used to process SAR data and improve resolution. Polarimetry and interferometry are additional SAR techniques. Typical SAR systems are mounted on aircraft or satellites.
MISSION TO PLANETS (CHANDRAYAAN,MAVEN,CURIOSITY,MANGALYAAN,CASSINI SOLSTICE M...Swetha A
This document summarizes several planetary exploration missions. It discusses Chandrayaan-1, India's first lunar orbiter mission, and its objectives to map lunar minerals and terrain. It also discusses NASA's MAVEN and Curiosity missions to Mars, with MAVEN studying the Martian atmosphere and Curiosity analyzing samples to search for evidence of past life. Additionally, it summarizes Cassini's ongoing mission in orbit around Saturn, making discoveries about the rings and moons like Titan and Enceladus.
This document provides an overview of synthetic aperture radar (SAR) basics and theory. It discusses key aspects of SAR including how it works, imaging geometry, spatial resolution, backscatter coefficients, common frequency bands, and advanced modes. SAR uses microwave radiation and can image the Earth's surface in all weather and light conditions, providing complementary data to optical remote sensing. It discusses concepts such as range and azimuth resolution, factors that influence backscatter, and challenges like speckle that SAR addresses through techniques like multi-look processing.
Jerry Clough presents techniques for analyzing OpenStreetMap data using QGIS. He discusses using OSM data to simulate the European Urban Atlas project and mapping retail locations. Case studies include analyzing pub density in Britain, simulating land use classification, and tracking street light and retail mappings. Challenges with OSM data like polygon overlaps and tagging variations are also covered.
Map to Image Georeferencing using ERDAS softwareSwetha A
The document provides steps to georeference a satellite image using ERDAS software. It involves opening the image and a georeferenced toposheet in separate viewers, selecting ground control points that match features in both, and using a polynomial geometric model to resample the image. At least 4 GCPs should be selected to georeference the image, which can then be verified using swipe and transparency tools to check the alignment of features.
The document discusses using hierarchical cluster analysis (HCA) to evaluate metabolomic sample processing methods. It describes two goals: 1) Use HCA to cluster samples based on raw data similarities and correlations to determine the impact of extraction and treatment methods on data variance. Extraction had the greatest effect, with ACN:/IPA/water and MeOH/CH3Cl/water samples most similar. 2) Use HCA to cluster metabolites based on z-scaled data and correlations to identify groups of related metabolites and evaluate the robustness of different correlation measures. Clusters extracted from the correlation-based dendrogram contained metabolites that shared biological functions.
The document describes how to use MATLAB's Fuzzy Logic Toolbox to solve fuzzy logic problems. It begins with an introduction to fuzzy logic and an overview of the toolbox. It then uses the example of balancing an inverted pendulum on a cart to demonstrate the fuzzy inference system design process. This involves defining membership functions, rules, and using toolbox tools to simulate the fuzzy controller.
This document provides information about an upcoming training course on advanced synthetic aperture radar (SAR) processing being offered by the Applied Technology Institute (ATI). The 2-day course will be held on May 6-7, 2009 in Chantilly, Virginia and will be instructed by Bart Huxtable. It will cover topics such as SAR review origins, basic and advanced SAR processing techniques, interferometric SAR, spotlight mode SAR, and polarimetric SAR. The course outline and schedule are provided along with instructor biographies and registration information. Additionally, the document advertises ATI's ability to provide on-site customized training courses.
This document discusses feature extraction and selection methods for principal component analysis. It provides an introduction to principal component analysis and how it can be used for dimensionality reduction by transforming correlated variables into a set of uncorrelated variables. The document serves as a tutorial on feature extraction, selection, and principal component analysis.
Radar 2009 a 14 airborne pulse doppler radarForward2025
This document provides an overview of a lecture on airborne pulse Doppler radar systems. It discusses different airborne radar missions including fighter/interceptor radars like those used on F-16s and F-35s, as well as airborne early warning radars like AWACS. It covers topics like airborne radar clutter, pulse Doppler modes using different PRFs, and examples of military radars and their specifications. The goal is to explain the considerations and techniques involved in airborne pulse Doppler radar system design and operation.
This document discusses using principal component analysis (PCA) to analyze metabolomic sample data from pumpkin experiments. PCA was performed on the raw data and scaled data to identify major sources of variance. For the raw data, the first two principal components captured most of the variance and separated samples by extraction method and treatment. Several samples were identified as potential outliers. When PCA was done on autoscaled data, the loadings showed differences due to both extraction and treatment. The scaled analysis also identified some outlier samples.
Radar 2009 a 18 synthetic aperture radarForward2025
This document provides an overview of a lecture on synthetic aperture radar (SAR). It begins with an introduction to SAR, including why it was developed due to limitations of conventional radar for imaging. It then discusses the basics of SAR and how it forms images using signal processing to synthesize a large antenna aperture. The document outlines the rest of the lecture topics which will cover SAR image formation techniques, examples, applications, and a history of the evolution of SAR from its origins in the 1950s to current systems.
This document discusses various geoprocessing tools available in QGIS for manipulating spatial data. It describes tools such as convex hull, buffer, union, intersect, clip, symmetrical difference, and dissolve. For each tool, it provides a definition, explains how to use the tool in QGIS, and shows an example of the output layer. The document serves as a guide to common geoprocessing tasks and spatial analysis that can be performed in QGIS.
Remote Sensing And GIS Application In Mineral , Oil , Ground Water MappingMin...Swetha A
Remote sensing and GIS techniques can be used to map minerals, oil, and groundwater. For minerals, accommodation zones between faults can localize magmatic material and mineralized fluids, and be identified in satellite images showing brecciation and fault patterns. Oil and gas exploration uses airborne magnetic and gravity surveys integrated with high resolution satellite imagery and DEMs for 3D visualization. Groundwater mapping involves literature review, image interpretation to create spatial databases, field reconnaissance, spatial analysis of data, and identifying recommended recharge structures by analyzing IRS satellite images, groundwater table maps, DEM elevation data, and resistivity curve modeling from electrical soundings.
Steps for Principal Component Analysis (pca) using ERDAS softwareSwetha A
Principal component analysis is a technique that uses orthogonal transformation to convert correlated variables into a set of uncorrelated variables called principal components. The document provides steps to perform principal component analysis in ERDAS, including opening an input file, specifying the number of desired components and output file, and viewing the output layers. The first few layers highlight different features like urban areas, water regions, and vegetation.
Matlab Feature Extraction Using Segmentation And Edge DetectionDataminingTools Inc
This document discusses several image processing techniques in Matlab:
1) Edge detection using the edge function and Sobel and Canny edge detection algorithms.
2) The radon transform which computes projections of an image along specified directions and the inverse radon transform used to reconstruct images from projections.
3) Marker-controlled watershed segmentation which separates touching objects in an image using morphological operations like gradients and markers.
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
The document discusses fuzzy logic and how it can be applied to strategy games. It introduces fuzzy logic as a way to represent problems with approximate reasoning rather than binary true/false logic. It then discusses how fuzzy logic could be used to determine whether a monster should attack or flee in a game based on fuzzy sets and membership degrees. Finally, it provides some potential solutions for implementing fuzzy logic in games, such as using defuzzification methods to generate crisp outputs from fuzzy inputs.
This document discusses integrating web GIS applications with monitoring tools for analysis and reporting. It provides an overview of GIS applications and web GIS, demonstrates a web GIS map application, and discusses monitoring the availability, performance, and usage of GIS services. The architecture of monitoring tools is explained, including data collection from GIS servers, windows performance counters, and log files. Examples of dashboard reports on summary data, uptime, usage, and performance from the monitoring tools are also shown.
Developing Efficient Web-based GIS ApplicationsSwetha A
The document discusses technologies for developing efficient web-based GIS applications. It describes mapping technologies like static map renderers, slippy maps, and Flash mapping. It also covers database technologies like Oracle, SQL Server, and normalization. Development standards discussed include web wireframing, languages like ASP and PHP, protocols like SOAP, and a three-tier architecture. The conclusion recommends Flash mapping or slippy maps, Oracle database, wireframing, SOAP protocol, and a three-tier architecture for developing efficient web-based GIS applications.
Synthetic Aperture Radar (SAR) uses signal processing techniques to synthesize a large antenna from data collected by a physically small antenna as it moves along a flight path. This allows SAR to achieve high-resolution images independent of altitude. SAR transmits microwave pulses and analyzes the returned echoes to build up images of the terrain. SAR has various applications including topographic mapping and measuring ocean waves, currents, and wind. Ocean backscatter measured by SAR is influenced by surface roughness driven by factors like wind as well as hydrodynamic effects of waves and currents.
This document provides an overview of synthetic aperture radar (SAR). SAR uses motion of a radar antenna mounted on a moving platform to synthesize a large antenna and create high-resolution radar images. It describes the basic principles of SAR, including how successive radar pulses are transmitted and echoes received to build up an image. Applications of SAR include remote sensing, mapping, and monitoring changes over time. Spectral estimation techniques are used to process SAR data and improve resolution. Polarimetry and interferometry are additional SAR techniques. Typical SAR systems are mounted on aircraft or satellites.
MISSION TO PLANETS (CHANDRAYAAN,MAVEN,CURIOSITY,MANGALYAAN,CASSINI SOLSTICE M...Swetha A
This document summarizes several planetary exploration missions. It discusses Chandrayaan-1, India's first lunar orbiter mission, and its objectives to map lunar minerals and terrain. It also discusses NASA's MAVEN and Curiosity missions to Mars, with MAVEN studying the Martian atmosphere and Curiosity analyzing samples to search for evidence of past life. Additionally, it summarizes Cassini's ongoing mission in orbit around Saturn, making discoveries about the rings and moons like Titan and Enceladus.
This document provides an overview of synthetic aperture radar (SAR) basics and theory. It discusses key aspects of SAR including how it works, imaging geometry, spatial resolution, backscatter coefficients, common frequency bands, and advanced modes. SAR uses microwave radiation and can image the Earth's surface in all weather and light conditions, providing complementary data to optical remote sensing. It discusses concepts such as range and azimuth resolution, factors that influence backscatter, and challenges like speckle that SAR addresses through techniques like multi-look processing.
Jerry Clough presents techniques for analyzing OpenStreetMap data using QGIS. He discusses using OSM data to simulate the European Urban Atlas project and mapping retail locations. Case studies include analyzing pub density in Britain, simulating land use classification, and tracking street light and retail mappings. Challenges with OSM data like polygon overlaps and tagging variations are also covered.
Map to Image Georeferencing using ERDAS softwareSwetha A
The document provides steps to georeference a satellite image using ERDAS software. It involves opening the image and a georeferenced toposheet in separate viewers, selecting ground control points that match features in both, and using a polynomial geometric model to resample the image. At least 4 GCPs should be selected to georeference the image, which can then be verified using swipe and transparency tools to check the alignment of features.
The document discusses using hierarchical cluster analysis (HCA) to evaluate metabolomic sample processing methods. It describes two goals: 1) Use HCA to cluster samples based on raw data similarities and correlations to determine the impact of extraction and treatment methods on data variance. Extraction had the greatest effect, with ACN:/IPA/water and MeOH/CH3Cl/water samples most similar. 2) Use HCA to cluster metabolites based on z-scaled data and correlations to identify groups of related metabolites and evaluate the robustness of different correlation measures. Clusters extracted from the correlation-based dendrogram contained metabolites that shared biological functions.
The document describes how to use MATLAB's Fuzzy Logic Toolbox to solve fuzzy logic problems. It begins with an introduction to fuzzy logic and an overview of the toolbox. It then uses the example of balancing an inverted pendulum on a cart to demonstrate the fuzzy inference system design process. This involves defining membership functions, rules, and using toolbox tools to simulate the fuzzy controller.
This document provides information about an upcoming training course on advanced synthetic aperture radar (SAR) processing being offered by the Applied Technology Institute (ATI). The 2-day course will be held on May 6-7, 2009 in Chantilly, Virginia and will be instructed by Bart Huxtable. It will cover topics such as SAR review origins, basic and advanced SAR processing techniques, interferometric SAR, spotlight mode SAR, and polarimetric SAR. The course outline and schedule are provided along with instructor biographies and registration information. Additionally, the document advertises ATI's ability to provide on-site customized training courses.
This document discusses feature extraction and selection methods for principal component analysis. It provides an introduction to principal component analysis and how it can be used for dimensionality reduction by transforming correlated variables into a set of uncorrelated variables. The document serves as a tutorial on feature extraction, selection, and principal component analysis.
Radar 2009 a 14 airborne pulse doppler radarForward2025
This document provides an overview of a lecture on airborne pulse Doppler radar systems. It discusses different airborne radar missions including fighter/interceptor radars like those used on F-16s and F-35s, as well as airborne early warning radars like AWACS. It covers topics like airborne radar clutter, pulse Doppler modes using different PRFs, and examples of military radars and their specifications. The goal is to explain the considerations and techniques involved in airborne pulse Doppler radar system design and operation.
This document discusses using principal component analysis (PCA) to analyze metabolomic sample data from pumpkin experiments. PCA was performed on the raw data and scaled data to identify major sources of variance. For the raw data, the first two principal components captured most of the variance and separated samples by extraction method and treatment. Several samples were identified as potential outliers. When PCA was done on autoscaled data, the loadings showed differences due to both extraction and treatment. The scaled analysis also identified some outlier samples.
Radar 2009 a 18 synthetic aperture radarForward2025
This document provides an overview of a lecture on synthetic aperture radar (SAR). It begins with an introduction to SAR, including why it was developed due to limitations of conventional radar for imaging. It then discusses the basics of SAR and how it forms images using signal processing to synthesize a large antenna aperture. The document outlines the rest of the lecture topics which will cover SAR image formation techniques, examples, applications, and a history of the evolution of SAR from its origins in the 1950s to current systems.
This document discusses various geoprocessing tools available in QGIS for manipulating spatial data. It describes tools such as convex hull, buffer, union, intersect, clip, symmetrical difference, and dissolve. For each tool, it provides a definition, explains how to use the tool in QGIS, and shows an example of the output layer. The document serves as a guide to common geoprocessing tasks and spatial analysis that can be performed in QGIS.
Remote Sensing And GIS Application In Mineral , Oil , Ground Water MappingMin...Swetha A
Remote sensing and GIS techniques can be used to map minerals, oil, and groundwater. For minerals, accommodation zones between faults can localize magmatic material and mineralized fluids, and be identified in satellite images showing brecciation and fault patterns. Oil and gas exploration uses airborne magnetic and gravity surveys integrated with high resolution satellite imagery and DEMs for 3D visualization. Groundwater mapping involves literature review, image interpretation to create spatial databases, field reconnaissance, spatial analysis of data, and identifying recommended recharge structures by analyzing IRS satellite images, groundwater table maps, DEM elevation data, and resistivity curve modeling from electrical soundings.
Steps for Principal Component Analysis (pca) using ERDAS softwareSwetha A
Principal component analysis is a technique that uses orthogonal transformation to convert correlated variables into a set of uncorrelated variables called principal components. The document provides steps to perform principal component analysis in ERDAS, including opening an input file, specifying the number of desired components and output file, and viewing the output layers. The first few layers highlight different features like urban areas, water regions, and vegetation.
Matlab Feature Extraction Using Segmentation And Edge DetectionDataminingTools Inc
This document discusses several image processing techniques in Matlab:
1) Edge detection using the edge function and Sobel and Canny edge detection algorithms.
2) The radon transform which computes projections of an image along specified directions and the inverse radon transform used to reconstruct images from projections.
3) Marker-controlled watershed segmentation which separates touching objects in an image using morphological operations like gradients and markers.
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
Dive into the realm of operating systems (OS) with Pravash Chandra Das, a seasoned Digital Forensic Analyst, as your guide. 🚀 This comprehensive presentation illuminates the core concepts, types, and evolution of OS, essential for understanding modern computing landscapes.
Beginning with the foundational definition, Das clarifies the pivotal role of OS as system software orchestrating hardware resources, software applications, and user interactions. Through succinct descriptions, he delineates the diverse types of OS, from single-user, single-task environments like early MS-DOS iterations, to multi-user, multi-tasking systems exemplified by modern Linux distributions.
Crucial components like the kernel and shell are dissected, highlighting their indispensable functions in resource management and user interface interaction. Das elucidates how the kernel acts as the central nervous system, orchestrating process scheduling, memory allocation, and device management. Meanwhile, the shell serves as the gateway for user commands, bridging the gap between human input and machine execution. 💻
The narrative then shifts to a captivating exploration of prominent desktop OSs, Windows, macOS, and Linux. Windows, with its globally ubiquitous presence and user-friendly interface, emerges as a cornerstone in personal computing history. macOS, lauded for its sleek design and seamless integration with Apple's ecosystem, stands as a beacon of stability and creativity. Linux, an open-source marvel, offers unparalleled flexibility and security, revolutionizing the computing landscape. 🖥️
Moving to the realm of mobile devices, Das unravels the dominance of Android and iOS. Android's open-source ethos fosters a vibrant ecosystem of customization and innovation, while iOS boasts a seamless user experience and robust security infrastructure. Meanwhile, discontinued platforms like Symbian and Palm OS evoke nostalgia for their pioneering roles in the smartphone revolution.
The journey concludes with a reflection on the ever-evolving landscape of OS, underscored by the emergence of real-time operating systems (RTOS) and the persistent quest for innovation and efficiency. As technology continues to shape our world, understanding the foundations and evolution of operating systems remains paramount. Join Pravash Chandra Das on this illuminating journey through the heart of computing. 🌟
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
Generating privacy-protected synthetic data using Secludy and MilvusZilliz
During this demo, the founders of Secludy will demonstrate how their system utilizes Milvus to store and manipulate embeddings for generating privacy-protected synthetic data. Their approach not only maintains the confidentiality of the original data but also enhances the utility and scalability of LLMs under privacy constraints. Attendees, including machine learning engineers, data scientists, and data managers, will witness first-hand how Secludy's integration with Milvus empowers organizations to harness the power of LLMs securely and efficiently.
Nunit vs XUnit vs MSTest Differences Between These Unit Testing Frameworks.pdfflufftailshop
When it comes to unit testing in the .NET ecosystem, developers have a wide range of options available. Among the most popular choices are NUnit, XUnit, and MSTest. These unit testing frameworks provide essential tools and features to help ensure the quality and reliability of code. However, understanding the differences between these frameworks is crucial for selecting the most suitable one for your projects.
Trusted Execution Environment for Decentralized Process MiningLucaBarbaro3
Presentation of the paper "Trusted Execution Environment for Decentralized Process Mining" given during the CAiSE 2024 Conference in Cyprus on June 7, 2024.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...alexjohnson7307
Predictive maintenance is a proactive approach that anticipates equipment failures before they happen. At the forefront of this innovative strategy is Artificial Intelligence (AI), which brings unprecedented precision and efficiency. AI in predictive maintenance is transforming industries by reducing downtime, minimizing costs, and enhancing productivity.
Digital Marketing Trends in 2024 | Guide for Staying AheadWask
https://www.wask.co/ebooks/digital-marketing-trends-in-2024
Feeling lost in the digital marketing whirlwind of 2024? Technology is changing, consumer habits are evolving, and staying ahead of the curve feels like a never-ending pursuit. This e-book is your compass. Dive into actionable insights to handle the complexities of modern marketing. From hyper-personalization to the power of user-generated content, learn how to build long-term relationships with your audience and unlock the secrets to success in the ever-shifting digital landscape.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
Web Page Clustering Using a Fuzzy Logic Based Representation and Self-Organizing Maps
1. Web Page Clustering Using a Fuzzy Logic Based
Representation and Self-organizing Maps
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez
NLP & IR Group, UNED
December 12, 2008
2. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Table of Contents
1 Objectives
2 Our Approach: Extended Fuzzy Combination of Criteria
(EFCC)
3 Experiment Description
4 Results
5 Conclusion
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 2
3. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Table of Contents
1 Objectives
2 Our Approach: Extended Fuzzy Combination of Criteria
(EFCC)
3 Experiment Description
4 Results
5 Conclusion
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 3
4. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Objectives
Group HTML documents by content similarity.
Self-Organizing Maps (SOM) to organize, visualize and
navigate through the collection.
Term weighting function taking advantage of HTML tags
Combining, by means of fuzzy logic, heuristic criteria based on
the inherent semantics of some HTML tags and word positions
in the document.
Hypothesis
An improvement in document representation will involve an
increase in map quality.
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 4
5. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Table of Contents
1 Objectives
2 Our Approach: Extended Fuzzy Combination of Criteria
(EFCC)
1 Fuzzy Logic
2 EFCC
3 Linguistic Variables
4 Knowledge Base
3 Experiment Description
4 Results
5 Conclusion
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 5
6. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Fuzzy logic
Capturing human expert knowledge.
Close to natural language.
Knowledge base: defined by a set of IF-THEN rules.
Linguistic variables
Defined using natural language words and fuzzy sets.
These sets allow the description of the membership degree of
an object to a particular class.
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 6
7. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Table of Contents
1 Objectives
2 Our Approach: Extended Fuzzy Combination of Criteria
(EFCC)
1 Fuzzy Logic
2 EFCC
3 Linguistic Variables
4 Knowledge Base
3 Experiment Description
4 Results
5 Conclusion
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 7
8. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Extended Fuzzy Combination of Criteria
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 8
9. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Extended Fuzzy Combination of Criteria
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 9
10. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Extended Fuzzy Combination of Criteria
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 10
11. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Extended Fuzzy Combination of Criteria
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 11
12. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Extended Fuzzy Combination of Criteria
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 12
13. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Extended Fuzzy Combination of Criteria
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 13
14. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Extended Fuzzy Combination of Criteria
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 14
15. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Extended Fuzzy Combination of Criteria
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 15
16. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Extended Fuzzy Combination of Criteria
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 16
17. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Extended Fuzzy Combination of Criteria
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 17
18. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Extended Fuzzy Combination of Criteria
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 18
19. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Table of Contents
1 Objectives
2 Our Approach: Extended Fuzzy Combination of Criteria
(EFCC)
1 Fuzzy Logic
2 EFCC
3 Linguistic Variables
4 Knowledge Base
3 Experiment Description
4 Results
5 Conclusion
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 19
20. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Linguistic Variables
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 20
21. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Linguistic Variables
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 21
22. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Linguistic Variables
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 22
23. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Linguistic Variables
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 23
24. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Linguistic Variables
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 24
25. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Linguistic Variables
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 25
26. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Table of Contents
1 Objectives
2 Our Approach: Extended Fuzzy Combination of Criteria
(EFCC)
1 Fuzzy Logic
2 EFCC
3 Linguistic Variables
4 Knowledge Base
3 Experiment Description
4 Results
5 Conclusion
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 26
27. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Knowledge Base
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 27
28. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Knowledge Base
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 28
29. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Knowledge Base
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 29
30. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Knowledge Base
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 30
31. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Table of Contents
1 Objectives
2 Our Approach: Extended Fuzzy Combination of Criteria
(EFCC)
3 Experiment Description
1 Dimensionality Reduction
2 Document Map
3 Evaluation Methods
4 Results
5 Conclusion
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 31
32. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Dimensionality Reduction
Input vectors dimension ranging from 100 to 5000
Stopwords, puntuaction marks suffixes, and words occurring
less than 50 times in the whole corpus were removed.
Two well known methods:
Document frequency reduction.
Random projection method.
Three proposed rank-based methods:
Most Valued Terms.
Fixed reduction method.
More Frequent Terms until n level.
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 32
33. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Table of Contents
1 Objectives
2 Our Approach: Extended Fuzzy Combination of Criteria
(EFCC)
3 Experiment Description
1 Dimensionality Reduction
2 Document Map
3 Evaluation Methods
4 Results
5 Conclusion
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 33
34. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Document Map Construction
Benchmark dataset for clustering: Banksearch1
10000 documents
10 classes
SOM size was set equal to the number of classes of input
documents, i.e. 5x2, in order to compare clustering results.
1
M. P. Sinka and D. W. Corne. A large benchmark dataset for web document clustering. Soft Computing
Systems: Design, Management, and Applications, 2002.
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 34
35. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Table of Contents
1 Objectives
2 Our Approach: Extended Fuzzy Combination of Criteria
(EFCC)
3 Experiment Description
1 Dimensionality Reduction
2 Document Map
3 Evaluation Methods
4 Results
5 Conclusion
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 35
36. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Evaluation Methods
Weighted average of the F-measure for each class.
After mapping the collection in the trained map, the class
with greater number of documents mapped on a neuron will
be selected to label the unit.
All the document vectors in a neuron which class is different
from the neuron label will be counted as errors.
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 36
37. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Table of Contents
1 Objectives
2 Our Approach: Extended Fuzzy Combination of Criteria
(EFCC)
3 Experiment Description
4 Results
5 Conclusion
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 37
38. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Best reduction for each term weighting function
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 38
39. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
MFTn reduction provides stability
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 39
40. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
EFCC+MFTn obtains its best results with the
smallest number of features
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 40
41. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Table of Contents
1 Objectives
2 Our Approach: Extended Fuzzy Combination of Criteria
(EFCC)
3 Experiment Description
4 Results
5 Conclusion
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 41
42. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Conclusion
Unsupervised document representation method, based on
fuzzy logic, focused on clustering HTML documents by means
of self-organizing maps.
MFTn reduction is the most stable reduction in all cases.
EFCC representation allows to obtain better results using a
smaller vocabulary.
Smaller number of features needed to represent the input
documents and SOM unit vectors, which implies an
improvement in computational cost.
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 42
43. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Thank You!
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 43
44. Web Page Clustering Using a Fuzzy Logic Based Representation and Self-organizing Maps
Objectives Our Approach Experiment Description Results Conclusion
Related Work
VSM Topic Document Weighting Modifies
Information Type Function SOM
Self organization of
a Massive Document Yes Yes Text Shannon’s Entrophy No
Collection2
Document Clustering Yes No Text Binary, TF, TF-IDF No
using Phrases3
Document Clustering Yes Yes Text ESVM, HSVM, HyM No
using WordNet4
Conceptional SOM5 Yes No Text TF Yes
2
T. Kohonen, S. Kaski, K. Lagus, J. Salojarvi, J. Honkela, V. Paatero, and A. Saarela. Self organization of a
massive document collection. IEEE Trans. on Neural Networks, 2000.
3
J. Bakus, M. Hussin, and M. Kamel. A som-based document clustering using phrases. In ICONIP, 2002.
4
C. Hung and S. Wermter. Neural network based document clustering using wordnet ontologies. Int. J.
Hybrid Intell. Syst., 2004
5
Y. Liu, X. Wang, and C. Wu. Consom: A conceptional som model for text clustering. In Neurocomputing,
2008
Alberto P. Garc´
ıa-Plaza, V´
ıctor Fresno, Raquel Mart´
ınez, NLP & IR Group, UNED slide 44