RecSys 2013 workshop paper on how to improve your cross-validation scheme in order to improve the statistical power of underlying significance testing.
Managing Irrelevant Contextual Categories in a Movie Recommender SystemAndrej Kosir
This document presents a methodology for managing irrelevant contextual categories in movie recommender systems. It proposes detecting relevant contextual conditions that influence user decisions and identifying irrelevant conditions to merge. The methodology applies statistical tests to movie rating data under different contextual conditions to determine which can be merged without affecting rating prediction accuracy. It evaluates the approach on a dataset of movie ratings with contextual metadata, finding some categories like season could be merged while others like decision context were all relevant. The methodology aims to improve recommender systems by reducing sparsity and questionnaire size by filtering irrelevant contextual information.
This document outlines a student project involving digital image processing techniques. The project aims to 1) evaluate different image de-noising techniques using simulated data, 2) design a system to process and register brain CT and MR images, and 3) compare medical image registration and fusion techniques. The document describes simulating various types of noise, using spatial and wavelet-based filters to reduce noise, and evaluating the performance of registration and fusion algorithms. Chapters will address noise reduction, registration foundations, biomedical applications, and fusion methods.
The document summarizes a research project on single image haze removal using a variable fog-weight. It begins with an introduction on how haze degrades image quality and the need for haze removal techniques. It then discusses the motivation, literature review, objective, and main contribution of the proposed method. The method uses the dark channel prior to estimate the transmission map and atmospheric light. It then applies a variable fog-weight to modify the transmission map and reduce halo artifacts. A guided filter is used for transmission refinement before recovering the haze-free scene radiance. The method aims to improve on existing techniques by reducing time complexity and halo artifacts while enhancing image visibility.
Single-photon avalanche diodes (SPADs) are novel sensors that can detect individual photons with high time resolution. SPADs allow for imaging with extreme dynamic range from low to high light conditions without saturation. They also enable minimal motion blur imaging due to their ability to precisely timestamp single photons. Recent research has demonstrated burst photography using SPAD arrays that can reconstruct non-rigid scene motion and produce almost motion-blur free images in dark environments. However, challenges remain in increasing resolution, reducing data rates and power consumption before widespread commercial applications can be realized.
Microstructural Analysis and Machine LearningPFHub PFHub
This document discusses using machine learning for microstructural analysis and semantic segmentation of x-ray tomography data. It describes using a convolutional neural network (CNN) trained on phase field simulated microstructures to perform semantic segmentation of x-ray tomography images of dendritic solidification in aluminum alloys. The CNN was able to achieve 99% accuracy when trained on 1000 small cropped images from the tomography data. Phase field modeling offers control over features to match the tomography and help determine the needed amount and size of training images for the CNN.
This paper proposes to analyze end-to-end network performance as a signal. Traditionally, network performance is measured by specially designed active probes, which can be singular packets, packet pairs, or longer packet trains, where packet pairs and trains are the default methods for useful performance metrics like available bandwidth, bottleneck capacity, jitter, etc. Probing results are notoriously noisy. This paper shows that if probing data are treated as a signal and processed as such, precision can be improved. Real network experiments and analysis are conducted specifically for available bandwidth, but the fundamental approach can be applied to any performance metric.
Going Smart and Deep on Materials at ALCFIan Foster
As we acquire large quantities of science data from experiment and simulation, it becomes possible to apply machine learning (ML) to those data to build predictive models and to guide future simulations and experiments. Leadership Computing Facilities need to make it easy to assemble such data collections and to develop, deploy, and run associated ML models.
We describe and demonstrate here how we are realizing such capabilities at the Argonne Leadership Computing Facility. In our demonstration, we use large quantities of time-dependent density functional theory (TDDFT) data on proton stopping power in various materials maintained in the Materials Data Facility (MDF) to build machine learning models, ranging from simple linear models to complex artificial neural networks, that are then employed to manage computations, improving their accuracy and reducing their cost. We highlight the use of new services being prototyped at Argonne to organize and assemble large data collections (MDF in this case), associate ML models with data collections, discover available data and models, work with these data and models in an interactive Jupyter environment, and launch new computations on ALCF resources.
Smart Sound Measurement and Control System for Smart CityIRJET Journal
This document summarizes a research paper that proposes a smart sound measurement and control system for smart cities using Internet of Things technology. The system aims to address issues with existing noise measurement devices, such as only detecting noise in limited nearby areas. The proposed system would use multiple inexpensive sound detection devices connected via WiFi that send sensor data to the cloud to be viewed on mobile devices. This would allow for averaging readings across devices and monitoring noise levels in larger spaces. The system is intended to help authorities better enforce noise regulations and view historical noise data to address noise pollution issues near hospitals, schools and other areas that require quiet environments.
Managing Irrelevant Contextual Categories in a Movie Recommender SystemAndrej Kosir
This document presents a methodology for managing irrelevant contextual categories in movie recommender systems. It proposes detecting relevant contextual conditions that influence user decisions and identifying irrelevant conditions to merge. The methodology applies statistical tests to movie rating data under different contextual conditions to determine which can be merged without affecting rating prediction accuracy. It evaluates the approach on a dataset of movie ratings with contextual metadata, finding some categories like season could be merged while others like decision context were all relevant. The methodology aims to improve recommender systems by reducing sparsity and questionnaire size by filtering irrelevant contextual information.
This document outlines a student project involving digital image processing techniques. The project aims to 1) evaluate different image de-noising techniques using simulated data, 2) design a system to process and register brain CT and MR images, and 3) compare medical image registration and fusion techniques. The document describes simulating various types of noise, using spatial and wavelet-based filters to reduce noise, and evaluating the performance of registration and fusion algorithms. Chapters will address noise reduction, registration foundations, biomedical applications, and fusion methods.
The document summarizes a research project on single image haze removal using a variable fog-weight. It begins with an introduction on how haze degrades image quality and the need for haze removal techniques. It then discusses the motivation, literature review, objective, and main contribution of the proposed method. The method uses the dark channel prior to estimate the transmission map and atmospheric light. It then applies a variable fog-weight to modify the transmission map and reduce halo artifacts. A guided filter is used for transmission refinement before recovering the haze-free scene radiance. The method aims to improve on existing techniques by reducing time complexity and halo artifacts while enhancing image visibility.
Single-photon avalanche diodes (SPADs) are novel sensors that can detect individual photons with high time resolution. SPADs allow for imaging with extreme dynamic range from low to high light conditions without saturation. They also enable minimal motion blur imaging due to their ability to precisely timestamp single photons. Recent research has demonstrated burst photography using SPAD arrays that can reconstruct non-rigid scene motion and produce almost motion-blur free images in dark environments. However, challenges remain in increasing resolution, reducing data rates and power consumption before widespread commercial applications can be realized.
Microstructural Analysis and Machine LearningPFHub PFHub
This document discusses using machine learning for microstructural analysis and semantic segmentation of x-ray tomography data. It describes using a convolutional neural network (CNN) trained on phase field simulated microstructures to perform semantic segmentation of x-ray tomography images of dendritic solidification in aluminum alloys. The CNN was able to achieve 99% accuracy when trained on 1000 small cropped images from the tomography data. Phase field modeling offers control over features to match the tomography and help determine the needed amount and size of training images for the CNN.
This paper proposes to analyze end-to-end network performance as a signal. Traditionally, network performance is measured by specially designed active probes, which can be singular packets, packet pairs, or longer packet trains, where packet pairs and trains are the default methods for useful performance metrics like available bandwidth, bottleneck capacity, jitter, etc. Probing results are notoriously noisy. This paper shows that if probing data are treated as a signal and processed as such, precision can be improved. Real network experiments and analysis are conducted specifically for available bandwidth, but the fundamental approach can be applied to any performance metric.
Going Smart and Deep on Materials at ALCFIan Foster
As we acquire large quantities of science data from experiment and simulation, it becomes possible to apply machine learning (ML) to those data to build predictive models and to guide future simulations and experiments. Leadership Computing Facilities need to make it easy to assemble such data collections and to develop, deploy, and run associated ML models.
We describe and demonstrate here how we are realizing such capabilities at the Argonne Leadership Computing Facility. In our demonstration, we use large quantities of time-dependent density functional theory (TDDFT) data on proton stopping power in various materials maintained in the Materials Data Facility (MDF) to build machine learning models, ranging from simple linear models to complex artificial neural networks, that are then employed to manage computations, improving their accuracy and reducing their cost. We highlight the use of new services being prototyped at Argonne to organize and assemble large data collections (MDF in this case), associate ML models with data collections, discover available data and models, work with these data and models in an interactive Jupyter environment, and launch new computations on ALCF resources.
Smart Sound Measurement and Control System for Smart CityIRJET Journal
This document summarizes a research paper that proposes a smart sound measurement and control system for smart cities using Internet of Things technology. The system aims to address issues with existing noise measurement devices, such as only detecting noise in limited nearby areas. The proposed system would use multiple inexpensive sound detection devices connected via WiFi that send sensor data to the cloud to be viewed on mobile devices. This would allow for averaging readings across devices and monitoring noise levels in larger spaces. The system is intended to help authorities better enforce noise regulations and view historical noise data to address noise pollution issues near hospitals, schools and other areas that require quiet environments.
Construction of inexpensive Web-Cam based Optical Spectrometer usingSoares Fernando
This document describes the construction and use of an inexpensive webcam-based optical spectrometer for quantitative spectroscopic studies. Key points:
- An inexpensive spectrometer was built from readily available materials like DVDs, cardboard, tape and glue to enable students to measure electromagnetic spectra as a function of wavelength within 10s of nm resolution and accuracy.
- The spectrometer was calibrated using known emission lines from a helium source and the hydrogen emission spectrum was analyzed, matching theoretical predictions to within 0.04% error.
- The low-cost nature of this device makes it suitable for equipping large classes for hands-on spectroscopy experiments and studies in resource-limited educational settings.
The document discusses a project that used accelerometers to recognize gestures for a virtual environment. The project utilized Wii remotes and an Acceleglove to collect accelerometer data and recognize gestures through MATLAB. A C# program extracted accelerometer data from the Wii remotes for training and testing gesture recognition algorithms in MATLAB. Key algorithms used in MATLAB included dynamic time warping, affinity propagation, and random projection for gesture recognition. The digital glove also used an artificial neural network approach for gesture recognition.
IRJET - An Robust and Dynamic Fire Detection Method using Convolutional N...IRJET Journal
This document proposes a new fire detection method using convolutional neural networks (CNNs). Specifically, it uses the YOLOv3 object detection algorithm, which can detect objects like fire in images or videos quickly and accurately. The proposed method aims to reduce computational time and costs compared to other CNN-based approaches, while also improving detection accuracy and reducing false alarms. It discusses implementing the method using four main modules: data exploration, pre-processing, feature engineering, and model selection. The workflow involves exploring data, pre-processing images, extracting features, and selecting the YOLOv3 CNN model for fire detection. The goal is to develop a robust and dynamic fire detection system using computer vision techniques to help prevent accidents.
(Structural) Feature Interactions for Variability-Intensive Systems Testing Gilles Perrouin
Presentation given in the "short talks" session in the Dagstuhl seminar 14281 on "Feature Interactions - the Next Generation" , Schloss Dagstuhl, Germany, July 2014.
APPLICATION OF CONVOLUTIONAL NEURAL NETWORK IN LAWN MEASUREMENTsipij
Lawn area measurement is an application of image processing and deep learning. Researchers used
hierarchical networks, segmented images, and other methods to measure the lawn area. Methods’
effectiveness and accuracy varies. In this project, deep learning method, specifically Convolutional neural
network, was applied to measure the lawn area. We used Keras and TensorFlow in Python to develop a
model that was trained on the dataset of houses then tuned the parameters with GridSearchCV in ScikitLearn (a machine learning library in Python) to estimate the lawn area. Convolutional neural network or
shortly CNN shows high accuracy (94 -97%). We may conclude that deep learning method, especially
CNN, could be a good method with a high state-of-art accuracy.
This document presents a scalable heuristic called Maximum Influence Arborescence (MIA) for solving the influence maximization problem in large social networks. MIA finds maximum influence paths between nodes and uses them to construct local influence regions called arborescences. It selects seed nodes that provide the largest marginal increase in influence spread by efficiently updating activation probabilities in the arborescences. Experiments on real networks show MIA achieves over 103-104 speedup compared to previous methods while maintaining similar influence spread, making it suitable for large networks with thousands to millions of nodes.
We study influence maximization in which diffusion on each step may be delayed, and the objective is to maximize influence spread within a certain deadline. Both IC and LT models are extended, and efficient algorithms are proposed and evaluated.
This work appears in AAAI 2012. For the full version of the paper, please see: http://arxiv.org/abs/1204.3074
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...PyData
Artificial intelligence is emerging as a new paradigm in materials science. This talk describes how physical intuition and (insightful) machine learning can solve the complicated task of structure recognition in materials at the nanoscale.
Jos van Sas - Testimonial Alcatel-Lucent Bell Labsimec.archive
- Bell Labs Alcatel-Lucent is an innovation engine with over 1,000 scientists and researchers across 8 countries collaborating with over 300 academic institutions. It has over 27,900 active patents and publishes over 400 papers per year.
- Experimentation facilities are important for evaluating solutions under realistic conditions at scale beyond theoretical research and simulations. This allows moving research closer to eventual product development.
- Examples of projects using large-scale emulation on the iLab.t platform include FP7 OCEAN investigating scalable content-aware delivery over CDNs and FP7 ECODE adding learning-based control to networks.
- Lessons learned are that large-scale emulation is important for validation and understanding before
Acoustic Scene Classification by using Combination of MODWPT and Spectral Fea...ijtsrd
Acoustic Scene Classification ASC is classified audio signals to imply about the context of the recorded environment. Audio scene includes a mixture of background sound and a variety of sound events. In this paper, we present the combination of maximal overlap wavelet packet transform MODWPT level 5 and six sets of time domain and frequency domain features are energy entropy, short time energy, spectral roll off, spectral centroid, spectral flux and zero crossing rate over statistic values average and standard deviation. We used DCASE Challenge 2016 dataset to show the properties of machine learning classifiers. There are several classifiers to address the ASC task. We compare the properties of different classifiers K nearest neighbors KNN , Support Vector Machine SVM , and Ensembles Bagged Trees by using combining wavelet and spectral features. The best of classification methodology and feature extraction are essential for ASC task. In this system, we extract at level 5, MODWPT energy 32, relative energy 32 and statistic values 6 from the audio signal and then extracted feature is applied in different classifiers. Mie Mie Oo | Lwin Lwin Oo "Acoustic Scene Classification by using Combination of MODWPT and Spectral Features" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd27992.pdfPaper URL: https://www.ijtsrd.com/computer-science/multimedia/27992/acoustic-scene-classification-by-using-combination-of-modwpt-and-spectral-features/mie-mie-oo
Outstanding advancements in imaging technology have made cryogenic electron microscopy a powerful technique for the nanocharacterization of biological macromolecular complexes, reaching atomic levels of resolution and being applicable to a wider set of samples than the other competing technologies. The real breakthrough in the development of cryo-EM has happened less than a decade ago, with the introduction of direct detection devices. These cameras allow unprecedented speed and resolution, and Lawrence Berkeley National Lab is developing a new detector, the 4D cam- era, that can operate at 87000 frames per second, revealing exclusive temporal dynamics of the investigated processes.
The current bottlenecks of the 4D camera, however, are the management of the large amount of data generated (around 50 GB/s) and the intrinsic noise level characterizing the signal acquired at that speed. Yet, the high frame rate enables the recognition of single electrons when they strike the detector, as opposed to traditional electron microscopy, where the charge is cumulated for every frame. Electron counting has remarkable advantages since it completely rejects electrical background noise as well as the variability in the electron charge deposition phenomena and it dramatically compresses images by saving them as lists of events coordinates.
With this work, the counting efficiency of the algorithm is enhanced, through the introduction of a denoising step before thresholding out the background noise, rising the precision by 7.11% with respect to the reference implementation. Furthermore, the localization of the events is refined to allow super-resolution, and a classification step is added to reduce the is- sue of collision losses, caused by overlapping electrons. In the end, a 10000x compression ratio is achieved thanks to electron counting. A GPU acceleration of the final algorithm is also proposed, achieving, in the best case, a speed up of 284x. The timing performances of the developed tool, in fact, are crucial for its real time execution on the microscope output.
Ultimately, this work aims at enabling a more efficient data management between the microscopy center and the supercomputing facility, both involved in the data processing pipeline, by moving part of the computation towards the instrumentation and transferring only a compressed version of the datasets. The intelligent redistribution of workloads, in fact, removes the bottleneck in data transfer and grants the use of the microscope at its maximum frame rate.
This document summarizes the first meeting of the robotics club. It introduces robots and lists the resources needed to build them, including money, time, electronics, mechanics, and computer programming skills. It then demonstrates some electronic components like LEDs, resistors, buttons, integrated circuits, and buzzers. The document provides an example Arduino program that uses a for loop to generate tones of increasing frequency on a buzzer.
This document describes using an artificial neural network (ANN) model to optimize the cost of reinforced concrete beams designed according to ACI 318-08 code requirements. The ANN model considers costs of concrete, reinforcement steel, and formwork. A simply supported beam was designed with variable cross-sectional dimensions to demonstrate the model. Computer models were developed using NEURO SHELL-2 software and results were compared to a classical optimization model in Excel using generalized reduced gradient methods, showing good agreement between the two approaches. The document provides details on the ANN model formulation, including design variables, constraints, and objective function to minimize total cost. An example problem is presented to optimize the design of a simply supported beam.
The use of sparse representation in direction of arrival (DoA) estimation has been around for a while. This exploits the angular sparsity of the impinging wavefronts and allows us to use much more efficient algorithms that can perform well in very challenging scenarios like coherent sources, low number of snapshots etc. In applications like channel sounding and RADAR, however, it may not be enough to just have the DoA of the signal but the offset from the carrier frequency or the Doppler frequency can be of equal importance as well in these applications.
Human Face Detection and Tracking for Age Rank, Weight and Gender Estimation ...IRJET Journal
This document proposes a system for real-time human face detection, tracking, and estimation of age, weight, and gender from face images using a Raspberry Pi processor. The system uses OpenCV for face detection and extracts facial features to classify age, estimate weight, and determine gender through a probabilistic framework. The system allows for real-time detection of multiple faces with high efficiency even from low-quality images. Evaluation shows the low-cost Raspberry Pi provides fast execution speeds suitable for real-time applications.
Automatic Test Suite Generation for Key-Points Detection DNNs using Many-Obje...Lionel Briand
This paper proposes using many-objective search algorithms to automatically generate test suites for key-point detection deep neural networks (DNNs). The paper aims to find test images that cause DNNs to severely mispredict the locations of as many key-points as possible. It compares various search algorithms and finds that MOSA+ generates test suites that maximize both the number and severity of mispredicted key-points. Additionally, the paper builds regression trees to explain individual key-point mispredictions based on image characteristics, helping engineers assess risks and improve the DNNs.
This document outlines the steps for developing a predictive modeling project in Python:
1) Select an appropriate modeling technique based on the type of problem, amount of data, and other factors.
2) Prepare the data for modeling by formatting, mapping text to numbers, and splitting into features and targets.
3) Validate the model selection by evaluating performance on test data.
4) Implement the trained model in a production environment to make predictions on new data.
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...PyData
This document discusses building a hybrid recommendation engine using Python to recommend Pubmed documents. It begins with an introduction to predictive analytics and recommender systems. Different types of recommender systems are described, including knowledge-based, content-based, collaborative filtering, and hybrid models. The document then outlines a hybrid model that performs content-based filtering on Pubmed documents using vector space modeling and weights documents, before applying collaborative filtering using the Python-recsys library to filter and recommend documents. Finally, it demonstrates the hybrid model on a Pubmed dataset and compares its performance to using Python-recsys alone.
Creating Your First Predictive Model In PythonRobert Dempsey
If you’ve been reading books and blog posts on machine learning and predictive analytics and are still left wondering how to create a predictive model and apply it to your own data, this presentation will give you the steps you need to take to do just that.
Discover why Python is better for Data Science: the whole workflow of Data Analysis is covered by Python. Tools for various tasks are shown, including: workflow, data analysis, data visualization, integration with Hadoop ecosystem, and communication.
This webinar will focus on the technical and practical aspects of creating and deploying predictive analytics. We have seen an emerging need for predictive analytics across clinical, operational, and financial domains. One pitfall we’ve seen with predictive analytics is that while many people with access to free tools can develop predictive models, many organizations fail to provide a sufficient infrastructure in which the models are deployed in a consistent, reliable way and truly embedded into the analytics environment. We will survey techniques that are used to get better predictions at scale. This webinar won’t be an intense mathematical treatment of the latest predictive algorithms, but will rather be a guide for organizations that want to embed predictive analytics into their technical and operational workflows.
Topics will include:
Reducing the time it takes to develop a model
Automating model training and retraining
Feature engineering
Deploying the model in the analytics environment
Deploying the model in the clinical environment
Construction of inexpensive Web-Cam based Optical Spectrometer usingSoares Fernando
This document describes the construction and use of an inexpensive webcam-based optical spectrometer for quantitative spectroscopic studies. Key points:
- An inexpensive spectrometer was built from readily available materials like DVDs, cardboard, tape and glue to enable students to measure electromagnetic spectra as a function of wavelength within 10s of nm resolution and accuracy.
- The spectrometer was calibrated using known emission lines from a helium source and the hydrogen emission spectrum was analyzed, matching theoretical predictions to within 0.04% error.
- The low-cost nature of this device makes it suitable for equipping large classes for hands-on spectroscopy experiments and studies in resource-limited educational settings.
The document discusses a project that used accelerometers to recognize gestures for a virtual environment. The project utilized Wii remotes and an Acceleglove to collect accelerometer data and recognize gestures through MATLAB. A C# program extracted accelerometer data from the Wii remotes for training and testing gesture recognition algorithms in MATLAB. Key algorithms used in MATLAB included dynamic time warping, affinity propagation, and random projection for gesture recognition. The digital glove also used an artificial neural network approach for gesture recognition.
IRJET - An Robust and Dynamic Fire Detection Method using Convolutional N...IRJET Journal
This document proposes a new fire detection method using convolutional neural networks (CNNs). Specifically, it uses the YOLOv3 object detection algorithm, which can detect objects like fire in images or videos quickly and accurately. The proposed method aims to reduce computational time and costs compared to other CNN-based approaches, while also improving detection accuracy and reducing false alarms. It discusses implementing the method using four main modules: data exploration, pre-processing, feature engineering, and model selection. The workflow involves exploring data, pre-processing images, extracting features, and selecting the YOLOv3 CNN model for fire detection. The goal is to develop a robust and dynamic fire detection system using computer vision techniques to help prevent accidents.
(Structural) Feature Interactions for Variability-Intensive Systems Testing Gilles Perrouin
Presentation given in the "short talks" session in the Dagstuhl seminar 14281 on "Feature Interactions - the Next Generation" , Schloss Dagstuhl, Germany, July 2014.
APPLICATION OF CONVOLUTIONAL NEURAL NETWORK IN LAWN MEASUREMENTsipij
Lawn area measurement is an application of image processing and deep learning. Researchers used
hierarchical networks, segmented images, and other methods to measure the lawn area. Methods’
effectiveness and accuracy varies. In this project, deep learning method, specifically Convolutional neural
network, was applied to measure the lawn area. We used Keras and TensorFlow in Python to develop a
model that was trained on the dataset of houses then tuned the parameters with GridSearchCV in ScikitLearn (a machine learning library in Python) to estimate the lawn area. Convolutional neural network or
shortly CNN shows high accuracy (94 -97%). We may conclude that deep learning method, especially
CNN, could be a good method with a high state-of-art accuracy.
This document presents a scalable heuristic called Maximum Influence Arborescence (MIA) for solving the influence maximization problem in large social networks. MIA finds maximum influence paths between nodes and uses them to construct local influence regions called arborescences. It selects seed nodes that provide the largest marginal increase in influence spread by efficiently updating activation probabilities in the arborescences. Experiments on real networks show MIA achieves over 103-104 speedup compared to previous methods while maintaining similar influence spread, making it suitable for large networks with thousands to millions of nodes.
We study influence maximization in which diffusion on each step may be delayed, and the objective is to maximize influence spread within a certain deadline. Both IC and LT models are extended, and efficient algorithms are proposed and evaluated.
This work appears in AAAI 2012. For the full version of the paper, please see: http://arxiv.org/abs/1204.3074
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...PyData
Artificial intelligence is emerging as a new paradigm in materials science. This talk describes how physical intuition and (insightful) machine learning can solve the complicated task of structure recognition in materials at the nanoscale.
Jos van Sas - Testimonial Alcatel-Lucent Bell Labsimec.archive
- Bell Labs Alcatel-Lucent is an innovation engine with over 1,000 scientists and researchers across 8 countries collaborating with over 300 academic institutions. It has over 27,900 active patents and publishes over 400 papers per year.
- Experimentation facilities are important for evaluating solutions under realistic conditions at scale beyond theoretical research and simulations. This allows moving research closer to eventual product development.
- Examples of projects using large-scale emulation on the iLab.t platform include FP7 OCEAN investigating scalable content-aware delivery over CDNs and FP7 ECODE adding learning-based control to networks.
- Lessons learned are that large-scale emulation is important for validation and understanding before
Acoustic Scene Classification by using Combination of MODWPT and Spectral Fea...ijtsrd
Acoustic Scene Classification ASC is classified audio signals to imply about the context of the recorded environment. Audio scene includes a mixture of background sound and a variety of sound events. In this paper, we present the combination of maximal overlap wavelet packet transform MODWPT level 5 and six sets of time domain and frequency domain features are energy entropy, short time energy, spectral roll off, spectral centroid, spectral flux and zero crossing rate over statistic values average and standard deviation. We used DCASE Challenge 2016 dataset to show the properties of machine learning classifiers. There are several classifiers to address the ASC task. We compare the properties of different classifiers K nearest neighbors KNN , Support Vector Machine SVM , and Ensembles Bagged Trees by using combining wavelet and spectral features. The best of classification methodology and feature extraction are essential for ASC task. In this system, we extract at level 5, MODWPT energy 32, relative energy 32 and statistic values 6 from the audio signal and then extracted feature is applied in different classifiers. Mie Mie Oo | Lwin Lwin Oo "Acoustic Scene Classification by using Combination of MODWPT and Spectral Features" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd27992.pdfPaper URL: https://www.ijtsrd.com/computer-science/multimedia/27992/acoustic-scene-classification-by-using-combination-of-modwpt-and-spectral-features/mie-mie-oo
Outstanding advancements in imaging technology have made cryogenic electron microscopy a powerful technique for the nanocharacterization of biological macromolecular complexes, reaching atomic levels of resolution and being applicable to a wider set of samples than the other competing technologies. The real breakthrough in the development of cryo-EM has happened less than a decade ago, with the introduction of direct detection devices. These cameras allow unprecedented speed and resolution, and Lawrence Berkeley National Lab is developing a new detector, the 4D cam- era, that can operate at 87000 frames per second, revealing exclusive temporal dynamics of the investigated processes.
The current bottlenecks of the 4D camera, however, are the management of the large amount of data generated (around 50 GB/s) and the intrinsic noise level characterizing the signal acquired at that speed. Yet, the high frame rate enables the recognition of single electrons when they strike the detector, as opposed to traditional electron microscopy, where the charge is cumulated for every frame. Electron counting has remarkable advantages since it completely rejects electrical background noise as well as the variability in the electron charge deposition phenomena and it dramatically compresses images by saving them as lists of events coordinates.
With this work, the counting efficiency of the algorithm is enhanced, through the introduction of a denoising step before thresholding out the background noise, rising the precision by 7.11% with respect to the reference implementation. Furthermore, the localization of the events is refined to allow super-resolution, and a classification step is added to reduce the is- sue of collision losses, caused by overlapping electrons. In the end, a 10000x compression ratio is achieved thanks to electron counting. A GPU acceleration of the final algorithm is also proposed, achieving, in the best case, a speed up of 284x. The timing performances of the developed tool, in fact, are crucial for its real time execution on the microscope output.
Ultimately, this work aims at enabling a more efficient data management between the microscopy center and the supercomputing facility, both involved in the data processing pipeline, by moving part of the computation towards the instrumentation and transferring only a compressed version of the datasets. The intelligent redistribution of workloads, in fact, removes the bottleneck in data transfer and grants the use of the microscope at its maximum frame rate.
This document summarizes the first meeting of the robotics club. It introduces robots and lists the resources needed to build them, including money, time, electronics, mechanics, and computer programming skills. It then demonstrates some electronic components like LEDs, resistors, buttons, integrated circuits, and buzzers. The document provides an example Arduino program that uses a for loop to generate tones of increasing frequency on a buzzer.
This document describes using an artificial neural network (ANN) model to optimize the cost of reinforced concrete beams designed according to ACI 318-08 code requirements. The ANN model considers costs of concrete, reinforcement steel, and formwork. A simply supported beam was designed with variable cross-sectional dimensions to demonstrate the model. Computer models were developed using NEURO SHELL-2 software and results were compared to a classical optimization model in Excel using generalized reduced gradient methods, showing good agreement between the two approaches. The document provides details on the ANN model formulation, including design variables, constraints, and objective function to minimize total cost. An example problem is presented to optimize the design of a simply supported beam.
The use of sparse representation in direction of arrival (DoA) estimation has been around for a while. This exploits the angular sparsity of the impinging wavefronts and allows us to use much more efficient algorithms that can perform well in very challenging scenarios like coherent sources, low number of snapshots etc. In applications like channel sounding and RADAR, however, it may not be enough to just have the DoA of the signal but the offset from the carrier frequency or the Doppler frequency can be of equal importance as well in these applications.
Human Face Detection and Tracking for Age Rank, Weight and Gender Estimation ...IRJET Journal
This document proposes a system for real-time human face detection, tracking, and estimation of age, weight, and gender from face images using a Raspberry Pi processor. The system uses OpenCV for face detection and extracts facial features to classify age, estimate weight, and determine gender through a probabilistic framework. The system allows for real-time detection of multiple faces with high efficiency even from low-quality images. Evaluation shows the low-cost Raspberry Pi provides fast execution speeds suitable for real-time applications.
Automatic Test Suite Generation for Key-Points Detection DNNs using Many-Obje...Lionel Briand
This paper proposes using many-objective search algorithms to automatically generate test suites for key-point detection deep neural networks (DNNs). The paper aims to find test images that cause DNNs to severely mispredict the locations of as many key-points as possible. It compares various search algorithms and finds that MOSA+ generates test suites that maximize both the number and severity of mispredicted key-points. Additionally, the paper builds regression trees to explain individual key-point mispredictions based on image characteristics, helping engineers assess risks and improve the DNNs.
This document outlines the steps for developing a predictive modeling project in Python:
1) Select an appropriate modeling technique based on the type of problem, amount of data, and other factors.
2) Prepare the data for modeling by formatting, mapping text to numbers, and splitting into features and targets.
3) Validate the model selection by evaluating performance on test data.
4) Implement the trained model in a production environment to make predictions on new data.
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...PyData
This document discusses building a hybrid recommendation engine using Python to recommend Pubmed documents. It begins with an introduction to predictive analytics and recommender systems. Different types of recommender systems are described, including knowledge-based, content-based, collaborative filtering, and hybrid models. The document then outlines a hybrid model that performs content-based filtering on Pubmed documents using vector space modeling and weights documents, before applying collaborative filtering using the Python-recsys library to filter and recommend documents. Finally, it demonstrates the hybrid model on a Pubmed dataset and compares its performance to using Python-recsys alone.
Creating Your First Predictive Model In PythonRobert Dempsey
If you’ve been reading books and blog posts on machine learning and predictive analytics and are still left wondering how to create a predictive model and apply it to your own data, this presentation will give you the steps you need to take to do just that.
Discover why Python is better for Data Science: the whole workflow of Data Analysis is covered by Python. Tools for various tasks are shown, including: workflow, data analysis, data visualization, integration with Hadoop ecosystem, and communication.
This webinar will focus on the technical and practical aspects of creating and deploying predictive analytics. We have seen an emerging need for predictive analytics across clinical, operational, and financial domains. One pitfall we’ve seen with predictive analytics is that while many people with access to free tools can develop predictive models, many organizations fail to provide a sufficient infrastructure in which the models are deployed in a consistent, reliable way and truly embedded into the analytics environment. We will survey techniques that are used to get better predictions at scale. This webinar won’t be an intense mathematical treatment of the latest predictive algorithms, but will rather be a guide for organizations that want to embed predictive analytics into their technical and operational workflows.
Topics will include:
Reducing the time it takes to develop a model
Automating model training and retraining
Feature engineering
Deploying the model in the analytics environment
Deploying the model in the clinical environment
Addressing the New User Problem with a Personality Based User Similarity MeasureMarko Tkalčič
This document proposes using a personality-based user similarity measure to address the new user problem in collaborative filtering recommender systems. It presents a methodology that models users based on their responses to a personality questionnaire and calculates similarity between users based on their personality profiles. The study finds that this personality-based approach performs better than a traditional rating-based approach under cold start conditions when users have provided few ratings. It also aims to determine the boundary between normal usage and cold start scenarios. The personality-based measure is shown to help address the new user problem but has drawbacks like requiring a personality assessment.
This document summarizes research on semi-supervised classification methods for protein crystallization image classification. It describes self-training and YATSI (Yet Another Two Staged Idea) semi-supervised classification approaches applied to a dataset of 2250 protein crystallization images. Experimental results show that naive Bayesian and SMO classifiers benefited from self-training and YATSI, while decision trees, multilayer perceptron, and random forests did not improve. Random forest provided the best overall classification performance. Future work will investigate active learning combined with semi-supervised learning.
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Multi illuminant estimation with c...IEEEBEBTECHSTUDENTPROJECTS
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
Violent Scenes Detection Using Mid-Level Violence Clusteringcsandit
This document proposes a system for detecting violent scenes in videos using a combination of visual and audio features analyzed at the segment level. The system applies multiple kernel learning to make full use of the multimodal nature of video data. It introduces "Mid-level Violence Clustering" which groups violent segments into clusters to implicitly learn mid-level concepts of violence without using manually tagged annotations. The system is trained on a dataset from MediaEval 2013 and evaluated using its official metric, outperforming the best score from that evaluation.
Violent Scenes Detection Using Mid-Level Violence Clusteringcsandit
This document proposes a system for detecting violent scenes in videos using a combination of visual and audio features analyzed at the segment level. The system applies multiple kernel learning to make use of the multimodal nature of video data. It also introduces "Mid-level Violence Clustering" which groups violent segments into clusters to implicitly learn mid-level concepts of violence without using manually tagged annotations. The system is trained on a dataset from MediaEval 2013 and evaluated using its official metric, outperforming the best score from that evaluation.
New Research Articles 2019 October Issue Signal & Image Processing An Interna...sipij
Signal & Image Processing: An International Journal (SIPIJ)
ISSN: 0976 – 710X [Online]; 2229 - 3922 [Print]
http://www.airccse.org/journal/sipij/index.html
Current Issue; October 2019, Volume 10, Number 5
Free- Reference Image Quality Assessment Framework Using Metrics Fusion and Dimensionality Reduction
Besma Sadou1, Atidel Lahoulou2, Toufik Bouden1, Anderson R. Avila3, Tiago H. Falk3 and Zahid Akhtar4, 1Non Destructive Testing Laboratory, University of Jijel, Algeria, 2LAOTI laboratory, University of Jijel, Algeria, 3University of Québec, Canada and 4University of Memphis, USA
Test-cost-sensitive Convolutional Neural Networks with Expert Branches
Mahdi Naghibi1, Reza Anvari1, Ali Forghani1 and Behrouz Minaei2, 1Malek-Ashtar University of Technology, Iran and 2Iran University of Science and Technology, Iran
Robust Image Watermarking Method using Wavelet Transform
Omar Adwan, The University of Jordan, Jordan
Improvements of the Analysis of Human Activity Using Acceleration Record of Electrocardiographs
Itaru Kaneko1, Yutaka Yoshida2 and Emi Yuda3, 1&2Nagoya City University, Japan and 3Tohoku University, Japan
http://www.airccse.org/journal/sipij/vol10.html
Retraining maximum likelihood classifiers using low-rank model.pptgrssieee
The document proposes a low-rank parameter modeling method to retrain maximum likelihood classifiers and address dataset shift between training and test data. It models test data using a low-rank approach with unknown parameter vectors estimated from the data. The method is evaluated on cloud detection using satellite images and tree cover mapping with Landsat images, showing improved classification performance over not retraining.
This document describes a data analysis method to automatically detect energy losses from shadows on a partially shaded residential PV system using only production data. The method defines an error barrier between a benchmark PV system and the studied system. Times when the error exceeds the barrier are marked in red, otherwise green. Periods with high red concentrations indicate shadowing. Shadowed times are then analyzed daily to distinguish between shaded and unshaded days, and further analyze shadowing within expected shadow hours only on shaded days. The goal is to distinguish energy losses due to shadows from other faults using just production data from the inverter.
Irina Rish, Researcher, IBM Watson, at MLconf NYC 2017MLconf
Irina Rish is a researcher at the AI Foundations department of the IBM T.J. Watson Research Center. She received MS in Applied Mathematics from Moscow Gubkin Institute, Russia, and PhD in Computer Science from the University of California, Irvine. Her areas of expertise include artificial intelligence and machine learning, with a particular focus on probabilistic graphical models, sparsity and compressed sensing, active learning, and their applications to various domains, ranging from diagnosis and performance management of distributed computer systems (“autonomic computing”) to predictive modeling and statistical biomarker discovery in neuroimaging and other biological data. Irina has published over 60 research papers, several book chapters, two edited books, and a monograph on Sparse Modeling, taught several tutorials and organized multiple workshops at machine-learning conferences, including NIPS, ICML and ECML. She holds 24 patents and several IBM awards. Irina currently serves on the editorial board of the Artificial Intelligence Journal (AIJ). As an adjunct professor at the EE Department of Columbia University, she taught several advanced graduate courses on statistical learning and sparse signal modeling.
Abstract Summary:
Learning About the Brain and Brain-Inspired Learning:
Quantifying mental states and identifying statistical biomarkers of mental disorders from neuroimaging data is an exciting and rapidly growing research area at the intersection of neuroscience and machine learning, with the particular focus on interpretability and reproducibility of learned models. We will discuss promises and limitations of machine-learning methods in such applications, focusing on recent applications of deep learning methods such as recurrent convnets to the analysis of “brain movies” (EEG) data. On the other hand, besides the above “AI to Brain” direction, we will also discuss the “Brain to AI”, namely, borrowing ideas from neuroscience to improve machine learning, with specific focus on adult neurogenesis and online model adaptation in representation learning.
This document summarizes the skills and experience of Yanjun Chen, including 6 years of experience in optical, electrical, and mechanical subsystem design, system integration, and image processing. Chen has a M.S. in Bioengineering from UIC and B.S. in Optoelectronic Information Engineering from Harbin Institute of Technology. Areas of expertise include optical coherence tomography, adaptive optics, and ultra-precision motion control.
This 3-credit course surveys principles of remote sensing instrumentation design with an emphasis on satellite-borne visible and near-infrared instruments. Topics include satellite remote sensing techniques, electromagnetic radiation properties, visible and near-infrared detectors, imaging systems, radiometry, instrument-spacecraft integration, and current and future remote sensing systems. The course also covers optical and passive microwave sensor design, as well as active microwave radar instrumentation. Prerequisites include undergraduate physics or engineering physics.
Program for 2015 ieee international conference on consumer electronics taiw...supra_uny
This document provides the program schedule for the 2015 IEEE International Conference on Consumer Electronics held in Taiwan from June 6-8, 2015. The conference included sessions on various topics related to consumer electronics on those dates in rooms IB-101, IB-201, IB-202 and other locations. Keynote speeches were given each day from 10:50-12:00 and other times. Sessions covered areas such as multimedia signal processing, security, big data analytics, wearables, medical applications and more. Social events included a welcome reception on June 6th and a banquet on June 7th.
A Comparative Case Study on Compression Algorithm for Remote Sensing ImagesDR.P.S.JAGADEESH KUMAR
This document summarizes research on compression algorithms for remote sensing images. It begins with an abstract describing the challenges of transmitting large remote sensing images from sensors to networks. The document then reviews 18 different research papers on various compression algorithms for remote sensing images, including wavelet-based algorithms, fractal coding methods, and region-based approaches. It evaluates each algorithm's performance in compressing remote sensing images while maintaining quality. The document aims to perform a comparative case study of these different compression algorithms.
The document describes a usability study comparing a video projector and an inter-PC screen broadcasting system for use in a computer laboratory. Subjects were shown text samples of varying difficulty levels using each tool, and their typing speed and accuracy were recorded. Results showed the projector was better for smaller amounts of text on one screen, while the screen broadcasting system was better for larger amounts of text. The study aimed to provide benchmarks for choosing presentation tools for computer labs.
Jing Li seeks a position as an SI/EMC engineer with extensive experience in EMC analysis and design. She has a PhD in electrical engineering from Missouri University of Science and Technology with a focus on signal integrity and computational electromagnetics. Her past work includes analyzing EMI issues at Cisco and developing absorbing materials to mitigate radiation. She has published several papers and her research analyzed the radiation mechanisms of optical links and connectors.
This is the second progress report of the project "Classification and Detection of liquid samples in hospitals and chemical labs using computer vision". The group has created a dataset of over 2000 annotated images across 12 classes. They trained 3 object detection models - YOLOv4_pro, YOLOv5n, YOLOv5s on this dataset and compared their validation mAP scores. The group has also tested these models on different size images and videos. Their next steps are to write a research paper, develop a real-time application, and test model speed on videos.
Mitchell D. Herndon is seeking an engineering position and has relevant experience constructing hardware components and cable harnesses for electronic warfare systems through co-ops at Georgia Tech Research Institute. He has a 3.6 GPA while pursuing a Bachelor of Science in Electrical Engineering from Georgia Institute of Technology graduating in Fall 2016. Additionally, he has worked as a water quality technician and tutor. Herndon has skills in programming languages like MATLAB, C, C++ and software like Labview, Quartus II and has experience with electronics hardware. He is pursuing a master's degree in electrical engineering.
Yu-Li Liang has extensive experience and skills in computer vision, image processing, machine learning, data analysis, and biological system modeling. He holds a PhD in Computer Science from the University of Colorado Boulder and has worked as a postdoctoral scholar at Caltech. His research focuses on developing algorithms for feature extraction from sequential images, obscene content detection in online videos, skin condition evaluation from photos, and geographical feature detection in satellite images. He has strong programming skills in MATLAB, C++, Python, and machine learning tools like WEKA.
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
This presentation provides valuable insights into effective cost-saving techniques on AWS. Learn how to optimize your AWS resources by rightsizing, increasing elasticity, picking the right storage class, and choosing the best pricing model. Additionally, discover essential governance mechanisms to ensure continuous cost efficiency. Whether you are new to AWS or an experienced user, this presentation provides clear and practical tips to help you reduce your cloud costs and get the most out of your budget.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on automated letter generation for Bonterra Impact Management using Google Workspace or Microsoft 365.
Interested in deploying letter generation automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Dive into the realm of operating systems (OS) with Pravash Chandra Das, a seasoned Digital Forensic Analyst, as your guide. 🚀 This comprehensive presentation illuminates the core concepts, types, and evolution of OS, essential for understanding modern computing landscapes.
Beginning with the foundational definition, Das clarifies the pivotal role of OS as system software orchestrating hardware resources, software applications, and user interactions. Through succinct descriptions, he delineates the diverse types of OS, from single-user, single-task environments like early MS-DOS iterations, to multi-user, multi-tasking systems exemplified by modern Linux distributions.
Crucial components like the kernel and shell are dissected, highlighting their indispensable functions in resource management and user interface interaction. Das elucidates how the kernel acts as the central nervous system, orchestrating process scheduling, memory allocation, and device management. Meanwhile, the shell serves as the gateway for user commands, bridging the gap between human input and machine execution. 💻
The narrative then shifts to a captivating exploration of prominent desktop OSs, Windows, macOS, and Linux. Windows, with its globally ubiquitous presence and user-friendly interface, emerges as a cornerstone in personal computing history. macOS, lauded for its sleek design and seamless integration with Apple's ecosystem, stands as a beacon of stability and creativity. Linux, an open-source marvel, offers unparalleled flexibility and security, revolutionizing the computing landscape. 🖥️
Moving to the realm of mobile devices, Das unravels the dominance of Android and iOS. Android's open-source ethos fosters a vibrant ecosystem of customization and innovation, while iOS boasts a seamless user experience and robust security infrastructure. Meanwhile, discontinued platforms like Symbian and Palm OS evoke nostalgia for their pioneering roles in the smartphone revolution.
The journey concludes with a reflection on the ever-evolving landscape of OS, underscored by the emergence of real-time operating systems (RTOS) and the persistent quest for innovation and efficiency. As technology continues to shape our world, understanding the foundations and evolution of operating systems remains paramount. Join Pravash Chandra Das on this illuminating journey through the heart of computing. 🌟
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxSitimaJohn
Ocean Lotus cyber threat actors represent a sophisticated, persistent, and politically motivated group that poses a significant risk to organizations and individuals in the Southeast Asian region. Their continuous evolution and adaptability underscore the need for robust cybersecurity measures and international cooperation to identify and mitigate the threats posed by such advanced persistent threat groups.
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
How to improve the statistical power of the 10-fold crossvalidation scheme in Recommender Systems
1. How to improve the statistical power of the 10-fold cross
validation scheme in Recommender Systems
University of Ljubljana
[LDOS]
..: Faculty of Electrical Engineering
..: Digital Signal, Image and Video Processing Laboratory
Andrej Košir
Ante Odić
Marko Tkalčič
2. University of Ljubljana
[LDOS]
..: Faculty of Electrical Engineering
..: Digital Signal, Image and Video Processing Laboratory
Statistical power, replicability and reproducibility
What is:
Replicability: to get the same experimental result (on the same data)
Reproducibility : to get similar experimental results leading to the same
conclusion
Mackay, R., & Oldford, R. (2000). Scientific method, statistical method, and the speed of light, Working paper 2000-02). Department of Statistics and Actuarial Science, University of Waterloo.
In terms of statistical testing
Higher power => better reproducibility
More likely to get to the same conclusions
3. University of Ljubljana
[LDOS]
..: Faculty of Electrical Engineering
..: Digital Signal, Image and Video Processing Laboratory
On stat hypothese testing
When we need to use stat tests?
The results should not change if we repeat the experiment
When we need it: at later stages of development where results are similar
RS 1
F1
0.72
RS 2
F2
0.89
0.74
Test
data
Elements of statistical testing
Working hypotheses
Null and alternative hypotheses: 𝐻0 and 𝐻1
p-value: 𝑝
Risk level: 𝛼
Decision on 𝐻0
4. University of Ljubljana
[LDOS]
..: Faculty of Electrical Engineering
..: Digital Signal, Image and Video Processing Laboratory
On errors and statistical power
Errors in test decision:
Errors of type I. and type II.
Effect size
Power:
ˆ
H0
ˆ
H1
H0
OK
type I.
H1
type II.
OK
Power = 𝑃𝑟[ 𝐻1 |𝐻1 ]
For each test a new analysis is required
more is better
The best one can do
Task 1 - How to select sample size: apriory power
Task 2 - How to estimate achieved power: posterior power
History:
1908 by William Sealy Gosset (Student): he did not need it
Mainly ignored until then
Software: GPower
http://www.psycho.uni-duesseldorf.de/abteilungen/aap/gpower3/
5. University of Ljubljana
[LDOS]
..: Faculty of Electrical Engineering
..: Digital Signal, Image and Video Processing Laboratory
The application we were working on: contextual variables
Which contextual variables are relevant:
What is context
Candidates: time, weather, mood, ...
Can we simply use it all?
• Irrelevant context can worse the performance of RS
Test if a given context is relevant
How: compare RS with and without it
ODIĆ, Ante, TKALČIČ, Marko, TASIČ, Jurij F., KOŠIR, Andrej. Predicting and detecting the
relevant contextual information in a movie-recommender system. Interact. comput.. [Print ed.], 2013,
vol. 25, no. 1, pp. 74-90, ilustr., doi:10.1093/iwc/iws003. [COBISS.SI-ID 9650260]
ODIĆ, Ante, TKALČIČ, Marko, TASIČ, Jurij F., KOŠIR, Andrej. Impact of the context relevancy on
ratings prediction in a movie-recommender system. Automatika (Zagreb), 2013, vol. 54, no. 2, pp. 252262, ilustr., doi:10.7305/automatika.54-2.258. [COBISS.SI-ID 9782356]
6. University of Ljubljana
[LDOS]
..: Faculty of Electrical Engineering
..: Digital Signal, Image and Video Processing Laboratory
The problem we observed: cross validation scheme
ODIĆ, Ante, TKALČIČ, Marko, TASIČ, Jurij F., KOŠIR, Andrej. Predicting and detecting the relevant contextual
information in a movie-recommender system. Interact. comput., vol. 25, no. 1, pp. 74-90, 2013.
There were differences among folds, but not in conclusion
What is wrong?
Paired / unpaired?
What is usually done:
Confusion matrix computation is actually unpaired
7. University of Ljubljana
[LDOS]
..: Faculty of Electrical Engineering
..: Digital Signal, Image and Video Processing Laboratory
Proposed solution
The procedure outline:
1.
2.
3.
4.
Select the scalar comparison measure (such as precision or F-measure).
Store the evaluation results of each fold and each method separately;
According to the specfic features of the evaluation results (distributions
etc.) select the most powerful test that meets these specific features
Perform the paired version of the selected test.
8. University of Ljubljana
[LDOS]
..: Faculty of Electrical Engineering
..: Digital Signal, Image and Video Processing Laboratory
Materials and methods (1)
Dataset:
Context Movie Dataset (LDOS-CoMoDa)
1611 ratings from 89 users to 946 items with associated contextual factors.
Contextual variables
•
•
•
•
•
•
•
•
•
•
time (morning, afternoon, evening, night),
daytype (working day, weekend, time (morning, afternoon, evening, night),
season (spring, summer, autumn, winter),
Location (home, public place, friend's house),
weather (sunny/clear, rainy, stormy, snowy, cloudy),
social (alone, partner, friends, colleagues, parents, public, family),
endEmo (sad, happy, scared, surprised, angry, disgusted, neutral),
dominantEmo (sad, happy, scared, surprised, angry, disgusted, neutral),
mood (positive, neutral, negative),
physical (healthy, ill), decision (user's choice, given by other), interaction (1rst, n-th)
Publically available:
LDOS-CoMoDa contextual dataset: available at www.ldos.si/comoda.html.
Used by 29 researchers at this moment.
9. University of Ljubljana
[LDOS]
..: Faculty of Electrical Engineering
..: Digital Signal, Image and Video Processing Laboratory
Materials and methods (2), results
Experimental design
10-fold cross validation
Two procedures: ProcPaired, ProcIndep
Results – which contextual variable improves MF?
Tests: Wilcoxon signed rank test (ProcIndep) and
Mann Whitney U test, (ProcPaired)
The achieved (post-hoc) statistical power for the paired test (pw pa.) and for the
independent test (pw in.) along with the computed p-values
Id
Var 1
Var 2
1
Physical
2
3
pw paired
p paired
pw indep.
p indep.
Weather 0.42
0.001
0.14
0.24
Decision
Social
0.99
0.004
0.25
0.19
interaction
Social
0.06
<0.001
0.05
0.43
10. University of Ljubljana
[LDOS]
..: Faculty of Electrical Engineering
..: Digital Signal, Image and Video Processing Laboratory
Discussion
Power improvements:
The first combination (physical vs. weather): 0.14 0.42, low but useful;
The second combination (decision vs. social): 0.19 0.99, the difference in
power is again substantial;
The third combination (interaction vs. social): 0.05 0.06, irrelevant;
It does not require substantial additional work
Worth of effort
11. University of Ljubljana
[LDOS]
..: Faculty of Electrical Engineering
..: Digital Signal, Image and Video Processing Laboratory
Further work
We limited to 10-fold cross validation and simple tests only. There is more
out there.
We will concentrate on a comparison of RS regarding the selected final tasks
(such as best five) and not limited to scalar performance measures (such as
precision at five).
More sophisticated statistical approaches:
are available such as a multi-level repeated binomial regression
my opinion: will not be used frequently
THANK YOU
Invitation: International Conference on Automatic Face and Gesture
Recognition FG2015, http://www.fg2015.org/
12. University of Ljubljana
[LDOS]
..: Faculty of Electrical Engineering
..: Digital Signal, Image and Video Processing Laboratory
Presentation structure
The goal
What it has to do with replicability and reproducibility?
Selected items from statistics
Our case & problem statement
Proposed solution & comments
Experimental results
Future work
Take away notes