This document discusses principles for software effort estimation. It begins with an introduction stating that successful estimation is critical but often projects are over or under estimated. It then discusses 12 principles related to questions about effort estimation. The principles include using domain experts, outlier pruning, combining superior solo methods, and using relevancy filtering when local data is lacking. The document advocates experimentation to determine the best practices and notes that size attributes are not always necessary. It promotes methods to reduce data needs through outlier and synonym pruning.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Neural Correlates of Technological Ambivalence: A Research Proposal Pierre-Majorique Léger
This proposal aims to study the neural correlates of technological ambivalence using electroencephalography (EEG). It hypothesizes that ambivalence involves increased activation in the anterior cingulate cortex and prefrontal cortex, which are involved in cognitive control and decision-making. The study will present participants with positive, negative, and incongruent feature combinations to induce ambivalence and measure EEG responses. Preliminary tests with 3 subjects found distinct neural activation patterns for ambivalence versus solely positive or negative states, including in the prefrontal cortex and posterior cingulate cortex. The full study will collect EEG data from 45 participants to further analyze the neural basis of technological ambivalence.
The document describes a service learning project undertaken by biomedical engineering students at California State University, Los Angeles to design an interactive game to assess fine motor skills in pediatric cerebral palsy patients. The students developed prototypes of sensor-embedded gloves and a LabVIEW program to record patient data during tests of increasing difficulty. A survey found that students highly enjoyed the project and felt it greatly benefited their education, particularly in strengthening engineering skills and gaining practical experience. The project appeared to reinforce students' interest in biomedical engineering careers without significantly changing their intended majors.
Robust face recognition by applying partitioning around medoids over eigen fa...ijcsa
An unsupervised learning methodology for robust face recognition is proposed for enhancing invariance to
various changes in the face. The area of face recognition in spite of being the most unobtrusive biometric
modality of all has encountered challenges with high performance in uncontrolled environment owing to
frequently occurring, unavoidable variations in the face. These changes may be due to noise, outliers,
changing expressions, emotions, pose, illumination, facial distractions like makeup, spectacles, hair growth
etc. Methods for dealing with these variations have been developed in the past with different success.
However the cost and time efficiency play a crucial role in implementing any methodology in real world.
This paper presents a method to integrate the technique of Partitioning Around Medoids with Eigen Faces
and Fisher Faces to improve the efficiency of face recognition considerably. The system so designed has
higher resistance towards the impact of various changes in the face and performs well in terms of success
rate, cost involved and time complexity. The methodology can therefore be used in developing highly robust
face recognition systems for real time environment.
AN IMPROVE OBJECT-ORIENTED APPROACH FOR MULTI-OBJECTIVE FLEXIBLE JOB-SHOP SCH...ijcsit
Flexible manufacturing systems are not easy to control and it is difficult to generate controlling systems for this problem domain. Flexible job-shop scheduling problem (FJSP) is one of the instances in this domain. It is a problem which acquires the job-shop scheduling problems (JSP). FJSP has additional routing subproblem in addition to JSP. In routing sub-problem each task is assigned to a machine out of a set of capable machines. In scheduling sub-problem, the sequence of assigned operations is obtained while optimizing the objective function(s). In this work an object-oriented (OO) approach with simulated annealing algorithm is used to simulate multi-objective FJSP. Solution approaches provided in the literature generally use two-string encoding scheme to represent this problem. However, OO analysis, design and programming methodology helps to present this problem on a single encoding scheme effectively which result in a practical integration of the problem solution to manufacturing control systems where OO paradigm is frequently used. Three parameters are considered in this paper: maximum completion time, workload of the most loaded machine and total workload of all machines which are the benchmark used to show the propose system achieve effective result.
Minimizing Musculoskeletal Disorders in Lathe Machine WorkersWaqas Tariq
In production units, workers work under tough conditions to perform the desired task. These tough conditions normally give rise to various musculoskeletal disorders within the workers. These disorders emerge within the workers body due to repetitive lifting, differential lifting height, ambient conditions etc. For the minimization of musculoskeletal disorders it is quite difficult to model them with mathematical difference or differential equations. In this paper the minimization of musculoskeletal disorders problem has been formulated using fuzzy technique. It is very difficult to train non linear complex musculoskeletal disorders problem, hence in this paper a non linear fuzzy model has been developed to give solutions to these non linearities. This model would have the capability of representing solutions for minimizing musculoskeletal disorders needed for workers working in the production units.
This document summarizes a research paper that analyzes the performance and convergence of a novel genetic algorithm model towards finding global minima. The paper introduces genetic algorithms, which are probabilistic search algorithms inspired by natural evolution. It describes the components of genetic algorithms, including chromosomes, fitness functions, reproduction, crossover, and mutation operators. It also discusses encoding solutions as chromosomes and two common genetic algorithm models: Holland's original model and the common model. The paper aims to present an analysis of applying genetic algorithms to optimize test functions and finding their global minima.
ANP-GP Approach for Selection of Software Architecture StylesWaqas Tariq
Abstract Selection of Software Architecture for any system is a difficult task as many different stake holders are involved in the selection process. Stakeholders view on quality requirements is different and at times they may also be conflicting in nature. Also selecting appropriate styles for the software architecture is important as styles impact characteristics of software (e.g. reliability, performance). Moreover, styles influence how software is built as they determine architectural elements (e.g. components, connectors) and rules on how to integrate these elements in the architecture. Selecting the best style is difficult because there are multiple factors such as project risk, corporate goals, limited availability of resources, etc. Therefore this study presents a method, called SSAS, for the selection of software architecture styles. Moreover, this selection is a multi-criteria decision-making problem in which different goals and objectives must be taken into consideration. In this paper, we suggest an improved selection methodology, which reflects interdependencies among evaluation criteria and alternatives using analytic network process (ANP) within a zero-one goal programming (ZOGP) model. Keywords: Software Architecture; Selection of Software Architecture Styles; Multi-Criteria Decision Making; Interdependence; Analytic Network Process (ANP); Zero-One Goal Programming (ZOGP)
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Neural Correlates of Technological Ambivalence: A Research Proposal Pierre-Majorique Léger
This proposal aims to study the neural correlates of technological ambivalence using electroencephalography (EEG). It hypothesizes that ambivalence involves increased activation in the anterior cingulate cortex and prefrontal cortex, which are involved in cognitive control and decision-making. The study will present participants with positive, negative, and incongruent feature combinations to induce ambivalence and measure EEG responses. Preliminary tests with 3 subjects found distinct neural activation patterns for ambivalence versus solely positive or negative states, including in the prefrontal cortex and posterior cingulate cortex. The full study will collect EEG data from 45 participants to further analyze the neural basis of technological ambivalence.
The document describes a service learning project undertaken by biomedical engineering students at California State University, Los Angeles to design an interactive game to assess fine motor skills in pediatric cerebral palsy patients. The students developed prototypes of sensor-embedded gloves and a LabVIEW program to record patient data during tests of increasing difficulty. A survey found that students highly enjoyed the project and felt it greatly benefited their education, particularly in strengthening engineering skills and gaining practical experience. The project appeared to reinforce students' interest in biomedical engineering careers without significantly changing their intended majors.
Robust face recognition by applying partitioning around medoids over eigen fa...ijcsa
An unsupervised learning methodology for robust face recognition is proposed for enhancing invariance to
various changes in the face. The area of face recognition in spite of being the most unobtrusive biometric
modality of all has encountered challenges with high performance in uncontrolled environment owing to
frequently occurring, unavoidable variations in the face. These changes may be due to noise, outliers,
changing expressions, emotions, pose, illumination, facial distractions like makeup, spectacles, hair growth
etc. Methods for dealing with these variations have been developed in the past with different success.
However the cost and time efficiency play a crucial role in implementing any methodology in real world.
This paper presents a method to integrate the technique of Partitioning Around Medoids with Eigen Faces
and Fisher Faces to improve the efficiency of face recognition considerably. The system so designed has
higher resistance towards the impact of various changes in the face and performs well in terms of success
rate, cost involved and time complexity. The methodology can therefore be used in developing highly robust
face recognition systems for real time environment.
AN IMPROVE OBJECT-ORIENTED APPROACH FOR MULTI-OBJECTIVE FLEXIBLE JOB-SHOP SCH...ijcsit
Flexible manufacturing systems are not easy to control and it is difficult to generate controlling systems for this problem domain. Flexible job-shop scheduling problem (FJSP) is one of the instances in this domain. It is a problem which acquires the job-shop scheduling problems (JSP). FJSP has additional routing subproblem in addition to JSP. In routing sub-problem each task is assigned to a machine out of a set of capable machines. In scheduling sub-problem, the sequence of assigned operations is obtained while optimizing the objective function(s). In this work an object-oriented (OO) approach with simulated annealing algorithm is used to simulate multi-objective FJSP. Solution approaches provided in the literature generally use two-string encoding scheme to represent this problem. However, OO analysis, design and programming methodology helps to present this problem on a single encoding scheme effectively which result in a practical integration of the problem solution to manufacturing control systems where OO paradigm is frequently used. Three parameters are considered in this paper: maximum completion time, workload of the most loaded machine and total workload of all machines which are the benchmark used to show the propose system achieve effective result.
Minimizing Musculoskeletal Disorders in Lathe Machine WorkersWaqas Tariq
In production units, workers work under tough conditions to perform the desired task. These tough conditions normally give rise to various musculoskeletal disorders within the workers. These disorders emerge within the workers body due to repetitive lifting, differential lifting height, ambient conditions etc. For the minimization of musculoskeletal disorders it is quite difficult to model them with mathematical difference or differential equations. In this paper the minimization of musculoskeletal disorders problem has been formulated using fuzzy technique. It is very difficult to train non linear complex musculoskeletal disorders problem, hence in this paper a non linear fuzzy model has been developed to give solutions to these non linearities. This model would have the capability of representing solutions for minimizing musculoskeletal disorders needed for workers working in the production units.
This document summarizes a research paper that analyzes the performance and convergence of a novel genetic algorithm model towards finding global minima. The paper introduces genetic algorithms, which are probabilistic search algorithms inspired by natural evolution. It describes the components of genetic algorithms, including chromosomes, fitness functions, reproduction, crossover, and mutation operators. It also discusses encoding solutions as chromosomes and two common genetic algorithm models: Holland's original model and the common model. The paper aims to present an analysis of applying genetic algorithms to optimize test functions and finding their global minima.
ANP-GP Approach for Selection of Software Architecture StylesWaqas Tariq
Abstract Selection of Software Architecture for any system is a difficult task as many different stake holders are involved in the selection process. Stakeholders view on quality requirements is different and at times they may also be conflicting in nature. Also selecting appropriate styles for the software architecture is important as styles impact characteristics of software (e.g. reliability, performance). Moreover, styles influence how software is built as they determine architectural elements (e.g. components, connectors) and rules on how to integrate these elements in the architecture. Selecting the best style is difficult because there are multiple factors such as project risk, corporate goals, limited availability of resources, etc. Therefore this study presents a method, called SSAS, for the selection of software architecture styles. Moreover, this selection is a multi-criteria decision-making problem in which different goals and objectives must be taken into consideration. In this paper, we suggest an improved selection methodology, which reflects interdependencies among evaluation criteria and alternatives using analytic network process (ANP) within a zero-one goal programming (ZOGP) model. Keywords: Software Architecture; Selection of Software Architecture Styles; Multi-Criteria Decision Making; Interdependence; Analytic Network Process (ANP); Zero-One Goal Programming (ZOGP)
This document discusses two-system models of decision making and presents results from experiments investigating the relationship between executive function and decisions in the Ultimatum Game. It finds that updating ability, but not switching or inhibition, is positively related to more rational decision making. However, this relationship is fragile and depends on factors like whether unfair feedback primes affective responses. Overall, the evidence for a connection between executive function and System 2 decision making is mixed, possibly due to limitations of the within-subject paradigm used.
Application of Genetic Algorithm and Particle Swarm Optimization in Software ...IOSR Journals
This document discusses using genetic algorithms and particle swarm optimization techniques to optimize software testing by finding the most error-prone paths in a program. It begins by providing background on software testing and the need for automated techniques. It then describes how genetic algorithms and particle swarm optimization work as meta-heuristic search techniques that can be applied to the problem of generating optimal test cases. The document presents pseudocode for each algorithm and provides a sample implementation of genetic algorithms to optimize a mathematical function. It similarly provides an overview of implementing particle swarm optimization to minimize another mathematical function. The goal is to generate test cases using these algorithms and do a comparative study of their effectiveness.
Quantitative Analysis of Infant’s Computer-supported Sketch and Design of Dra...Mohd Syahmi
This study analyzed infants' computer-supported drawing behavior quantitatively and compared it to traditional paper and crayon drawing. Researchers conducted experiments where infants drew their favorite animals using different tools: paper and crayon, and drawing software with varying levels of features. The infants' drawing actions were recorded on video and analyzed using time and event sampling observation methods. The results showed that drawing software with small color palettes made color selection difficult for infants and limited creative stimulation. The study proposes that drawing software for infants should have large color palettes and easily movable coloring tools to reduce mental load and promote creativity.
A DISCUSSION ON IMAGE ENHANCEMENT USING HISTOGRAM EQUALIZATION BY VARIOUS MET...pharmaindexing
This document summarizes several papers on image enhancement techniques using histogram equalization. It discusses papers that propose sub-region histogram equalization to improve contrast while preserving spatial relationships. It also discusses a 3D histogram equalization method that produces a uniform 1D grayscale histogram to overcome issues with previous color histogram methods. Another paper proposes using total variation minimization for cartoon-texture decomposition prior to histogram equalization to reduce intensity saturation effects. Further, a technique called gain controllable clipped histogram equalization is presented to enhance contrast while preserving original brightness. Finally, a method called bi-histogram equalization with neighborhood metrics is described which divides histograms to improve local contrast while maintaining brightness.
The document discusses two-system models of decision making that propose an automatic, emotional System 1 and a controlled, rational System 2. It reports on several experiments that tested how executive function and working memory load affect decisions in the Ultimatum Game. The results provided little evidence that executive function is directly related to System 2 decision making. Relationships found were fragile and dependent on other factors like feedback received. Further research is needed to better understand how different cognitive systems interact in decision making.
FACIAL AGE ESTIMATION USING TRANSFER LEARNING AND BAYESIAN OPTIMIZATION BASED...sipij
The document summarizes research on facial age estimation using transfer learning and Bayesian optimization based on gender information. Specifically:
1) A convolutional neural network is trained to classify gender from facial images. This gender classification CNN is then used as input for an age estimation model.
2) Bayesian optimization is applied to the pre-trained gender classification CNN to fine-tune it for the age estimation task. This reduces error on validation data.
3) Experiments on the FERET and FG-NET datasets show the proposed approach of using gender information and Bayesian optimization outperforms state-of-the-art methods, achieving a mean absolute error of 1.2 and 2.67 respectively.
This document summarizes theories of divided attention from psychological literature. It describes dual task experiments and factors like task similarity, difficulty, and practice that influence performance. Early theories proposed either a single, limited central processor (Kahneman) or multiple specialized modules (Allport). Later theories like multiple resource theory (Navon & Gopher) and Baddeley's model of working memory provided a synthesis, combining a central executive with modality-specific subsystems to better explain dual task findings. However, all theories have limitations in fully specifying the cognitive architecture underlying divided attention.
1) The document discusses inhibitory control in task switching. It describes how humans must maintain and flexibly switch between task representations.
2) A key effect is the "N-2 repetition cost", where response times are slower when repeating the task performed two trials prior (e.g. ABA), compared to repeating a different task (e.g. CBA). This effect is proposed to reflect inhibitory control.
3) However, the reliability and underlying mechanisms of the N-2 repetition cost are debated. Episodic retrieval effects could also explain the cost. The document reviews evidence both supporting inhibition and suggesting alternative accounts or modulatory factors like practice and cue transparency.
This document describes a study that uses a neural network to predict the success level (flop, hit, or superhit) of Indian movies. The researchers:
1. Collected data on factors that influence movie success, such as the actors, directors, producers, etc. and their past movie performances.
2. Used this historical data to assign weights to each factor and develop a methodology to classify movies into success levels based on thresholds.
3. Trained a neural network using the weighted input data to automate the movie success prediction process.
4. Evaluated the model and found it could accurately predict success levels for 93.3% of movies, based on a confusion matrix analysis of actual
1) The document proposes measuring human learning ability through complexity measures like Rademacher complexity and algorithmic stability, which are commonly used to analyze machine learning algorithms.
2) An experiment was designed to estimate average human Rademacher complexity and algorithmic stability for students on different types of tasks (shape and word problems).
3) The results showed that human algorithmic stability provided more useful insights into human learning than Rademacher complexity, as it does not require fixing a function class or assuming optimal performance like Rademacher complexity does.
Nakayama Estimation Of Viewers Response For Contextual Understanding Of Tasks...Kalle
To estimate viewer’s contextual understanding, features of their
eye-movements while viewing question statements in response to definition statements, and features of correct and incorrect responses were extracted and compared. Twelve directional features
of eye-movements across a two-dimensional space were created, and these features were compared between correct and incorrect responses. The procedure of estimating the response was developed with Support Vector Machines, using these features. The estimation performance and accuracy were assessed across combinations of features. The number of definition statements, which needed to be memorized to answer the question statements during the experiment, affected the estimation accuracy. These results provide evidence that features of eye-movements during reading statements
can be used as an index of contextual understanding.
The Influence of Task Characteristics on Multiple Objective and Subjective Co...Pierre-Majorique Léger
Authors :
Mahdi Mirhoseini
Pierre-Majorique Léger
Sylvain Sénécal
Abstract. Using Electroencephalography (EEG), this study aims at extracting three features from instantaneous mental workload measure and link them to different aspect of the workload construct. An experiment was designed to investigate the effect of two workload inductors (Task difficulty and uncertainty) on extracted features along with a subjective measure of men- tal workload. Results suggest that both subjective and objective measures of workload are able to capture the effect of task difficulty; however only accumulated load was found to be sensitive to task uncertainty. We discuss that the three EEG measures derived from instantaneous work- load can be used as criteria for designing more efficient information systems.
The study measured differences in search interactions, eye gaze patterns, and knowledge gain between users who showed low versus high levels of learning during an online search task. Users searched for health information on two topics. Knowledge gain was measured using pre- and post-search assessments of topic knowledge, while eye gaze and search interactions were also tracked. Results showed that users with low learning fixated more and longer on pages but asked less specialized questions and felt less workload. Those with high learning showed more efficient search behavior and knowledge acquisition. The findings help understand factors influencing online learning.
IRJET- An Extensive Study of Sentiment Analysis Techniques and its Progressio...IRJET Journal
This document discusses the progression of sentiment analysis techniques from traditional machine learning approaches to modern deep learning methods. It begins with an overview of traditional techniques like Naive Bayes and support vector machines. It then discusses how these methods were improved through techniques like feature selection, handling negation, and scaling to big data. The document traces how research increasingly focused on applying neural networks to sentiment analysis. It aims to provide insight into how state-of-the-art deep learning models are replacing earlier algorithms for sentiment analysis.
Genetic Approach to Parallel SchedulingIOSR Journals
Genetic algorithms were used to solve the parallel task scheduling problem of minimizing overall completion time. The genetic algorithm represents each scheduling as a chromosome. It initializes a population of random schedules and evaluates their fitness based on completion time. Selection, crossover, and mutation operators evolve the population over generations. The best schedule found schedules tasks to processors to minimize completion time. Testing on task graphs of varying sizes showed that the genetic algorithm finds improved schedules over generations and that tournament selection works better than roulette wheel selection.
Game balancing with ecosystem mechanismAnand Bhojan
To adapt game difficulty upon game character’s strength, Dynamic Difficulty Adjustment (DDA) and some other learning strategies have been applied in commercial game designs. However, most of the existing approaches could not ensure diversity in results, and rarely attempted to coordinate content generation and behaviour control together. This paper suggests a solution that is based on multi-level swarm model and ecosystem mechanism, in order to provide a more flexible way of game balance control.
This document presents an overview of a lesson on the technology acceptance model (TAM). The objective is to introduce key concepts of TAM, have student groups design concept maps of TAM, and summarize. TAM is presented as attempting to understand technology acceptance in organizations. It includes four versions and draws from other models like the theory of reasoned action. The core idea is that two key factors, perceived usefulness and perceived ease of use, influence users' decisions about adopting technologies.
Towards Accurate Estimation of Fingerprint Ridge Orientation Using BPNN and T...IOSR Journals
This document proposes a new methodology using a neural network and ternarization process to estimate fingerprint ridge orientation. Ridge orientation is important for fingerprint enhancement and matching. The methodology first divides an image into blocks and generates a feature vector for each block based on gradient, intensity, and ridge properties. This vector is input to a trained neural network that responds with a value indicating ridge orientation quality. Twelve orientations are considered and the highest response identifies the orientation. Ternarization removes falsely identified high-response blocks while retaining correctly identified blocks, improving orientation estimation. Experimental results showed the proposed method estimates orientation better than traditional gradient-based approaches.
The document proposes and evaluates techniques for generating test input data to raise divide-by-zero exceptions in software systems. It compares various meta-heuristic techniques like hill climbing strategies, simulated annealing, and genetic algorithms on three case studies. The results show that genetic algorithms and constraint programming are most effective at generating inputs to trigger divide-by-zero exceptions with the proposed novel fitness function outperforming the original fitness function. Future work could explore applying these techniques to additional software units.
This document discusses principles for software effort estimation. It begins with an introduction explaining the importance of accurate estimation. It then discusses publications in the area and lists eight key questions about effort estimation. The document provides answers to the questions in the form of twelve principles. It discusses issues like using multiple methods, improving analogy-based estimation, handling lack of local data, and determining the essential data needed. The principles promote methods that can compensate for missing size attributes, combine outlier and synonym pruning, and be aware of sampling method trade-offs.
ICSE’14 Workshop Keynote Address: Emerging Trends in Software Metrics (WeTSOM’14).
Data about software projects is not stored in metrc1, metric2,…,
but is shared between them in some shared, underlying,shape.
Not every project has thesame underlying simple shape; many projects have different,
albeit simple, shapes.
We can exploit that shape, to great effect: for better local predictions; for transferring
lessons learned; for privacy-preserving data mining/
This document discusses two-system models of decision making and presents results from experiments investigating the relationship between executive function and decisions in the Ultimatum Game. It finds that updating ability, but not switching or inhibition, is positively related to more rational decision making. However, this relationship is fragile and depends on factors like whether unfair feedback primes affective responses. Overall, the evidence for a connection between executive function and System 2 decision making is mixed, possibly due to limitations of the within-subject paradigm used.
Application of Genetic Algorithm and Particle Swarm Optimization in Software ...IOSR Journals
This document discusses using genetic algorithms and particle swarm optimization techniques to optimize software testing by finding the most error-prone paths in a program. It begins by providing background on software testing and the need for automated techniques. It then describes how genetic algorithms and particle swarm optimization work as meta-heuristic search techniques that can be applied to the problem of generating optimal test cases. The document presents pseudocode for each algorithm and provides a sample implementation of genetic algorithms to optimize a mathematical function. It similarly provides an overview of implementing particle swarm optimization to minimize another mathematical function. The goal is to generate test cases using these algorithms and do a comparative study of their effectiveness.
Quantitative Analysis of Infant’s Computer-supported Sketch and Design of Dra...Mohd Syahmi
This study analyzed infants' computer-supported drawing behavior quantitatively and compared it to traditional paper and crayon drawing. Researchers conducted experiments where infants drew their favorite animals using different tools: paper and crayon, and drawing software with varying levels of features. The infants' drawing actions were recorded on video and analyzed using time and event sampling observation methods. The results showed that drawing software with small color palettes made color selection difficult for infants and limited creative stimulation. The study proposes that drawing software for infants should have large color palettes and easily movable coloring tools to reduce mental load and promote creativity.
A DISCUSSION ON IMAGE ENHANCEMENT USING HISTOGRAM EQUALIZATION BY VARIOUS MET...pharmaindexing
This document summarizes several papers on image enhancement techniques using histogram equalization. It discusses papers that propose sub-region histogram equalization to improve contrast while preserving spatial relationships. It also discusses a 3D histogram equalization method that produces a uniform 1D grayscale histogram to overcome issues with previous color histogram methods. Another paper proposes using total variation minimization for cartoon-texture decomposition prior to histogram equalization to reduce intensity saturation effects. Further, a technique called gain controllable clipped histogram equalization is presented to enhance contrast while preserving original brightness. Finally, a method called bi-histogram equalization with neighborhood metrics is described which divides histograms to improve local contrast while maintaining brightness.
The document discusses two-system models of decision making that propose an automatic, emotional System 1 and a controlled, rational System 2. It reports on several experiments that tested how executive function and working memory load affect decisions in the Ultimatum Game. The results provided little evidence that executive function is directly related to System 2 decision making. Relationships found were fragile and dependent on other factors like feedback received. Further research is needed to better understand how different cognitive systems interact in decision making.
FACIAL AGE ESTIMATION USING TRANSFER LEARNING AND BAYESIAN OPTIMIZATION BASED...sipij
The document summarizes research on facial age estimation using transfer learning and Bayesian optimization based on gender information. Specifically:
1) A convolutional neural network is trained to classify gender from facial images. This gender classification CNN is then used as input for an age estimation model.
2) Bayesian optimization is applied to the pre-trained gender classification CNN to fine-tune it for the age estimation task. This reduces error on validation data.
3) Experiments on the FERET and FG-NET datasets show the proposed approach of using gender information and Bayesian optimization outperforms state-of-the-art methods, achieving a mean absolute error of 1.2 and 2.67 respectively.
This document summarizes theories of divided attention from psychological literature. It describes dual task experiments and factors like task similarity, difficulty, and practice that influence performance. Early theories proposed either a single, limited central processor (Kahneman) or multiple specialized modules (Allport). Later theories like multiple resource theory (Navon & Gopher) and Baddeley's model of working memory provided a synthesis, combining a central executive with modality-specific subsystems to better explain dual task findings. However, all theories have limitations in fully specifying the cognitive architecture underlying divided attention.
1) The document discusses inhibitory control in task switching. It describes how humans must maintain and flexibly switch between task representations.
2) A key effect is the "N-2 repetition cost", where response times are slower when repeating the task performed two trials prior (e.g. ABA), compared to repeating a different task (e.g. CBA). This effect is proposed to reflect inhibitory control.
3) However, the reliability and underlying mechanisms of the N-2 repetition cost are debated. Episodic retrieval effects could also explain the cost. The document reviews evidence both supporting inhibition and suggesting alternative accounts or modulatory factors like practice and cue transparency.
This document describes a study that uses a neural network to predict the success level (flop, hit, or superhit) of Indian movies. The researchers:
1. Collected data on factors that influence movie success, such as the actors, directors, producers, etc. and their past movie performances.
2. Used this historical data to assign weights to each factor and develop a methodology to classify movies into success levels based on thresholds.
3. Trained a neural network using the weighted input data to automate the movie success prediction process.
4. Evaluated the model and found it could accurately predict success levels for 93.3% of movies, based on a confusion matrix analysis of actual
1) The document proposes measuring human learning ability through complexity measures like Rademacher complexity and algorithmic stability, which are commonly used to analyze machine learning algorithms.
2) An experiment was designed to estimate average human Rademacher complexity and algorithmic stability for students on different types of tasks (shape and word problems).
3) The results showed that human algorithmic stability provided more useful insights into human learning than Rademacher complexity, as it does not require fixing a function class or assuming optimal performance like Rademacher complexity does.
Nakayama Estimation Of Viewers Response For Contextual Understanding Of Tasks...Kalle
To estimate viewer’s contextual understanding, features of their
eye-movements while viewing question statements in response to definition statements, and features of correct and incorrect responses were extracted and compared. Twelve directional features
of eye-movements across a two-dimensional space were created, and these features were compared between correct and incorrect responses. The procedure of estimating the response was developed with Support Vector Machines, using these features. The estimation performance and accuracy were assessed across combinations of features. The number of definition statements, which needed to be memorized to answer the question statements during the experiment, affected the estimation accuracy. These results provide evidence that features of eye-movements during reading statements
can be used as an index of contextual understanding.
The Influence of Task Characteristics on Multiple Objective and Subjective Co...Pierre-Majorique Léger
Authors :
Mahdi Mirhoseini
Pierre-Majorique Léger
Sylvain Sénécal
Abstract. Using Electroencephalography (EEG), this study aims at extracting three features from instantaneous mental workload measure and link them to different aspect of the workload construct. An experiment was designed to investigate the effect of two workload inductors (Task difficulty and uncertainty) on extracted features along with a subjective measure of men- tal workload. Results suggest that both subjective and objective measures of workload are able to capture the effect of task difficulty; however only accumulated load was found to be sensitive to task uncertainty. We discuss that the three EEG measures derived from instantaneous work- load can be used as criteria for designing more efficient information systems.
The study measured differences in search interactions, eye gaze patterns, and knowledge gain between users who showed low versus high levels of learning during an online search task. Users searched for health information on two topics. Knowledge gain was measured using pre- and post-search assessments of topic knowledge, while eye gaze and search interactions were also tracked. Results showed that users with low learning fixated more and longer on pages but asked less specialized questions and felt less workload. Those with high learning showed more efficient search behavior and knowledge acquisition. The findings help understand factors influencing online learning.
IRJET- An Extensive Study of Sentiment Analysis Techniques and its Progressio...IRJET Journal
This document discusses the progression of sentiment analysis techniques from traditional machine learning approaches to modern deep learning methods. It begins with an overview of traditional techniques like Naive Bayes and support vector machines. It then discusses how these methods were improved through techniques like feature selection, handling negation, and scaling to big data. The document traces how research increasingly focused on applying neural networks to sentiment analysis. It aims to provide insight into how state-of-the-art deep learning models are replacing earlier algorithms for sentiment analysis.
Genetic Approach to Parallel SchedulingIOSR Journals
Genetic algorithms were used to solve the parallel task scheduling problem of minimizing overall completion time. The genetic algorithm represents each scheduling as a chromosome. It initializes a population of random schedules and evaluates their fitness based on completion time. Selection, crossover, and mutation operators evolve the population over generations. The best schedule found schedules tasks to processors to minimize completion time. Testing on task graphs of varying sizes showed that the genetic algorithm finds improved schedules over generations and that tournament selection works better than roulette wheel selection.
Game balancing with ecosystem mechanismAnand Bhojan
To adapt game difficulty upon game character’s strength, Dynamic Difficulty Adjustment (DDA) and some other learning strategies have been applied in commercial game designs. However, most of the existing approaches could not ensure diversity in results, and rarely attempted to coordinate content generation and behaviour control together. This paper suggests a solution that is based on multi-level swarm model and ecosystem mechanism, in order to provide a more flexible way of game balance control.
This document presents an overview of a lesson on the technology acceptance model (TAM). The objective is to introduce key concepts of TAM, have student groups design concept maps of TAM, and summarize. TAM is presented as attempting to understand technology acceptance in organizations. It includes four versions and draws from other models like the theory of reasoned action. The core idea is that two key factors, perceived usefulness and perceived ease of use, influence users' decisions about adopting technologies.
Towards Accurate Estimation of Fingerprint Ridge Orientation Using BPNN and T...IOSR Journals
This document proposes a new methodology using a neural network and ternarization process to estimate fingerprint ridge orientation. Ridge orientation is important for fingerprint enhancement and matching. The methodology first divides an image into blocks and generates a feature vector for each block based on gradient, intensity, and ridge properties. This vector is input to a trained neural network that responds with a value indicating ridge orientation quality. Twelve orientations are considered and the highest response identifies the orientation. Ternarization removes falsely identified high-response blocks while retaining correctly identified blocks, improving orientation estimation. Experimental results showed the proposed method estimates orientation better than traditional gradient-based approaches.
The document proposes and evaluates techniques for generating test input data to raise divide-by-zero exceptions in software systems. It compares various meta-heuristic techniques like hill climbing strategies, simulated annealing, and genetic algorithms on three case studies. The results show that genetic algorithms and constraint programming are most effective at generating inputs to trigger divide-by-zero exceptions with the proposed novel fitness function outperforming the original fitness function. Future work could explore applying these techniques to additional software units.
This document discusses principles for software effort estimation. It begins with an introduction explaining the importance of accurate estimation. It then discusses publications in the area and lists eight key questions about effort estimation. The document provides answers to the questions in the form of twelve principles. It discusses issues like using multiple methods, improving analogy-based estimation, handling lack of local data, and determining the essential data needed. The principles promote methods that can compensate for missing size attributes, combine outlier and synonym pruning, and be aware of sampling method trade-offs.
ICSE’14 Workshop Keynote Address: Emerging Trends in Software Metrics (WeTSOM’14).
Data about software projects is not stored in metrc1, metric2,…,
but is shared between them in some shared, underlying,shape.
Not every project has thesame underlying simple shape; many projects have different,
albeit simple, shapes.
We can exploit that shape, to great effect: for better local predictions; for transferring
lessons learned; for privacy-preserving data mining/
A Hybrid Approach to Expert and Model Based Effort Estimation CS, NcState
Daniel Baker defended his thesis on a hybrid approach to expert and model-based software effort estimation. He automated several expert judgment best practices in his 2CEE tool and found that feature selection sometimes improved model performance while bagging and boosting did not. Evaluation of methods at NASA JPL found median error was greatly reduced. The thesis achieved a more robust uncertainty representation and revealed unstable COCOMO calibrations versus previous reports.
Pareto-Optimal Search-Based Software Engineering (POSBSE): A Literature SurveyAbdel Salam Sayyad
Paper presented at the 2nd International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE’13), San Francisco, USA. May 2013.
This document discusses parameter tuning versus using default values for test data generation using the EvoSuite tool. It finds that while parameter tuning can improve performance on average, default values perform relatively well. The available search budget, or time and resources, has a strong impact on which parameter settings should be used. Parameter tuning becomes computationally expensive and does not always lead to significant improvements over default values.
This is slides used at Arithmer seminar given by Dr. Masaaki Uesaka at Arithmer inc.
It is a summary of recent methods for quality assurance of machine learning model.
Arithmer Seminar is weekly held, where professionals from within our company give lectures on their respective expertise.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
This document discusses a study on the impact of classification techniques on the performance of defect prediction models. The study finds that unlike prior work that grouped techniques into 2 similar ranks, this study using non-overlapping statistical ranks, expanded scope, and clean data sources grouped techniques into 4 distinct ranks, showing top techniques like logistic model trees and logistic regression outperform others. This suggests classification technique selection matters more than previously thought. The study concludes that experimenting with different available techniques could improve defect prediction model performance.
On Parameter Tuning in Search-Based Software Engineering: A Replicated Empiri...Abdel Salam Sayyad
The document summarizes a study that replicated an earlier empirical study on parameter tuning in search-based software engineering. The original study found that different parameter settings can significantly impact performance and that default parameter settings do not always perform optimally. The replication confirmed these findings, and also found that default settings generally performed poorly compared to best tuned settings. The replication also found that IBEA's best tuned performance was generally better than NSGA-II's best tuned performance. Additionally, parameter tuning on a sample of problems did not necessarily lead to the best settings for a new problem, but was generally better than default settings.
The paper presents a new language called UDITA for describing tests. UDITA is a Java-based language that includes non-deterministic choice operators and an interface for generating linked data structures. This allows for more efficient and effective test generation compared to previous approaches. The language aims to make test specification easier while generating tests that are faster, of higher quality, and less complex than traditional manually written or randomly generated tests.
This project aims to build a binary classifier model to label unlabeled DNA sequences as either positive (p) or negative (n) based on labeled training sequences. The team will take two approaches: 1) A k-mer approach that generates all DNA sequence fragments of length K and counts frequencies to use as attributes for classification models. 2) A PWM approach that uses motif finding tools to generate position weight matrices and score sequences to use as attributes. The approaches will be evaluated individually and combined to obtain the best performing model. Key challenges include deriving meaningful attributes from the sequence data alone. Parameters like k-mer length, number of motifs, and motif lengths will be tuned to optimize model performance.
State of the Art in Machine Learning, by Thomas Dietterich, Distinguished Professor Emeritus in the School of EECS at Oregon State University and Chief Scientist of BigML.
*MLSEV 2020: Virtual Conference.
the application of machine lerning algorithm for SEEKiranKumar671235
The document discusses using machine learning algorithms to accurately estimate software development effort (SDE). It proposes using a modified Jaya optimization algorithm to select important features which are then input to an extreme gradient boosting model for SDE estimation. The key objectives are to develop a novel feature selection method, propose an ensemble model for accurate prediction, and improve prediction ability using deep learning stacking. It reviews related work applying metaheuristic and machine learning techniques for SDE estimation and outlines the proposed approach of using modified Jaya optimization and extreme gradient boosting.
This document provides an overview of a survey of multi-objective evolutionary algorithms for data mining tasks. It discusses key concepts in multi-objective optimization and evolutionary algorithms. It also reviews common data mining tasks like feature selection, classification, clustering, and association rule mining that are often formulated as multi-objective problems and solved using multi-objective evolutionary algorithms. The survey focuses on reviewing applications of multi-objective evolutionary algorithms for feature selection and classification in part 1, and applications for clustering, association rule mining and other tasks in part 2.
In the modern world, we are permanently using, leveraging, interacting with, and relying upon systems of ever higher sophistication, ranging from our cars, recommender systems in eCommerce, and networks when we go online, to integrated circuits when using our PCs and smartphones, security-critical software when accessing our bank accounts, and spreadsheets for financial planning and decision making. The complexity of these systems coupled with our high dependency on them implies both a non-negligible likelihood of system failures, and a high potential that such failures have significant negative effects on our everyday life. For that reason, it is a vital requirement to keep the harm of emerging failures to a minimum, which means minimizing the system downtime as well as the cost of system repair. This is where model-based diagnosis comes into play.
Model-based diagnosis is a principled, domain-independent approach that can be generally applied to troubleshoot systems of a wide variety of types, including all the ones mentioned above. It exploits and orchestrates techniques for knowledge representation, automated reasoning, heuristic problem solving, intelligent search, learning, stochastics, statistics, decision making under uncertainty, as well as combinatorics and set theory to detect, localize, and fix faults in abnormally behaving systems.
In this talk, we will give an introduction to the topic of model-based diagnosis, point out the major challenges in the field, and discuss a selection of approaches from our research addressing these challenges. For instance, we will present methods for the optimization of the time and memory performance of diagnosis systems, show efficient techniques for a semi-automatic debugging by interacting with a user or expert, and demonstrate how our algorithms can be effectively leveraged in important application domains such as scheduling or the Semantic Web.
Strategies oled optimization jmp 2016 09-19David Lee
Every experiment yields multiple data types, each requiring unique analyses and controls due to the sub-micron nature of an innovative organic light-emitting diode (OLED). Three specific data methods will be discussed. First, the premise of the study centers on a six-factor definitive screening design that was built utilizing new features incorporated in JMP 13 for improved power and signal detection. Multiple responses were modeled with a defect model generated via use of the Profiler and Simulation studies. Second, devices are continually monitored for radiance loss in an accelerated fade test. Frequently, devices are removed from the test prior to reaching their failure point. Predicted failure times can be estimated by utilizing a custom nonlinear model in either the Reliability Degradation or Nonlinear Model platforms. Estimated failure times were then incorporated into traditional parametric survival techniques, as well as new features in the Generalized Regression platform. Lastly, radiance data is collected across the visual spectrum, resulting in approximately 100 correlated responses.
Strategies for Optimization of an OLED DeviceDavid Lee
Every experiment yields multiple data types, each requiring unique analyses and controls due to the sub-micron nature of an innovative organic light-emitting diode (OLED). Three specific data methods will be discussed. First, the premise of the study centers on a six-factor definitive screening design that was built utilizing new features incorporated in JMP 13 for improved power and signal detection. Multiple responses were modeled with a defect model generated via use of the Profiler and Simulation studies. Second, devices are continually monitored for radiance loss in an accelerated fade test. Frequently, devices are removed from the test prior to reaching their failure point. Predicted failure times can be estimated by utilizing a custom nonlinear model in either the Reliability Degradation or Nonlinear Model platforms. Estimated failure times were then incorporated into traditional parametric survival techniques, as well as new features in the Generalized Regression platform. Lastly, radiance data is collected across the visual spectrum, resulting in approximately 100 correlated responses.
A Software Measurement Using Artificial Neural Network and Support Vector Mac...ijseajournal
Today, Software measurement are based on various techniques such that neural network, Genetic
algorithm, Fuzzy Logic etc. This study involves the efficiency of applying support vector machine using
Gaussian Radial Basis kernel function to software measurement problem to increase the performance and
accuracy. Support vector machines (SVM) are innovative approach to constructing learning machines that
Minimize generalization error. There is a close relationship between SVMs and the Radial Basis Function
(RBF) classifiers. Both have found numerous applications such as in optical character recognition, object
detection, face verification, text categorization, and so on. The result demonstrated that the accuracy and
generalization performance of SVM Gaussian Radial Basis kernel function is better than RBFN. We also
examine and summarize the several superior points of the SVM compared with RBFN.
Similar to Ekrem Kocaguneli PhD Defense Presentation (20)
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
Generating privacy-protected synthetic data using Secludy and MilvusZilliz
During this demo, the founders of Secludy will demonstrate how their system utilizes Milvus to store and manipulate embeddings for generating privacy-protected synthetic data. Their approach not only maintains the confidentiality of the original data but also enhances the utility and scalability of LLMs under privacy constraints. Attendees, including machine learning engineers, data scientists, and data managers, will witness first-hand how Secludy's integration with Milvus empowers organizations to harness the power of LLMs securely and efficiently.
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Webinar: Designing a schema for a Data WarehouseFederico Razzoli
Are you new to data warehouses (DWH)? Do you need to check whether your data warehouse follows the best practices for a good design? In both cases, this webinar is for you.
A data warehouse is a central relational database that contains all measurements about a business or an organisation. This data comes from a variety of heterogeneous data sources, which includes databases of any type that back the applications used by the company, data files exported by some applications, or APIs provided by internal or external services.
But designing a data warehouse correctly is a hard task, which requires gathering information about the business processes that need to be analysed in the first place. These processes must be translated into so-called star schemas, which means, denormalised databases where each table represents a dimension or facts.
We will discuss these topics:
- How to gather information about a business;
- Understanding dictionaries and how to identify business entities;
- Dimensions and facts;
- Setting a table granularity;
- Types of facts;
- Types of dimensions;
- Snowflakes and how to avoid them;
- Expanding existing dimensions and facts.
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfChart Kalyan
A Mix Chart displays historical data of numbers in a graphical or tabular form. The Kalyan Rajdhani Mix Chart specifically shows the results of a sequence of numbers over different periods.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on integration of Salesforce with Bonterra Impact Management.
Interested in deploying an integration with Salesforce for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
2. 2
Agenda
• Introduction
• Publications
• What to Know
• 8 Questions
• Answers
• 12 Principles
• Validity Issues
• Future Work
3. 3
Introduction
Software effort estimation (SEE) is the process of estimating the total
effort required to complete a software project (Keung2008 [1]).
Successful estimation is critical for software organizations
Over-estimation: Killing promising projects
Under-estimation: Wasting entire effort! E.g. NASA’s
launch-control system cancelled after initial estimate of
$200M was overrun by another $200M [22]
Among IT projects developed in 2009, only 32% were
successfully completed within time with full functionality [23]
4. 4
Introduction (cntd.)
We will discuss algorithms, but it would be irresponsible to say
that SEE is merely an algorithmic problem. Organizational factors
are just as important
E.g. common experiences of data collection and user interaction
in organizations operating in different domains
5. 5
Introduction (cntd.)
This presentation is not about a single algorithm/answer targeting a
single problem.
Because there is not just one question.
It is (unfortunately) not everything about SEE.
It brings together critical questions and related solutions.
6. 6
What to know?
1 When do I have perfect data? What is the best effort
2
estimation method?
3 Can I use multiple methods?
4
ABE methods are easy to use.
5 What if I lack resources How can I improve them?
for local data?
7 Are all attributes and all
6 I don’t believe in size instances necessary?
attributes. What can I do?
8 How to experiment, which
sampling method to use?
7. 7
Publications
Journals
• E. Kocaguneli, T. Menzies, J. Keung, “On the Value of Ensemble Effort Estimation”, IEEE Transactions on
Software Engineering, 2011.
• E. Kocaguneli, T. Menzies, A. Bener, J. Keung, “Exploiting the Essential Assumptions of Analogy-based
Effort Estimation”, IEEE Transactions on Software Engineering, 2011.
• E. Kocaguneli, T. Menzies, J. Keung, “Kernel Methods for Software Effort Estimation”, Empirical
Software Engineering Journal, 2011.
• J. Keung, E. Kocaguneli, T. Menzies, “A Ranking Stability Indicator for Selecting the Best Effort Estimator
in Software Cost Estimation”, Journal of Automated Software Engineering, 2012.
Under review Journals
• E. Kocaguneli, T. Menzies, J. Keung, “Active Learning for Effort Estimation”, third round review at IEEE
Transactions on Software Engineering.
• E. Kocaguneli, T. Menzies, E. Mendes, “Transfer Learning in Effort Estimation”, submitted to ACM
Transactions on Software Engineering.
• E. Kocaguneli, T. Menzies, “Software Effort Models Should be Assessed Via Leave-One-Out Validation”,
under second round review at Journal of Systems and Software.
• E. Kocaguneli, T. Menzies, E. Mendes, “Towards Theoretical Maximum Prediction Accuracy Using D-
ABE”, submitted to IEEE Transactions on Software Engineering.
Conference
• E. Kocaguneli, T. Menzies, J. Hihn, Byeong Ho Kang, “Size Doesn‘t Matter? On the Value of Software Size
Features for Effort Estimation”, Predictive Models in Software Engineering (PROMISE) 2012.
• E. Kocaguneli, T. Menzies, “How to Find Relevant Data for Effort Estimation”, International Symposium
on Empirical Software Engineering and Measurement (ESEM) 2011
• E. Kocaguneli, G. Gay, Y. Yang, T. Menzies, “When to Use Data from Other Projects for Effort Estimation”,
International Conference on Automated Software Engineering (ASE) 2010, Short-paper.
8. 8
1 When do I have the perfect data?
Principle #1: Know your domain
Domain knowledge is important in every step (Fayyad1996 [2])
Yet, this knowledge takes time and effort to gain,
e.g. percentage commit information
Principle #2: Let the experts talk
Initial results may be off according to domain experts
Success is to create a discussion, interest and suggestions
Principle #3: Suspect your data
“Curiosity” to question is a key characteristic (Rauser2011 [3])
e.g. in an SEE project, 200+ test cases, 0 bugs
Principle #4: Data collection is cyclic
Any step from mining till presentation may be repeated
9. 9
2 What is the best effort estimation
method?
There is no agreed upon Methods change ranking w.r.t.
best estimation method conditions such as data sets, error
(Shepperd2001 [4]) measures (Myrtveit2005 [5])
Experimenting with: 90 solo-
methods, 20 public data sets, 7 Top 13 methods are CART & ABE
error measures methods (1NN, 5NN)
10. 10
3 How to use superior subset of
methods?
We have a set of Assembling solo-methods
superior methods to may be a good idea, e.g.
recommend fusion of 3 biometric
modalities (Ross2003 [20])
But the previous evidence of Baker2007 [7], Kocaguneli2009
assembling multiple methods in [8], Khoshgoftaar2009 [9] failed to
SEE is discouraging outperform solo-methods
Combine top
2,4,8,13 solo-
methods via mean,
median and IRWM
11. 11
2 What is the best effort estimation method
3 How to use superior subset of methods?
Principle #5: Use a ranking stability indicator
Principle #6: Assemble superior solo-methods
A method to identify successful methods using their rank changes
A novel scheme for assembling solo-methods
Multi-methods that outperform all solo-methods
This research published at: .
• Kocaguneli, T. Menzies, J. Keung, “On the Value of Ensemble Effort Estimation”, IEEE Transactions on
Software Engineering, 2011.
• J. Keung, E. Kocaguneli, T. Menzies, “A Ranking Stability Indicator for Selecting the Best Effort Estimator in
Software Cost Estimation”, Journal of Automated Software Engineering, 2012.
12. 12
4 How can we improve ABE methods?
Analogy based methods They are very widely used
make use of similar past (Walkerden1999 [10]) as:
projects for estimation • No model-calibration to local data
• Can better handle outliers
• Can work with 1 or more attributes
• Easy to explain
Two promising research areas
• weighting the selected analogies
(Mendes2003 [11], Mosley2002[12])
• improving design options (Keung2008 [1])
13. 13
How can we improve ABE methods?
(cntd.)
Building on the previous research (Mendes2003 [11], Mosley2002[12]
,Keung2008 [1]), we adopted two different strategies
a) Weighting analogies
We used kernel weighting to
weigh selected analogies
Compare performance of
each k-value with and
without weighting.
A similar experience in defect
In none of the scenarios did we
prediction
see a significant improvement
14. 14
How can we improve ABE methods?
b) Designing ABE methods
(cntd.) D-ABE
Easy-path: Remove training • Get best estimates of all training
instance that violate assumptions instances
• Remove all the training instances
TEAK will be discussed later. within half of the worst MRE (acc.
D-ABE: Built on theoretical to TMPA).
maximum prediction accuracy • Return closest neighbor’s estimate
(TMPA) (Keung2008 [1]) to the test instance.
Training Instances
Test instance
t a
Close to the
b d worst MRE
Return the c
closest e
neighbor’s
estimate f
Worst MRE
15. 15
How can we improve ABE methods?
(cntd.)
DABE Comparison to DABE Comparison to
static k w.r.t. MMRE static k w.r.t. win, tie, loss
16. 16
How can we improve ABE methods?
(cntd.)
Principle #7: Weighting analogies is overelaboration
Principle #8: Use easy-path design
Investigation of an unexplored and promising ABE option
of kernel-weighting
A negative result published at ESE Journal
An ABE design option that can be applied to different
ABE methods (D-ABE, TEAK)
This research published at: .
• E. Kocaguneli, T. Menzies, A. Bener, J. Keung, “Exploiting the Essential Assumptions of Analogy-based Effort
Estimation”, IEEE Transactions on Software Engineering, 2011.
• E. Kocaguneli, T. Menzies, J. Keung, “Kernel Methods for Software Effort Estimation”, Empirical Software
Engineering Journal, 2011.
17. 17
5 How to handle lack of local data?
Finding enough local training Merits of using cross-data from
data is a fundamental problem another company is questionable
(Turhan2009 [13]). (Kitchenham2007 [14]).
We use a relevancy filtering method called TEAK
on public and proprietary data sets.
Similar projects,
Similar projects,
dissimilar effort
similar effort
values, hence
values, hence
high variance
low variance
Cross data works as well as within data for 6 out
of 8 proprietary data sets, 19 out of 21 public data
sets after TEAK’s relevancy filtering
18. 18
How to handle lack of local data?
(cntd.)
Principle #9: Use relevancy filtering
A novel method to handle lack of local data
Successful application on public as well as proprietary data
This research published at: .
• E. Kocaguneli, T. Menzies, “How to Find Relevant Data for Effort Estimation”, International Symposium on
Empirical Software Engineering and Measurement (ESEM) 2011
• E. Kocaguneli, G. Gay, Y. Yang, T. Menzies, “When to Use Data from Other Projects for Effort Estimation”,
International Conference on Automated Software Engineering (ASE) 2010, Short-paper.
19. 19
E(k) matrices & Popularity
This concept helps the next 2 problems: size features and the
essential content, i.e. pop1NN and QUICK algorithms, respectively
20. 20
E(k) matrices & Popularity (cntd.)
Outlier pruning Sample steps
1. Calculate “popularity” of
instances
2. Sorting by popularity,
3. Label one instance at a time
4. Find the stopping point
5. Return closest neighbor from
active pool as estimate
Finding the stopping point
1. If all popular instances are exhausted.
2. Or if there is no MRE improvement for n consecutive times.
3. Or if the ∆ between the best and the worst error of the last n
times is very small. (∆ = 0.1; n = 3)
21. 21
E(k) matrices & Popularity (cntd.)
Picking random More popular instances
training instance is One of the stopping
in the active pool
not a good idea point conditions fire
decrease error
22. 22
6 Do I have to use size attributes?
At the heart of widely accepted COCOMO uses LOC (Boehm1981
SEE methods lies the software [15]), whereas FP (Albrecht1983
size attributes [16]) uses logical transactions
Size attributes are beneficial if used properly (Lum2002
[17]); e.g. DoD and NASA uses successfully.
Yet, the size attributes may not be trusted or may not be estimated
at the early stages. That disrupts adoption of SEE methods.
Measuring software
productivity by lines of code is This is a very costly measuring
like measuring progress on an unit because it encourages the
airplane by how much it weighs writing of insipid code - E. Dijkstra
– B. Gates
23. 23
Do I have to use size attributes? (cntd.)
pop1NN (w/o size) vs. 1NN and CART (w/ size)
Given enough resources
for correct collection and
estimation, size features
may be helpful
If not, then outlier pruning
helps.
24. 24
Do I have to use size attributes? (cntd.)
Principle #10: Use outlier pruning
Promotion of SEE methods that can compensate the lack
of the software size features
A method called pop1NN that shows that size features
are not a “must”.
This research published at: .
• E. Kocaguneli, T. Menzies, J. Hihn, Byeong Ho Kang, “Size Doesn‘t Matter? On the Value of Software Size
Features for Effort Estimation”, Predictive Models in Software Engineering (PROMISE) 2012.
25. 25
7 What is the essential content of SEE
data?
SEE is populated with overly
In a matrix of N instances and F
complex methods for
features, the essential content is N ′ ∗ F ′
marginal performance
increase (Jorgensen2007 [18])
QUICK is an active learning
Synonym pruning method combines outlier
removal and synonym pruning
1. Calculate the popularity of
features Removal based on distance seemed
2. Select non-popular features. to be reserved for instances.
Similar tasks both remove ABE method as a two dimensional
cells in the hypercube of all reduction (Ahn2007 [25])
cases times all columns
In our lab variance-based feature
(Lipowezky1998 [24])
selector is used as a row selector
26. 26
What is the essential content of SEE
data? (cntd.)
At most 31% of all
the cells
On median 10%
Intrinsic dimensionality: There is a consensus in
the high-dimensional data analysis community
that the only reason any methods work in very
high dimensions is that, in fact, the data are not
truly high-dimensional. (Levina & Bickel 2005)
Performance?
27. 27
What is the essential content of SEE
QUICK vs passiveNN (1NN)
data? (cntd.) QUICK vs CART
Only one dataset
where QUICK is
significantly worse
than passiveNN
4 such data sets
when QUICK is
compared to CART
28. 28
What is the essential content of SEE
data? (cntd.)
Principle #11: Combine outlier and synonym pruning
An unsupervised method to find the essential content of
SEE data sets and reduce the data needs
Promoting research to elaborate on the data, not on the
algorithm
This research is under 3rd round review: .
• E. Kocaguneli, T. Menzies, J. Keung, “Active Learning for Effort Estimation”, third round review at IEEE
Transactions on Software Engineering.
29. 29
8 How should I choose the right SM?
Expectation
(Kitchenham2007 [7]) Observed
No significant difference for B&V values among 90 methods
Only minutes of run time difference (<15)
LOO is not probabilistic and results can be easily shared
30. 30
How should I choose the right SM?
(cntd.)
Principle #12: Be aware of sampling method trade-off
The first experimental investigation of B&V trade-off in SEE
Recommendation based on experimental concerns
This research is under 2nd round review: .
• E. Kocaguneli, T. Menzies, “Software Effort Models Should be Assessed Via Leave-One-Out Validation”,
under second round review at Journal of Systems and Software.
31. 31
1.
What to know?
Know your domain
2. Let the experts talk
3.When do I your data
Suspect have perfect data? What isathe best effort
5. Use ranking stability
4. Data collection is cyclic estimation method?
indicator
6. Assemble superior solo-
Can I use multiple methods?
methods
7.ABE methods are easy to use.
Weighting analogies is over-
What if I lack resources elaboration I improve them?
How can
9. Use relevancy filtering 8. Use easy-path design
for local data?
Are all attributes andand
11. Combine outlier all
I don’t believe in size instances necessary?
synonym pruning
10. Use outlier pruning
attributes. What can I do?
How Be experiment, which
12. to aware of sampling
method trade-off
sampling method to use?
32. 32
Validity Issues
Construct validity, i.e. do we measure what
we intend to measure?
Use of previously recommended estimation
methods, error measures and data sets
External validity, i.e. can we generalize results
outside current specifications
Difficult to assert that results will definitely hold
Yet we use almost all the publicly available SEE data sets.
Median value of projects used by the studies
reviewed is 186 projects (Kitchenham2007 [14])
Our experimentation uses 1000+ projects
33. 33
Future Work
Application on publicly
accessible big data sets
300K projects, 2M users 250K open source projects
Smarter, larger scale algorithms
for general conclusions Application to different
domains, e.g. defect
Current methods may face prediction
scalability issues. Improving
common ideas for scalability, e.g. Combining intrinsic dimensionality
linear time NN methods techniques in ML for lower bound
dimensions of SEE data sets
(Levina2004 [27])
36. 36
References
[1] J. W. Keung, “Theoretical Maximum Prediction Accuracy for Analogy-Based Software Cost
Estimation,” 15th Asia-Pacific Software Engineering Conference, pp. 495– 502, 2008.
[2] U. Fayyad, G. Piatetsky-Shapiro, and P. Smyth, “The kdd process for extracting useful knowledge
from volumes of data,” Commun. ACM, vol. 39, no. 11, pp. 27–34, Nov. 1996.
[3] J. Rauser, “What is a career in big data?” 2011. [Online]. Available: http:
//strataconf.com/stratany2011/public/schedule/speaker/10070
[4] M. Shepperd and G. Kadoda, “Comparing Software Prediction Techniques Using Simulation,” IEEE
Trans. Softw. Eng., vol. 27, no. 11, pp. 1014–1022, 2001.
[5] I. Myrtveit, E. Stensrud, and M. Shepperd, “Reliability and validity in comparative studies of
software prediction models,” IEEE Trans. Softw. Eng., vol. 31, no. 5, pp. 380–391, May 2005.
[6] E. Alpaydin, “Techniques for combining multiple learners,” Proceedings of Engineering of Intelligent
Systems, vol. 2, pp. 6–12, 1998.
[7] D. Baker, “A hybrid approach to expert and model-based effort estimation,” Master’s thesis, Lane
Department of Computer Science and Electrical Engineering, West Virginia University, 2007.
[8] E. Kocaguneli, Y. Kultur, and A. Bener, “Combining multiple learners induced on multiple datasets
for software effort prediction,” in International Symposium on Software Reliability Engineering (ISSRE),
2009, student Paper.
[9] T. M. Khoshgoftaar, P. Rebours, and N. Seliya, “Software quality analysis by combining multiple
projects and learners,” Software Quality Control, vol. 17, no. 1, pp. 25–49, 2009.
[10] F. Walkerden and R. Jeffery, “An empirical study of analogy-based software effort estima- tion,”
Empirical Software Engineering, vol. 4, no. 2, pp. 135–158, 1999.
[11] E. Mendes, I. D. Watson, C. Triggs, N. Mosley, and S. Counsell, “A comparative study of cost
estimation models for web hypermedia applications,” Empirical Software Engineering, vol. 8, no. 2, pp.
163–196, 2003.
37. [12] E. Mendes and N. Mosley, “Further investigation into the use of cbr and stepwise regression to 37
predict development effort for web hypermedia applications,” in International Symposium on Empirical
Software Engineering, 2002.
[13] B. Turhan, T. Menzies, A. Bener, and J. Di Stefano, “On the relative value of cross-company and
within-company data for defect prediction,” Empirical Software Engineering, vol. 14, no. 5, pp. 540–
578, 2009.
[14] B. A. Kitchenham, E. Mendes, and G. H. Travassos, “Cross versus within-company cost
estimation studies: A systematic review,” IEEE Trans. Softw. Eng., vol. 33, no. 5, pp. 316– 329, 2007.
[15] B. W. Boehm, C. Abts, A. W. Brown, S. Chulani, B. K. Clark, E. Horowitz, R. Madachy, D. J.
Reifer, and B. Steece, Software Cost Estimation with Cocomo II. Upper Saddle River, NJ, USA:
Prentice Hall PTR, 2000.
[16] A. Albrecht and J. Gaffney, “Software function, source lines of code and development effort
prediction: A software science validation,” IEEE Trans. Softw. Eng., vol. 9, pp. 639–648, 1983.
[17] K. Lum, J. Powell, and J. Hihn, “Validation of spacecraft cost estimation models for flight and
ground systems,” in ISPA’02: Conference Proceedings, Software Modeling Track, 2002.
[18] M. Jorgensen and M. Shepperd, “A systematic review of software development cost estimation
studies,” IEEE Trans. Softw. Eng., vol. 33, no. 1, pp. 33–53, 2007.
[19] ] B. A. Kitchenham, E. Mendes, and G. H. Travassos, “Cross versus within-company cost
estimation studies: A systematic review,” IEEE Trans. Softw. Eng., vol. 33, no. 5, pp. 316– 329, 2007.
[20] A. Ross, “Information fusion in biometrics,” Pattern Recognition Letters, vol. 24, no. 13, pp. 2115–
2125, Sep. 2003.
[21] Raymond P. L. Buse, Thomas Zimmermann: Information needs for software development
analytics. ICSE 2012: 987-996
[22] Spareref.com. Nasa to shut down checkout & launch control system, August 26, 2002.
http://www.spaceref.com/news/viewnews.html?id=475.
[23] Standish Group (2004). CHAOS Report(Report). West Yarmouth, Massachusetts: Standish
Group.
[24] U. Lipowezky, Selection of the optimal prototype subset for 1-NN classification, Pattern
Recognition Lett. 19 (1998) 907}918.
[25] Hyunchul Ahn, Kyoung-jae Kim, Ingoo Han, A case-based reasoning system with the two-
dimensional reduction technique for customer classification, Expert Systems with Applications, Volume
32, Issue 4, May 2007, Pages 1011-1019, ISSN 0957-4174, 10.1016/j.eswa.2006.02.021.
[26] Elke Achtert, Christian Böhm, Peer Kröger, Peter Kunath, Alexey Pryakhin, and Matthias Renz.
2006. Efficient reverse k-nearest neighbor search in arbitrary metric spaces. In Proceedings of the
2006 ACM SIGMOD international conference on Management of data (SIGMOD '06)
[27] E. Levina and P.J. Bickel. Maximum likelihood estimation of intrinsic dimension. In Advances in
Neural Information Processing Systems, volume 17, Cambridge, MA, USA, 2004. The MIT Press.