Mental health disorders affect many aspects of patient’s lives, including emotions, cognition, and especially behaviors. E-health technology helps to collect information wealth in a non-invasive manner, which represents a promising opportunity to construct health behavior markers. Combining such user behavior data can provide a more comprehensive and contextual view than questionnaire data. Due to behavioral data, we can train machine learning models to understand the data pattern and also use prediction algorithms to know the next state of a person’s behavior. The remaining challenges for this issue are how to apply mathematical formulations to textual datasets and find metadata that aids to identify the person’s life pattern and also predict the next state of his comportment. The main idea of this work is to use a hidden Markov model (HMM) to predict user behavior from social media applications by analyzing and detecting states and symbols from the user behavior dataset. To achieve this goal, we need to analyze and detect the states and symbols from the user behavior dataset, then convert the textual data to mathematical and numerical matrices. Finally, apply the HMM model to predict the hidden user behavior states. We tested our program and identified that the log-likelihood was higher and better when the model fits the data. In any case, the results of the study indicated that the program was suitable for the purpose and yielded valuable data.
Collusion-resistant multiparty data sharing in social networksIJECEIAES
The number of users on online social networks (OSNs) has grown tremendously over the past few years, with sites like Facebook amassing over a billion users. With the popularity of OSNs, the increase in privacy risk from the large volume of sensitive and private data is inevitable. While there are many features for access control for an individual user, most OSNs still need concrete mechanisms to preserve the privacy of data shared between multiple users. The proposed method uses metrics such as identity leakage (IL) and strength of interaction (SoI) to fine-tune the scenarios that use privacy risk and sharing loss to identify and resolve conflicts. In addition to conflict resolution, bot detection is also done to mitigate collusion attacks. The final decision to share the data item is then ascertained based on whether it passes the threshold condition for the above metrics.
Towards Decision Support and Goal AchievementIdentifying Ac.docxturveycharlyn
Towards Decision Support and Goal Achievement:
Identifying Action-Outcome Relationships From Social
Media
Emre Kıcıman
Microsoft Research
[email protected]
Matthew Richardson
Microsoft Research
[email protected]
ABSTRACT
Every day, people take actions, trying to achieve their per-
sonal, high-order goals. People decide what actions to take
based on their personal experience, knowledge and gut in-
stinct. While this leads to positive outcomes for some peo-
ple, many others do not have the necessary experience, knowl-
edge and instinct to make good decisions. What if, rather
than making decisions based solely on their own personal
experience, people could take advantage of the reported ex-
periences of hundreds of millions of other people?
In this paper, we investigate the feasibility of mining the
relationship between actions and their outcomes from the
aggregated timelines of individuals posting experiential mi-
croblog reports. Our contributions include an architecture
for extracting action-outcome relationships from social me-
dia data, techniques for identifying experiential social media
messages and converting them to event timelines, and an
analysis and evaluation of action-outcome extraction in case
studies.
1. INTRODUCTION
While current structured knowledge bases (e.g., Freebase)
contain a sizeable collection of information about entities,
from celebrities and locations to concepts and common ob-
jects, there is a class of knowledge that has minimal cov-
erage: actions. Simple information about common actions,
such as the effect of eating pasta before running a marathon,
or the consequences of adopting a puppy, are missing. While
some of this information may be found within the free text of
Wikipedia articles, the lack of a structured or semi-structured
representation make it largely unavailable for computational
usage. With computing devices continuing to become more
embedded in our everyday lives, and mediating an increasing
degree of our interactions with both the digital and physical
world, knowledge bases that can enable our computing de-
vices to represent and evaluate actions and their likely out-
comes can help individuals reason about actions and their
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than the
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specific permission
and/or a fee. Request permissions from [email protected]
KDD’15, August 10-13, 2015, Sydney, NSW, Australia.
Copyright is held by the owner/author(s). Publication rights licensed to ACM.
ACM 978-1-4503-3664-2/15/08 ...$15.00.
DOI: http://dx.doi.org/10.1145 ...
NOVEL MACHINE LEARNING ALGORITHMS FOR CENTRALITY AND CLIQUES DETECTION IN YOU...ijaia
The goal of this research project is to analyze the dynamics of social networks using machine learning techniques to locate maximal cliques and to find clusters for the purpose of identifying a target demographic. Unsupervised machine learning techniques are designed and implemented in this project to analyze a dataset from YouTube to discover communities in the social network and find central nodes. Different clustering algorithms are implemented and applied to the YouTube dataset. The well-known Bron-Kerbosch algorithm is used effectively in this research to find maximal cliques. The results obtained from this research could be used for advertising purposes and for building smart recommendation systems. All algorithms were implemented using Python programming language. The experimental results show that we were able to successfully find central nodes through clique-centrality and degree centrality. By utilizing clique detection algorithms, the research shown how machine learning algorithms can detect close knit groups within a larger network.
NOVEL MACHINE LEARNING ALGORITHMS FOR CENTRALITY AND CLIQUES DETECTION IN YOU...gerogepatton
The goal of this research project is to analyze the dynamics of social networks using machine learning techniques to locate maximal cliques and to find clusters for the purpose of identifying a target demographic. Unsupervised machine learning techniques are designed and implemented in this project to analyze a dataset from YouTube to discover communities in the social network and find central nodes. Different clustering algorithms are implemented and applied to the YouTube dataset. The well-known Bron-Kerbosch algorithm is used effectively in this research to find maximal cliques. The results obtained from this research could be used for advertising purposes and for building smart recommendation systems. All algorithms were implemented using Python programming language. The experimental results show that we were able to successfully find central nodes through clique-centrality and degree centrality. By utilizing clique detection algorithms, the research shown how machine learning algorithms can detect close knit groups within a larger network.
Novel Machine Learning Algorithms for Centrality and Cliques Detection in You...gerogepatton
The goal of this research project is to analyze the dynamics of social networks using machine learning techniques to locate maximal cliques and to find clusters for the purpose of dentifying a target
demographic. Unsupervised machine learning techniques are designed and implemented in this project to analyze a dataset from YouTube to discover communities in the social network and find central nodes. Different clustering algorithms are implemented and applied to the YouTube dataset. The well-known Bron-Kerbosch algorithm is used effectively in this research to find maximal cliques. The results obtained
from this research could be used for advertising purposes and for building smart recommendation systems.
All algorithms were implemented using Python programming language. The experimental results show that
we were able to successfully find central nodes through clique-centrality and degree centrality. By utilizing
clique detection algorithms, the research shown how machine learning algorithms can detect close knit
groups within a larger network.
Machine learning based recommender system for e-commerceIAESIJAI
Nowadays, e-commerce is becoming an essential part of business for many reasons, including the simplicity, availability, richness and diversity of products and services, flexibility of payment methods and the convenience of shopping remotely without losing time. These benefits have greatly optimized the lives of users, especially with the technological development of mobile devices and the availability of the Internet anytime and anywhere. Because of their direct impact on the revenue of e-commerce companies, recommender systems are considered a must in this field. Recommender systems detect items that match the customer's needs based on the customer's previous actions and make them appear in an interesting way. Such a customized experience helps to increase customer engagement and purchase rates as the suggested items are tailored to the customer's interests. Therefore, perfecting recommendation systems that allow for more personalized and accurate item recommendations is a major challenge in the e-marketing world. In our study, we succeeded in developing an algorithm to suggest personal recommendations to customers using association rules via the Frequent-Pattern-Growth algorithm. Our technique generated good results with a high average probability of purchasing the next product suggested by the recommendation system.
A Dynamic Intelligent Policies Analysis Mechanism for Personal Data Processin...Konstantinos Demertzis
The evolution of the Internet of Things is significantly a
ected by legal restrictions imposed for personal data handling, such as the European General Data Protection Regulation (GDPR).
The main purpose of this regulation is to provide people in the digital age greater control over their personal data, with their freely given, specific, informed and unambiguous consent to collect and process the data concerning them. ADVOCATE is an advanced framework that fully complies with the requirements of GDPR, which, with the extensive use of blockchain and artificial intelligence technologies, aims to provide an environment that will support users in maintaining control of their personal data in the IoT ecosystem. This paper proposes and presents the Intelligent Policies Analysis Mechanism (IPAM) of the ADVOCATE framework, which, in an intelligent and fully automated manner, can identify conflicting rules or consents of the user, which may lead to the collection of personal data that can be used for profiling. In order to clearly identify and implement IPAM, the problem of recording user data from smart entertainment devices using Fuzzy Cognitive Maps (FCMs) was simulated. FCMs are an intelligent decision-making system that simulates the processes of a complex system, modeling the correlation base, knowing the behavioral and balance specialists of the system. Respectively, identifying conflicting rules that can lead to a profile, training is done using Extreme Learning Machines (ELMs), which are highly ecient neural systems of small and flexible architecture that can work optimally in complex environments.
Collusion-resistant multiparty data sharing in social networksIJECEIAES
The number of users on online social networks (OSNs) has grown tremendously over the past few years, with sites like Facebook amassing over a billion users. With the popularity of OSNs, the increase in privacy risk from the large volume of sensitive and private data is inevitable. While there are many features for access control for an individual user, most OSNs still need concrete mechanisms to preserve the privacy of data shared between multiple users. The proposed method uses metrics such as identity leakage (IL) and strength of interaction (SoI) to fine-tune the scenarios that use privacy risk and sharing loss to identify and resolve conflicts. In addition to conflict resolution, bot detection is also done to mitigate collusion attacks. The final decision to share the data item is then ascertained based on whether it passes the threshold condition for the above metrics.
Towards Decision Support and Goal AchievementIdentifying Ac.docxturveycharlyn
Towards Decision Support and Goal Achievement:
Identifying Action-Outcome Relationships From Social
Media
Emre Kıcıman
Microsoft Research
[email protected]
Matthew Richardson
Microsoft Research
[email protected]
ABSTRACT
Every day, people take actions, trying to achieve their per-
sonal, high-order goals. People decide what actions to take
based on their personal experience, knowledge and gut in-
stinct. While this leads to positive outcomes for some peo-
ple, many others do not have the necessary experience, knowl-
edge and instinct to make good decisions. What if, rather
than making decisions based solely on their own personal
experience, people could take advantage of the reported ex-
periences of hundreds of millions of other people?
In this paper, we investigate the feasibility of mining the
relationship between actions and their outcomes from the
aggregated timelines of individuals posting experiential mi-
croblog reports. Our contributions include an architecture
for extracting action-outcome relationships from social me-
dia data, techniques for identifying experiential social media
messages and converting them to event timelines, and an
analysis and evaluation of action-outcome extraction in case
studies.
1. INTRODUCTION
While current structured knowledge bases (e.g., Freebase)
contain a sizeable collection of information about entities,
from celebrities and locations to concepts and common ob-
jects, there is a class of knowledge that has minimal cov-
erage: actions. Simple information about common actions,
such as the effect of eating pasta before running a marathon,
or the consequences of adopting a puppy, are missing. While
some of this information may be found within the free text of
Wikipedia articles, the lack of a structured or semi-structured
representation make it largely unavailable for computational
usage. With computing devices continuing to become more
embedded in our everyday lives, and mediating an increasing
degree of our interactions with both the digital and physical
world, knowledge bases that can enable our computing de-
vices to represent and evaluate actions and their likely out-
comes can help individuals reason about actions and their
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than the
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specific permission
and/or a fee. Request permissions from [email protected]
KDD’15, August 10-13, 2015, Sydney, NSW, Australia.
Copyright is held by the owner/author(s). Publication rights licensed to ACM.
ACM 978-1-4503-3664-2/15/08 ...$15.00.
DOI: http://dx.doi.org/10.1145 ...
NOVEL MACHINE LEARNING ALGORITHMS FOR CENTRALITY AND CLIQUES DETECTION IN YOU...ijaia
The goal of this research project is to analyze the dynamics of social networks using machine learning techniques to locate maximal cliques and to find clusters for the purpose of identifying a target demographic. Unsupervised machine learning techniques are designed and implemented in this project to analyze a dataset from YouTube to discover communities in the social network and find central nodes. Different clustering algorithms are implemented and applied to the YouTube dataset. The well-known Bron-Kerbosch algorithm is used effectively in this research to find maximal cliques. The results obtained from this research could be used for advertising purposes and for building smart recommendation systems. All algorithms were implemented using Python programming language. The experimental results show that we were able to successfully find central nodes through clique-centrality and degree centrality. By utilizing clique detection algorithms, the research shown how machine learning algorithms can detect close knit groups within a larger network.
NOVEL MACHINE LEARNING ALGORITHMS FOR CENTRALITY AND CLIQUES DETECTION IN YOU...gerogepatton
The goal of this research project is to analyze the dynamics of social networks using machine learning techniques to locate maximal cliques and to find clusters for the purpose of identifying a target demographic. Unsupervised machine learning techniques are designed and implemented in this project to analyze a dataset from YouTube to discover communities in the social network and find central nodes. Different clustering algorithms are implemented and applied to the YouTube dataset. The well-known Bron-Kerbosch algorithm is used effectively in this research to find maximal cliques. The results obtained from this research could be used for advertising purposes and for building smart recommendation systems. All algorithms were implemented using Python programming language. The experimental results show that we were able to successfully find central nodes through clique-centrality and degree centrality. By utilizing clique detection algorithms, the research shown how machine learning algorithms can detect close knit groups within a larger network.
Novel Machine Learning Algorithms for Centrality and Cliques Detection in You...gerogepatton
The goal of this research project is to analyze the dynamics of social networks using machine learning techniques to locate maximal cliques and to find clusters for the purpose of dentifying a target
demographic. Unsupervised machine learning techniques are designed and implemented in this project to analyze a dataset from YouTube to discover communities in the social network and find central nodes. Different clustering algorithms are implemented and applied to the YouTube dataset. The well-known Bron-Kerbosch algorithm is used effectively in this research to find maximal cliques. The results obtained
from this research could be used for advertising purposes and for building smart recommendation systems.
All algorithms were implemented using Python programming language. The experimental results show that
we were able to successfully find central nodes through clique-centrality and degree centrality. By utilizing
clique detection algorithms, the research shown how machine learning algorithms can detect close knit
groups within a larger network.
Machine learning based recommender system for e-commerceIAESIJAI
Nowadays, e-commerce is becoming an essential part of business for many reasons, including the simplicity, availability, richness and diversity of products and services, flexibility of payment methods and the convenience of shopping remotely without losing time. These benefits have greatly optimized the lives of users, especially with the technological development of mobile devices and the availability of the Internet anytime and anywhere. Because of their direct impact on the revenue of e-commerce companies, recommender systems are considered a must in this field. Recommender systems detect items that match the customer's needs based on the customer's previous actions and make them appear in an interesting way. Such a customized experience helps to increase customer engagement and purchase rates as the suggested items are tailored to the customer's interests. Therefore, perfecting recommendation systems that allow for more personalized and accurate item recommendations is a major challenge in the e-marketing world. In our study, we succeeded in developing an algorithm to suggest personal recommendations to customers using association rules via the Frequent-Pattern-Growth algorithm. Our technique generated good results with a high average probability of purchasing the next product suggested by the recommendation system.
A Dynamic Intelligent Policies Analysis Mechanism for Personal Data Processin...Konstantinos Demertzis
The evolution of the Internet of Things is significantly a
ected by legal restrictions imposed for personal data handling, such as the European General Data Protection Regulation (GDPR).
The main purpose of this regulation is to provide people in the digital age greater control over their personal data, with their freely given, specific, informed and unambiguous consent to collect and process the data concerning them. ADVOCATE is an advanced framework that fully complies with the requirements of GDPR, which, with the extensive use of blockchain and artificial intelligence technologies, aims to provide an environment that will support users in maintaining control of their personal data in the IoT ecosystem. This paper proposes and presents the Intelligent Policies Analysis Mechanism (IPAM) of the ADVOCATE framework, which, in an intelligent and fully automated manner, can identify conflicting rules or consents of the user, which may lead to the collection of personal data that can be used for profiling. In order to clearly identify and implement IPAM, the problem of recording user data from smart entertainment devices using Fuzzy Cognitive Maps (FCMs) was simulated. FCMs are an intelligent decision-making system that simulates the processes of a complex system, modeling the correlation base, knowing the behavioral and balance specialists of the system. Respectively, identifying conflicting rules that can lead to a profile, training is done using Extreme Learning Machines (ELMs), which are highly ecient neural systems of small and flexible architecture that can work optimally in complex environments.
Appling tracking game system to measure user behavior toward cybersecurity p...IJECEIAES
Institutions wrestle to protect their information from threats and cybercrime. Therefore, it is dedicating a great deal of their concern to improving the information security infrastructure. Users’ behaviors were explored by applying traditional questionnaire as a research instrument in data collocate process. But researchers usually suffer from a lack of respondents' credibility when asking someone to fill out a questionnaire, and the credibility may decline further if the research topic relates to aspects of the use and implementation of information security policies. Therefore, there is insufficient reliability of the respondent's answers to the questionnaire’s questions, and the responses might not reflect the actual behavior based on the human bias when facing the problems theoretically. The current study creates a new idea to track and study the behavior of the respondents by building a tracking game system aligned with the questionnaire whose results are required to be known. The system will allow the respondent to answer the survey questions related to the compliance with the information security policies by tracking their behavior while using the system.
Most of the network habitats retain on facing an ever increasing number of security threats. In early times,
firewalls are used as a security examines point in the network environment. Recently the use of Intrusion
Detection System (IDS) has greatly increased due to its more constructive and robust working than
firewall. An IDS refers to the process of constantly observing the incoming and outgoing traffic of a
network in order to diagnose suspicious behavior. In real scenario most of the environments are dynamic
in nature, which leads to the problem of concept drift, is perturbed with learning from data whose
statistical attribute change over time. Concept drift is impenetrable if the dataset is class-imbalanced. In
this review paper, study of IDS along with different approaches of incremental learning is carried out.
From this study, by applying voting rule to incremental learning a new approach is proposed. Further, the
comparison between existing Fuzzy rule method and proposed approach is done.
Singapore Management UniversityInstitutional Knowledge at Si.docxjennifer822
Singapore Management University
Institutional Knowledge at Singapore Management University
Research Collection Lee Kong Chian School Of
Business
Lee Kong Chian School of Business
10-2016
Big data and data science methods for management
research: From the Editors
Gerard GEORGE
Singapore Management University, [email protected]
Ernst C. OSINGA
Singapore Management University, [email protected]
Dovev LAVIE
Technion
Brent A. SCOTT
Michigan State University
DOI: https://doi.org/10.5465/amj.2016.4005
Follow this and additional works at: https://ink.library.smu.edu.sg/lkcsb_research
Part of the Management Sciences and Quantitative Methods Commons, and the Strategic
Management Policy Commons
This Editorial is brought to you for free and open access by the Lee Kong Chian School of Business at Institutional Knowledge at Singapore
Management University. It has been accepted for inclusion in Research Collection Lee Kong Chian School Of Business by an authorized administrator
of Institutional Knowledge at Singapore Management University. For more information, please email [email protected]
Citation
GEORGE, Gerard; Ernst C. OSINGA; LAVIE, Dovev; and SCOTT, Brent A.. Big data and data science methods for management
research: From the Editors. (2016). Academy of Management Journal. 59, (5), 1493-1507. Research Collection Lee Kong Chian School
Of Business.
Available at: https://ink.library.smu.edu.sg/lkcsb_research/4964
https://ink.library.smu.edu.sg?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
https://ink.library.smu.edu.sg/lkcsb_research?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
https://ink.library.smu.edu.sg/lkcsb_research?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
https://ink.library.smu.edu.sg/lkcsb?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
https://doi.org/10.5465/amj.2016.4005
https://ink.library.smu.edu.sg/lkcsb_research?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
http://network.bepress.com/hgg/discipline/637?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
http://network.bepress.com/hgg/discipline/642?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
http://network.bepress.com/hgg/discipline/642?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
mailto:[email protected]
1
FROM THE EDITORS
BIG DATA AND DATA SCIENCE METHODS FOR MANAGEMENT RESEARCH
Published in Academy of Management Journal, October 2016, 59 (5), pp. 1493-1507.
http://doi.org/10.5465/amj.2016.4005
The recent advent of remote sensing, mobile technologies, novel transaction systems, and
high performance computing offers opportunities to un.
Singapore Management UniversityInstitutional Knowledge at Si.docxedgar6wallace88877
Singapore Management University
Institutional Knowledge at Singapore Management University
Research Collection Lee Kong Chian School Of
Business
Lee Kong Chian School of Business
10-2016
Big data and data science methods for management
research: From the Editors
Gerard GEORGE
Singapore Management University, [email protected]
Ernst C. OSINGA
Singapore Management University, [email protected]
Dovev LAVIE
Technion
Brent A. SCOTT
Michigan State University
DOI: https://doi.org/10.5465/amj.2016.4005
Follow this and additional works at: https://ink.library.smu.edu.sg/lkcsb_research
Part of the Management Sciences and Quantitative Methods Commons, and the Strategic
Management Policy Commons
This Editorial is brought to you for free and open access by the Lee Kong Chian School of Business at Institutional Knowledge at Singapore
Management University. It has been accepted for inclusion in Research Collection Lee Kong Chian School Of Business by an authorized administrator
of Institutional Knowledge at Singapore Management University. For more information, please email [email protected]
Citation
GEORGE, Gerard; Ernst C. OSINGA; LAVIE, Dovev; and SCOTT, Brent A.. Big data and data science methods for management
research: From the Editors. (2016). Academy of Management Journal. 59, (5), 1493-1507. Research Collection Lee Kong Chian School
Of Business.
Available at: https://ink.library.smu.edu.sg/lkcsb_research/4964
https://ink.library.smu.edu.sg?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
https://ink.library.smu.edu.sg/lkcsb_research?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
https://ink.library.smu.edu.sg/lkcsb_research?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
https://ink.library.smu.edu.sg/lkcsb?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
https://doi.org/10.5465/amj.2016.4005
https://ink.library.smu.edu.sg/lkcsb_research?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
http://network.bepress.com/hgg/discipline/637?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
http://network.bepress.com/hgg/discipline/642?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
http://network.bepress.com/hgg/discipline/642?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
mailto:[email protected]
1
FROM THE EDITORS
BIG DATA AND DATA SCIENCE METHODS FOR MANAGEMENT RESEARCH
Published in Academy of Management Journal, October 2016, 59 (5), pp. 1493-1507.
http://doi.org/10.5465/amj.2016.4005
The recent advent of remote sensing, mobile technologies, novel transaction systems, and
high performance computing offers opportunities to un.
Social Media Datasets for Analysis and Modeling Drug Usageijtsrd
This paper based on the research carried out in the area of data mining depends for managing bulk amount of data with mining in social media on using composite applications for performing more sophisticated analysis. Enhancement of social media may address this need. The objective of this paper is to introduce such type of tool which used in social network to characterised Medicine Usage. This paper outlined a structured approach to analyse social media in order to capture emerging trends in medicine abuse by applying powerful methods like Machine Learning. This paper describes how to fetch important data for analysis from social network. Then big data techniques to extract useful content for analysis are discussed. Sindhu S. B | Dr. B. N Veerappa "Social Media Datasets for Analysis and Modeling Drug Usage" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd25246.pdfPaper URL: https://www.ijtsrd.com/engineering/computer-engineering/25246/social-media-datasets-for-analysis-and-modeling-drug-usage/sindhu-s-b
Assessment of the main features of the model of dissemination of information ...IJECEIAES
Social networks provide a fairly wide range of data that allows one way or another to evaluate the effect of the dissemination of information. This article presents the results of a study that describes methods for determining the key parameters of the model needed to analyze and predict the dissemination of information in social networks. An approach based on the analysis of statistical data on user behavior in social networks is proposed. The process of evaluating the main features of the model is described, including the mathematical methods used for data analysis and information dissemination modeling. The study aims to understand the processes of information dissemination in social networks and develop recommendations for the effective use of social networks as a communication and brand promotion tool, as well as to consider the analytical properties of the classical susceptible-infected-removed (SIR) model and evaluate its applicability to the problem of information dissemination. The results of the study can be used to create algorithms and techniques that will effectively manage the process of information dissemination in social networks.
Depression and anxiety detection through the Closed-Loop method using DASS-21TELKOMNIKA JOURNAL
The change of information and communication technology has brought many changes in daily
life. The way humans interacting is changing. It is possible to express each form of communication directly
and instantly. Social media has contributed data in size, diversity and capacity and quality. Based on it,
the idea was to see and measure the tendency of depression and anxiety through social media using
the Closed-Loop method using Facebook text mining posts. Through the stages of pre-processing
including text extraction using the Naïve Bayes machine learning model for text classification, the early
signs of depression and anxiety are measured using DASS-21 parameter. In total, 22,934 Facebook posts
were contributed as training and learning data collected from July 2017 until July 2018. As a results,
analysis and mapping of social demographics of users that are usually as a trigger of depression, and
anxiety, such as grief, illness, household affairs, children education and others are available.
DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...MITAILibrary
Vishal Coodye is an MIT fellow who has contributed to the Robotics & AI Technology since 2010. His contributions to the scientific comminity brings the world to new horizons. MIT. Library. USA.
Intelligent analysis of the effect of internetIJCI JOURNAL
This paper analyzes the effect of Information Technology on professionals, academicians and students in
perspective of their relations, education, job purpose, health, entertainment and electronic business that
can bring changes in society. Technology can have both positive and negative consequences for people of
different walks of life at different times. The need is to understand the true impact of internet on the society
so that people can start thinking and build a healthy society. In this paper, an empirical study is considered
60 persons; a causal loop model is formed relating the parameters on the basis of data collected. These
parameters are used to form the fuzzy dynamic model to analyze the effect of the internet on the society.
The model is analyzed and suitable solutions are proposed to counter the negative effect of internet on our
society.
APPLICATIONS OF DEEP LEARNING AND MACHINE LEARNING IN HEALTHCARE DOMAIN – A L...IAEME Publication
Artificial intelligence (AI) has been developing rapidly in recent years in terms of software algorithms, hardware implementation, and applications in a vast number of areas. In this review, we summarize the latest developments of applications of AI in biomedicine, including disease diagnostics, living assistance, biomedical information processing, and biomedical research. Various automated systems and tools like Braincomputer interfaces (BCIs), arterial spin labelling (ASL) imaging, ASL-MRI, biomarkers, Natural language processing (NLP) and various algorithms helps to minimize errors and control disease progression. The computer assisted diagnosis, decision support systems, expert systems and implementation of software may assist physicians to minimize the intra and inter-observer variability. In this paper, a detailed literature review on application and implementation of Machine Learning, Deep Learning and Artificial Intelligence in the healthcare industry by various researchers.
Online Data Preprocessing: A Case Study ApproachIJECEIAES
Besides the Internet search facility and e-mails, social networking is now one of the three best uses of the Internet. A tremendous number of volunteers every day write articles, share photos, videos and links at a scope and scale never imagined before. However, because social network data are huge and come from heterogeneous sources, the data are highly susceptible to inconsistency, redundancy, noise, and loss. For data scientists, preparing the data and getting it into a standard format is critical because the quality of data is going to directly affect the performance of mining algorithms that are going to be applied next. Low-quality data will certainly limit the analysis and lower the quality of mining results. To this end, the goal of this study is to provide an overview of the different phases involved in data preprocessing, with a focus on social network data. As a case study, we will show how we applied preprocessing to the data that we collected for the Malaysian Flight MH370 that disappeared in 2014.
Framework for A Personalized Intelligent Assistant to Elderly People for Acti...CSCJournals
The increasing population of elderly people is associated with the need to meet their increasing requirements and to provide solutions that can improve their quality of life in a smart home. In addition to fear and anxiety towards interfacing with systems; cognitive disabilities, weakened memory, disorganized behavior and even physical limitations are some of the problems that elderly people tend to face with increasing age. The essence of providing technology-based solutions to address these needs of elderly people and to create smart and assisted living spaces for the elderly; lies in developing systems that can adapt by addressing their diversity and can augment their performances in the context of their day to day goals. Therefore, this work proposes a framework for development of a Personalized Intelligent Assistant to help elderly people perform Activities of Daily Living (ADLs) in a smart and connected Internet of Things (IoT) based environment. This Personalized Intelligent Assistant can analyze different tasks performed by the user and recommend activities by considering their daily routine, current affective state and the underlining user experience. To uphold the efficacy of this proposed framework, it has been tested on a couple of datasets for modelling an "average user" and a "specific user" respectively. The results presented show that the model achieves a performance accuracy of 73.12% when modelling a "specific user", which is considerably higher than its performance while modelling an "average user", this upholds the relevance for development and implementation of this proposed framework.
Sentimental classification analysis of polarity multi-view textual data using...IJECEIAES
The data and information available in most community environments is complex in nature. Sentimental data resources may possibly consist of textual data collected from multiple information sources with different representations and usually handled by different analytical models. These types of data resource characteristics can form multi-view polarity textual data. However, knowledge creation from this type of sentimental textual data requires considerable analytical efforts and capabilities. In particular, data mining practices can provide exceptional results in handling textual data formats. Besides, in the case of the textual data exists as multi-view or unstructured data formats, the hybrid and integrated analysis efforts of text data mining algorithms are vital to get helpful results. The objective of this research is to enhance the knowledge discovery from sentimental multi-view textual data which can be considered as unstructured data format to classify the polarity information documents in the form of two different categories or types of useful information. A proposed framework with integrated data mining algorithms has been discussed in this paper, which is achieved through the application of X-means algorithm for clustering and HotSpot algorithm of association rules. The analysis results have shown improved accuracies of classifying the sentimental multi-view textual data into two categories through the application of the proposed framework on online polarity user-reviews dataset upon a given topics.
Bibliometric analysis highlighting the role of women in addressing climate ch...IJECEIAES
Fossil fuel consumption increased quickly, contributing to climate change
that is evident in unusual flooding and draughts, and global warming. Over
the past ten years, women's involvement in society has grown dramatically,
and they succeeded in playing a noticeable role in reducing climate change.
A bibliometric analysis of data from the last ten years has been carried out to
examine the role of women in addressing the climate change. The analysis's
findings discussed the relevant to the sustainable development goals (SDGs),
particularly SDG 7 and SDG 13. The results considered contributions made
by women in the various sectors while taking geographic dispersion into
account. The bibliometric analysis delves into topics including women's
leadership in environmental groups, their involvement in policymaking, their
contributions to sustainable development projects, and the influence of
gender diversity on attempts to mitigate climate change. This study's results
highlight how women have influenced policies and actions related to climate
change, point out areas of research deficiency and recommendations on how
to increase role of the women in addressing the climate change and
achieving sustainability. To achieve more successful results, this initiative
aims to highlight the significance of gender equality and encourage
inclusivity in climate change decision-making processes.
Voltage and frequency control of microgrid in presence of micro-turbine inter...IJECEIAES
The active and reactive load changes have a significant impact on voltage
and frequency. In this paper, in order to stabilize the microgrid (MG) against
load variations in islanding mode, the active and reactive power of all
distributed generators (DGs), including energy storage (battery), diesel
generator, and micro-turbine, are controlled. The micro-turbine generator is
connected to MG through a three-phase to three-phase matrix converter, and
the droop control method is applied for controlling the voltage and
frequency of MG. In addition, a method is introduced for voltage and
frequency control of micro-turbines in the transition state from gridconnected mode to islanding mode. A novel switching strategy of the matrix
converter is used for converting the high-frequency output voltage of the
micro-turbine to the grid-side frequency of the utility system. Moreover,
using the switching strategy, the low-order harmonics in the output current
and voltage are not produced, and consequently, the size of the output filter
would be reduced. In fact, the suggested control strategy is load-independent
and has no frequency conversion restrictions. The proposed approach for
voltage and frequency regulation demonstrates exceptional performance and
favorable response across various load alteration scenarios. The suggested
strategy is examined in several scenarios in the MG test systems, and the
simulation results are addressed.
More Related Content
Similar to Predicting user behavior using data profiling and hidden Markov model
Appling tracking game system to measure user behavior toward cybersecurity p...IJECEIAES
Institutions wrestle to protect their information from threats and cybercrime. Therefore, it is dedicating a great deal of their concern to improving the information security infrastructure. Users’ behaviors were explored by applying traditional questionnaire as a research instrument in data collocate process. But researchers usually suffer from a lack of respondents' credibility when asking someone to fill out a questionnaire, and the credibility may decline further if the research topic relates to aspects of the use and implementation of information security policies. Therefore, there is insufficient reliability of the respondent's answers to the questionnaire’s questions, and the responses might not reflect the actual behavior based on the human bias when facing the problems theoretically. The current study creates a new idea to track and study the behavior of the respondents by building a tracking game system aligned with the questionnaire whose results are required to be known. The system will allow the respondent to answer the survey questions related to the compliance with the information security policies by tracking their behavior while using the system.
Most of the network habitats retain on facing an ever increasing number of security threats. In early times,
firewalls are used as a security examines point in the network environment. Recently the use of Intrusion
Detection System (IDS) has greatly increased due to its more constructive and robust working than
firewall. An IDS refers to the process of constantly observing the incoming and outgoing traffic of a
network in order to diagnose suspicious behavior. In real scenario most of the environments are dynamic
in nature, which leads to the problem of concept drift, is perturbed with learning from data whose
statistical attribute change over time. Concept drift is impenetrable if the dataset is class-imbalanced. In
this review paper, study of IDS along with different approaches of incremental learning is carried out.
From this study, by applying voting rule to incremental learning a new approach is proposed. Further, the
comparison between existing Fuzzy rule method and proposed approach is done.
Singapore Management UniversityInstitutional Knowledge at Si.docxjennifer822
Singapore Management University
Institutional Knowledge at Singapore Management University
Research Collection Lee Kong Chian School Of
Business
Lee Kong Chian School of Business
10-2016
Big data and data science methods for management
research: From the Editors
Gerard GEORGE
Singapore Management University, [email protected]
Ernst C. OSINGA
Singapore Management University, [email protected]
Dovev LAVIE
Technion
Brent A. SCOTT
Michigan State University
DOI: https://doi.org/10.5465/amj.2016.4005
Follow this and additional works at: https://ink.library.smu.edu.sg/lkcsb_research
Part of the Management Sciences and Quantitative Methods Commons, and the Strategic
Management Policy Commons
This Editorial is brought to you for free and open access by the Lee Kong Chian School of Business at Institutional Knowledge at Singapore
Management University. It has been accepted for inclusion in Research Collection Lee Kong Chian School Of Business by an authorized administrator
of Institutional Knowledge at Singapore Management University. For more information, please email [email protected]
Citation
GEORGE, Gerard; Ernst C. OSINGA; LAVIE, Dovev; and SCOTT, Brent A.. Big data and data science methods for management
research: From the Editors. (2016). Academy of Management Journal. 59, (5), 1493-1507. Research Collection Lee Kong Chian School
Of Business.
Available at: https://ink.library.smu.edu.sg/lkcsb_research/4964
https://ink.library.smu.edu.sg?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
https://ink.library.smu.edu.sg/lkcsb_research?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
https://ink.library.smu.edu.sg/lkcsb_research?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
https://ink.library.smu.edu.sg/lkcsb?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
https://doi.org/10.5465/amj.2016.4005
https://ink.library.smu.edu.sg/lkcsb_research?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
http://network.bepress.com/hgg/discipline/637?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
http://network.bepress.com/hgg/discipline/642?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
http://network.bepress.com/hgg/discipline/642?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
mailto:[email protected]
1
FROM THE EDITORS
BIG DATA AND DATA SCIENCE METHODS FOR MANAGEMENT RESEARCH
Published in Academy of Management Journal, October 2016, 59 (5), pp. 1493-1507.
http://doi.org/10.5465/amj.2016.4005
The recent advent of remote sensing, mobile technologies, novel transaction systems, and
high performance computing offers opportunities to un.
Singapore Management UniversityInstitutional Knowledge at Si.docxedgar6wallace88877
Singapore Management University
Institutional Knowledge at Singapore Management University
Research Collection Lee Kong Chian School Of
Business
Lee Kong Chian School of Business
10-2016
Big data and data science methods for management
research: From the Editors
Gerard GEORGE
Singapore Management University, [email protected]
Ernst C. OSINGA
Singapore Management University, [email protected]
Dovev LAVIE
Technion
Brent A. SCOTT
Michigan State University
DOI: https://doi.org/10.5465/amj.2016.4005
Follow this and additional works at: https://ink.library.smu.edu.sg/lkcsb_research
Part of the Management Sciences and Quantitative Methods Commons, and the Strategic
Management Policy Commons
This Editorial is brought to you for free and open access by the Lee Kong Chian School of Business at Institutional Knowledge at Singapore
Management University. It has been accepted for inclusion in Research Collection Lee Kong Chian School Of Business by an authorized administrator
of Institutional Knowledge at Singapore Management University. For more information, please email [email protected]
Citation
GEORGE, Gerard; Ernst C. OSINGA; LAVIE, Dovev; and SCOTT, Brent A.. Big data and data science methods for management
research: From the Editors. (2016). Academy of Management Journal. 59, (5), 1493-1507. Research Collection Lee Kong Chian School
Of Business.
Available at: https://ink.library.smu.edu.sg/lkcsb_research/4964
https://ink.library.smu.edu.sg?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
https://ink.library.smu.edu.sg/lkcsb_research?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
https://ink.library.smu.edu.sg/lkcsb_research?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
https://ink.library.smu.edu.sg/lkcsb?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
https://doi.org/10.5465/amj.2016.4005
https://ink.library.smu.edu.sg/lkcsb_research?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
http://network.bepress.com/hgg/discipline/637?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
http://network.bepress.com/hgg/discipline/642?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
http://network.bepress.com/hgg/discipline/642?utm_source=ink.library.smu.edu.sg%2Flkcsb_research%2F4964&utm_medium=PDF&utm_campaign=PDFCoverPages
mailto:[email protected]
1
FROM THE EDITORS
BIG DATA AND DATA SCIENCE METHODS FOR MANAGEMENT RESEARCH
Published in Academy of Management Journal, October 2016, 59 (5), pp. 1493-1507.
http://doi.org/10.5465/amj.2016.4005
The recent advent of remote sensing, mobile technologies, novel transaction systems, and
high performance computing offers opportunities to un.
Social Media Datasets for Analysis and Modeling Drug Usageijtsrd
This paper based on the research carried out in the area of data mining depends for managing bulk amount of data with mining in social media on using composite applications for performing more sophisticated analysis. Enhancement of social media may address this need. The objective of this paper is to introduce such type of tool which used in social network to characterised Medicine Usage. This paper outlined a structured approach to analyse social media in order to capture emerging trends in medicine abuse by applying powerful methods like Machine Learning. This paper describes how to fetch important data for analysis from social network. Then big data techniques to extract useful content for analysis are discussed. Sindhu S. B | Dr. B. N Veerappa "Social Media Datasets for Analysis and Modeling Drug Usage" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd25246.pdfPaper URL: https://www.ijtsrd.com/engineering/computer-engineering/25246/social-media-datasets-for-analysis-and-modeling-drug-usage/sindhu-s-b
Assessment of the main features of the model of dissemination of information ...IJECEIAES
Social networks provide a fairly wide range of data that allows one way or another to evaluate the effect of the dissemination of information. This article presents the results of a study that describes methods for determining the key parameters of the model needed to analyze and predict the dissemination of information in social networks. An approach based on the analysis of statistical data on user behavior in social networks is proposed. The process of evaluating the main features of the model is described, including the mathematical methods used for data analysis and information dissemination modeling. The study aims to understand the processes of information dissemination in social networks and develop recommendations for the effective use of social networks as a communication and brand promotion tool, as well as to consider the analytical properties of the classical susceptible-infected-removed (SIR) model and evaluate its applicability to the problem of information dissemination. The results of the study can be used to create algorithms and techniques that will effectively manage the process of information dissemination in social networks.
Depression and anxiety detection through the Closed-Loop method using DASS-21TELKOMNIKA JOURNAL
The change of information and communication technology has brought many changes in daily
life. The way humans interacting is changing. It is possible to express each form of communication directly
and instantly. Social media has contributed data in size, diversity and capacity and quality. Based on it,
the idea was to see and measure the tendency of depression and anxiety through social media using
the Closed-Loop method using Facebook text mining posts. Through the stages of pre-processing
including text extraction using the Naïve Bayes machine learning model for text classification, the early
signs of depression and anxiety are measured using DASS-21 parameter. In total, 22,934 Facebook posts
were contributed as training and learning data collected from July 2017 until July 2018. As a results,
analysis and mapping of social demographics of users that are usually as a trigger of depression, and
anxiety, such as grief, illness, household affairs, children education and others are available.
DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...MITAILibrary
Vishal Coodye is an MIT fellow who has contributed to the Robotics & AI Technology since 2010. His contributions to the scientific comminity brings the world to new horizons. MIT. Library. USA.
Intelligent analysis of the effect of internetIJCI JOURNAL
This paper analyzes the effect of Information Technology on professionals, academicians and students in
perspective of their relations, education, job purpose, health, entertainment and electronic business that
can bring changes in society. Technology can have both positive and negative consequences for people of
different walks of life at different times. The need is to understand the true impact of internet on the society
so that people can start thinking and build a healthy society. In this paper, an empirical study is considered
60 persons; a causal loop model is formed relating the parameters on the basis of data collected. These
parameters are used to form the fuzzy dynamic model to analyze the effect of the internet on the society.
The model is analyzed and suitable solutions are proposed to counter the negative effect of internet on our
society.
APPLICATIONS OF DEEP LEARNING AND MACHINE LEARNING IN HEALTHCARE DOMAIN – A L...IAEME Publication
Artificial intelligence (AI) has been developing rapidly in recent years in terms of software algorithms, hardware implementation, and applications in a vast number of areas. In this review, we summarize the latest developments of applications of AI in biomedicine, including disease diagnostics, living assistance, biomedical information processing, and biomedical research. Various automated systems and tools like Braincomputer interfaces (BCIs), arterial spin labelling (ASL) imaging, ASL-MRI, biomarkers, Natural language processing (NLP) and various algorithms helps to minimize errors and control disease progression. The computer assisted diagnosis, decision support systems, expert systems and implementation of software may assist physicians to minimize the intra and inter-observer variability. In this paper, a detailed literature review on application and implementation of Machine Learning, Deep Learning and Artificial Intelligence in the healthcare industry by various researchers.
Online Data Preprocessing: A Case Study ApproachIJECEIAES
Besides the Internet search facility and e-mails, social networking is now one of the three best uses of the Internet. A tremendous number of volunteers every day write articles, share photos, videos and links at a scope and scale never imagined before. However, because social network data are huge and come from heterogeneous sources, the data are highly susceptible to inconsistency, redundancy, noise, and loss. For data scientists, preparing the data and getting it into a standard format is critical because the quality of data is going to directly affect the performance of mining algorithms that are going to be applied next. Low-quality data will certainly limit the analysis and lower the quality of mining results. To this end, the goal of this study is to provide an overview of the different phases involved in data preprocessing, with a focus on social network data. As a case study, we will show how we applied preprocessing to the data that we collected for the Malaysian Flight MH370 that disappeared in 2014.
Framework for A Personalized Intelligent Assistant to Elderly People for Acti...CSCJournals
The increasing population of elderly people is associated with the need to meet their increasing requirements and to provide solutions that can improve their quality of life in a smart home. In addition to fear and anxiety towards interfacing with systems; cognitive disabilities, weakened memory, disorganized behavior and even physical limitations are some of the problems that elderly people tend to face with increasing age. The essence of providing technology-based solutions to address these needs of elderly people and to create smart and assisted living spaces for the elderly; lies in developing systems that can adapt by addressing their diversity and can augment their performances in the context of their day to day goals. Therefore, this work proposes a framework for development of a Personalized Intelligent Assistant to help elderly people perform Activities of Daily Living (ADLs) in a smart and connected Internet of Things (IoT) based environment. This Personalized Intelligent Assistant can analyze different tasks performed by the user and recommend activities by considering their daily routine, current affective state and the underlining user experience. To uphold the efficacy of this proposed framework, it has been tested on a couple of datasets for modelling an "average user" and a "specific user" respectively. The results presented show that the model achieves a performance accuracy of 73.12% when modelling a "specific user", which is considerably higher than its performance while modelling an "average user", this upholds the relevance for development and implementation of this proposed framework.
Sentimental classification analysis of polarity multi-view textual data using...IJECEIAES
The data and information available in most community environments is complex in nature. Sentimental data resources may possibly consist of textual data collected from multiple information sources with different representations and usually handled by different analytical models. These types of data resource characteristics can form multi-view polarity textual data. However, knowledge creation from this type of sentimental textual data requires considerable analytical efforts and capabilities. In particular, data mining practices can provide exceptional results in handling textual data formats. Besides, in the case of the textual data exists as multi-view or unstructured data formats, the hybrid and integrated analysis efforts of text data mining algorithms are vital to get helpful results. The objective of this research is to enhance the knowledge discovery from sentimental multi-view textual data which can be considered as unstructured data format to classify the polarity information documents in the form of two different categories or types of useful information. A proposed framework with integrated data mining algorithms has been discussed in this paper, which is achieved through the application of X-means algorithm for clustering and HotSpot algorithm of association rules. The analysis results have shown improved accuracies of classifying the sentimental multi-view textual data into two categories through the application of the proposed framework on online polarity user-reviews dataset upon a given topics.
Similar to Predicting user behavior using data profiling and hidden Markov model (20)
Bibliometric analysis highlighting the role of women in addressing climate ch...IJECEIAES
Fossil fuel consumption increased quickly, contributing to climate change
that is evident in unusual flooding and draughts, and global warming. Over
the past ten years, women's involvement in society has grown dramatically,
and they succeeded in playing a noticeable role in reducing climate change.
A bibliometric analysis of data from the last ten years has been carried out to
examine the role of women in addressing the climate change. The analysis's
findings discussed the relevant to the sustainable development goals (SDGs),
particularly SDG 7 and SDG 13. The results considered contributions made
by women in the various sectors while taking geographic dispersion into
account. The bibliometric analysis delves into topics including women's
leadership in environmental groups, their involvement in policymaking, their
contributions to sustainable development projects, and the influence of
gender diversity on attempts to mitigate climate change. This study's results
highlight how women have influenced policies and actions related to climate
change, point out areas of research deficiency and recommendations on how
to increase role of the women in addressing the climate change and
achieving sustainability. To achieve more successful results, this initiative
aims to highlight the significance of gender equality and encourage
inclusivity in climate change decision-making processes.
Voltage and frequency control of microgrid in presence of micro-turbine inter...IJECEIAES
The active and reactive load changes have a significant impact on voltage
and frequency. In this paper, in order to stabilize the microgrid (MG) against
load variations in islanding mode, the active and reactive power of all
distributed generators (DGs), including energy storage (battery), diesel
generator, and micro-turbine, are controlled. The micro-turbine generator is
connected to MG through a three-phase to three-phase matrix converter, and
the droop control method is applied for controlling the voltage and
frequency of MG. In addition, a method is introduced for voltage and
frequency control of micro-turbines in the transition state from gridconnected mode to islanding mode. A novel switching strategy of the matrix
converter is used for converting the high-frequency output voltage of the
micro-turbine to the grid-side frequency of the utility system. Moreover,
using the switching strategy, the low-order harmonics in the output current
and voltage are not produced, and consequently, the size of the output filter
would be reduced. In fact, the suggested control strategy is load-independent
and has no frequency conversion restrictions. The proposed approach for
voltage and frequency regulation demonstrates exceptional performance and
favorable response across various load alteration scenarios. The suggested
strategy is examined in several scenarios in the MG test systems, and the
simulation results are addressed.
Enhancing battery system identification: nonlinear autoregressive modeling fo...IJECEIAES
Precisely characterizing Li-ion batteries is essential for optimizing their
performance, enhancing safety, and prolonging their lifespan across various
applications, such as electric vehicles and renewable energy systems. This
article introduces an innovative nonlinear methodology for system
identification of a Li-ion battery, employing a nonlinear autoregressive with
exogenous inputs (NARX) model. The proposed approach integrates the
benefits of nonlinear modeling with the adaptability of the NARX structure,
facilitating a more comprehensive representation of the intricate
electrochemical processes within the battery. Experimental data collected
from a Li-ion battery operating under diverse scenarios are employed to
validate the effectiveness of the proposed methodology. The identified
NARX model exhibits superior accuracy in predicting the battery's behavior
compared to traditional linear models. This study underscores the
importance of accounting for nonlinearities in battery modeling, providing
insights into the intricate relationships between state-of-charge, voltage, and
current under dynamic conditions.
Smart grid deployment: from a bibliometric analysis to a surveyIJECEIAES
Smart grids are one of the last decades' innovations in electrical energy.
They bring relevant advantages compared to the traditional grid and
significant interest from the research community. Assessing the field's
evolution is essential to propose guidelines for facing new and future smart
grid challenges. In addition, knowing the main technologies involved in the
deployment of smart grids (SGs) is important to highlight possible
shortcomings that can be mitigated by developing new tools. This paper
contributes to the research trends mentioned above by focusing on two
objectives. First, a bibliometric analysis is presented to give an overview of
the current research level about smart grid deployment. Second, a survey of
the main technological approaches used for smart grid implementation and
their contributions are highlighted. To that effect, we searched the Web of
Science (WoS), and the Scopus databases. We obtained 5,663 documents
from WoS and 7,215 from Scopus on smart grid implementation or
deployment. With the extraction limitation in the Scopus database, 5,872 of
the 7,215 documents were extracted using a multi-step process. These two
datasets have been analyzed using a bibliometric tool called bibliometrix.
The main outputs are presented with some recommendations for future
research.
Use of analytical hierarchy process for selecting and prioritizing islanding ...IJECEIAES
One of the problems that are associated to power systems is islanding
condition, which must be rapidly and properly detected to prevent any
negative consequences on the system's protection, stability, and security.
This paper offers a thorough overview of several islanding detection
strategies, which are divided into two categories: classic approaches,
including local and remote approaches, and modern techniques, including
techniques based on signal processing and computational intelligence.
Additionally, each approach is compared and assessed based on several
factors, including implementation costs, non-detected zones, declining
power quality, and response times using the analytical hierarchy process
(AHP). The multi-criteria decision-making analysis shows that the overall
weight of passive methods (24.7%), active methods (7.8%), hybrid methods
(5.6%), remote methods (14.5%), signal processing-based methods (26.6%),
and computational intelligent-based methods (20.8%) based on the
comparison of all criteria together. Thus, it can be seen from the total weight
that hybrid approaches are the least suitable to be chosen, while signal
processing-based methods are the most appropriate islanding detection
method to be selected and implemented in power system with respect to the
aforementioned factors. Using Expert Choice software, the proposed
hierarchy model is studied and examined.
Enhancing of single-stage grid-connected photovoltaic system using fuzzy logi...IJECEIAES
The power generated by photovoltaic (PV) systems is influenced by
environmental factors. This variability hampers the control and utilization of
solar cells' peak output. In this study, a single-stage grid-connected PV
system is designed to enhance power quality. Our approach employs fuzzy
logic in the direct power control (DPC) of a three-phase voltage source
inverter (VSI), enabling seamless integration of the PV connected to the
grid. Additionally, a fuzzy logic-based maximum power point tracking
(MPPT) controller is adopted, which outperforms traditional methods like
incremental conductance (INC) in enhancing solar cell efficiency and
minimizing the response time. Moreover, the inverter's real-time active and
reactive power is directly managed to achieve a unity power factor (UPF).
The system's performance is assessed through MATLAB/Simulink
implementation, showing marked improvement over conventional methods,
particularly in steady-state and varying weather conditions. For solar
irradiances of 500 and 1,000 W/m2
, the results show that the proposed
method reduces the total harmonic distortion (THD) of the injected current
to the grid by approximately 46% and 38% compared to conventional
methods, respectively. Furthermore, we compare the simulation results with
IEEE standards to evaluate the system's grid compatibility.
Enhancing photovoltaic system maximum power point tracking with fuzzy logic-b...IJECEIAES
Photovoltaic systems have emerged as a promising energy resource that
caters to the future needs of society, owing to their renewable, inexhaustible,
and cost-free nature. The power output of these systems relies on solar cell
radiation and temperature. In order to mitigate the dependence on
atmospheric conditions and enhance power tracking, a conventional
approach has been improved by integrating various methods. To optimize
the generation of electricity from solar systems, the maximum power point
tracking (MPPT) technique is employed. To overcome limitations such as
steady-state voltage oscillations and improve transient response, two
traditional MPPT methods, namely fuzzy logic controller (FLC) and perturb
and observe (P&O), have been modified. This research paper aims to
simulate and validate the step size of the proposed modified P&O and FLC
techniques within the MPPT algorithm using MATLAB/Simulink for
efficient power tracking in photovoltaic systems.
Adaptive synchronous sliding control for a robot manipulator based on neural ...IJECEIAES
Robot manipulators have become important equipment in production lines, medical fields, and transportation. Improving the quality of trajectory tracking for
robot hands is always an attractive topic in the research community. This is a
challenging problem because robot manipulators are complex nonlinear systems
and are often subject to fluctuations in loads and external disturbances. This
article proposes an adaptive synchronous sliding control scheme to improve trajectory tracking performance for a robot manipulator. The proposed controller
ensures that the positions of the joints track the desired trajectory, synchronize
the errors, and significantly reduces chattering. First, the synchronous tracking
errors and synchronous sliding surfaces are presented. Second, the synchronous
tracking error dynamics are determined. Third, a robust adaptive control law is
designed,the unknown components of the model are estimated online by the neural network, and the parameters of the switching elements are selected by fuzzy
logic. The built algorithm ensures that the tracking and approximation errors
are ultimately uniformly bounded (UUB). Finally, the effectiveness of the constructed algorithm is demonstrated through simulation and experimental results.
Simulation and experimental results show that the proposed controller is effective with small synchronous tracking errors, and the chattering phenomenon is
significantly reduced.
Remote field-programmable gate array laboratory for signal acquisition and de...IJECEIAES
A remote laboratory utilizing field-programmable gate array (FPGA) technologies enhances students’ learning experience anywhere and anytime in embedded system design. Existing remote laboratories prioritize hardware access and visual feedback for observing board behavior after programming, neglecting comprehensive debugging tools to resolve errors that require internal signal acquisition. This paper proposes a novel remote embeddedsystem design approach targeting FPGA technologies that are fully interactive via a web-based platform. Our solution provides FPGA board access and debugging capabilities beyond the visual feedback provided by existing remote laboratories. We implemented a lab module that allows users to seamlessly incorporate into their FPGA design. The module minimizes hardware resource utilization while enabling the acquisition of a large number of data samples from the signal during the experiments by adaptively compressing the signal prior to data transmission. The results demonstrate an average compression ratio of 2.90 across three benchmark signals, indicating efficient signal acquisition and effective debugging and analysis. This method allows users to acquire more data samples than conventional methods. The proposed lab allows students to remotely test and debug their designs, bridging the gap between theory and practice in embedded system design.
Detecting and resolving feature envy through automated machine learning and m...IJECEIAES
Efficiently identifying and resolving code smells enhances software project quality. This paper presents a novel solution, utilizing automated machine learning (AutoML) techniques, to detect code smells and apply move method refactoring. By evaluating code metrics before and after refactoring, we assessed its impact on coupling, complexity, and cohesion. Key contributions of this research include a unique dataset for code smell classification and the development of models using AutoGluon for optimal performance. Furthermore, the study identifies the top 20 influential features in classifying feature envy, a well-known code smell, stemming from excessive reliance on external classes. We also explored how move method refactoring addresses feature envy, revealing reduced coupling and complexity, and improved cohesion, ultimately enhancing code quality. In summary, this research offers an empirical, data-driven approach, integrating AutoML and move method refactoring to optimize software project quality. Insights gained shed light on the benefits of refactoring on code quality and the significance of specific features in detecting feature envy. Future research can expand to explore additional refactoring techniques and a broader range of code metrics, advancing software engineering practices and standards.
Smart monitoring technique for solar cell systems using internet of things ba...IJECEIAES
Rapidly and remotely monitoring and receiving the solar cell systems status parameters, solar irradiance, temperature, and humidity, are critical issues in enhancement their efficiency. Hence, in the present article an improved smart prototype of internet of things (IoT) technique based on embedded system through NodeMCU ESP8266 (ESP-12E) was carried out experimentally. Three different regions at Egypt; Luxor, Cairo, and El-Beheira cities were chosen to study their solar irradiance profile, temperature, and humidity by the proposed IoT system. The monitoring data of solar irradiance, temperature, and humidity were live visualized directly by Ubidots through hypertext transfer protocol (HTTP) protocol. The measured solar power radiation in Luxor, Cairo, and El-Beheira ranged between 216-1000, 245-958, and 187-692 W/m 2 respectively during the solar day. The accuracy and rapidity of obtaining monitoring results using the proposed IoT system made it a strong candidate for application in monitoring solar cell systems. On the other hand, the obtained solar power radiation results of the three considered regions strongly candidate Luxor and Cairo as suitable places to build up a solar cells system station rather than El-Beheira.
An efficient security framework for intrusion detection and prevention in int...IJECEIAES
Over the past few years, the internet of things (IoT) has advanced to connect billions of smart devices to improve quality of life. However, anomalies or malicious intrusions pose several security loopholes, leading to performance degradation and threat to data security in IoT operations. Thereby, IoT security systems must keep an eye on and restrict unwanted events from occurring in the IoT network. Recently, various technical solutions based on machine learning (ML) models have been derived towards identifying and restricting unwanted events in IoT. However, most ML-based approaches are prone to miss-classification due to inappropriate feature selection. Additionally, most ML approaches applied to intrusion detection and prevention consider supervised learning, which requires a large amount of labeled data to be trained. Consequently, such complex datasets are impossible to source in a large network like IoT. To address this problem, this proposed study introduces an efficient learning mechanism to strengthen the IoT security aspects. The proposed algorithm incorporates supervised and unsupervised approaches to improve the learning models for intrusion detection and mitigation. Compared with the related works, the experimental outcome shows that the model performs well in a benchmark dataset. It accomplishes an improved detection accuracy of approximately 99.21%.
Developing a smart system for infant incubators using the internet of things ...IJECEIAES
This research is developing an incubator system that integrates the internet of things and artificial intelligence to improve care for premature babies. The system workflow starts with sensors that collect data from the incubator. Then, the data is sent in real-time to the internet of things (IoT) broker eclipse mosquito using the message queue telemetry transport (MQTT) protocol version 5.0. After that, the data is stored in a database for analysis using the long short-term memory network (LSTM) method and displayed in a web application using an application programming interface (API) service. Furthermore, the experimental results produce as many as 2,880 rows of data stored in the database. The correlation coefficient between the target attribute and other attributes ranges from 0.23 to 0.48. Next, several experiments were conducted to evaluate the model-predicted value on the test data. The best results are obtained using a two-layer LSTM configuration model, each with 60 neurons and a lookback setting 6. This model produces an R 2 value of 0.934, with a root mean square error (RMSE) value of 0.015 and a mean absolute error (MAE) of 0.008. In addition, the R 2 value was also evaluated for each attribute used as input, with a result of values between 0.590 and 0.845.
A review on internet of things-based stingless bee's honey production with im...IJECEIAES
Honey is produced exclusively by honeybees and stingless bees which both are well adapted to tropical and subtropical regions such as Malaysia. Stingless bees are known for producing small amounts of honey and are known for having a unique flavor profile. Problem identified that many stingless bees collapsed due to weather, temperature and environment. It is critical to understand the relationship between the production of stingless bee honey and environmental conditions to improve honey production. Thus, this paper presents a review on stingless bee's honey production and prediction modeling. About 54 previous research has been analyzed and compared in identifying the research gaps. A framework on modeling the prediction of stingless bee honey is derived. The result presents the comparison and analysis on the internet of things (IoT) monitoring systems, honey production estimation, convolution neural networks (CNNs), and automatic identification methods on bee species. It is identified based on image detection method the top best three efficiency presents CNN is at 98.67%, densely connected convolutional networks with YOLO v3 is 97.7%, and DenseNet201 convolutional networks 99.81%. This study is significant to assist the researcher in developing a model for predicting stingless honey produced by bee's output, which is important for a stable economy and food security.
A trust based secure access control using authentication mechanism for intero...IJECEIAES
The internet of things (IoT) is a revolutionary innovation in many aspects of our society including interactions, financial activity, and global security such as the military and battlefield internet. Due to the limited energy and processing capacity of network devices, security, energy consumption, compatibility, and device heterogeneity are the long-term IoT problems. As a result, energy and security are critical for data transmission across edge and IoT networks. Existing IoT interoperability techniques need more computation time, have unreliable authentication mechanisms that break easily, lose data easily, and have low confidentiality. In this paper, a key agreement protocol-based authentication mechanism for IoT devices is offered as a solution to this issue. This system makes use of information exchange, which must be secured to prevent access by unauthorized users. Using a compact contiki/cooja simulator, the performance and design of the suggested framework are validated. The simulation findings are evaluated based on detection of malicious nodes after 60 minutes of simulation. The suggested trust method, which is based on privacy access control, reduced packet loss ratio to 0.32%, consumed 0.39% power, and had the greatest average residual energy of 0.99 mJoules at 10 nodes.
Fuzzy linear programming with the intuitionistic polygonal fuzzy numbersIJECEIAES
In real world applications, data are subject to ambiguity due to several factors; fuzzy sets and fuzzy numbers propose a great tool to model such ambiguity. In case of hesitation, the complement of a membership value in fuzzy numbers can be different from the non-membership value, in which case we can model using intuitionistic fuzzy numbers as they provide flexibility by defining both a membership and a non-membership functions. In this article, we consider the intuitionistic fuzzy linear programming problem with intuitionistic polygonal fuzzy numbers, which is a generalization of the previous polygonal fuzzy numbers found in the literature. We present a modification of the simplex method that can be used to solve any general intuitionistic fuzzy linear programming problem after approximating the problem by an intuitionistic polygonal fuzzy number with n edges. This method is given in a simple tableau formulation, and then applied on numerical examples for clarity.
The performance of artificial intelligence in prostate magnetic resonance im...IJECEIAES
Prostate cancer is the predominant form of cancer observed in men worldwide. The application of magnetic resonance imaging (MRI) as a guidance tool for conducting biopsies has been established as a reliable and well-established approach in the diagnosis of prostate cancer. The diagnostic performance of MRI-guided prostate cancer diagnosis exhibits significant heterogeneity due to the intricate and multi-step nature of the diagnostic pathway. The development of artificial intelligence (AI) models, specifically through the utilization of machine learning techniques such as deep learning, is assuming an increasingly significant role in the field of radiology. In the realm of prostate MRI, a considerable body of literature has been dedicated to the development of various AI algorithms. These algorithms have been specifically designed for tasks such as prostate segmentation, lesion identification, and classification. The overarching objective of these endeavors is to enhance diagnostic performance and foster greater agreement among different observers within MRI scans for the prostate. This review article aims to provide a concise overview of the application of AI in the field of radiology, with a specific focus on its utilization in prostate MRI.
Seizure stage detection of epileptic seizure using convolutional neural networksIJECEIAES
According to the World Health Organization (WHO), seventy million individuals worldwide suffer from epilepsy, a neurological disorder. While electroencephalography (EEG) is crucial for diagnosing epilepsy and monitoring the brain activity of epilepsy patients, it requires a specialist to examine all EEG recordings to find epileptic behavior. This procedure needs an experienced doctor, and a precise epilepsy diagnosis is crucial for appropriate treatment. To identify epileptic seizures, this study employed a convolutional neural network (CNN) based on raw scalp EEG signals to discriminate between preictal, ictal, postictal, and interictal segments. The possibility of these characteristics is explored by examining how well timedomain signals work in the detection of epileptic signals using intracranial Freiburg Hospital (FH), scalp Children's Hospital Boston-Massachusetts Institute of Technology (CHB-MIT) databases, and Temple University Hospital (TUH) EEG. To test the viability of this approach, two types of experiments were carried out. Firstly, binary class classification (preictal, ictal, postictal each versus interictal) and four-class classification (interictal versus preictal versus ictal versus postictal). The average accuracy for stage detection using CHB-MIT database was 84.4%, while the Freiburg database's time-domain signals had an accuracy of 79.7% and the highest accuracy of 94.02% for classification in the TUH EEG database when comparing interictal stage to preictal stage.
Analysis of driving style using self-organizing maps to analyze driver behaviorIJECEIAES
Modern life is strongly associated with the use of cars, but the increase in acceleration speeds and their maneuverability leads to a dangerous driving style for some drivers. In these conditions, the development of a method that allows you to track the behavior of the driver is relevant. The article provides an overview of existing methods and models for assessing the functioning of motor vehicles and driver behavior. Based on this, a combined algorithm for recognizing driving style is proposed. To do this, a set of input data was formed, including 20 descriptive features: About the environment, the driver's behavior and the characteristics of the functioning of the car, collected using OBD II. The generated data set is sent to the Kohonen network, where clustering is performed according to driving style and degree of danger. Getting the driving characteristics into a particular cluster allows you to switch to the private indicators of an individual driver and considering individual driving characteristics. The application of the method allows you to identify potentially dangerous driving styles that can prevent accidents.
Hyperspectral object classification using hybrid spectral-spatial fusion and ...IJECEIAES
Because of its spectral-spatial and temporal resolution of greater areas, hyperspectral imaging (HSI) has found widespread application in the field of object classification. The HSI is typically used to accurately determine an object's physical characteristics as well as to locate related objects with appropriate spectral fingerprints. As a result, the HSI has been extensively applied to object identification in several fields, including surveillance, agricultural monitoring, environmental research, and precision agriculture. However, because of their enormous size, objects require a lot of time to classify; for this reason, both spectral and spatial feature fusion have been completed. The existing classification strategy leads to increased misclassification, and the feature fusion method is unable to preserve semantic object inherent features; This study addresses the research difficulties by introducing a hybrid spectral-spatial fusion (HSSF) technique to minimize feature size while maintaining object intrinsic qualities; Lastly, a soft-margins kernel is proposed for multi-layer deep support vector machine (MLDSVM) to reduce misclassification. The standard Indian pines dataset is used for the experiment, and the outcome demonstrates that the HSSF-MLDSVM model performs substantially better in terms of accuracy and Kappa coefficient.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Predicting user behavior using data profiling and hidden Markov model
1. International Journal of Electrical and Computer Engineering (IJECE)
Vol. 13, No. 5, October 2023, pp. 5444~5453
ISSN: 2088-8708, DOI: 10.11591/ijece.v13i5.pp5444-5453 5444
Journal homepage: http://ijece.iaescore.com
Predicting user behavior using data profiling and hidden
Markov model
Bahaa Eddine Elbaghazaoui, Mohamed Amnai, Youssef Fakhri
Laboratory of Computer Sciences, Faculty of Sciences, Ibn Tofail University, Kenitra, Morocco
Article Info ABSTRACT
Article history:
Received Oct 16, 2022
Revised Jan 19, 2023
Accepted Mar 9, 2023
Mental health disorders affect many aspects of patient’s lives, including
emotions, cognition, and especially behaviors. E-health technology helps to
collect information wealth in a non-invasive manner, which represents a
promising opportunity to construct health behavior markers. Combining
such user behavior data can provide a more comprehensive and contextual
view than questionnaire data. Due to behavioral data, we can train machine
learning models to understand the data pattern and also use prediction
algorithms to know the next state of a person’s behavior. The remaining
challenges for this issue are how to apply mathematical formulations to
textual datasets and find metadata that aids to identify the person’s life
pattern and also predict the next state of his comportment. The main idea of
this work is to use a hidden Markov model (HMM) to predict user behavior
from social media applications by analyzing and detecting states and symbols
from the user behavior dataset. To achieve this goal, we need to analyze and
detect the states and symbols from the user behavior dataset, then convert
the textual data to mathematical and numerical matrices. Finally, apply the
HMM model to predict the hidden user behavior states. We tested our
program and identified that the log-likelihood was higher and better when
the model fits the data. In any case, the results of the study indicated that the
program was suitable for the purpose and yielded valuable data.
Keywords:
Data profiling
Hidden Markov model
Machine learning
User behavior
User profiling
This is an open access article under the CC BY-SA license.
Corresponding Author:
Bahaa Eddine Elbaghazaoui
Laboratory of Computer Sciences, Faculty of Sciences, Ibn Tofail University
Kenitra, Morocco
Email: bahaaeddine.elbaghazaoui@uit.ac.ma
1. INTRODUCTION
Personal data is getting more difficult to obtain [1], and the most recent legislation allows users to
connect with a service without providing personal information. Furthermore, a trustworthy personal profile
must be constructed utilizing a variety of attributes that are frequently hidden and can only be deduced using
predictive models. Implicit feedback, on the other hand, is easy to collect but highly rare. Although strategies
based on factorization of the user-item matrix [2] are simple to implement (given the ability to use
parallelized algorithms), the gap between the number of ratings and products is usually too large to allow
reliable estimation.
Various probabilistic methods have been devised by researchers to understand user behavior
patterns in social data [3]. These models often incorporate a hidden parameter, representing a user’s interest
in a particular topic, estimated based on their interactions with messages, views, and other related content.
The user data, such as their frequency of visits, number of messages inspected and received, play a crucial
role in recommending services through web applications. Therefore, it is imperative to consider these factors
while describing user behavior [4].
2. Int J Elec & Comp Eng ISSN: 2088-8708
Predicting user behavior using data profiling and hidden Markov model (Bahaa Eddine Elbaghazaoui)
5445
Thoughts are sometimes associated with user behavior [5]. We can use innate behavioral
information to better comprehend a person’s psychological and health behavior. It can also provide a method
for predicting future states such as mood, which has a wide range of applications in psychology [6],
medicine, and other economic transactions [7], where user mood is a major factor in decision-making.
Online user behavior is creating an unprecedented amount of data that is implicitly public. First, the
amount of data is expanding as users spend more time on media applications and engage more through
postings, comments, and other means [8]. Second, users almost always contribute information. However,
what is less clear is that the capacity to swiftly collect, combine, and analyze that information can provide
extremely detailed information about the user. Individual pieces of information are harmless, but when joined
with others, they might lead to unexpected revelations.
Psychological profiling is a technique used by psychologists and law enforcement agencies to
understand the behavior and motivation of criminals or suspects [9]. Behavioral profiling, on the other hand,
involves the use of machine learning and sophisticated analytics to generate profiles of user behavior in
various fields such as marketing, healthcare, and finance. The use of behavioral profiling can help
organizations better understand and anticipate user needs. However, it is important to balance the benefits of
behavioral profiling with privacy concerns and ethical considerations.
The term “prediction” is on everyone’s lips. In the corporate sector, predictive analytics is
exploding, and every company wants to know what their customers will do next. Predictive analysis can
deliver great business value even when individual predictions are not always accurate [10].
The majority of prediction analyses are based on typical machine learning (ML) models such as
linear regression, random forest (RF), and natural language processing (NLP). The difficulty with these
models is that they use the entire dataset as input and provide a single column as a result [11]. The goal is to
figure out the data pattern. Regrettably, the outcomes of those models differ from one database to the next.
This article aims to predict user behavior using data profiling and the hidden Markov model (HMM)
technique [12]. However, HMM allows profiling the most likely sequence of states. This paper will begin by
profiling the dataset to better comprehend the data’s worth and identify common user behavior states. Then,
for easy application of the HMM model, we need to convert all textual data to a mathematical matrix.
The primary goal of our paper is to identify the users’ behavioral directions, as well as the states of
their behavior and the necessary actions that they must perform. The HMM model is one of the most
applicable models for dealing with this problem. Indeed, after profiling the data, we can extract information
from our database (min, max, repetition of an activity, ...). Using this metadata, we can then easily apply the
HMM model. However, due to the Daylio application, we collected a user lifelog dataset that contains every
action this user took in time as well as his mood at that time. After cleaning and analyzing the dataset, we
apply the HMM model and predict the moods of the user based on his activities.
This paper outlines its structure, which aims to summarize previously discussed ideas. It starts by
discussing pertinent attempts and significant challenges related to its approach in the first section. The second
section covers data profiling and the HMM method. The paper's primary contribution is detailed in the third
section, followed by a discussion of the implementation strategy. Lastly, the study finishes with a thorough
discussion of the results and recommendations for potential future research.
2. BACKGROUND
Lifelogging is the practice of digitally recording a user’s everyday activities for a variety of
objectives and at various levels of detail [13]. Such data can be saved to track daily activities and improve the
human experience. Lifelogging is already being used in a variety of settings. It has, for example, been used to
recall human memories [14], anticipate and diagnose physical health issues [15], detect chronic diseases [16],
create themed and digitized diaries, record, identify, and review everyday activities, and gather and analyze
data on aged health. Researchers in both the lifelogging and physical health fields have shown great interest
in studying physical health due to its potential to enhance the quality of life for individuals.
Self-revelation behavior plays an essential role in various social media networks, affecting
self-presentation and social desirability. This conduct caters to people’s demands and influences their actions
and attitudes [17]. Recently, there has been an increasing interest in exploring the role of emotions in
self-disclosure behavior [18]. By evaluating online encounters that can be shared through an emotional
perspective, a more comprehensive understanding of individuals’ capacity to convey, comprehend, manage,
and exchange their self-reports to gain social recognition is achievable [19]. According to present studies,
online communication facilitates the sharing of opinions, perceptions, emotions, and experiences more than
in-person interactions [20].
Our preliminary research findings on comprehending consumers psychologically and tracking their
mental health in four psychological categories using life logs were published in Dang-Nguyen et al. [21].
These categories consist of measuring the big five personality “BIG5” traits [22], predicting mood and sleep
3. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 13, No. 5, October 2023: 5444-5453
5446
quality [23], and detecting music type and mood [24]. We concluded that using lifelog data, we can
psychologically analyze and model the person [25].
According to past psychological research, various elements influence a person’s behavior.
Environmental factors [26], exercise and physical activities [27], [28], weather and air pollution [29], [30],
sleep duration and quality, working hours, heart rate, blood pressure, and an individual’s personality,
particularly in terms of extraversion and neuroticism [31], are some of these factors. Temperature, wind
speed, sunshine, precipitation, air pressure, and photoperiod are all factors that affect mood, according to one
study by Denissen et al. [29]. Many studies on the relationship between body measurements and mood have
shown that biometrics are mood markers and are influenced by mood [32], [33].
For researchers that are all heading in the same direction, some of them tried to anticipate and
identify moods by gathering information from trial users’ profiles, such as social participation, gender,
linguistic style, and a variety of psychological data [34]. According to this study, user behaviors and postings
can be used as behavioral clues to characterize Twitter users’ tweets as positive, negative, or neutral [30].
Roshanaei et al. [35] the same group of researchers looked at predicting user emotions based on their mobile
phone habits. To collect valuable and significant amounts of user data, they created an Android app that
captures user emotions as well as certain physical data about their lives, such as activities they engage in,
places they visit, and apps they are currently using. They offer correlations between user moods and these
characteristics in their work, as well as create algorithms to predict user moods with promising accuracy.
ML involves several internal states that are challenging to identify or observe. An alternative
approach is to infer these states from observable external factors [36]. This is precisely what HMM does. For
instance, in speech recognition, we listen to a spoken utterance (the observable) to deduce the underlying text
(the internal state representing the speech).
Our method would be to apply a mathematical approach using historical lifelog data to forecast
future user behavior. We can forecast the next and hidden states of user behavior using the HMM. This
strategy will be realized when we can convert all of our textual data into mathematical form.
3. PROBLEMATIC ISSUE
Several researchers have focused on collecting metadata from web applications to gain a
comprehensive understanding of user behavior, which is commonly referred to as data profiling [37].
Studying user behavior can help to enhance the caliber of different services and goods [38] and offer them in
a way that satisfies consumers’ needs. To effectively market their products and services, large corporations
are constantly trying to predict and comprehend their clients behavior. However, determining users’
behavioral patterns, their states, and the crucial actions they must take [39] can be challenging. According to
our research, the HMM model is the most suitable method for addressing this issue. We can extract
information from our database after profiling the data (such as minimum, maximum, and activity frequency),
and then use this metadata to apply the HMM model easily.
4. PRELIMINARIES
This section will focus on two methods that will be used for the project: data profiling application
and HMM approach. The data profiling application will be utilized to extract meaningful insights from the
given dataset. On the other hand, the HMM approach will be used to build a model to predict the next
possible event based on the previous sequence of events.
4.1. Data profiling
Profiling is a procedure that involves gathering data from various data sources (databases and files)
and compiling statistics and information on such data [40]. As a result, the data analysis is quite close [37].
The process of data profiling is useful in various scenarios where maintaining data quality is crucial. It can
help with data warehousing, business intelligence, and linked data by assisting in the identification of
potential problems and the implementation of necessary fixes. Moreover, data profiling is crucial for data
migration and conversion since it reveals data quality problems that could be overlooked during translation or
adaption.
It may come as a surprise that we engage in more data profiling while cleaning and preparing the
data than during the preview stage. During this stage, we often handle, clean, remove, or repair various data
elements such as NULL values, errors, missing values, noise, or unexpected data artifacts. Some activities
necessitate expertise in a particular field or domain. For example, converting an attribute to a specific
physical unit, generating a new explanatory variable by calculating the ratio of two specific attributes, or
converting an IP address to a geographic location [41].
4. Int J Elec & Comp Eng ISSN: 2088-8708
Predicting user behavior using data profiling and hidden Markov model (Bahaa Eddine Elbaghazaoui)
5447
Accurate and effective data science models rely on a comprehensive understanding of the data being
analyzed [42]. Therefore, data profiling is considered the most reliable and efficient way to “get to know the
data” for analytics and machine learning initiatives [43]. Data profiling involves examining and evaluating
data from multiple sources to identify patterns, quality, completeness, and consistency of the data. This
process enables data scientists to gain insights into the data and determine the necessary steps required to
clean, transform, and prepare the data for modeling.
4.2. HMM
The HMM is a modified version of the Markov chain, which is a mathematical model that provides
information about the likelihood of sets of random variables or states, such as words, tags, or symbols
representing various things such as the weather. The Markov chain assumes that the current state is the only
one that matters when predicting future events in a sequence, and that previous states have no influence on
the future state. This is like to forecasting the weather for tomorrow without considering the weather from
yesterday.
𝑃(𝑞𝑖 = 𝑎|𝑞1. . . 𝑞𝑖−1) = 𝑃(𝑞𝑖 = 𝑎|𝑞𝑖−1) (1)
HMM provides a more probabilistic approach to modeling time-series data. Unlike traditional
methods, which are based around mathematical equations, HMM models a set of discrete states and
transitions between them to capture the underlying patterns in the data. This approach allows for more
accurate and robust predictions, as it can capture complex relationships and patterns between the data points.
In addition, the use of stochastic models provides a means to model uncertainty and variability in the data,
which can be beneficial in many contexts. Finally, HMM models are typically more computationally efficient
than traditional methods, as they require fewer parameters and can be scaled up or down to adjust the
complexity of the model.
Consider a list of state variables, denoted as qi, can be modeled using a Markov model. This model
is based on the assumption that the future probability of the sequence depends only on the present state, and
not on the past. Thus, the Markov model simplifies the prediction of future states by disregarding the
influence of the past states.
The HMM is a probabilistic model that uses observable data to infer unobserved information [44]. It
is a statistical modeling technique that’s utilized in speech recognition, handwriting recognition, as well as
other applications [45]. In many circumstances, what is observed is not reality, and the HMM is one method
for recovering the hidden truth. However, it is not magic for everything, and to utilize it, one must meet
certain assumptions, which are the foundations of the HMM.
The focus is on the observations sequence rather than the sequence of states in HMM, a variation of
Markov models where the states generating observable data are not explicitly observable [46]. Given the
value of the hidden variable at moment 𝑡 − 1, the hidden variable’s value at time 𝑡 dictates both the value of
the observable variable at time t and the conditional probability distribution of the hidden variable at
instant 𝑡. As a result, HMMs are particularly useful for imitating situations when a system can only be
partially seen.
The HMM assumes a discrete state space for the hidden variables and a continuous Gaussian
distribution for the observations. The HMM has two parameters: the probabilities of state transitions and the
probabilities of output. Given a hidden state at time 𝑡 − 1, the state transition probabilities determine the
selection of the hidden state at time 𝑡. This yields one of N possible values for a set of hidden states modeled
by a categorical distribution. Figure 1 depicts the HMM process, where the current state and transition
probability matrix A determine the Markov process, which is hidden beneath the dashed line.
The higher-level Markov model can be transformed into a hidden Markov model, as depicted in
Figure 2, for a better understanding. In this model, the state variables are hidden, and only observable
outputs, or emissions, are visible. The model assumes that the probability distribution of each output depends
only on the current hidden state, which can be inferred using the observable emissions.
Markov chain model is clearly visible: the “start” state, the two “state 1” and “state 2” states, as well
as the transitions and the corresponding probability. Two new observations have been added: “A” and “B”.
Each state has a given probability of emitting each observation, which we call “probability of emission. This
probability could be null, indicating that the state is unable to issue the requested observation. In our
example, state 1 has a 50% probability (0.5) of producing a “A” and a 50% chance of producing a “B”
whereas state 2 has a 90% chance (0.9) of producing a “A” and a 10% chance (0.1) of producing a “B”. The
sum of the state’s emissions probabilities must always be equal to one, just as it must be for transitions
leaving that state.
5. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 13, No. 5, October 2023: 5444-5453
5448
Figure 1. Hidden Markov model
Figure 2. Example of HMM
The basic components of the HMM model is as: i) hidden states, ii) observation symbols the return
to the initial hidden state when the initial state has changed, iii) transition to the terminal state probability
distribution, iv) distribution of state transition probabilities, and v) distribution of state emission probabilities.
HMM is divided into two sections: hidden and observable. The hidden component is made up of
hidden states that cannot be seen directly but are detected by observation symbols that hidden states emit. For
example, we do not know what mood our user is in (mood is a hidden state), but we may see their actions
(observable symbols) and infer what hidden state he is in from those actions.
There are three common problems that an HMM can be used to solve: i) calculate the probability of
a specific sequence using an automated system (using the forward algorithm); ii) finding the most likely state
(hidden) that led to the generation of a certain sortie sequence (using the Viterbi algorithm); and iii) given a
flight sequence, determine the most likely set of states as well as the likelihood of each state’s sorties. The
Baum-Welch algorithm, often known as the forward-backward algorithm, is used.
Let’s take the on-screen keyboard on a mobile phone as an example [47]. We may occasionally
mistype the character next to what we meant to type. The observed data is the character you mistyped, while
the unobserved data is the one you wanted to enter in your head. Another example is that owing to random
noise, your global positioning system (GPS) measurements (observed) may jump into your actual location
(unobserved).
A hidden Markov model can be evaluated using one of two approaches HMM.
a) Likelihood of test data: In this strategy, some test data should be kept and the likelihood of the test
sequences computed using the forward algorithm.
b) Predicting data parts based on other data parts: The application determines whether or not a prediction
task is significant. You might be interested in foreseeing the future, for example. You can utilize the
forward method to follow the state of the hmm in this scenario.
5. METHOD AND CONTRIBUTION
This study presents our workflow approach, which is illustrated in Figure 3. The initial stage is to
gather information about user behavior and activities. In the second step, we use data cleaning to identify
relevant and powerful data, followed by data profiling to gather metadata for our HMM model. The final step
will be to apply HMM to our metadata.
To make things clearer. In the first phase, we must choose a dataset or an application programming
interface (API) that contains information about user behavior as well as its daily activities. These details will
make it easier for us to put our strategy into action. In the second section, we must clean up our database of
incorrect, incomplete, and null information [48]. As a result, for each state, calculate the average, minimum,
6. Int J Elec & Comp Eng ISSN: 2088-8708
Predicting user behavior using data profiling and hidden Markov model (Bahaa Eddine Elbaghazaoui)
5449
and maximum values to get a broad picture of the data. In the final section, you must calculate the matrix of
state transitions and the emission matrix between each state and activity. The initial value of each state can be
taken as 1 divided by the number of states. Finally, we’ll apply our HMM model.
Figure 3. Approach workflow
In this case, our issue is based on the second problem, as mentioned in the preliminary portion of the
HMM section. The Viterbi algorithm’s goal is to draw an inference based on some observed data and a
trained model [49], [50]. It works by posing the following question: given the trained parameter matrices and
data, what are the states that maximize the joint probability? In other words, given the data and the trained
model, what is the most likely option? The following algorithm 1 can be used to represent this statement, and
the answer is dependent on the facts.
Algorithm 1. Viterbi
Input Initial probabilities vector (π), Observation sequence (O), State transition
probabilities matrix (A), Emission probabilities matrix (B)
Output The likely sequence of hidden states (Q)
1 Begin
2 //Initialization
3 Let N be the number of states
4 Let T be the length of observation sequence
5 Initialize a 2D array to store the probabilities of most likely path:
6 V[1..N, 1..T]
7 //Recursion
8 For each state 1 to N, do
9 V[state, 1]=π[state]*B[state, O[1]]
10 For each time 2 to T, do
11 For each state 1 to N, do
12 V[state, t]=max(V[1..N, t-1]*A[1..N, state]*B[state, O[t]])
13 //Termination
14 Let prob be the maximum of V[1..N, T]
15 Let Q[1..T] be an array to store the most likely sequence of states
16 Q[T]=argmax(V[1..N, T])
17 For t=T-1 to 1, do
18 Q[t]=argmax(V[1..N, t]*A[1..N, Q[t+1]])
19 Return Q
20 End
The algorithm 1 is a dynamic programming method that is used to identify the sequence of states
with the highest probability in an HMM. The purpose of this algorithm is to maximize the probability of the
most likely sequence of states given a sequence of observations. To achieve this, the algorithm initializes a
two-dimensional array called V, which is used to store the probabilities of the most likely path. Then, for
each state, the algorithm assigns probabilities of the observed sequence. Next, the algorithm iterates over
each time step and, for each state, finds the maximum probability of the most likely sequence of states given
the previous probabilities. The algorithm then utilizes the array with the maximum probability to identify the
most likely set of states. The program then produces the most probable series of states.
The accuracy of an HMM can be determined using Python by calculating the model log-likelihood.
The log-likelihood is a measure of how well the model fits the data. To calculate the log-likelihood of an
HMM, you need to first fit the model to the data using the fit() method. This will create an instance of the
HMM with its estimated parameters. Then, you can use the score() method to compute the log-likelihood.
The better the model fits the data, the higher the log-likelihood.
7. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 13, No. 5, October 2023: 5444-5453
5450
6. IMPLEMENTATION AND RESULTS
Based on the workflow that we provided in our contribution. The steps for our implementation will
be as follows. In the first instance, we need to gather a set of user-relevant data from a donation database or
an API. Following whatever search one conducts, one discovers a database collected by the application
Daylio that is tailored to our needs. The dataset may be found on the Kaggle website [51]. The dataset is
Abid Ali Awan’s lifelogs with goals, and it comprises full data about time, emotions, and activities that affect
mood from 03/02/2018 to 16/04/2021 as represented in Table 1.
The dataset has 940 data rows. It has many columns as shown in the previous Table 1, but the last
two columns (activities and mood) are the ones that interest us in our approach. The user “Abid Ali Awan”
frequently alternates between five moods (good, normal, amazing, awful, and bad), and there are numerous
activities to choose from, such as reading, learning, and praying...).
Following the data quality step, we applied a Python script to clean the dataset. However, we take
the columns that we are interested in and also delete the rows that have null or empty values. After that, we
started using data profiling. We used the pandas-profiling library to get a description of metadata such as
min, max, and the most frequent values, as seen in Table 2. Data profiling in our work is based on extracting
metadata that will help us predict emotional states based on activities. To accomplish so, we must define the
transition matrix between states (between moods) as well as the transmission matrix between states and
symbols (moods and activities).
It is time to talk about transition matrixes. We use the term “transition” from one specific state to
another. In fact, a transition between “Good” and “Bad” is a day when the user is “Good” and the next day is
“Bad”. The transition value between these two states is the value of transition between these two states
divided by the total number of transitions. Table 3 shows the values of the transition matrix in our dataset.
Now, we must generate the emission matrix, which contains the probabilities for each state of
emitting each of the potential observations. In fact, if we choose the state “Good” and the symbol
“designing”, the emission value is the number of days that the user is “Good” as well as performing the
activity “designing” divided by the number of days that the user is “Good”. Following the extraction of all
the necessary data. For our preferred language “Python”, we used the hmmlearn library with Gaussian mixture
emissions. The hmmlearn is a collection of techniques for unsupervised learning and inferring HMM.
Table 1. Abid Ali Awan’s lifelog in the Daylio app
Index Full date Date Weekday Time Activities Mood
0 16/04/2021 Apr 16 Friday 8:00 pm Reading, Art, Prayer... Good
1 15/04/2021 Apr 15 Thursday 2:37 am Reading, Learning, Art... Good
2 14/04/2021 Apr 14 Wednesday 2:39 am Reading, Learning, Prayer... Normal
.. ... ... ... ... ... ...
938 03/02/2018 Feb 03 Saturday 7:52 pm Write, Dota 2, Streaming ... Awfull
939 03/02/2018 Feb 03 Saturday 3:12 pm Walk, Meditation, Dota 2... Normal
Table 2. Profiling Daylio dataset
Activities Mood
Count 12,614 940
Unique 58 5
Top YouTube Good
Freq 770 487
Table 3. Transition matrix result
Good Normal Amazing Awful Bad
Good 0.6057 0.2032 0.127 0.0266 0.0369
Normal 0.4756 0.2540 0.1027 0.0918 0.0756
Amazing 0.4071 0.0838 0.4790 0.0179 0.0119
Awful 0.4117 0.1764 0.0392 0.1960 0.1764
Bad 0.2857 0.3469 0.0816 0.1632 0.1224
7. DISCUSSION
The following experience is used as an application. We give our program the following lists of
activities. This set of activities represents an example, if a user does these activities, what states can be
obtained according to our program? Indeed, we tried in the first tests, random activities that were different
and not repeated, for example:
8. Int J Elec & Comp Eng ISSN: 2088-8708
Predicting user behavior using data profiling and hidden Markov model (Bahaa Eddine Elbaghazaoui)
5451
[‘Write’, ‘Prayer’, ‘Shower’, ‘Dota’, ‘Good’, ‘Streaming’, ‘YouTube’, ‘Research’, ‘Power’]
According to those activities, the program predicts the following hidden states:
[‘Bad’, ‘Awful’, ‘Bad’, ‘Awful’, ‘Amazing’, ‘Good’, ‘Normal’, ‘Amazing’, ‘Good’]
After that, we gave our program a list of activities that it repeated as follows:
[‘Streaming’, ‘Reading’, ‘Streaming’, ‘YouTube’, ‘YouTube’, ‘YouTube’, ‘YouTube’, ‘YouTube’,
‘YouTube’]
Based on those activities, the program predicts the following hidden states.
[‘Good’, ‘Normal’, ‘Bad’, ‘Awful’, ‘Awful’, ‘Awful’, ‘Awful’, ‘Awful’, ‘Awful’]
According to our tests, if we repeat some activities that the user must do, we predict that the mood
states will repeat too (i.e., if the user watches YouTube several times, the mood states will be “Awful”). For
this, we find that user activities influence the future state of mood. This validates the HMM hypothesis and
enhances the results of our program.
The evaluation of our system can also be based on the Likelihood results. The log-likelihood value
of a regression model is a measure of how well the model fits the data. A higher log-likelihood value
indicates a better fit to the dataset. The log-likelihood method is useful when comparing multiple models. In
practice, multiple regression models are often fitted to a dataset, and the model with the highest log-
likelihood value is chosen as the best fit.
A degenerate solution will result from fitting a model with 34 free scalar parameters with just
5 data points and a 𝑙𝑜𝑔(𝑙𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑) =165.2432. If we take 9 activities from our dataset, we get
𝑙𝑜𝑔(𝑙𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑) =14.3623. If we take 100 activities from our dataset, we get 𝑙𝑜𝑔(𝑙𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑) =-238.5843.
The results indicate that the model is overfitted when using a small data set of 5 data points with 34
free scalar parameters. The log-likelihood score is much higher than when using larger datasets of 9 and 100
activities. This suggests that the model is better at predicting outcomes when more data points are available.
Additionally, the log-likelihood score is much lower when using a dataset of 100 activities as opposed to 9,
indicating that the model is more accurate when more data points are used.
8. CONCLUSION AND FUTURE DIRECTION
HMM has the ability to capture the underlying structure of user behavior, allowing for more
accurate predictions. This has been used to great success in a variety of different applications, from
predicting user behavior on websites to predicting stock market movements. By leveraging the power of
HMM, we can better understand user behavior, predict their next state, and ultimately improve user
experience. The objective of this project is to combine basic duties. However, storage and data collection,
state and symbol profiling from our data, and the extraction of the transition matrix and emission matrix are
all steps in the process. Then, using the HMM model, we forecast hidden states based on observable symbols.
This method will aid us in comprehending and anticipating customer behavior, and the current product will
be tailored to meet their needs.
In our future directions, we will explore the potential of combining HMMs with machine learning
techniques to create more efficient and powerful solutions. We will investigate the use of HMMs to improve
the accuracy of machine learning algorithms, as well as explore their potential to enable more intelligent and
dynamic machine learning models. We will also investigate the potential of HMMs to act as an intermediary
between machine learning algorithms, potentially allowing for more efficient and interactive learning
systems.
REFERENCES
[1] M. Naeem et al., “Trends and future perspective challenges in big data,” in Smart Innovation, Systems and Technologies, vol. 253,
Springer Singapore, 2022, pp. 309–325, doi: 10.1007/978-981-16-5036-9_30.
[2] T.-H. Pham et al., “A novel machine learning framework for automated detection of arrhythmias in ECG segments,” Journal of
Ambient Intelligence and Humanized Computing, vol. 12, no. 11, pp. 10145–10162, Nov. 2021, doi: 10.1007/s12652-020-02779-1.
[3] J.-H. Kang, K. Lerman, and L. Getoor, “LA-LDA: A limited attention topic model for social recommendation,” in Lecture Notes
in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7812,
Springer Berlin Heidelberg, 2013, pp. 211–220, doi: 10.1007/978-3-642-37210-0_23.
[4] H. Hao, F. Feng, W. Shao, and W. Huang, “Understanding the influence of contextual factors and individual social capital on
American public mask wearing in response to COVID–19,” Health Place, vol. 68, 2021, doi: 10.1016/j.healthplace.2021.102537.
9. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 13, No. 5, October 2023: 5444-5453
5452
[5] D. A. Reed, D. B. Gannon, and J. R. Larus, “Imagining the future: thoughts on computing,” Computer, vol. 45, no. 1, pp. 25–30,
Jan. 2012, doi: 10.1109/MC.2011.327.
[6] R. Buehler, C. McFarland, V. Spyropoulos, and K. C. H. Lam, “Motivated prediction of future feelings: effects of negative mood
and mood orientation on affective forecasts,” Personality and Social Psychology Bulletin, vol. 33, no. 9, pp. 1265–1278, Sep.
2007, doi: 10.1177/0146167207303014.
[7] J. Bollen, H. Mao, and X. Zeng, “Twitter mood predicts the stock market,” Journal of Computational Science, vol. 2, no. 1,
pp. 1–8, Mar. 2011, doi: 10.1016/j.jocs.2010.12.007.
[8] S. Khalid, “Motion-based behaviour learning, profiling and classification in the presence of anomalies,” Pattern Recognition,
vol. 43, no. 1, pp. 173–186, Jan. 2010, doi: 10.1016/j.patcog.2009.04.025.
[9] R. N. Turco, “Psychological profiling,” International Journal of Offender Therapy and Comparative Criminology, vol. 34, no. 2,
pp. 147–154, Sep. 1990, doi: 10.1177/0306624X9003400207.
[10] M. F. Zanuy, “Speaker recognition by means of a combination of linear and nonlinear predictive models,” in 6th European
Conference on Speech Communication and Technology (Eurospeech 1999), 1999, pp. 763–766, doi: 10.21437/Eurospeech.1999-185.
[11] P. Petousis and V. Stylianou, “A big data COVID-19 literature pattern discovery using NLP,” BioRxiv, Jun. 2022, doi:
10.1101/2022.06.01.494451.
[12] S. Li and Y. Bai, “Deep learning and improved HMM training algorithm and its analysis in facial expression recognition of sports
athletes,” Computational Intelligence and Neuroscience, vol. 2022, pp. 1–12, Jan. 2022, doi: 10.1155/2022/1027735.
[13] S. Ali, S. Khusro, A. Khan, and H. Khan, “Smartphone-based lifelogging: toward realization of personal big data,” in
EAI/Springer Innovations in Communication and Computing, Springer International Publishing, 2022, pp. 249–309, doi:
10.1007/978-3-030-75123-4_12.
[14] J. E. Kragel and J. L. Voss, “Looking for the neural basis of memory,” Trends in Cognitive Sciences, vol. 26, no. 1, pp. 53–65,
Jan. 2022, doi: 10.1016/j.tics.2021.10.010.
[15] J. Holt-Lunstad and A. Steptoe, “Social isolation: An underappreciated determinant of physical health,” Current Opinion in
Psychology, vol. 43, pp. 232–237, Feb. 2022, doi: 10.1016/j.copsyc.2021.07.012.
[16] V. Singh, V. K. Asari, and R. Rajasekaran, “A deep neural network for early detection and prediction of chronic kidney disease,”
Diagnostics, vol. 12, no. 1, Jan. 2022, doi: 10.3390/diagnostics12010116.
[17] N. Andalibi, “Disclosure, privacy, and stigma on social media: Examining non-disclosure of distressing experiences,” ACM
Transactions on Computer-Human Interaction, vol. 27, no. 3, pp. 1–43, Jun. 2020, doi: 10.1145/3386600.
[18] M. Burke, J. Cheng, and B. de Gant, “Social comparison and Facebook: feedback, positivity, and opportunities for comparison,”
in Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Apr. 2020, pp. 1–13, doi:
10.1145/3313831.3376482.
[19] O. L. Haimson and T. C. Veinot, “Coming out to doctors, coming out to ‘everyone’: understanding the average sequence of
transgender identity disclosures using social media data,” Transgender Health, vol. 5, no. 3, pp. 158–165, Sep. 2020, doi:
10.1089/trgh.2019.0045.
[20] K. L. Hinderliter, S. R. Puhl, and L. A. Skinta, “Mental health disparities among transgender and gender-diverse adults: A review
of the literature,” LGBT Health, vol. 8, no. 6, pp. 341–353, 2021, doi: 10.1089/lgbt.2020.0353.
[21] D.-T. Dang-Nguyen, L. Zhou, R. Gupta, M. Riegler, and C. Gurrin, “Building a disclosed lifelog dataset: Challenges, principles
and processes,” in Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing, Jun. 2017, pp. 1–6,
doi: 10.1145/3095713.3095736.
[22] C. MacCann, Y. Jiang, L. E. R. Brown, K. S. Double, M. Bucich, and A. Minbashian, “Emotional intelligence predicts academic
performance: a meta-analysis.,” Psychological Bulletin, vol. 146, no. 2, pp. 150–186, Feb. 2020, doi: 10.1037/bul0000219.
[23] L. Wang, S. Huang, L. Huangfu, B. Liu, and X. Zhang, “Multi-label out-of-distribution detection via exploiting sparsity and co-
occurrence of labels,” Image and Vision Computing, vol. 126, p. 104548, 2022, doi: 10.1016/j.imavis.2022.104548.
[24] A. Palazzi, B. Wagner Fritzen, and G. Gauer, “Music-induced emotion effects on decision-making,” Psychology of Music, vol. 47,
no. 5, pp. 621–643, Sep. 2019, doi: 10.1177/0305735618779224.
[25] H. Tong et al., “Music mood classification based on lifelog,” in Lecture Notes in Computer Science (including subseries Lecture
Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11168, Springer International Publishing, 2018,
pp. 55–66, doi: 10.1007/978-3-030-01012-6_5.
[26] R. Küller, S. Ballal, T. Laike, B. Mikellides, and G. Tonello, “The impact of light and colour on psychological mood: A cross-
cultural study of indoor work environments,” Ergonomics, vol. 49, no. 14, pp. 1496–1507, Nov. 2006, doi:
10.1080/00140130600858142.
[27] K. Scrivener, C. Sherrington, and K. Schurr, “Amount of exercise in the first week after stroke predicts walking speed and
unassisted walking,” Neurorehabilitation and Neural Repair, vol. 26, no. 8, pp. 932–938, 2012, doi: 10.1177/1545968312439628.
[28] R. R. Yeung, “The acute effects of exercise on mood state,” Journal of Psychosomatic Research, vol. 40, no. 2, pp. 123–141, Feb.
1996, doi: 10.1016/0022-3999(95)00554-4.
[29] J. J. A. Denissen, L. Butalid, L. Penke, and M. A. G. van Aken, “The effects of weather on daily mood: A multilevel approach,”
Emotion, vol. 8, no. 5, pp. 662–667, 2008, doi: 10.1037/a0013497.
[30] M. Roshanaei, R. Han, and S. Mishra, “Having fun?: Personalized activity-based mood prediction in social media,” in Lecture
Notes in Social Networks, 2017, pp. 1–18, doi: 10.1007/978-3-319-51049-1_1.
[31] J. M. L. Poon, “Mood: a review of its antecedents and consequences,” International Journal of Organization Theory and
Behavior, vol. 4, no. 3/4, pp. 357–388, Mar. 2001, doi: 10.1108/IJOTB-04-03-04-2001-B008.
[32] J. Warland, “Low blood pressure, low mood?,” BMC Pregnancy and Childbirth, vol. 12, 2012, doi: 10.1186/1471-2393-12-S1-A9.
[33] J. Bakker, M. Pechenizkiy, and N. Sidorova, “What’s your current stress level? detection of stress patterns from GSR sensor
data,” in 2011 IEEE 11th International Conference on Data Mining Workshops, 2011, pp. 573–580, doi:
10.1109/ICDMW.2011.178.
[34] J. Burger et al., “Reporting standards for psychological network analyses in cross-sectional data,” Psychological Methods, Apr.
2022, doi: 10.1037/met0000471.
[35] M. Roshanaei, R. Han, and S. Mishra, “EmotionSensing: Predicting mobile user emotion,” in Proceedings of the 2017 IEEE/ACM
International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2017, Jul. 2017, pp. 325–330, doi:
10.1145/3110025.3110127.
[36] D. R. K. Srivastava and D. Pandey, “Speech recognition using HMM and soft computing,” Materials Today: Proceedings, 2022,
vol. 51, pp. 1878–1883. doi: 10.1016/j.matpr.2021.10.097.
[37] B. E. Elbaghazaoui, M. Amnai, and A. Semmouri, “Data profiling over big data area,” in Advances in Intelligent Systems and
Computing, pp. 111–123, 2021, doi: 10.1007/978-3-030-72588-4_8.
10. Int J Elec & Comp Eng ISSN: 2088-8708
Predicting user behavior using data profiling and hidden Markov model (Bahaa Eddine Elbaghazaoui)
5453
[38] M. Giannakis, R. Dubey, S. Yan, K. Spanaki, and T. Papadopoulos, “Social media and sensemaking patterns in new product
development: demystifying the customer sentiment,” Annals of Operations Research, vol. 308, no. 1–2, pp. 145–175, Jan. 2022,
doi: 10.1007/s10479-020-03775-6.
[39] A. Vitetta, “Risk reduction in transport system in emergency conditions: a framework for network design problems,” in WIT
Transactions on the Built Environment, 2021, vol. 206, pp. 267–274, doi: 10.2495/SAFE210221.
[40] L. Ehrlinger and W. Wöß, “A survey of data quality measurement and monitoring tools,” Frontiers in Big Data, vol. 5, Mar.
2022, doi: 10.3389/fdata.2022.850611.
[41] C. Gokhale, “Network analysis of dark web traffic through the geo-location of South African IP address space,” in EAI/Springer
Innovations in Communication and Computing, Springer International Publishing, 2020, pp. 201–219, doi: 10.1007/978-3-030-
14718-1_10.
[42] E. Peer, D. Rothschild, A. Gordon, Z. Evernden, and E. Damer, “Data quality of platforms and panels for online behavioral
research,” Behavior Research Methods, vol. 54, no. 4, pp. 1643–1662, Sep. 2021, doi: 10.3758/s13428-021-01694-3.
[43] Y. Liu, Y. Li, and J. Wang, “Data profiling for big data analytics,” Journal of Intelligent and Fuzzy Systems, vol. 38, no. 5, pp.
6081–6092, 2020, doi: 10.3233/JIFS-189726.
[44] J. H. Jung, “Hand gesture recognition techniques: a review,” Journal of Imaging Science and Technology, vol. 64, no. 2, pp.
20501–20511, 2020, doi: 10.2352/J.ImagingSci.Technol.2020.64.2.020501.
[45] Q. Deng and D. Soffker, “A review of HMM-based approaches of driving behaviors recognition and prediction,” IEEE
Transactions on Intelligent Vehicles, vol. 7, no. 1, pp. 21–31, Mar. 2022, doi: 10.1109/TIV.2021.3065933.
[46] P. J. Schweitzer, “A survey of aggregation-disaggregation in large Markov chains,” in Numerical Solution of Markov Chains,
Boca Raton: CRC Press, 2021, pp. 63–88, doi: 10.1201/9781003210160-4.
[47] N. Kimura et al., “SilentSpeller: Towards mobile, hands-free, silent speech text entry using electropalatography,” in CHI
Conference on Human Factors in Computing Systems, Apr. 2022, pp. 1–19, doi: 10.1145/3491102.3502015.
[48] C. Chai, J. Wang, Y. Luo, Z. Niu, and G. Li, “Data management for machine learning: a survey,” IEEE Transactions on
Knowledge and Data Engineering, p. 1, 2022, doi: 10.1109/TKDE.2022.3148237.
[49] G. Rathee, C. A. Kerrache, and M. A. Ferrag, “A blockchain-based intrusion detection system using viterbi algorithm and indirect
trust for IIoT systems,” Journal of Sensor and Actuator Networks, vol. 11, no. 4, 2022, doi: 10.3390/jsan11040071.
[50] A. Jandera and T. Skovranek, “Customer behaviour hidden Markov model,” Mathematics, vol. 10, no. 8, Apr. 2022,
doi: 10.3390/math10081230.
[51] A. A. Awan, “Daylio mood tracker,” Kaggle, May 18, 2021. Accessed Oct. 16, 2022. [Online]. Available:
https://www.kaggle.com/datasets/kingabzpro/daylio-mood-tracker
BIOGRAPHIES OF AUTHORS
Bahaa Eddine Elbaghazaoui began his studies with a Mathematical Science
Scientific baccalaureate option. In 2013, he passed the preparatory classes that were integrated
into the National School of Applied Sciences of Khouribga, and then he progressed into the
computer engineering sector, earning an engineering diploma in software engineering in
2019. Bahaa Eddine El. is Ph.D. student, he does research at the Laboratory of Computer
Science. Ibn Tofail University in Kenitra, Morocco. He can be contacted at email
bahaaeddine.elbaghazaoui@uit.ac.ma and elbaghazaoui.bahaa@gmail.com.
Mohamed Amnai completed his master's degree in Computers, Electronics,
Electrical, and Automation (IEEA) from Molay Ismail University in Errachidia in 2000. Later
in 2007, he obtained his master's degree from Ibn Tofail University in Kenitra. In 2011, he
received his Ph.D. in Computer Science and Telecommunications from Ibn Tofail University
in Kenitra, Morocco. Currently, he serves as an assistant at the National School of Applied
Sciences Khouribga of Settat University since March 2014. In 2018, he was appointed as an
Associate Professor at the Department of Computer Science and Mathematics, Faculty of
Sciences of Kenitra, Ibn Tofail University in Morocco. He is also a member of the Networks
and Telecommunications Team at the Kenitra Faculty of Science and a member of the
Research Laboratory in Computer Science and Telecommunications (LaRIT). He can be
contacted at email mohamed.amnai@uit.ac.ma.
Youssef Fakhri obtained a Bachelor of Science (B.S.) degree in Electronic
Physics from the University Mohammed V’s Faculty of Sciences in Rabat, Morocco, in 2001.
He pursued his Master’s Degree (DESA) in Computer and Telecommunication at the same
university, where he completed his Master’s Project with the Moroccan ICI Company in 2003.
Later in 2007, he received his Ph.D. from the University Mohammed V-Agdal in Rabat,
Morocco, in collaboration with the Polytechnic University of Catalonia (UPC) in Spain.
Following his academic accomplishments, he was appointed as an Associate Professor at the
Ibn Tofail University's Faculty of Sciences of Kenitra in Morocco in 2009, where he teaches in
the Department of Computer Science and Mathematics. Additionally, he serves as an
Associate Researcher at the Rabat Faculty of Sciences and is currently the Laboratory Head at
LaRIT. He can be contacted at email fakhri@uit.ac.ma.