One success factor of an online affiliate is determined by the quality of the content source. Therefore, affiliate marketplaces need to do an objective assessment to retrieve content data that will be used to choose the right product in the appropriate product filter. Usually, the selection is not made using a good and measured system so that the selection of product content is only based on parts that are not in accordance with what is seen or subjective. However, if analyzed using a good and measurable system will produce an objective product content and can have a positive impact on users because the selection is based on factual data. The purpose of this research is to analyze the potential of the affiliate marketplace by combining cosine similarity with vision-based page segmentation. This is a new breakthrough made for optimization to get the best content in accordance with the required criteria. This work will produce a number of product recommendations that are appropriate for publication and then made use of for comparison that matches the required criteria. At the limited evaluation stage, the performance of the proposed model obtained satisfactory results, in which 5 queries tested were all as expected.
E-commerce online review for detecting influencing factors users perceptionjournalBEEI
To date, online shopping using e-commerce services becomes a trend. The emergence of e-commerce truly helps people to shop more effectively and efficiently. However, there are still some problems encountered in e-commerce, especially from the user perspective. This research aims to explore user review data, particularly on factors that influence user perception of e-commerce applications, classify, and identify potential solutions to finding problems in e-commerce applications. Data is grabbed using web scraping techniques and classified using proper machine learning, i.e., support vector machine (SVM). Text associations and fishbone analysis are performed based on the classified user review data. The results of this study show that the user satisfaction problem can be captured. Furthermore, various services that should be provided as a potential solution to experienced customers' problems or application users' perception problems can be generated. A detailed discussion of these findings is available in this article.
Data Exchange Design with SDMX Format for Interoperability Statistical DataNooria Sukmaningtyas
Today’s concept of Open Government Data (OGD) for openness, transparency and ease of
access of data owned by government agencies becomes increasingly important. This initiative emerges
from the demand of data usersforthe data belongs to the government agencies. The data services
providing an easy access, cheap, fast, and interoperability are needed by the users and becomes
important indicator performance for respective government agencies. Statistical Data and Metadata
Exchange (SDMX) is a new standard format in the data dissemination activities particularly in the
exchange of statistical data and metadata via Internet. In this respect SDMX support the implementation of
OGD project. This paper is on the technical design, development and implementation of data and
metadata exchange service of statistical data using SDMX format to support interoperability data through
web services. Three results are proposed: (i) framework for standardization of structure of statistical
publications data model with SDMX; (ii) design architecture of data sharing model; and (iii) web service
implementation of data and metadata exchange service using Service Oriented Analysis and Design
(SOAD) method. Implementation at Statistics Indonesia (BPS) is chosen as a case study to prove the
design concept. It is shown through quantitative assessment and black box testing that the design
achieves its objective.
Enhanced Privacy Preserving Access Control in Incremental Data using microagg...ravi sharma
The use of one or more techniques designed to make it impossible or at least more difficult to identify a particular individual from stored data related to them.
A Generic Model for Student Data Analytic Web Service (SDAWS)Editor IJCATR
Any university management system accumulates a cartload of data and analytics can be applied on it to gather useful
information to aid the academic decision making process. This paper is a novel attempt to demonstrate the significance of a data
analytic web service in the education domain. This can be integrated with the University Management System or any other application
of the university easily. Analytics as a web service offers much benefits over the traditional analysis methods. The web service can be
hosted on a web server and accessed over the internet or on to the private cloud of the campus. The data from various courses from
different departments can be uploaded and analyzed easily. In this paper we design a web service framework to be used in educational
data mining that provide analysis as a service.
Unstructured multidimensional array multimedia retrival model based xml databaseeSAT Journals
Abstract Unstructured Data derived from the thought of data warehouse, data cube and xml, this paper presents a new database structure model which organizes the unstructured data in a multidimensional data cube based on XML Database. In this data cube of XML, clustered data are stored in instance table. A leading data corresponding are stored in dimension table. The relational model is helpful to construct data model, but it lacks flexibility, now the new data model can complement the defect of relational model. When querying, a leading data is gained from dimension table of XML then receiving the unstructured data through XQuery. Thus we increase the flexibility of XML database. Keywords: XML, multimedia, Multi-dimension, Database, Retrieval Model, multidimensional array, unstructured data.
E-commerce online review for detecting influencing factors users perceptionjournalBEEI
To date, online shopping using e-commerce services becomes a trend. The emergence of e-commerce truly helps people to shop more effectively and efficiently. However, there are still some problems encountered in e-commerce, especially from the user perspective. This research aims to explore user review data, particularly on factors that influence user perception of e-commerce applications, classify, and identify potential solutions to finding problems in e-commerce applications. Data is grabbed using web scraping techniques and classified using proper machine learning, i.e., support vector machine (SVM). Text associations and fishbone analysis are performed based on the classified user review data. The results of this study show that the user satisfaction problem can be captured. Furthermore, various services that should be provided as a potential solution to experienced customers' problems or application users' perception problems can be generated. A detailed discussion of these findings is available in this article.
Data Exchange Design with SDMX Format for Interoperability Statistical DataNooria Sukmaningtyas
Today’s concept of Open Government Data (OGD) for openness, transparency and ease of
access of data owned by government agencies becomes increasingly important. This initiative emerges
from the demand of data usersforthe data belongs to the government agencies. The data services
providing an easy access, cheap, fast, and interoperability are needed by the users and becomes
important indicator performance for respective government agencies. Statistical Data and Metadata
Exchange (SDMX) is a new standard format in the data dissemination activities particularly in the
exchange of statistical data and metadata via Internet. In this respect SDMX support the implementation of
OGD project. This paper is on the technical design, development and implementation of data and
metadata exchange service of statistical data using SDMX format to support interoperability data through
web services. Three results are proposed: (i) framework for standardization of structure of statistical
publications data model with SDMX; (ii) design architecture of data sharing model; and (iii) web service
implementation of data and metadata exchange service using Service Oriented Analysis and Design
(SOAD) method. Implementation at Statistics Indonesia (BPS) is chosen as a case study to prove the
design concept. It is shown through quantitative assessment and black box testing that the design
achieves its objective.
Enhanced Privacy Preserving Access Control in Incremental Data using microagg...ravi sharma
The use of one or more techniques designed to make it impossible or at least more difficult to identify a particular individual from stored data related to them.
A Generic Model for Student Data Analytic Web Service (SDAWS)Editor IJCATR
Any university management system accumulates a cartload of data and analytics can be applied on it to gather useful
information to aid the academic decision making process. This paper is a novel attempt to demonstrate the significance of a data
analytic web service in the education domain. This can be integrated with the University Management System or any other application
of the university easily. Analytics as a web service offers much benefits over the traditional analysis methods. The web service can be
hosted on a web server and accessed over the internet or on to the private cloud of the campus. The data from various courses from
different departments can be uploaded and analyzed easily. In this paper we design a web service framework to be used in educational
data mining that provide analysis as a service.
Unstructured multidimensional array multimedia retrival model based xml databaseeSAT Journals
Abstract Unstructured Data derived from the thought of data warehouse, data cube and xml, this paper presents a new database structure model which organizes the unstructured data in a multidimensional data cube based on XML Database. In this data cube of XML, clustered data are stored in instance table. A leading data corresponding are stored in dimension table. The relational model is helpful to construct data model, but it lacks flexibility, now the new data model can complement the defect of relational model. When querying, a leading data is gained from dimension table of XML then receiving the unstructured data through XQuery. Thus we increase the flexibility of XML database. Keywords: XML, multimedia, Multi-dimension, Database, Retrieval Model, multidimensional array, unstructured data.
-Developed Enterprise software architecture model for startup company as consultants
-Identified and developed all part of enterprise software middleware such as business motivation model, business capability model, Information architecture model, application architecture and much more.
- Developed the project planning, roadmap and governance for the enterprise.
A tutorial on secure outsourcing of large scalecomputation for big dataredpel dot com
A tutorial on secure outsourcing of large scalecomputation for big data
for more ieee paper / full abstract / implementation , just visit www.redpel.com
Nowadays the smart phones are equipped with touch based screens, enabling the user to interact in an easier and
efficient way, compared to standard buttons. However, such screens require visual navigation, apparently different
to access for the visually impaired.One of the technologies which aid the blind person to call in android mobile is
Virtual Reality. Even though virtual reality is employed to carry out operations the blind person attention is one of
the most important parameter. If the user commits any mistake to use the application it leads to a wrong call. So, in
this paper we suggest technology to reduce the burdens of a blind person “Haptic Technology”. Haptic is the
science of applying tactile sensation to human interaction with computers. In our paper we have discussed the basic
concepts behind haptic along with the haptic devices and how these devices are interacted to produce sense of touch
and force feedback mechanisms. Also the implementation of this mechanism by means of haptic rendering and
contact detection were discussed. Further we explained the storage and retrieval of haptic data while working with
haptic devices.
Optimized Feature Extraction and Actionable Knowledge Discovery for Customer ...Eswar Publications
In today’s dynamic marketplace, telecommunication organizations, both private and public, are increasingly leaving antiquated marketing philosophies and strategies to the adoption of more customer-driven initiatives that seek to understand, attract, retain and build intimate long term relationship with profitable customers. This paradigm shift has undauntedly led to the growing interest in Customer Relationship Management (CRM) initiatives that aim at ensuring customer identification and interactions. The urgent market requirement is to identify automated methods that can assist businesses in the complex task of predicting customer churning.
The immediate requirement of the market is to have systems that can perform accurate
(i) identification of loyal customers (so that companies can offer more services to retain them)
(ii) prediction of churners to ensure that only the customers who are planning to switch their service
providers are being targeted for retention
A Framework for Visualizing Association Mining Resultsertekg
Download Link > https://ertekprojects.com/gurdal-ertek-publications/blog/a-framework-for-visualizing-association-mining-results/
Association mining is one of the most used data mining techniques due to interpretable and actionable results. In this study we pro-pose a framework to visualize the association mining results, speci¯cally frequent itemsets and association rules, as graphs. We demonstrate the applicability and usefulness of our approach through a Market Basket Analysis (MBA) case study where we visually explore the data mining results for a supermarket data set. In this case study we derive several
interesting insights regarding the relationships among the items and sug-gest how they can be used as basis for decision making in retailing.
Linking Behavioral Patterns to Personal Attributes through Data Re-Miningertekg
Download Link >https://ertekprojects.com/gurdal-ertek-publications/blog/linking-behavioral-patterns-to-personal-attributes-through-data-re-mining/
A fundamental challenge in behavioral informatics is the development of methodologies and systems that can achieve its goals and tasks, including be-havior pattern analysis. This study presents such a methodology, that can be con-verted into a decision support system, by the appropriate integration of existing tools for association mining and graph visualization. The methodology enables the linking of behavioral patterns to personal attributes, through the re-mining of colored association graphs that represent item associations. The methodology is described and mathematically formalized, and is demonstrated in a case study related with retail industry.
Latent semantic analysis and cosine similarity for hadith search engineTELKOMNIKA JOURNAL
Search engine technology was used to find information as needed easily, quickly and efficiently, including in searching the information about the hadith which was a second guideline of life for muslim besides the Holy Qur'an. This study was aim to build a specialized search engine to find information about a complete and eleven hadith in Indonesian language. In this research, search engines worked by using latent semantic analysis (LSA) and cosine similarity based on the keywords entered. The LSA and cosine similarity methods were used in forming structured representations of text data as well as calculating the similarity of the keyword text entered with hadith text data, so the hadith information was issued in accordance with what was searched. Based on the results of the test conducted 50 times, it indicated that the LSA and cosine similarity had a success rate in finding high hadith information with an average recall value was 87.83%, although from all information obtained level of precision hadith was found semantically not many, it was indicated by the average precision value was 36.25%.
An Efficient Framework for Predicting and Recommending M-Commerce Patterns Ba...Editor IJCATR
Mobile Commerce, also known as M-Commerce or mCommerce, is the ability to conduct commerce using a mobile device. Research is done by Mining and Prediction of Mobile Users’ Commerce Behaviors such as their purchase transactions. The problem of PMCP-Mine algorithm has been overcome by the efficient framework based graph diffusion method. The main objective is to construct the graph based diffusion method. Graph is constructed for the items purchased by the Mobile users and then finding the frequently purchased item. By using ranking method, we are ranking the items based on the transactions. Then, by analyzing the mobile users behavior and recommending the ranked items. This framework produces more efficient and accurate item recommendation than the MCE framework.
An Improved Support Vector Machine Classifier Using AdaBoost and Genetic Algo...Eswar Publications
Predicting the objective of internet users has divergent applications in the areas such as e-commerce, entertainment in online, and several internet-based applications. The critical part of the classifying internet queries based on obtainable features namely contextual information, keywords and their semantic relationships. This research paper presents an improved support vector machine classifier that makes use of ad boost genetic algorithmic approach towards web interaction mining. Around 31 participants are chosen and given topics to
search web contents. Parameters such as precision, recall and F1 score are taken for comparing the proposed classifier with the classical support vector machine. Results proved that the proposed classifier achieves better performance than that of the conventional SVM.
Paper Explained: Deep learning framework for measuring the digital strategy o...Devansh16
Companies today are racing to leverage the latest digital technologies, such as artificial intelligence, blockchain, and cloud computing. However, many companies report that their strategies did not achieve the anticipated business results. This study is the first to apply state of the art NLP models on unstructured data to understand the different clusters of digital strategy patterns that companies are Adopting. We achieve this by analyzing earnings calls from Fortune Global 500 companies between 2015 and 2019. We use Transformer based architecture for text classification which show a better understanding of the conversation context. We then investigate digital strategy patterns by applying clustering analysis. Our findings suggest that Fortune 500 companies use four distinct strategies which are product led, customer experience led, service led, and efficiency led. This work provides an empirical baseline for companies and researchers to enhance our understanding of the field.
Check out my other articles on Medium. : https://rb.gy/zn1aiu
My YouTube. It's a work in progress haha: https://rb.gy/88iwdd
Reach out to me on LinkedIn. Let's connect: https://rb.gy/m5ok2y
My Twitter: https://twitter.com/Machine01776819
My Substack: https://devanshacc.substack.com/
If you would like to work with me email me: devanshverma425@gmail.com
Live conversations at twitch here: https://rb.gy/zlhk9y
To get updates on my content- Instagram: https://rb.gy/gmvuy9
Get a free stock on Robinhood: https://join.robinhood.com/fnud75
A LINK-BASED APPROACH TO ENTITY RESOLUTION IN SOCIAL NETWORKScsandit
Social networks initially had been places for people to contact each other, find friends or new
acquaintances. As such they ever proved interesting for machine aided analysis. Recent
developments, however, pivoted social networks to being among the main fields of information
exchange, opinion expression and debate. As a result there is growing interest in both analyzing
and integrating social network services. In this environment efficient information retrieval is
hindered by the vast amount and varying quality of the user-generated content. Guiding users to
relevant information is a valuable service and also a difficult task, where a crucial part of the
process is accurately resolving duplicate entities to real-world ones. In this paper we propose a
novel approach that utilizes the principles of link mining to successfully extend the methodology
of entity resolution to multitype problems. The proposed method is presented using an
illustrative social network-based real-world example and validated by comprehensive
evaluation of the results.
Abstract—Since the demand for information retrieval increases quickly, indexing structures became an important issue to support fast information retrieval. According to the work in this paper, a new data structure called Dynamic Ordered Multi-field Index (DOMI) for information retrieval has been introduced. It is based on radix trees organized in segments in addition to a hash table to point to the roots of each segment, where each segment is dedicated to store the values of a single field. The hash table is used to access the needed segments directly without traversing the upper segments. So, DOMI improves look-up performance for queries addressing to a single field. In the case of multiple queries addressing, each segment of the radix tree is traversed sequentially without visiting the unrelated branches. The use of segmentation for the proposed DOMI provides flexibility for minimizing communication overhead in the distributed system. Every field in the radix tree is represented by one segment, where each segment can be stored as one block.
In addition to, the proposed DOMI consumes less space comparing to indexes which are built using B or B+ trees. Hence, it is more suitable for intensive-data such as Big Data.
PERFORMANCE ANALYSIS OFMODIFIED ALGORITHM FOR FINDINGMULTILEVEL ASSOCIATION R...cseij
Multilevel association rules explore the concept hierarchy at multiple levels which provides more specific information. Apriori algorithm explores the single level association rules. Many implementations are available of Apriori algorithm. Fast Apriori implementation is modified to develop new algorithm for finding multilevel association rules. In this study the performance of this new algorithm is analyzed in terms of running time in seconds
Service Level Comparison for Online Shopping using Data MiningIIRindia
The term knowledge discovery in databases (KDD) is the analysis step of data mining. The data mining goal is to extract the knowledge and patterns from large data sets, not the data extraction itself. Big-Data Computing is a critical challenge for the ICT industry. Engineers and researchers are dealing with the cloud computing paradigm of petabyte data sets. Thus the demand for building a service stack to distribute, manage and process massive data sets has risen drastically. We investigate the problem for a single source node to broadcast the big chunk of data sets to a set of nodes to minimize the maximum completion time. These nodes may locate in the same datacenter or across geo-distributed data centers. The Big-data broadcasting problem is modeled into a LockStep Broadcast Tree (LSBT) problem. And the main idea of the LSBT is defining a basic unit of upload bandwidth, r, a node with capacity c broadcasts data to a set of [c=r] children at the rate r. Note that r is a parameter to be optimized as part of the LSBT problem. The broadcast data are further divided into m chunks. In a pipeline manner, these m chunks can then be broadcast down the LSBT. In a homogeneous network environment in which each node has the same upload capacity c, the optimal uplink rate r, of LSBT is either c=2 or 3, whichever gives the smaller maximum completion time. For heterogeneous environments, an O(nlog2n) algorithm is presented to select an optimal uplink rate r, and to construct an optimal LSBT. With lower computational complexity and low maximum completion time, the numerical results shows better performance.The methodology includes Various Web applications Building and Broadcasting followed by the Gateway Application and Batch Processing over the TSV Data after which the Web Crawling for Resources and MapReduce process takes place and finally Picking Products from Recommendations and Purchasing it.
-Developed Enterprise software architecture model for startup company as consultants
-Identified and developed all part of enterprise software middleware such as business motivation model, business capability model, Information architecture model, application architecture and much more.
- Developed the project planning, roadmap and governance for the enterprise.
A tutorial on secure outsourcing of large scalecomputation for big dataredpel dot com
A tutorial on secure outsourcing of large scalecomputation for big data
for more ieee paper / full abstract / implementation , just visit www.redpel.com
Nowadays the smart phones are equipped with touch based screens, enabling the user to interact in an easier and
efficient way, compared to standard buttons. However, such screens require visual navigation, apparently different
to access for the visually impaired.One of the technologies which aid the blind person to call in android mobile is
Virtual Reality. Even though virtual reality is employed to carry out operations the blind person attention is one of
the most important parameter. If the user commits any mistake to use the application it leads to a wrong call. So, in
this paper we suggest technology to reduce the burdens of a blind person “Haptic Technology”. Haptic is the
science of applying tactile sensation to human interaction with computers. In our paper we have discussed the basic
concepts behind haptic along with the haptic devices and how these devices are interacted to produce sense of touch
and force feedback mechanisms. Also the implementation of this mechanism by means of haptic rendering and
contact detection were discussed. Further we explained the storage and retrieval of haptic data while working with
haptic devices.
Optimized Feature Extraction and Actionable Knowledge Discovery for Customer ...Eswar Publications
In today’s dynamic marketplace, telecommunication organizations, both private and public, are increasingly leaving antiquated marketing philosophies and strategies to the adoption of more customer-driven initiatives that seek to understand, attract, retain and build intimate long term relationship with profitable customers. This paradigm shift has undauntedly led to the growing interest in Customer Relationship Management (CRM) initiatives that aim at ensuring customer identification and interactions. The urgent market requirement is to identify automated methods that can assist businesses in the complex task of predicting customer churning.
The immediate requirement of the market is to have systems that can perform accurate
(i) identification of loyal customers (so that companies can offer more services to retain them)
(ii) prediction of churners to ensure that only the customers who are planning to switch their service
providers are being targeted for retention
A Framework for Visualizing Association Mining Resultsertekg
Download Link > https://ertekprojects.com/gurdal-ertek-publications/blog/a-framework-for-visualizing-association-mining-results/
Association mining is one of the most used data mining techniques due to interpretable and actionable results. In this study we pro-pose a framework to visualize the association mining results, speci¯cally frequent itemsets and association rules, as graphs. We demonstrate the applicability and usefulness of our approach through a Market Basket Analysis (MBA) case study where we visually explore the data mining results for a supermarket data set. In this case study we derive several
interesting insights regarding the relationships among the items and sug-gest how they can be used as basis for decision making in retailing.
Linking Behavioral Patterns to Personal Attributes through Data Re-Miningertekg
Download Link >https://ertekprojects.com/gurdal-ertek-publications/blog/linking-behavioral-patterns-to-personal-attributes-through-data-re-mining/
A fundamental challenge in behavioral informatics is the development of methodologies and systems that can achieve its goals and tasks, including be-havior pattern analysis. This study presents such a methodology, that can be con-verted into a decision support system, by the appropriate integration of existing tools for association mining and graph visualization. The methodology enables the linking of behavioral patterns to personal attributes, through the re-mining of colored association graphs that represent item associations. The methodology is described and mathematically formalized, and is demonstrated in a case study related with retail industry.
Latent semantic analysis and cosine similarity for hadith search engineTELKOMNIKA JOURNAL
Search engine technology was used to find information as needed easily, quickly and efficiently, including in searching the information about the hadith which was a second guideline of life for muslim besides the Holy Qur'an. This study was aim to build a specialized search engine to find information about a complete and eleven hadith in Indonesian language. In this research, search engines worked by using latent semantic analysis (LSA) and cosine similarity based on the keywords entered. The LSA and cosine similarity methods were used in forming structured representations of text data as well as calculating the similarity of the keyword text entered with hadith text data, so the hadith information was issued in accordance with what was searched. Based on the results of the test conducted 50 times, it indicated that the LSA and cosine similarity had a success rate in finding high hadith information with an average recall value was 87.83%, although from all information obtained level of precision hadith was found semantically not many, it was indicated by the average precision value was 36.25%.
An Efficient Framework for Predicting and Recommending M-Commerce Patterns Ba...Editor IJCATR
Mobile Commerce, also known as M-Commerce or mCommerce, is the ability to conduct commerce using a mobile device. Research is done by Mining and Prediction of Mobile Users’ Commerce Behaviors such as their purchase transactions. The problem of PMCP-Mine algorithm has been overcome by the efficient framework based graph diffusion method. The main objective is to construct the graph based diffusion method. Graph is constructed for the items purchased by the Mobile users and then finding the frequently purchased item. By using ranking method, we are ranking the items based on the transactions. Then, by analyzing the mobile users behavior and recommending the ranked items. This framework produces more efficient and accurate item recommendation than the MCE framework.
An Improved Support Vector Machine Classifier Using AdaBoost and Genetic Algo...Eswar Publications
Predicting the objective of internet users has divergent applications in the areas such as e-commerce, entertainment in online, and several internet-based applications. The critical part of the classifying internet queries based on obtainable features namely contextual information, keywords and their semantic relationships. This research paper presents an improved support vector machine classifier that makes use of ad boost genetic algorithmic approach towards web interaction mining. Around 31 participants are chosen and given topics to
search web contents. Parameters such as precision, recall and F1 score are taken for comparing the proposed classifier with the classical support vector machine. Results proved that the proposed classifier achieves better performance than that of the conventional SVM.
Paper Explained: Deep learning framework for measuring the digital strategy o...Devansh16
Companies today are racing to leverage the latest digital technologies, such as artificial intelligence, blockchain, and cloud computing. However, many companies report that their strategies did not achieve the anticipated business results. This study is the first to apply state of the art NLP models on unstructured data to understand the different clusters of digital strategy patterns that companies are Adopting. We achieve this by analyzing earnings calls from Fortune Global 500 companies between 2015 and 2019. We use Transformer based architecture for text classification which show a better understanding of the conversation context. We then investigate digital strategy patterns by applying clustering analysis. Our findings suggest that Fortune 500 companies use four distinct strategies which are product led, customer experience led, service led, and efficiency led. This work provides an empirical baseline for companies and researchers to enhance our understanding of the field.
Check out my other articles on Medium. : https://rb.gy/zn1aiu
My YouTube. It's a work in progress haha: https://rb.gy/88iwdd
Reach out to me on LinkedIn. Let's connect: https://rb.gy/m5ok2y
My Twitter: https://twitter.com/Machine01776819
My Substack: https://devanshacc.substack.com/
If you would like to work with me email me: devanshverma425@gmail.com
Live conversations at twitch here: https://rb.gy/zlhk9y
To get updates on my content- Instagram: https://rb.gy/gmvuy9
Get a free stock on Robinhood: https://join.robinhood.com/fnud75
A LINK-BASED APPROACH TO ENTITY RESOLUTION IN SOCIAL NETWORKScsandit
Social networks initially had been places for people to contact each other, find friends or new
acquaintances. As such they ever proved interesting for machine aided analysis. Recent
developments, however, pivoted social networks to being among the main fields of information
exchange, opinion expression and debate. As a result there is growing interest in both analyzing
and integrating social network services. In this environment efficient information retrieval is
hindered by the vast amount and varying quality of the user-generated content. Guiding users to
relevant information is a valuable service and also a difficult task, where a crucial part of the
process is accurately resolving duplicate entities to real-world ones. In this paper we propose a
novel approach that utilizes the principles of link mining to successfully extend the methodology
of entity resolution to multitype problems. The proposed method is presented using an
illustrative social network-based real-world example and validated by comprehensive
evaluation of the results.
Abstract—Since the demand for information retrieval increases quickly, indexing structures became an important issue to support fast information retrieval. According to the work in this paper, a new data structure called Dynamic Ordered Multi-field Index (DOMI) for information retrieval has been introduced. It is based on radix trees organized in segments in addition to a hash table to point to the roots of each segment, where each segment is dedicated to store the values of a single field. The hash table is used to access the needed segments directly without traversing the upper segments. So, DOMI improves look-up performance for queries addressing to a single field. In the case of multiple queries addressing, each segment of the radix tree is traversed sequentially without visiting the unrelated branches. The use of segmentation for the proposed DOMI provides flexibility for minimizing communication overhead in the distributed system. Every field in the radix tree is represented by one segment, where each segment can be stored as one block.
In addition to, the proposed DOMI consumes less space comparing to indexes which are built using B or B+ trees. Hence, it is more suitable for intensive-data such as Big Data.
PERFORMANCE ANALYSIS OFMODIFIED ALGORITHM FOR FINDINGMULTILEVEL ASSOCIATION R...cseij
Multilevel association rules explore the concept hierarchy at multiple levels which provides more specific information. Apriori algorithm explores the single level association rules. Many implementations are available of Apriori algorithm. Fast Apriori implementation is modified to develop new algorithm for finding multilevel association rules. In this study the performance of this new algorithm is analyzed in terms of running time in seconds
Service Level Comparison for Online Shopping using Data MiningIIRindia
The term knowledge discovery in databases (KDD) is the analysis step of data mining. The data mining goal is to extract the knowledge and patterns from large data sets, not the data extraction itself. Big-Data Computing is a critical challenge for the ICT industry. Engineers and researchers are dealing with the cloud computing paradigm of petabyte data sets. Thus the demand for building a service stack to distribute, manage and process massive data sets has risen drastically. We investigate the problem for a single source node to broadcast the big chunk of data sets to a set of nodes to minimize the maximum completion time. These nodes may locate in the same datacenter or across geo-distributed data centers. The Big-data broadcasting problem is modeled into a LockStep Broadcast Tree (LSBT) problem. And the main idea of the LSBT is defining a basic unit of upload bandwidth, r, a node with capacity c broadcasts data to a set of [c=r] children at the rate r. Note that r is a parameter to be optimized as part of the LSBT problem. The broadcast data are further divided into m chunks. In a pipeline manner, these m chunks can then be broadcast down the LSBT. In a homogeneous network environment in which each node has the same upload capacity c, the optimal uplink rate r, of LSBT is either c=2 or 3, whichever gives the smaller maximum completion time. For heterogeneous environments, an O(nlog2n) algorithm is presented to select an optimal uplink rate r, and to construct an optimal LSBT. With lower computational complexity and low maximum completion time, the numerical results shows better performance.The methodology includes Various Web applications Building and Broadcasting followed by the Gateway Application and Batch Processing over the TSV Data after which the Web Crawling for Resources and MapReduce process takes place and finally Picking Products from Recommendations and Purchasing it.
http://assignment-partner.com/ .That's a sample paper - essay / paper on the topic "Information system infrastructure" created by our writers!
Disclaimer: The paper above have been completed for actual clients. We have acclaimed personal permission from the customers to post it.
An efficient enhanced k-means clustering algorithm for best offer prediction...IJECEIAES
Telecom companies usually offer several rate plans or bundles to satisfy the customers’ different needs. Finding and recommending the best offer that perfectly matches the customer’s needs is crucial in maintaining customer loyalty and the company’s revenue in the long run. This paper presents an effective method of detecting a group of customers who have the potential to upgrade their telecom package. The used data is an actual dataset extracted from call detail records (CDRs) of a telecom operator. The method utilizes an enhanced k-means clustering model based on customer profiling. The results show that the proposed k-means-based clustering algorithm more effectively identifies potential customers willing to upgrade to a higher tier package compared to the traditional k-means algorithm. Our results showed that our proposed clustering model accuracy was over 90%, while the traditional k-means accuracy was under 70%.
E-commerce Website Recommender System Based on Dissimilarity and Association ...Nooria Sukmaningtyas
By analyzing the current electronic commerce recommendation algorithm analysis, put forward a
kind to use dissimilarity clustering and association recommendation algorithm, the algorithm realized web
website shopping user data clustering by use of the dissimilarity, and then use the association rules
algorithm for clustering results of association recommendation, experiments show that the algorithm
compared with traditional clustering association algorithm of iteration times decrease, improve operational
efficiency, to prove the method by use of the actual users purchase the recommended, and evidence of
the effectiveness of the algorithm in recommendation.
Processing of the data generated from transactions that occur every day which resulted in nearly thousands of data per day requires software capable of enabling users to conduct a search of the necessary data. Data mining becomes a solution for the problem. To that end, many large industries began creating software that can perform data processing. Due to the high cost to obtain data mining software that comes from the big industry, then eventually some communities such as universities eventually provide convenience for users who want just to learn or to deepen the data mining to create software based on open source. Meanwhile, many commercial vendors market their products respectively. WEKA and Salford System are both of data mining software. They have the advantages and the disadvantages. This study is to compare them by using several attributes. The users can select which software is more suitable for their daily activities.
International Journal of Computational Engineering Research(IJCER) is an intentional online Journal in English monthly publishing journal. This Journal publish original research work that contributes significantly to further the scientific knowledge in engineering and Technology.
This is our contributions to the Data Science projects, as developed in our startup. These are part of partner trainings and in-house design and development and testing of the course material and concepts in Data Science and Engineering. It covers Data ingestion, data wrangling, feature engineering, data analysis, data storage, data extraction, querying data, formatting and visualizing data for various dashboards.Data is prepared for accurate ML model predictions and Generative AI apps
This is our project work at our startup for Data Science. This is part of our internal training and focused on data management for AI, ML and Generative AI apps
International Journal of Computational Engineering Research(IJCER)ijceronline
International Journal of Computational Engineering Research(IJCER) is an intentional online Journal in English monthly publishing journal. This Journal publish original research work that contributes significantly to further the scientific knowledge in engineering and Technology.
Square transposition: an approach to the transposition process in block cipherjournalBEEI
The transposition process is needed in cryptography to create a diffusion effect on data encryption standard (DES) and advanced encryption standard (AES) algorithms as standard information security algorithms by the National Institute of Standards and Technology. The problem with DES and AES algorithms is that their transposition index values form patterns and do not form random values. This condition will certainly make it easier for a cryptanalyst to look for a relationship between ciphertexts because some processes are predictable. This research designs a transposition algorithm called square transposition. Each process uses square 8 × 8 as a place to insert and retrieve 64-bits. The determination of the pairing of the input scheme and the retrieval scheme that have unequal flow is an important factor in producing a good transposition. The square transposition can generate random and non-pattern indices so that transposition can be done better than DES and AES.
Hyper-parameter optimization of convolutional neural network based on particl...journalBEEI
Deep neural networks have accomplished enormous progress in tackling many problems. More specifically, convolutional neural network (CNN) is a category of deep networks that have been a dominant technique in computer vision tasks. Despite that these deep neural networks are highly effective; the ideal structure is still an issue that needs a lot of investigation. Deep Convolutional Neural Network model is usually designed manually by trials and repeated tests which enormously constrain its application. Many hyper-parameters of the CNN can affect the model performance. These parameters are depth of the network, numbers of convolutional layers, and numbers of kernels with their sizes. Therefore, it may be a huge challenge to design an appropriate CNN model that uses optimized hyper-parameters and reduces the reliance on manual involvement and domain expertise. In this paper, a design architecture method for CNNs is proposed by utilization of particle swarm optimization (PSO) algorithm to learn the optimal CNN hyper-parameters values. In the experiment, we used Modified National Institute of Standards and Technology (MNIST) database of handwritten digit recognition. The experiments showed that our proposed approach can find an architecture that is competitive to the state-of-the-art models with a testing error of 0.87%.
Supervised machine learning based liver disease prediction approach with LASS...journalBEEI
In this contemporary era, the uses of machine learning techniques are increasing rapidly in the field of medical science for detecting various diseases such as liver disease (LD). Around the globe, a large number of people die because of this deadly disease. By diagnosing the disease in a primary stage, early treatment can be helpful to cure the patient. In this research paper, a method is proposed to diagnose the LD using supervised machine learning classification algorithms, namely logistic regression, decision tree, random forest, AdaBoost, KNN, linear discriminant analysis, gradient boosting and support vector machine (SVM). We also deployed a least absolute shrinkage and selection operator (LASSO) feature selection technique on our taken dataset to suggest the most highly correlated attributes of LD. The predictions with 10 fold cross-validation (CV) made by the algorithms are tested in terms of accuracy, sensitivity, precision and f1-score values to forecast the disease. It is observed that the decision tree algorithm has the best performance score where accuracy, precision, sensitivity and f1-score values are 94.295%, 92%, 99% and 96% respectively with the inclusion of LASSO. Furthermore, a comparison with recent studies is shown to prove the significance of the proposed system.
A secure and energy saving protocol for wireless sensor networksjournalBEEI
The research domain for wireless sensor networks (WSN) has been extensively conducted due to innovative technologies and research directions that have come up addressing the usability of WSN under various schemes. This domain permits dependable tracking of a diversity of environments for both military and civil applications. The key management mechanism is a primary protocol for keeping the privacy and confidentiality of the data transmitted among different sensor nodes in WSNs. Since node's size is small; they are intrinsically limited by inadequate resources such as battery life-time and memory capacity. The proposed secure and energy saving protocol (SESP) for wireless sensor networks) has a significant impact on the overall network life-time and energy dissipation. To encrypt sent messsages, the SESP uses the public-key cryptography’s concept. It depends on sensor nodes' identities (IDs) to prevent the messages repeated; making security goals- authentication, confidentiality, integrity, availability, and freshness to be achieved. Finally, simulation results show that the proposed approach produced better energy consumption and network life-time compared to LEACH protocol; sensors are dead after 900 rounds in the proposed SESP protocol. While, in the low-energy adaptive clustering hierarchy (LEACH) scheme, the sensors are dead after 750 rounds.
Plant leaf identification system using convolutional neural networkjournalBEEI
This paper proposes a leaf identification system using convolutional neural network (CNN). This proposed system can identify five types of local Malaysia leaf which were acacia, papaya, cherry, mango and rambutan. By using CNN from deep learning, the network is trained from the database that acquired from leaf images captured by mobile phone for image classification. ResNet-50 was the architecture has been used for neural networks image classification and training the network for leaf identification. The recognition of photographs leaves requested several numbers of steps, starting with image pre-processing, feature extraction, plant identification, matching and testing, and finally extracting the results achieved in MATLAB. Testing sets of the system consists of 3 types of images which were white background, and noise added and random background images. Finally, interfaces for the leaf identification system have developed as the end software product using MATLAB app designer. As a result, the accuracy achieved for each training sets on five leaf classes are recorded above 98%, thus recognition process was successfully implemented.
Customized moodle-based learning management system for socially disadvantaged...journalBEEI
This study aims to develop Moodle-based LMS with customized learning content and modified user interface to facilitate pedagogical processes during covid-19 pandemic and investigate how teachers of socially disadvantaged schools perceived usability and technology acceptance. Co-design process was conducted with two activities: 1) need assessment phase using an online survey and interview session with the teachers and 2) the development phase of the LMS. The system was evaluated by 30 teachers from socially disadvantaged schools for relevance to their distance learning activities. We employed computer software usability questionnaire (CSUQ) to measure perceived usability and the technology acceptance model (TAM) with insertion of 3 original variables (i.e., perceived usefulness, perceived ease of use, and intention to use) and 5 external variables (i.e., attitude toward the system, perceived interaction, self-efficacy, user interface design, and course design). The average CSUQ rating exceeded 5.0 of 7 point-scale, indicated that teachers agreed that the information quality, interaction quality, and user interface quality were clear and easy to understand. TAM results concluded that the LMS design was judged to be usable, interactive, and well-developed. Teachers reported an effective user interface that allows effective teaching operations and lead to the system adoption in immediate time.
Understanding the role of individual learner in adaptive and personalized e-l...journalBEEI
Dynamic learning environment has emerged as a powerful platform in a modern e-learning system. The learning situation that constantly changing has forced the learning platform to adapt and personalize its learning resources for students. Evidence suggested that adaptation and personalization of e-learning systems (APLS) can be achieved by utilizing learner modeling, domain modeling, and instructional modeling. In the literature of APLS, questions have been raised about the role of individual characteristics that are relevant for adaptation. With several options, a new problem has been raised where the attributes of students in APLS often overlap and are not related between studies. Therefore, this study proposed a list of learner model attributes in dynamic learning to support adaptation and personalization. The study was conducted by exploring concepts from the literature selected based on the best criteria. Then, we described the results of important concepts in student modeling and provided definitions and examples of data values that researchers have used. Besides, we also discussed the implementation of the selected learner model in providing adaptation in dynamic learning.
Prototype mobile contactless transaction system in traditional markets to sup...journalBEEI
One way to prevent and reduce the spread of the covid-19 pandemic is through physical distancing program. This research aims to develop a prototype contactless transaction system using digital payment mechanisms and QR code technology that will be applied in traditional markets. The method used in the development of electronic market systems is a prototype approach. The application of QR code and digital payments are used as a solution to minimize money exchange contacts that are common in traditional markets. The results showed that the system built was able to accelerate and facilitate the buying and selling transaction process in traditional market environment. Alpha testing shows that all functional systems are running well. Meanwhile, beta testing shows that the user can very well accept the system that was built. The results of the study also show acceptance of the usefulness of the system being built, as well as the optimism of its users to be able to take advantage of this system both technologically and functionally, so its can be a part of the digital transformation of the traditional market to the electronic market and has become one of the solutions in reducing the spread of the current covid-19 pandemic.
Wireless HART stack using multiprocessor technique with laxity algorithmjournalBEEI
The use of a real-time operating system is required for the demarcation of industrial wireless sensor network (IWSN) stacks (RTOS). In the industrial world, a vast number of sensors are utilised to gather various types of data. The data gathered by the sensors cannot be prioritised ahead of time. Because all of the information is equally essential. As a result, a protocol stack is employed to guarantee that data is acquired and processed fairly. In IWSN, the protocol stack is implemented using RTOS. The data collected from IWSN sensor nodes is processed using non-preemptive scheduling and the protocol stack, and then sent in parallel to the IWSN's central controller. The real-time operating system (RTOS) is a process that occurs between hardware and software. Packets must be sent at a certain time. It's possible that some packets may collide during transmission. We're going to undertake this project to get around this collision. As a prototype, this project is divided into two parts. The first uses RTOS and the LPC2148 as a master node, while the second serves as a standard data collection node to which sensors are attached. Any controller may be used in the second part, depending on the situation. Wireless HART allows two nodes to communicate with each other.
Implementation of double-layer loaded on octagon microstrip yagi antennajournalBEEI
A double-layer loaded on the octagon microstrip yagi antenna (OMYA) at 5.8 GHz industrial, scientific and medical (ISM) Band is investigated in this paper. The double-layer consist of two double positive (DPS) substrates. The OMYA is overlaid with a double-layer configuration were simulated, fabricated and measured. A good agreement was observed between the computed and measured results of the gain for this antenna. According to comparison results, it shows that 2.5 dB improvement of the OMYA gain can be obtained by applying the double-layer on the top of the OMYA. Meanwhile, the bandwidth of the measured OMYA with the double-layer is 14.6%. It indicates that the double-layer can be used to increase the OMYA performance in term of gain and bandwidth.
The calculation of the field of an antenna located near the human headjournalBEEI
In this work, a numerical calculation was carried out in one of the universal programs for automatic electro-dynamic design. The calculation is aimed at obtaining numerical values for specific absorbed power (SAR). It is the SAR value that can be used to determine the effect of the antenna of a wireless device on biological objects; the dipole parameters will be selected for GSM1800. Investigation of the influence of distance to a cell phone on radiation shows that absorbed in the head of a person the effect of electromagnetic radiation on the brain decreases by three times this is a very important result the SAR value has decreased by almost three times it is acceptable results.
Exact secure outage probability performance of uplinkdownlink multiple access...journalBEEI
In this paper, we study uplink-downlink non-orthogonal multiple access (NOMA) systems by considering the secure performance at the physical layer. In the considered system model, the base station acts a relay to allow two users at the left side communicate with two users at the right side. By considering imperfect channel state information (CSI), the secure performance need be studied since an eavesdropper wants to overhear signals processed at the downlink. To provide secure performance metric, we derive exact expressions of secrecy outage probability (SOP) and and evaluating the impacts of main parameters on SOP metric. The important finding is that we can achieve the higher secrecy performance at high signal to noise ratio (SNR). Moreover, the numerical results demonstrate that the SOP tends to a constant at high SNR. Finally, our results show that the power allocation factors, target rates are main factors affecting to the secrecy performance of considered uplink-downlink NOMA systems.
Design of a dual-band antenna for energy harvesting applicationjournalBEEI
This report presents an investigation on how to improve the current dual-band antenna to enhance the better result of the antenna parameters for energy harvesting application. Besides that, to develop a new design and validate the antenna frequencies that will operate at 2.4 GHz and 5.4 GHz. At 5.4 GHz, more data can be transmitted compare to 2.4 GHz. However, 2.4 GHz has long distance of radiation, so it can be used when far away from the antenna module compare to 5 GHz that has short distance in radiation. The development of this project includes the scope of designing and testing of antenna using computer simulation technology (CST) 2018 software and vector network analyzer (VNA) equipment. In the process of designing, fundamental parameters of antenna are being measured and validated, in purpose to identify the better antenna performance.
Transforming data-centric eXtensible markup language into relational database...journalBEEI
eXtensible markup language (XML) appeared internationally as the format for data representation over the web. Yet, most organizations are still utilising relational databases as their database solutions. As such, it is crucial to provide seamless integration via effective transformation between these database infrastructures. In this paper, we propose XML-REG to bridge these two technologies based on node-based and path-based approaches. The node-based approach is good to annotate each positional node uniquely, while the path-based approach provides summarised path information to join the nodes. On top of that, a new range labelling is also proposed to annotate nodes uniquely by ensuring the structural relationships are maintained between nodes. If a new node is to be added to the document, re-labelling is not required as the new label will be assigned to the node via the new proposed labelling scheme. Experimental evaluations indicated that the performance of XML-REG exceeded XMap, XRecursive, XAncestor and Mini-XML concerning storing time, query retrieval time and scalability. This research produces a core framework for XML to relational databases (RDB) mapping, which could be adopted in various industries.
Key performance requirement of future next wireless networks (6G)journalBEEI
Given the massive potentials of 5G communication networks and their foreseeable evolution, what should there be in 6G that is not in 5G or its long-term evolution? 6G communication networks are estimated to integrate the terrestrial, aerial, and maritime communications into a forceful network which would be faster, more reliable, and can support a massive number of devices with ultra-low latency requirements. This article presents a complete overview of potential 6G communication networks. The major contribution of this study is to present a broad overview of key performance indicators (KPIs) of 6G networks that cover the latest manufacturing progress in the environment of the principal areas of research application, and challenges.
Noise resistance territorial intensity-based optical flow using inverse confi...journalBEEI
This paper presents the use of the inverse confidential technique on bilateral function with the territorial intensity-based optical flow to prove the effectiveness in noise resistance environment. In general, the image’s motion vector is coded by the technique called optical flow where the sequences of the image are used to determine the motion vector. But, the accuracy rate of the motion vector is reduced when the source of image sequences is interfered by noises. This work proved that the inverse confidential technique on bilateral function can increase the percentage of accuracy in the motion vector determination by the territorial intensity-based optical flow under the noisy environment. We performed the testing with several kinds of non-Gaussian noises at several patterns of standard image sequences by analyzing the result of the motion vector in a form of the error vector magnitude (EVM) and compared it with several noise resistance techniques in territorial intensity-based optical flow method.
Modeling climate phenomenon with software grids analysis and display system i...journalBEEI
This study aims to model climate change based on rainfall, air temperature, pressure, humidity and wind with grADS software and create a global warming module. This research uses 3D model, define, design, and develop. The results of the modeling of the five climate elements consist of the annual average temperature in Indonesia in 2009-2015 which is between 29oC to 30.1oC, the horizontal distribution of the annual average pressure in Indonesia in 2009-2018 is between 800 mBar to 1000 mBar, the horizontal distribution the average annual humidity in Indonesia in 2009 and 2011 ranged between 27-57, in 2012-2015, 2017 and 2018 it ranged between 30-60, during the East Monsoon, the wind circulation moved from northern Indonesia to the southern region Indonesia. During the west monsoon, the wind circulation moves from the southern part of Indonesia to the northern part of Indonesia. The global warming module for SMA/MA produced is feasible to use, this is in accordance with the value given by the validate of 69 which is in the appropriate category and the response of teachers and students through a 91% questionnaire.
An approach of re-organizing input dataset to enhance the quality of emotion ...journalBEEI
The purpose of this paper is to propose an approach of re-organizing input data to recognize emotion based on short signal segments and increase the quality of emotional recognition using physiological signals. MIT's long physiological signal set was divided into two new datasets, with shorter and overlapped segments. Three different classification methods (support vector machine, random forest, and multilayer perceptron) were implemented to identify eight emotional states based on statistical features of each segment in these two datasets. By re-organizing the input dataset, the quality of recognition results was enhanced. The random forest shows the best classification result among three implemented classification methods, with an accuracy of 97.72% for eight emotional states, on the overlapped dataset. This approach shows that, by re-organizing the input dataset, the high accuracy of recognition results can be achieved without the use of EEG and ECG signals.
Parking detection system using background subtraction and HSV color segmentationjournalBEEI
Manual system vehicle parking makes finding vacant parking lots difficult, so it has to check directly to the vacant space. If many people do parking, then the time needed for it is very much or requires many people to handle it. This research develops a real-time parking system to detect parking. The system is designed using the HSV color segmentation method in determining the background image. In addition, the detection process uses the background subtraction method. Applying these two methods requires image preprocessing using several methods such as grayscaling, blurring (low-pass filter). In addition, it is followed by a thresholding and filtering process to get the best image in the detection process. In the process, there is a determination of the ROI to determine the focus area of the object identified as empty parking. The parking detection process produces the best average accuracy of 95.76%. The minimum threshold value of 255 pixels is 0.4. This value is the best value from 33 test data in several criteria, such as the time of capture, composition and color of the vehicle, the shape of the shadow of the object’s environment, and the intensity of light. This parking detection system can be implemented in real-time to determine the position of an empty place.
Quality of service performances of video and voice transmission in universal ...journalBEEI
The universal mobile telecommunications system (UMTS) has distinct benefits in that it supports a wide range of quality of service (QoS) criteria that users require in order to fulfill their requirements. The transmission of video and audio in real-time applications places a high demand on the cellular network, therefore QoS is a major problem in these applications. The ability to provide QoS in the UMTS backbone network necessitates an active QoS mechanism in order to maintain the necessary level of convenience on UMTS networks. For UMTS networks, investigation models for end-to-end QoS, total transmitted and received data, packet loss, and throughput providing techniques are run and assessed and the simulation results are examined. According to the results, appropriate QoS adaption allows for specific voice and video transmission. Finally, by analyzing existing QoS parameters, the QoS performance of 4G/UMTS networks may be improved.
CW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERSveerababupersonal22
It consists of cw radar and fmcw radar ,range measurement,if amplifier and fmcw altimeterThe CW radar operates using continuous wave transmission, while the FMCW radar employs frequency-modulated continuous wave technology. Range measurement is a crucial aspect of radar systems, providing information about the distance to a target. The IF amplifier plays a key role in signal processing, amplifying intermediate frequency signals for further analysis. The FMCW altimeter utilizes frequency-modulated continuous wave technology to accurately measure altitude above a reference point.
We have compiled the most important slides from each speaker's presentation. This year’s compilation, available for free, captures the key insights and contributions shared during the DfMAy 2024 conference.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
HEAP SORT ILLUSTRATED WITH HEAPIFY, BUILD HEAP FOR DYNAMIC ARRAYS.
Heap sort is a comparison-based sorting technique based on Binary Heap data structure. It is similar to the selection sort where we first find the minimum element and place the minimum element at the beginning. Repeat the same process for the remaining elements.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Marketplace affiliates potential analysis using cosine similarity and vision-based page segmentation
1. Bulletin of Electrical Engineering and Informatics
Vol. 9, No. 6, December 2020, pp. 2492~2498
ISSN: 2302-9285, DOI: 10.11591/eei.v9i6.2018 2492
Journal homepage: http://beei.org
Marketplace affiliates potential analysis using cosine similarity
and vision-based page segmentation
Wildan Budiawan Zulfikar1
, Mohamad Irfan2
, Muhammad Ghufron3
, Jumadi4
, Esa Firmansyah5
1,3
Department of Informatics, UIN Sunan Gunung Djati, Indonesia
2
Department of ICT, Asia E University, Malaysia
4
School of Electrical Engineering and Informatics, Institut Teknologi Bandung, Indonesia
5
Department of Informatics, STMIK Sumedang, Indonesia
Article Info ABSTRACT
Article history:
Received Aug 15, 2019
Revised Jan 28, 2020
Accepted Mar 1, 2020
One success factor of an online affiliate is determined by the quality of
the content source. Therefore, affiliate marketplaces need to do an objective
assessment to retrieve content data that will be used to choose the right
product in the appropriate product filter. Usually, the selection is not made
using a good and measured system so that the selection of product content
is only based on parts that are not in accordance with what is seen or
subjective. However, if analyzed using a good and measurable system will
produce an objective product content and can have a positive impact on users
because the selection is based on factual data. The purpose of this research
is to analyze the potential of the affiliate marketplace by combining cosine
similarity with vision-based page segmentation. This is a new breakthrough
made for optimization to get the best content in accordance with the required
criteria. This work will produce a number of product recommendations that
are appropriate for publication and then made use of for comparison that
matches the required criteria. At the limited evaluation stage, the performance
of the proposed model obtained satisfactory results, in which 5 queries tested
were all as expected.
Keywords:
Cosine similarity
Marketplace affiliates
Page segmentation
Vision
Web scraping
This is an open access article under the CC BY-SA license.
Corresponding Author:
Wildan Budiawan Zulfikar,
Department of Informatics,
UIN Sunan Gunung Djati,
105th A.H. Nasution Street, Bandung, 40614, Indonesia.
Email: wildan.b@uinsgd.ac.id
1. INTRODUCTION
Nowadays, information technology has created new types and business opportunities where more
and more business transactions are being made online. Therefore, everyone might easily carry out buying
and selling transactions [1-3]. Many companies try to offer a variety of products using this media [4, 5].
One of the benefits of the existence of the internet is as a media promotion of a product. A product that
is online via the internet can bring huge benefits to entrepreneurs because the product is known throughout
the world [4, 6].
Web scraping is the process of extracting information from a website. Web scraping is an alternative
way that chose because the required data is not always available in the API, another source like shared
database or data warehouse, or even they do not provide the API at all [7-12]. This research has used product
attribute data obtained from several marketplace affiliates using web scraping techniques. It used one of the web
scraping methods, vision-based page segmentation. Vision-based page Segmentation is an algorithm for
website page metadata. Based on previous research, this method of extracting tag tree data can detect content
2. Bulletin of Electr Eng & Inf ISSN: 2302-9285
Marketplace affiliates potential analysis using cosine similarity and… (Wildan Budiawan Zulfikar)
2493
structures quickly [13, 14]. It transforms the deep web into a visual tree [13, 15]. The result is divided into
several segments and can be processed using DOM parser before it can finally be processed and modeled [13].
In addition, the proposed model applies Cosine Similarity and TF-IDF. Cosine-Similarity is one
algorithm that functions to compare similarities between documents. In this case, what is compared is a query
with a training document [16-18]. In calculating cosine similarity, first, do a scalar multiplication between
the query and the document then add up, then do the multiplication between the length of the document and
the length of the query that has been squared, after that the square root count is calculated [16, 19-23].
Furthermore, the results of the scalar multiplication are divided by the results of the multiplication
of the length of the document and query.
2. RESEARCH METHOD
In this article, it will be explained that the existing attribute data is sourced from some marketplace
data. Data sources are taken directly from the original website. Online marketplace data taken is product data
that is still active in the product category list. Detailed marketplace affiliate data used in this work is
described in Table 1.
Table 1. List of marketplace affiliate
Marketplace URL Role
Tokopedia https://www.tokopedia.com Main marketplace
Bukalapak https://www.bukalapak.com 2nd
marketplace
Blanja https://www.blanja.com 3rd
marketplace
Lazada https://www.lazada.co.id 4th
marketplace
In this work, the main marketplace is Tokopedia. Then, one product will compare to another
marketplaces. The use of this method is divided into two processes namely the first process will be scraping
product data based on all selected web marketplace data. This method uses the id category and name category
attributes of each product. When the process of web scraping, product data will be divided into several
categories that will be done using the cosine similarity and vision-based page segmentation methods.
After the data is formed in the form of HTML dom, the system will determine one of the data used to do
the process to display the data. The category becomes one of the data used as a reference for this data
filtering process because it shows the level of each product data based on the category and is appropriate in
retrieving accurate data and filters in the price and rating order. Product availability is the second factor
because it supports product competency.
2.1. Cosine similarity
The following is a simulation or example of data used in the process. Category data can be seen in
Table 2. Conditions are adjusted to each category which in this case is limited to 6 categories. The product
attributes that will be analyzed in the work in detail can be seen in Table 3.
Table 2. Product categories
Categories Code
Fashion Cat_001
Health Cat_002
Beauty Cat_003
Smartphone and tablet Cat_004
Laptop Cat_005
Computer Cat_006
Table 3. Product attributes
Attributes Code
Name prod_name
ID prod_id
SKU prod_sku
Link prod_link
Figure prod_fig
Price prod_price
Category code prod_cat_id
Category description prod_cat_name
Advertiser prod_ads
Table 4 is the query data that will be calculated using TF-IDF based on a specific query. This work
involves six queries and four affiliate marketplaces as explained in Table 4. Figure 1 is a visualization of
Table 4 to make it easier to read valid data and the same or almost the same then the table is converted into a
graph diagram. Pictures from the graph diagram of the query and matching with each of the place list lists
can be seen in the Figure 1.
3. ISSN: 2302-9285
Bulletin of Electr Eng & Inf, Vol. 9, No. 6, December 2020 : 2492 – 2498
2494
Tabel 4. TF-IDF
Query Marketplace 1 (D1) Marketplace 2 (D2) Marketplace 3 (D3) Marketplace 4 (D4)
Galaxy s7 1 1 0 0
Samsung 1 1 1 1
iPhone X 128Gb Black 1 0 0 0
Galaxy+S7 1 1 1 1
MacBook Air 1 0 1 1
Keyboard Razer 1 0 1 0
Figure 1. TF-IDF data on graph
The first stage of vision-based page segmentation is to determine the initial weight of each query
manually. For example, the first weight is filled by Samsung's query and the second group is weighted by
Galaxy. Then obtained: Centroid 1=0.3 and Centroid 2=0.3 as explained in Table 5 and the visualization
explained in Figure 2.
Table 5. TF-IDF data and weighting
Code D1 D2 Description
Oppo 0 0
Lenovo 0 0
Samsung 0.3 0 Weight 1
Asus 0 0
Galaxy 0 0.3 Weight 2
Iphone 0 0
Figure 2. TF-IDF data in a graph diagram with weighted queries
Next calculate the distance of each data with each weight using (1) [24, 25]:
(1)
The next work to calculate the weight by comparing query data 1 with each query taken that has
weight. The query data weighting can be seen in Table 6. Then, look for the average of each weight value
to be used as a new query weight namely:
W1 New: (0.3, 0.3)
W2 New: (0.0, 0.0)
This step will continue to be repeated until the conditions are met. The desired condition is that there
is no change in the weighting of the data source which means there is no difference between the data query
and the query in the previous iteration. Then the second iteration will be performed using a new weighting.
iphone x 128
gb black
macbook air
galaxy s7
keyboard
razer
samsung
0
2
4
6
8
10
0 5 10 15 20
TF
IDF
GRAPH
TOOLS
QUERY
oppo lenovo
samsung
asus
galaxy
iphone
0
2
4
6
8
10
0 5 10 15 20
TF-IDF
Weighting
Query
Dokumen 1 dan 2
4. Bulletin of Electr Eng & Inf ISSN: 2302-9285
Marketplace affiliates potential analysis using cosine similarity and… (Wildan Budiawan Zulfikar)
2495
In this experiment, the algorithm will be completed in the third iteration. The final results are presented in the
form of a Cartesian diagram to make it easier to see the closeness of the data between the weighting and each
data as explained on Table 6. Table 6 explains the list of Queries included in the category. Products that are
in the first weighting are Q3, Q5, Q6 and in the second weighting are Q1, Q2, Q4.
Table 6. Clustering results
W1 W2
Q3 Q1
Q5 Q2
Q6 Q4
2.2. Vision-based page segmentation
Query retrieved adjusted to the query that has been selected. The higher the weighting value
of the selected query the higher the query used and conversely the lower the weighting of the query the lower
weighting of the query is used. Next, calculate the normalization of data according to the vision-based page
segmentation formula then multiplied by the weights that have been determined at the initialization stage
namely (2) for profit and (3) for costs:
{ } (2)
{ } (3)
3. RESULTS AND DISCUSSION
After conducting the previous weighting phase, product data search will be performed based on
the Vision-based Page Segmentation algorithm. Based on some data that has been classified, only one
product data will be taken that matches the previous TF-IDF process and to be compared using this
algorithm. The query groups to be selected are based on the query to be searched. If the query is appropriate,
then the appropriate group will be taken, and while the position is not appropriate, page 404 or product page
will not be selected. In this case, the first query is taken that is Samsung Galaxy. The following is the use
of the vision-based page segmentation algorithm as described in the previous chapter:
a. First step is determining new product data. Figure 3 describe the default position of product detail
including name, dimension, price, and any related data of product.
Figure 3. Scheme design vision-based page segmentation product scraping data
5. ISSN: 2302-9285
Bulletin of Electr Eng & Inf, Vol. 9, No. 6, December 2020 : 2492 – 2498
2496
b. Then, each product attributes parse by its category and subcategory as describe on Figure 4.
Figure 4. Web process data page product data scraping
c. Extract data that will be searched using (4).
(4)
Where w(1) is the query data that is input with the total weight that will be searched from the weight value
of R and r is the number of segments related to the query that already has a value.
d. Validation of data that has been processed and normalize data according to (5).
{ } (5)
The results of data normalization using are explained in Table 7.
Table 7. The result of data normalization
No. Kode Div.Id Div.Attr Div.Class Cache Alpha result
1 Q1 0.875 1 0.8 1 0.5
2 Q2 1 1 0.8 0.954545455 1
3 Q4 0.9375 0.875 1 0.909090909 1
e. Normalization results are multiplied by the weights and summed to find out the final result of
the preference value with (6) and final preference result explained in Table 8.
∑ (6)
Table 8. Final preference results
No. Kode Div.Id Div.Attr Div.Class Cache Alpha result Result
W 0.3 0.2 0.2 0.15 0.15
1 Q1 0.875 1 0.8 1 0.5 0.8475
2 Q2 1 1 0.8 0.954545455 1 0.953181818
3 Q4 0.9375 0.875 1 0.909090909 1 0.942613636
Based on Table 8, it can be concluded that the most recommended Query data is the Query data with the Q2
code. Query data Q2 gets 0.95 results and is only 0.01 points different from Q4.
4. CONCLUSION
If evaluated from performance, the proposed model gets the appropriate results. 5 queries tested
everything as expected. The cosine similarity algorithm successfully improvised the vision-based page
segmentation algorithm and was able to adjust product 1 to other products that were eligible to be selected
by searching product data processed by TF-IDF. Further work, we suggest comparing this model with
different methods.
6. Bulletin of Electr Eng & Inf ISSN: 2302-9285
Marketplace affiliates potential analysis using cosine similarity and… (Wildan Budiawan Zulfikar)
2497
REFERENCES
[1] Y. U. Chandra, S. Karya, and M. Hendrawaty, “Decision support systems for customer to buy products with
an integration of reviews and comments from marketplace e-commerce sites in Indonesia: A proposed model,”
International Journal Advamced Science Engineering Information Technology., vol. 9, no. 4, pp. 1171-1176, 2019.
[2] I. O. Sfenrianto, A. Christiano, and M. P. Mulani, “Impact of e-service on customer loyalty in marketplace in
Indonesia,” Journal of Theoretical and Applied Information Technology, vol. 96, no. 20, pp. 6795-6805, 2018.
[3] G. J. A. Santoso and T. A. Napitupulu, “Factors affecting seller loyalty in business e-marketplace : A case of
Indonesia,” Journal of Theoretical and Applied Information Technology, vol. 96, no. 1, pp. 162-171,·Jan. 2018.
[4] M. A. Fauzan, A. S. Nisafani, and A. Wibisono, “Seller reputation impact on sales performance in public
e-marketplace Bukalapak,” TELKOMNIKA Telecommunication Computing Electronic and Control, vol. 17, no. 4,
pp. 1810-1817, Aug. 2019.
[5] H. Yoganarasimhan, “The value of reputation in an online freelance marketplace,” Marketing Science, vol. 32,
no. 6, pp. 860-891, Nov. 2013.
[6] M. N. Alraja and M. A. Said Kashoob, “Transformation to electronic purchasing: an empirical investigation,”
TELKOMNIKA Telecommunication Computing Electronic and Control, vol. 17, no. 3, pp. 1209-1219, June 2019.
[7] S. K. Malik and S. Rizvi, “Information extraction using web usage mining, web scrapping and semantic
annotation,” 2011 International Conference on Computational Intelligence and Communication Networks,
Gwalior, pp. 465-469, 2011.
[8] K. Sriraghav, S. Jayanthi, N. Vidya, and V. S. Felix Enigo, “ScrAnViz-A tool to scrap, analyze and visualize
unstructured-data using attribute-based opinion mining algorithm,” 2017 Innovations in Power and Advanced
Computing Technologies (i-PACT), Vellore, pp. 1-5, 2017.
[9] R. Murali, “An intelligent web spider for online e-commerce data extraction,” 2018 Second International
Conference on Green Computing and Internet of Things (ICGCIoT), Bangalore, India, pp. 332-339, 2018.
[10] S. Mehak, R. Zafar, S. Aslam, and S. M. Bhatti, “Exploiting filtering approach with web scrapping for smart online
shopping: Penny wise: A wise tool for online shopping,” 2019 2nd International Conference on Computing,
Mathematics and Engineering Technologies (iCoMET), Sukkur, Pakistan, pp. 1-5, 2019.
[11] Đ. Petrović and I. Stanišević, “Web scrapping and storing data in a database, a case study of the used cars
market,” 2017 25th Telecommunication Forum (TELFOR), Belgrade, pp. 1-4, 2017.
[12] J. G. Thomsen, E. Ernst, C. Brabrand, and M. Schwartzbach, “WebSelF: A web scraping framework,”
International Conference on Web Engineering 2012, Lecture Notes in Computer Science, Springer, Berlin,
Heidelberg, vol. 7387, pp. 347-361, 2012.
[13] M. Cormer, R. Mann, K. Moffatt, and R. Cohen, “Towards an improved vision-based web page segmentation
algorithm,” 2017 14th Conference on Computer and Robot Vision (CRV), Edmonton, AB, pp. 345-352, 2017.
[14] A. Bhardwaj and V. Mangat, “A novel approach for content extraction from web pages,” 2014 Recent Advances in
Engineering and Computational Sciences (RAECS), Chandigarh, pp. 1-4, 2014.
[15] P. Ko, S. Kang, and H. Kumar, “Web page dependent vision based segementation for web sites,”
Seventh IEEE/ACIS International Conference on Computer and Information Science (ICIS 2008), Portland, OR,
pp. 690-694, 2008.
[16] D. Xue and Y. Wang, “Applying cosine similarity to discount evidence,” 2017 10th International Symposium on
Computational Intelligence and Design (ISCID), Hangzhou, pp. 516-519, 2017.
[17] X. Wang, Z. Xu, X. Xia, and C. Mao, “Computing user similarity by combining SimRank++ and cosine similarities
to improve collaborative filtering,” 2017 14th Web Information Systems and Applications Conference (WISA),
Liuzhou, pp. 205-210, 2017.
[18] P. P. Gokul, B. K. Akhil, and K. K. M. Shiva, “Sentence similarity detection in Malayalam language using cosine
similarity,” 2017 2nd IEEE International Conference on Recent Trends in Electronics, Information &
Communication Technology (RTEICT), Bangalore, pp. 221-225, 2017.
[19] M. Alodadi and V. P. Janeja, “Similarity in patient support forums using TF-IDF and cosine similarity metrics,”
2015 International Conference on Healthcare Informatics, Dallas, TX, pp. 521-522, 2015.
[20] Hilary I. Okagbue, Sheila A. Bishop, Pelumi E. Oguntunde, Patience I. Adamu, Abiodun A. Opanuga, and Elvir M.
Akhmetshin, “Modified CiteScore metric for reducing the effect of self-citations,” TELKOMNIKA
Telecommunication Computing Electronic and Control, vol. 17, no. 6, pp. 3044-3049, Dec. 2019.
[21] A. Hamdy and M. Elsayed, “Towards more accurate automatic recommendation of software design patterns,”
Journal of Theoretical & Applied Information Technology, vol. 96, no. 15, pp. 5069-5079, 2018.
[22] D. Jayasri and D. D. Manimegalai, “An efficient cross ontology-based similarity measure for bio-document
retrieval system,” Journal of Theoretical & Applied Information Technology, vol. 54, no. 2, pp. 245-258, 2013.
[23] Y. Kawada, “Cosine similarity and the Borda rule,” Social Choice and Welfare, vol. 51, no. 1, pp. 1-11, Jun. 2018.
[24] W. Uther, “TF–IDF,” in C. Sammut and G. I. Webb (Ed.), Encyclopedia of Machine Learning, Boston, MA:
Springer US, pp. 986-987, 2011.
[25] Li-Ping Jing, Hou-Kuan Huang, and Hong-Bo Shi, “Improved feature selection approach TFIDF in text mining,”
Proceedings. International Conference on Machine Learning and Cybernetics, Beijing, vol. 2, pp. 944-946, 2002.
7. ISSN: 2302-9285
Bulletin of Electr Eng & Inf, Vol. 9, No. 6, December 2020 : 2492 – 2498
2498
BIOGRAPHIES OF AUTHORS
Wildan Budiawan Zulfikar received the B.Eng degree from UIN Sunan Gunung Djati,
Indonesia, and M.Cs from STMIK LIKMI, Indonesia. He currently a lecturer at UIN Sunan
Gunung Djati, Indonesia. His research area is in Information System and Data Mining.
Mohamad Irfan received the B.Eng degree from UIN Sunan Gunung Djati, Indonesia, and
M.Cs from STMIK LIKMI, Indonesia. He currently a Ph.D student of Asia E University,
Malaysia. His research focused on Information System.
Muhammad Ghufron received the B.Eng degree from UIN Sunan Gunung Djati, Indonesia. He
currently a research assistant of Informatics Department. His research interest is Business
Information System.
Jumadi received the B.Eng degree from Dharma Negara Business and Infromatics School,
Indonesia and M.Cs from Universitas Gajah Mada, Indonesia. He currently a Ph.D student of
School of Electrical Engineering and Informatics, Institute Teknologi Bandung, Indonesia. His
research focussed on Semantics Information Retrieval.
Esa Firmansyah received the B.Eng degree from STMIK PMBI Bandung, and M.Kom from
STTIBI Jakarta, Indonesia. He currently a Ph.D student of Asia E University, Malaysia. His
research focussed on Information Technology & Information System.