The document describes a study that aims to handle the task of detecting offensive language in multilingual documents using machine learning models. The proposed framework consists of three phases: preprocessing text, representing text using BERT models, and classifying text into offensive and non-offensive classes. The study examines different strategies for handling multilingualism, such as creating one classification model for multiple languages or using translation to convert all texts to one language before classification. Experimental results on a bilingual dataset show that the translation-based approach using Arabic BERT achieves over 93% F1-score and 91% precision for offensive language detection in multilingual texts.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Hate speech detection on Indonesian text using word embedding method-global v...IAESIJAI
Hate speech is defined as communication directed toward a specific individual or group that involves hatred or anger and a language with solid arguments leading to someone's opinion can cause social conflict. It has a lot of potential for individuals to communicate their thoughts on an online platform because the number of Internet users globally, including in Indonesia, is continually rising. This study aims to observe the impact of pre-trained global vector (GloVe) word embedding on accuracy in the classification of hate speech and non-hate speech. The use of pre-trained GloVe (Indonesian text) and single and multi-layer long short-term memory (LSTM) classifiers has performance that is resistant to overfitting compared to pre-trainable embedding for hate-speech detection. The accuracy value is 81.5% on a single layer and 80.9% on a double-layer LSTM. The following job is to provide pre-trained with formal and non-formal language corpus; pre-processing to overcome non-formal words is very challenging.
Marathi-English CLIR using detailed user query and unsupervised corpus-based WSDIJERA Editor
With rapid growth of multilingual information on the Internet, Cross Language Information Retrieval (CLIR) is becoming need of the day. It helps user to query in their native language and retrieve information in any language. But the performance of CLIR is poor as compared to monolingual retrieval due to lexical ambiguity, mismatching of query terms and out-of-vocabulary words. In this paper, we have proposed an algorithm for improving the performance of Marathi-English CLIR system. The system first finds possible translations of input query in target language, disambiguates them and then gives English queries to search engine for relevant document retrieval. The disambiguation is based on unsupervised corpus-based method which uses English dictionary as additional resource. The experiment is performed on FIRE 2011 (Forum of Information Retrieval Evaluation) dataset using “Title” and “Description” fields as inputs. The experimental results show that proposed approach gives better performance of Marathi-English CLIR system with good precision level.
Hate Speech Detection in multilingual Text using Deep LearningIRJET Journal
This paper aims to detect hate speech in multilingual text using deep learning techniques. Specifically, it focuses on English-Hindi code-mixed text from social media. The paper combines three existing datasets to create a larger consolidated dataset of over 20,000 tweets and comments annotated as hate speech or non-hate speech. It then applies and compares various machine learning and deep learning models on this dataset. The experimental results show that a CNN-BiLSTM deep learning model achieves the best performance with 87% accuracy, 82% precision, and 85% F1 score, outperforming existing approaches.
Identification of monolingual and code-switch information from English-Kannad...IJECEIAES
Code-switching is a very common occurrence in social media communication, predominantly found in multilingual countries like India. Using more than one language in communication is known as codeswitching or code-mixing. Some of the important applications of codeswitch are machine translation (MT), shallow parsing, dialog systems, and semantic parsing. Identifying code-switch and monolingual information is useful for better communication in online networking websites. In this paper, we performed a character level n-gram approach to identify monolingual and code-switch information from English-Kannada social media data. We paralleled various machine learning techniques such as naïve Bayes (NB), support vector classifier (SVC), logistic regression (LR) and neural network (NN) on English-Kannada code-switch (EKCS) data. From the proposed approach, it is observed that the character level n-gram approach provides 1.8% to 4.1% of improvement in terms of Accuracy and 1.6% to 3.8% of improvement in F1-score. Also observed that SVC and NN techniques are outperformed in terms of accuracy (97.9%) and F1-score (98%) with character level n-gram.
A FRAMEWORK FOR BUILDING A MULTILINGUAL INDUSTRIAL ONTOLOGY: METHODOLOGY AND...dannyijwest
As Web 3.0 is blooming, ontologies augment semantic Web with semi–structured knowledge. Industrial
ontologies can help in improving online commercial communication and marketing. In addition,
conceptualizing the enterprise knowledge can improve information retrieval for industrial applications.
Having ontologies combine multiple languages can help in delivering the knowledge to a broad sector of
Internet users. In addition, multi-lingual ontologies can also help in commercial transactions. This research
paper provides a framework model for building industrial multilingual ontologies which include Corpus
Determination, Filtering, Analysis, Ontology Building, and Ontology Evaluation
A FRAMEWORK FOR BUILDING A MULTILINGUAL INDUSTRIAL ONTOLOGY: METHODOLOGY AND ...IJwest
As Web 3.0 is blooming, ontologies augment semantic Web with semi–structured knowledge. Industrial
ontologies can help in improving online commercial communication and marketing. In addition,
conceptualizing the enterprise knowledge can improve information retrieval for industrial applications.
Having ontologies combine multiple languages can help in delivering the knowledge to a broad sector of
Internet users. In addition, multi-lingual ontologies can also help in commercial transactions. This
research paper provides a framework model for building industrial multilingual ontologies which include
Corpus Determination, Filtering, Analysis, Ontology Building, and Ontology Evaluation. It also addresses
factors to be considered when modeling multilingual ontologies. A case study for building a bilingual
English-Arabic ontology for smart phones is presented. The ontology was illustrated using an ontology
editor and visualization tool. The built ontology consists of 67 classes and 18 instances presented in both
Arabic and English. In addition, applications for using the ontology are presented. Future research
directions for the built industrial ontology are presented.
SENTIMENT ANALYSIS OF MIXED CODE FOR THE TRANSLITERATED HINDI AND MARATHI TEXTSijnlc
The evolution of information Technology has led to the collection of large amount of data, the volume of
which has increased to the extent that in last two years the data produced is greater than all the data ever
recorded in human history. This has necessitated use of machines to understand, interpret and apply data,
without manual involvement. A lot of these texts are available in transliterated code-mixed form, which due
to the complexity are very difficult to analyze. The work already performed in this area is progressing at
great pace and this work hopes to be a way to push that work further. The designed system is an effort
which classifies Hindi as well as Marathi text transliterated (Romanized) documents automatically using
supervised learning methods (KNN), Naïve Bayes and Support Vector Machine (SVM)) and ontology based
classification; and results are compared to in order to decide which methodology is better suited in
handling of these documents. As we will see, the plain machine learning algorithm applications are just as
or in many cases are much better in performance than the more analytical approach.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Hate speech detection on Indonesian text using word embedding method-global v...IAESIJAI
Hate speech is defined as communication directed toward a specific individual or group that involves hatred or anger and a language with solid arguments leading to someone's opinion can cause social conflict. It has a lot of potential for individuals to communicate their thoughts on an online platform because the number of Internet users globally, including in Indonesia, is continually rising. This study aims to observe the impact of pre-trained global vector (GloVe) word embedding on accuracy in the classification of hate speech and non-hate speech. The use of pre-trained GloVe (Indonesian text) and single and multi-layer long short-term memory (LSTM) classifiers has performance that is resistant to overfitting compared to pre-trainable embedding for hate-speech detection. The accuracy value is 81.5% on a single layer and 80.9% on a double-layer LSTM. The following job is to provide pre-trained with formal and non-formal language corpus; pre-processing to overcome non-formal words is very challenging.
Marathi-English CLIR using detailed user query and unsupervised corpus-based WSDIJERA Editor
With rapid growth of multilingual information on the Internet, Cross Language Information Retrieval (CLIR) is becoming need of the day. It helps user to query in their native language and retrieve information in any language. But the performance of CLIR is poor as compared to monolingual retrieval due to lexical ambiguity, mismatching of query terms and out-of-vocabulary words. In this paper, we have proposed an algorithm for improving the performance of Marathi-English CLIR system. The system first finds possible translations of input query in target language, disambiguates them and then gives English queries to search engine for relevant document retrieval. The disambiguation is based on unsupervised corpus-based method which uses English dictionary as additional resource. The experiment is performed on FIRE 2011 (Forum of Information Retrieval Evaluation) dataset using “Title” and “Description” fields as inputs. The experimental results show that proposed approach gives better performance of Marathi-English CLIR system with good precision level.
Hate Speech Detection in multilingual Text using Deep LearningIRJET Journal
This paper aims to detect hate speech in multilingual text using deep learning techniques. Specifically, it focuses on English-Hindi code-mixed text from social media. The paper combines three existing datasets to create a larger consolidated dataset of over 20,000 tweets and comments annotated as hate speech or non-hate speech. It then applies and compares various machine learning and deep learning models on this dataset. The experimental results show that a CNN-BiLSTM deep learning model achieves the best performance with 87% accuracy, 82% precision, and 85% F1 score, outperforming existing approaches.
Identification of monolingual and code-switch information from English-Kannad...IJECEIAES
Code-switching is a very common occurrence in social media communication, predominantly found in multilingual countries like India. Using more than one language in communication is known as codeswitching or code-mixing. Some of the important applications of codeswitch are machine translation (MT), shallow parsing, dialog systems, and semantic parsing. Identifying code-switch and monolingual information is useful for better communication in online networking websites. In this paper, we performed a character level n-gram approach to identify monolingual and code-switch information from English-Kannada social media data. We paralleled various machine learning techniques such as naïve Bayes (NB), support vector classifier (SVC), logistic regression (LR) and neural network (NN) on English-Kannada code-switch (EKCS) data. From the proposed approach, it is observed that the character level n-gram approach provides 1.8% to 4.1% of improvement in terms of Accuracy and 1.6% to 3.8% of improvement in F1-score. Also observed that SVC and NN techniques are outperformed in terms of accuracy (97.9%) and F1-score (98%) with character level n-gram.
A FRAMEWORK FOR BUILDING A MULTILINGUAL INDUSTRIAL ONTOLOGY: METHODOLOGY AND...dannyijwest
As Web 3.0 is blooming, ontologies augment semantic Web with semi–structured knowledge. Industrial
ontologies can help in improving online commercial communication and marketing. In addition,
conceptualizing the enterprise knowledge can improve information retrieval for industrial applications.
Having ontologies combine multiple languages can help in delivering the knowledge to a broad sector of
Internet users. In addition, multi-lingual ontologies can also help in commercial transactions. This research
paper provides a framework model for building industrial multilingual ontologies which include Corpus
Determination, Filtering, Analysis, Ontology Building, and Ontology Evaluation
A FRAMEWORK FOR BUILDING A MULTILINGUAL INDUSTRIAL ONTOLOGY: METHODOLOGY AND ...IJwest
As Web 3.0 is blooming, ontologies augment semantic Web with semi–structured knowledge. Industrial
ontologies can help in improving online commercial communication and marketing. In addition,
conceptualizing the enterprise knowledge can improve information retrieval for industrial applications.
Having ontologies combine multiple languages can help in delivering the knowledge to a broad sector of
Internet users. In addition, multi-lingual ontologies can also help in commercial transactions. This
research paper provides a framework model for building industrial multilingual ontologies which include
Corpus Determination, Filtering, Analysis, Ontology Building, and Ontology Evaluation. It also addresses
factors to be considered when modeling multilingual ontologies. A case study for building a bilingual
English-Arabic ontology for smart phones is presented. The ontology was illustrated using an ontology
editor and visualization tool. The built ontology consists of 67 classes and 18 instances presented in both
Arabic and English. In addition, applications for using the ontology are presented. Future research
directions for the built industrial ontology are presented.
SENTIMENT ANALYSIS OF MIXED CODE FOR THE TRANSLITERATED HINDI AND MARATHI TEXTSijnlc
The evolution of information Technology has led to the collection of large amount of data, the volume of
which has increased to the extent that in last two years the data produced is greater than all the data ever
recorded in human history. This has necessitated use of machines to understand, interpret and apply data,
without manual involvement. A lot of these texts are available in transliterated code-mixed form, which due
to the complexity are very difficult to analyze. The work already performed in this area is progressing at
great pace and this work hopes to be a way to push that work further. The designed system is an effort
which classifies Hindi as well as Marathi text transliterated (Romanized) documents automatically using
supervised learning methods (KNN), Naïve Bayes and Support Vector Machine (SVM)) and ontology based
classification; and results are compared to in order to decide which methodology is better suited in
handling of these documents. As we will see, the plain machine learning algorithm applications are just as
or in many cases are much better in performance than the more analytical approach.
Contextual Analysis for Middle Eastern Languages with Hidden Markov Modelsijnlc
Displaying a document in Middle Eastern languages requires contextual analysis due to different presentational forms for each character of the alphabet. The words of the document will be formed by the joining of the correct positional glyphs representing corresponding presentational forms of the
characters. A set of rules defines the joining of the glyphs. As usual, these rules vary from language to language and are subject to interpretation by the software developers.
The Evaluation of a Code-Switched Sepedi-English Automatic Speech Recognition...IJCI JOURNAL
Speech technology is a field that encompasses various techniques and tools used to enable machines to interact with speech, such as automatic speech recognition (ASR), spoken dialog systems, and others, allowing a device to capture spoken words through a microphone from a human speaker. End-to-end approaches such as Connectionist Temporal Classification (CTC) and attention-based methods are the most used for the development of ASR systems. However, these techniques were commonly used for research and development for many high-resourced languages with large amounts of speech data for training and evaluation, leaving low-resource languages relatively underdeveloped. While the CTC method has been successfully used for other languages, its effectiveness for the Sepedi language remains uncertain. In this study, we present the evaluation of the Sepedi-English code-switched automatic speech recognition system. This end-to-end system was developed using the Sepedi Prompted Code Switching corpus and the CTC approach. The performance of the system was evaluated using both the NCHLT Sepedi test corpus and the Sepedi Prompted Code Switching corpus. The model produced the lowest WER of 41.9%, however, the model faced challenges in recognizing the Sepedi only text.
A Survey Of Current Datasets For Code-Switching ResearchJim Webb
This document provides a survey of existing datasets for code-switching research. It begins by defining code-switching and discussing its increased prevalence in social media interactions. It then proposes quality metrics for evaluating code-switching datasets, including number of words, vocabulary size, number of sentences, and average sentence length. The document reviews available datasets categorized by common NLP tasks like language identification, named entity recognition, sentiment analysis, and machine translation. Several datasets for language pairs like English-Hindi, Spanish-English, and Mandarin-English are discussed. In conclusion, the survey finds that while interest in code-switching research is growing, availability of suitable annotated datasets remains limited.
ADVERSARIAL GRAMMATICAL ERROR GENERATION: APPLICATION TO PERSIAN LANGUAGEkevig
Grammatical error correction (GEC) greatly benefits from large quantities of high-quality training data.
However, the preparation of a large amount of labelled training data is time-consuming and prone to
human errors. These issues have become major obstacles in training GEC systems. Recently, the
performance of English GEC systems has drastically been enhanced by the application of deep neural
networks that generate a large amount of synthetic data from limited samples. While GEC has extensively
been studied in languages such as English and Chinese, no attempts have been made to generate synthetic
data for improving Persian GEC systems. Given the substantial grammatical and semantic differences of
the Persian language, in this paper, we propose a new deep learning framework to create large enough
synthetic sentences that are grammatically incorrect for training Persian GEC systems. A modified version
of sequence generative adversarial net with policy gradient is developed, in which the size of the model is
scaled down and the hyperparameters are tuned. The generator is trained in an adversarial framework on
a limited dataset of 8000 samples. Our proposed adversarial framework achieved bilingual evaluation
understudy (BLEU) scores of 64.5% on BLEU-2, 44.2% on BLEU-3, and 21.4% on BLEU-4, and
outperformed the conventional supervised-trained long short-term memory using maximum likelihood
estimation and recently proposed sequence labeler using neural machine translation augmentation. This
shows promise toward improving the performance of GEC systems by generating a large amount of
training data.
ADVERSARIAL GRAMMATICAL ERROR GENERATION: APPLICATION TO PERSIAN LANGUAGEkevig
Grammatical error correction (GEC) greatly benefits from large quantities of high-quality training data.
However, the preparation of a large amount of labelled training data is time-consuming and prone to
human errors. These issues have become major obstacles in training GEC systems. Recently, the
performance of English GEC systems has drastically been enhanced by the application of deep neural
networks that generate a large amount of synthetic data from limited samples. While GEC has extensively
been studied in languages such as English and Chinese, no attempts have been made to generate synthetic
data for improving Persian GEC systems. Given the substantial grammatical and semantic differences of
the Persian language, in this paper, we propose a new deep learning framework to create large enough
synthetic sentences that are grammatically incorrect for training Persian GEC systems. A modified version
of sequence generative adversarial net with policy gradient is developed, in which the size of the model is
scaled down and the hyperparameters are tuned. The generator is trained in an adversarial framework on
a limited dataset of 8000 samples. Our proposed adversarial framework achieved bilingual evaluation
understudy (BLEU) scores of 64.5% on BLEU-2, 44.2% on BLEU-3, and 21.4% on BLEU-4, and
outperformed the conventional supervised-trained long short-term memory using maximum likelihood
estimation and recently proposed sequence labeler using neural machine translation augmentation. This
shows promise toward improving the performance of GEC systems by generating a large amount of
training data.
This paper describes machine translation systems submitted by the IIITT team for the LoResMT 2021 shared task on translating between English-Marathi and English-Irish, which are low-resource languages. They fine-tuned IndicTrans, a pretrained multilingual NMT model, on additional parallel corpora for English-Marathi translation. For English-Irish, they used a pretrained Helsinki-NLP model. Their systems achieved BLEU scores of 24.2, 25.8 and 34.6 for English-Marathi, Irish-English and English-Irish respectively, ranking highly in the shared task.
Mediterranean Arabic Language and Speech Technology ResourcesHend Al-Khalifa
The MEDAR survey collected responses from 57 players involved in human language technologies and language resources for Arabic. The responses came from a variety of countries, with the highest numbers coming from Egypt (11), Morocco (10), and West Bank & Gaza Strip (9). The majority of respondents (36) answered on behalf of institutions, while 17 answered as independent experts. The survey gathered information on the respondents' profiles, language resources, needs for resources, and market information to understand the current state of language technologies for Arabic.
New Research Articles 2020 May Issue International Journal of Software Engin...ijseajournal
This document proposes an agent-based approach to systematically specify auditability requirements during goal-oriented requirements engineering. It presents a case study applying this approach to the design of a system called LawDisTrA that distributes lawsuits among judges in a transparent manner. The approach uses an interdependency graph to capture different facets of transparency and their operationalization. An evaluation of a implemented LawDisTrA system that distributed over 300,000 lawsuits demonstrated the ability of the presented approach to address the cross-organizational nature of transparency through adequate auditability techniques.
MULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURESmlaij
The proposed methodology presented in the paper deals with solving the problem of multilingual speech
recognition. Current text and speech recognition and translation methods have a very low accuracy in
translating sentences which contain a mixture of two or more different languages. The paper proposes a
novel approach to tackling this problem and highlights some of the drawbacks of current recognition and
translation methods.
Annotating For Hate Speech The MaNeCo Corpus And Some Input From Critical Di...Sabrina Baloi
This document presents a novel annotation scheme for hate speech in online comments. It was developed based on analyzing comments reacting to news on migration and LGBTQ+ issues in Malta. The scheme aims to address challenges in annotating hate speech, as different people may have different thresholds for what constitutes hate speech. It proposes a multi-layer scheme that was tested against a binary hate speech/not hate speech classification and showed higher agreement between annotators. It also introduces the MaNeCo corpus, a large collection of online newspaper comments from Malta over 10 years that the scheme will be applied to.
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...kevig
In this paper, phoneme sequences are used as language information to perform code-switched language
identification (LID). With the one-pass recognition system, the spoken sounds are converted into
phonetically arranged sequences of sounds. The acoustic models are robust enough to handle multiple
languages when emulating multiple hidden Markov models (HMMs). To determine the phoneme similarity
among our target languages, we reported two methods of phoneme mapping. Statistical phoneme-based
bigram language models (LM) are integrated into speech decoding to eliminate possible phone
mismatches. The supervised support vector machine (SVM) is used to learn to recognize the phonetic
information of mixed-language speech based on recognized phone sequences. As the back-end decision is
taken by an SVM, the likelihood scores of segments with monolingual phone occurrence are used to
classify language identity. The speech corpus was tested on Sepedi and English languages that are often
mixed. Our system is evaluated by measuring both the ASR performance and the LID performance
separately. The systems have obtained a promising ASR accuracy with data-driven phone merging
approach modelled using 16 Gaussian mixtures per state. In code-switched speech and monolingual
speech segments respectively, the proposed systems achieved an acceptable ASR and LID accuracy.
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...kevig
In this paper, phoneme sequences are used as language information to perform code-switched language identification (LID). With the one-pass recognition system, the spoken sounds are converted into phonetically arranged sequences of sounds. The acoustic models are robust enough to handle multiple languages when emulating multiple hidden Markov models (HMMs). To determine the phoneme similarity among our target languages, we reported two methods of phoneme mapping. Statistical phoneme-based bigram language models (LM) are integrated into speech decoding to eliminate possible phone mismatches. The supervised support vector machine (SVM) is used to learn to recognize the phonetic information of mixed-language speech based on recognized phone sequences. As the back-end decision is taken by an SVM, the likelihood scores of segments with monolingual phone occurrence are used to classify language identity. The speech corpus was tested on Sepedi and English languages that are often mixed. Our system is evaluated by measuring both the ASR performance and the LID performance separately. The systems have obtained a promising ASR accuracy with data-driven phone merging approach modelled using 16 Gaussian mixtures per state. In code-switched speech and monolingual speech segments respectively, the proposed systems achieved an acceptable ASR and LID accuracy.
Live Sign Language Translation: A SurveyIRJET Journal
The document discusses various approaches that have been used for live sign language translation. It reviews 20 research papers that used techniques like convolutional neural networks, support vector machines, k-nearest neighbors, and LSTM networks to classify hand gestures and translate sign language into text with varying levels of accuracy between 62.3% to 99.9%. Deep learning models using CNNs and LSTMs achieved the highest accuracy compared to traditional classifiers. The paper aims to help other researchers in the field understand past approaches and how to potentially improve sign language translation systems.
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...kevig
Many Natural Language Processing (NLP) applications involve Named Entity Recognition (NER) as an important task, where it leads to improve the overall performance of NLP applications. In this paper the Deep learning techniques are used to perform NER task on Hindi text data as it found that as compared to English NER, Hindi language NER is not sufficiently done. This is a barrier for resource-scarce languages as many resources are not readily available. Many researchers use various techniques such as rule based, machine learning based and hybrid approaches to solve this problem. Deep learning based algorithms are being developed in large scale as an innovative approach now a days for the advanced NER models which will give the best results out of it. In this paper we devise a Novel architecture based on residual network architecture for preferably Bidirectional Long Short Term Memory (BiLSTM) with fasttext word embedding layers. For this purpose we use pre-trained word embedding to represent the words in the corpus where the NER tags of the words are defined as the used annotated corpora. BiLSTM Development of an NER system for Indian languages is a comparatively difficult task. In this paper, we have done the various experiments to compare the results of NER with normal embedding and fasttext embedding layers to analyse the performance of word embedding with different batch sizes to train the deep learning models. Here we present a state-of-the-art results with said approach F1 Score measures.
An evolutionary approach to comparative analysis of detecting Bangla abusive ...journalBEEI
The use of Bangla abusive texts has been accelerated with the progressive use of social media. Through this platform, one can spread the hatred or negativity in a viral form. Plenty of research has been done on detecting abusive text in the English language. Bangla abusive text detection has not been done to a great extent. In this experimental study, we have applied three distinct approaches to a comprehensive dataset to obtain a better outcome. In the first study, a large dataset collected from Facebook and YouTube has been utilized to detect abusive texts. After extensive pre-processing and feature extraction, a set of consciously selected supervised machine learning classifiers i.e. multinomial Naïve Bayes (MNB), multi layer perceptron (MLP), support vector machine (SVM), decision tree, random forrest, stochastic gradient descent (SGD), ridge, perceptron and k-nearest neighbors (k-NN) has been applied to determine the best result. The second experiment is conducted by constructing a balanced dataset by random under sampling the majority class and finally, a Bengali stemmer is employed on the dataset and then the final experiment is conducted. In all three experiments, SVM with the full dataset obtained the highest accuracy of 88%.
An in-depth review on News Classification through NLPIRJET Journal
This document provides an in-depth literature review of news classification through natural language processing (NLP). It discusses several existing approaches to news classification, including models that use convolutional neural networks (CNNs), graph-based approaches, and attention mechanisms. The document also notes that current search engines often return too many irrelevant results, so classification could help layer search results. It concludes that while many techniques have been developed, inconsistencies remain in effectively classifying news, so further research on combining NLP, feature extraction, and fuzzy logic is needed.
A SURVEY ON CROSS LANGUAGE INFORMATION RETRIEVALIJCI JOURNAL
Now a days, number of Web Users accessing information over Internet is increasing day by day. A huge
amount of information on Internet is available in different language that can be access by anybody at any
time. Information Retrieval (IR) deals with finding useful information from a large collection of
unstructured, structured and semi-structured data. Information Retrieval can be classified into different
classes such as monolingual information retrieval, cross language information retrieval and multilingual
information retrieval (MLIR) etc. In the current scenario, the diversity of information and language
barriers are the serious issues for communication and cultural exchange across the world. To solve such
barriers, cross language information retrieval (CLIR) system, are nowadays in strong demand. CLIR refers
to the information retrieval activities in which the query or documents may appear in different languages.
This paper takes an overview of the new application areas of CLIR and reviews the approaches used in the
process of CLIR research for query and document translation. Further, based on available literature, a
number of challenges and issues in CLIR have been identified and discussed.
A prior case study of natural language processing on different domain IJECEIAES
This document summarizes a prior case study on natural language processing across different domains. It begins with an introduction to natural language processing, describing how it is a branch of artificial intelligence that allows computers to understand human language. It then reviews several existing studies that applied natural language processing techniques such as named entity recognition and text mining to tasks like identifying technical knowledge in resumes, enhancing reading skills for deaf students, and predicting student performance. The document concludes by highlighting some of the challenges in developing new natural language processing models.
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES ijnlc
Stemming is the process of term conflation. It conflates all the word variants to a common form called as stem. It plays significant role in numerous Natural Language Processing (NLP) applications like morphological analysis, parsing, document summarization, text classification, part-of-speech tagging, question-answering system, machine translation, word sense disambiguation, information retrieval (IR), etc. Each of these tasks requires some pre-processing to be done. Stemming is one of the important building blocks for all these applications. This paper, presents an overview of various stemming techniques, evaluation criteria for stemmers and various existing stemmers for Indic languages.
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
1) The document discusses the Sungal Tunnel project in Jammu and Kashmir, India, which is being constructed using the New Austrian Tunneling Method (NATM).
2) NATM involves continuous monitoring during construction to adapt to changing ground conditions, and makes extensive use of shotcrete for temporary tunnel support.
3) The methodology section outlines the systematic geotechnical design process for tunnels according to Austrian guidelines, and describes the various steps of NATM tunnel construction including initial and secondary tunnel support.
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
This study examines the effect of response reduction factors (R factors) on reinforced concrete (RC) framed structures through nonlinear dynamic analysis. Three RC frame models with varying heights (4, 8, and 12 stories) were analyzed in ETABS software under different R factors ranging from 1 to 5. The results showed that displacement increased as the R factor decreased, indicating less linear behavior for lower R factors. Drift also decreased proportionally with increasing R factors from 1 to 5. Shear forces in the frames decreased with higher R factors. In general, R factors of 3 to 5 produced more satisfactory performance with less displacement and drift. The displacement variations between different building heights were consistent at different R factors. This study evaluated how R factors influence
More Related Content
Similar to AUTOMATIC DETECTION AND LANGUAGE IDENTIFICATION OF MULTILINGUAL DOCUMENTS
Contextual Analysis for Middle Eastern Languages with Hidden Markov Modelsijnlc
Displaying a document in Middle Eastern languages requires contextual analysis due to different presentational forms for each character of the alphabet. The words of the document will be formed by the joining of the correct positional glyphs representing corresponding presentational forms of the
characters. A set of rules defines the joining of the glyphs. As usual, these rules vary from language to language and are subject to interpretation by the software developers.
The Evaluation of a Code-Switched Sepedi-English Automatic Speech Recognition...IJCI JOURNAL
Speech technology is a field that encompasses various techniques and tools used to enable machines to interact with speech, such as automatic speech recognition (ASR), spoken dialog systems, and others, allowing a device to capture spoken words through a microphone from a human speaker. End-to-end approaches such as Connectionist Temporal Classification (CTC) and attention-based methods are the most used for the development of ASR systems. However, these techniques were commonly used for research and development for many high-resourced languages with large amounts of speech data for training and evaluation, leaving low-resource languages relatively underdeveloped. While the CTC method has been successfully used for other languages, its effectiveness for the Sepedi language remains uncertain. In this study, we present the evaluation of the Sepedi-English code-switched automatic speech recognition system. This end-to-end system was developed using the Sepedi Prompted Code Switching corpus and the CTC approach. The performance of the system was evaluated using both the NCHLT Sepedi test corpus and the Sepedi Prompted Code Switching corpus. The model produced the lowest WER of 41.9%, however, the model faced challenges in recognizing the Sepedi only text.
A Survey Of Current Datasets For Code-Switching ResearchJim Webb
This document provides a survey of existing datasets for code-switching research. It begins by defining code-switching and discussing its increased prevalence in social media interactions. It then proposes quality metrics for evaluating code-switching datasets, including number of words, vocabulary size, number of sentences, and average sentence length. The document reviews available datasets categorized by common NLP tasks like language identification, named entity recognition, sentiment analysis, and machine translation. Several datasets for language pairs like English-Hindi, Spanish-English, and Mandarin-English are discussed. In conclusion, the survey finds that while interest in code-switching research is growing, availability of suitable annotated datasets remains limited.
ADVERSARIAL GRAMMATICAL ERROR GENERATION: APPLICATION TO PERSIAN LANGUAGEkevig
Grammatical error correction (GEC) greatly benefits from large quantities of high-quality training data.
However, the preparation of a large amount of labelled training data is time-consuming and prone to
human errors. These issues have become major obstacles in training GEC systems. Recently, the
performance of English GEC systems has drastically been enhanced by the application of deep neural
networks that generate a large amount of synthetic data from limited samples. While GEC has extensively
been studied in languages such as English and Chinese, no attempts have been made to generate synthetic
data for improving Persian GEC systems. Given the substantial grammatical and semantic differences of
the Persian language, in this paper, we propose a new deep learning framework to create large enough
synthetic sentences that are grammatically incorrect for training Persian GEC systems. A modified version
of sequence generative adversarial net with policy gradient is developed, in which the size of the model is
scaled down and the hyperparameters are tuned. The generator is trained in an adversarial framework on
a limited dataset of 8000 samples. Our proposed adversarial framework achieved bilingual evaluation
understudy (BLEU) scores of 64.5% on BLEU-2, 44.2% on BLEU-3, and 21.4% on BLEU-4, and
outperformed the conventional supervised-trained long short-term memory using maximum likelihood
estimation and recently proposed sequence labeler using neural machine translation augmentation. This
shows promise toward improving the performance of GEC systems by generating a large amount of
training data.
ADVERSARIAL GRAMMATICAL ERROR GENERATION: APPLICATION TO PERSIAN LANGUAGEkevig
Grammatical error correction (GEC) greatly benefits from large quantities of high-quality training data.
However, the preparation of a large amount of labelled training data is time-consuming and prone to
human errors. These issues have become major obstacles in training GEC systems. Recently, the
performance of English GEC systems has drastically been enhanced by the application of deep neural
networks that generate a large amount of synthetic data from limited samples. While GEC has extensively
been studied in languages such as English and Chinese, no attempts have been made to generate synthetic
data for improving Persian GEC systems. Given the substantial grammatical and semantic differences of
the Persian language, in this paper, we propose a new deep learning framework to create large enough
synthetic sentences that are grammatically incorrect for training Persian GEC systems. A modified version
of sequence generative adversarial net with policy gradient is developed, in which the size of the model is
scaled down and the hyperparameters are tuned. The generator is trained in an adversarial framework on
a limited dataset of 8000 samples. Our proposed adversarial framework achieved bilingual evaluation
understudy (BLEU) scores of 64.5% on BLEU-2, 44.2% on BLEU-3, and 21.4% on BLEU-4, and
outperformed the conventional supervised-trained long short-term memory using maximum likelihood
estimation and recently proposed sequence labeler using neural machine translation augmentation. This
shows promise toward improving the performance of GEC systems by generating a large amount of
training data.
This paper describes machine translation systems submitted by the IIITT team for the LoResMT 2021 shared task on translating between English-Marathi and English-Irish, which are low-resource languages. They fine-tuned IndicTrans, a pretrained multilingual NMT model, on additional parallel corpora for English-Marathi translation. For English-Irish, they used a pretrained Helsinki-NLP model. Their systems achieved BLEU scores of 24.2, 25.8 and 34.6 for English-Marathi, Irish-English and English-Irish respectively, ranking highly in the shared task.
Mediterranean Arabic Language and Speech Technology ResourcesHend Al-Khalifa
The MEDAR survey collected responses from 57 players involved in human language technologies and language resources for Arabic. The responses came from a variety of countries, with the highest numbers coming from Egypt (11), Morocco (10), and West Bank & Gaza Strip (9). The majority of respondents (36) answered on behalf of institutions, while 17 answered as independent experts. The survey gathered information on the respondents' profiles, language resources, needs for resources, and market information to understand the current state of language technologies for Arabic.
New Research Articles 2020 May Issue International Journal of Software Engin...ijseajournal
This document proposes an agent-based approach to systematically specify auditability requirements during goal-oriented requirements engineering. It presents a case study applying this approach to the design of a system called LawDisTrA that distributes lawsuits among judges in a transparent manner. The approach uses an interdependency graph to capture different facets of transparency and their operationalization. An evaluation of a implemented LawDisTrA system that distributed over 300,000 lawsuits demonstrated the ability of the presented approach to address the cross-organizational nature of transparency through adequate auditability techniques.
MULTILINGUAL SPEECH TO TEXT USING DEEP LEARNING BASED ON MFCC FEATURESmlaij
The proposed methodology presented in the paper deals with solving the problem of multilingual speech
recognition. Current text and speech recognition and translation methods have a very low accuracy in
translating sentences which contain a mixture of two or more different languages. The paper proposes a
novel approach to tackling this problem and highlights some of the drawbacks of current recognition and
translation methods.
Annotating For Hate Speech The MaNeCo Corpus And Some Input From Critical Di...Sabrina Baloi
This document presents a novel annotation scheme for hate speech in online comments. It was developed based on analyzing comments reacting to news on migration and LGBTQ+ issues in Malta. The scheme aims to address challenges in annotating hate speech, as different people may have different thresholds for what constitutes hate speech. It proposes a multi-layer scheme that was tested against a binary hate speech/not hate speech classification and showed higher agreement between annotators. It also introduces the MaNeCo corpus, a large collection of online newspaper comments from Malta over 10 years that the scheme will be applied to.
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...kevig
In this paper, phoneme sequences are used as language information to perform code-switched language
identification (LID). With the one-pass recognition system, the spoken sounds are converted into
phonetically arranged sequences of sounds. The acoustic models are robust enough to handle multiple
languages when emulating multiple hidden Markov models (HMMs). To determine the phoneme similarity
among our target languages, we reported two methods of phoneme mapping. Statistical phoneme-based
bigram language models (LM) are integrated into speech decoding to eliminate possible phone
mismatches. The supervised support vector machine (SVM) is used to learn to recognize the phonetic
information of mixed-language speech based on recognized phone sequences. As the back-end decision is
taken by an SVM, the likelihood scores of segments with monolingual phone occurrence are used to
classify language identity. The speech corpus was tested on Sepedi and English languages that are often
mixed. Our system is evaluated by measuring both the ASR performance and the LID performance
separately. The systems have obtained a promising ASR accuracy with data-driven phone merging
approach modelled using 16 Gaussian mixtures per state. In code-switched speech and monolingual
speech segments respectively, the proposed systems achieved an acceptable ASR and LID accuracy.
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...kevig
In this paper, phoneme sequences are used as language information to perform code-switched language identification (LID). With the one-pass recognition system, the spoken sounds are converted into phonetically arranged sequences of sounds. The acoustic models are robust enough to handle multiple languages when emulating multiple hidden Markov models (HMMs). To determine the phoneme similarity among our target languages, we reported two methods of phoneme mapping. Statistical phoneme-based bigram language models (LM) are integrated into speech decoding to eliminate possible phone mismatches. The supervised support vector machine (SVM) is used to learn to recognize the phonetic information of mixed-language speech based on recognized phone sequences. As the back-end decision is taken by an SVM, the likelihood scores of segments with monolingual phone occurrence are used to classify language identity. The speech corpus was tested on Sepedi and English languages that are often mixed. Our system is evaluated by measuring both the ASR performance and the LID performance separately. The systems have obtained a promising ASR accuracy with data-driven phone merging approach modelled using 16 Gaussian mixtures per state. In code-switched speech and monolingual speech segments respectively, the proposed systems achieved an acceptable ASR and LID accuracy.
Live Sign Language Translation: A SurveyIRJET Journal
The document discusses various approaches that have been used for live sign language translation. It reviews 20 research papers that used techniques like convolutional neural networks, support vector machines, k-nearest neighbors, and LSTM networks to classify hand gestures and translate sign language into text with varying levels of accuracy between 62.3% to 99.9%. Deep learning models using CNNs and LSTMs achieved the highest accuracy compared to traditional classifiers. The paper aims to help other researchers in the field understand past approaches and how to potentially improve sign language translation systems.
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...kevig
Many Natural Language Processing (NLP) applications involve Named Entity Recognition (NER) as an important task, where it leads to improve the overall performance of NLP applications. In this paper the Deep learning techniques are used to perform NER task on Hindi text data as it found that as compared to English NER, Hindi language NER is not sufficiently done. This is a barrier for resource-scarce languages as many resources are not readily available. Many researchers use various techniques such as rule based, machine learning based and hybrid approaches to solve this problem. Deep learning based algorithms are being developed in large scale as an innovative approach now a days for the advanced NER models which will give the best results out of it. In this paper we devise a Novel architecture based on residual network architecture for preferably Bidirectional Long Short Term Memory (BiLSTM) with fasttext word embedding layers. For this purpose we use pre-trained word embedding to represent the words in the corpus where the NER tags of the words are defined as the used annotated corpora. BiLSTM Development of an NER system for Indian languages is a comparatively difficult task. In this paper, we have done the various experiments to compare the results of NER with normal embedding and fasttext embedding layers to analyse the performance of word embedding with different batch sizes to train the deep learning models. Here we present a state-of-the-art results with said approach F1 Score measures.
An evolutionary approach to comparative analysis of detecting Bangla abusive ...journalBEEI
The use of Bangla abusive texts has been accelerated with the progressive use of social media. Through this platform, one can spread the hatred or negativity in a viral form. Plenty of research has been done on detecting abusive text in the English language. Bangla abusive text detection has not been done to a great extent. In this experimental study, we have applied three distinct approaches to a comprehensive dataset to obtain a better outcome. In the first study, a large dataset collected from Facebook and YouTube has been utilized to detect abusive texts. After extensive pre-processing and feature extraction, a set of consciously selected supervised machine learning classifiers i.e. multinomial Naïve Bayes (MNB), multi layer perceptron (MLP), support vector machine (SVM), decision tree, random forrest, stochastic gradient descent (SGD), ridge, perceptron and k-nearest neighbors (k-NN) has been applied to determine the best result. The second experiment is conducted by constructing a balanced dataset by random under sampling the majority class and finally, a Bengali stemmer is employed on the dataset and then the final experiment is conducted. In all three experiments, SVM with the full dataset obtained the highest accuracy of 88%.
An in-depth review on News Classification through NLPIRJET Journal
This document provides an in-depth literature review of news classification through natural language processing (NLP). It discusses several existing approaches to news classification, including models that use convolutional neural networks (CNNs), graph-based approaches, and attention mechanisms. The document also notes that current search engines often return too many irrelevant results, so classification could help layer search results. It concludes that while many techniques have been developed, inconsistencies remain in effectively classifying news, so further research on combining NLP, feature extraction, and fuzzy logic is needed.
A SURVEY ON CROSS LANGUAGE INFORMATION RETRIEVALIJCI JOURNAL
Now a days, number of Web Users accessing information over Internet is increasing day by day. A huge
amount of information on Internet is available in different language that can be access by anybody at any
time. Information Retrieval (IR) deals with finding useful information from a large collection of
unstructured, structured and semi-structured data. Information Retrieval can be classified into different
classes such as monolingual information retrieval, cross language information retrieval and multilingual
information retrieval (MLIR) etc. In the current scenario, the diversity of information and language
barriers are the serious issues for communication and cultural exchange across the world. To solve such
barriers, cross language information retrieval (CLIR) system, are nowadays in strong demand. CLIR refers
to the information retrieval activities in which the query or documents may appear in different languages.
This paper takes an overview of the new application areas of CLIR and reviews the approaches used in the
process of CLIR research for query and document translation. Further, based on available literature, a
number of challenges and issues in CLIR have been identified and discussed.
A prior case study of natural language processing on different domain IJECEIAES
This document summarizes a prior case study on natural language processing across different domains. It begins with an introduction to natural language processing, describing how it is a branch of artificial intelligence that allows computers to understand human language. It then reviews several existing studies that applied natural language processing techniques such as named entity recognition and text mining to tasks like identifying technical knowledge in resumes, enhancing reading skills for deaf students, and predicting student performance. The document concludes by highlighting some of the challenges in developing new natural language processing models.
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES ijnlc
Stemming is the process of term conflation. It conflates all the word variants to a common form called as stem. It plays significant role in numerous Natural Language Processing (NLP) applications like morphological analysis, parsing, document summarization, text classification, part-of-speech tagging, question-answering system, machine translation, word sense disambiguation, information retrieval (IR), etc. Each of these tasks requires some pre-processing to be done. Stemming is one of the important building blocks for all these applications. This paper, presents an overview of various stemming techniques, evaluation criteria for stemmers and various existing stemmers for Indic languages.
Similar to AUTOMATIC DETECTION AND LANGUAGE IDENTIFICATION OF MULTILINGUAL DOCUMENTS (20)
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
1) The document discusses the Sungal Tunnel project in Jammu and Kashmir, India, which is being constructed using the New Austrian Tunneling Method (NATM).
2) NATM involves continuous monitoring during construction to adapt to changing ground conditions, and makes extensive use of shotcrete for temporary tunnel support.
3) The methodology section outlines the systematic geotechnical design process for tunnels according to Austrian guidelines, and describes the various steps of NATM tunnel construction including initial and secondary tunnel support.
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
This study examines the effect of response reduction factors (R factors) on reinforced concrete (RC) framed structures through nonlinear dynamic analysis. Three RC frame models with varying heights (4, 8, and 12 stories) were analyzed in ETABS software under different R factors ranging from 1 to 5. The results showed that displacement increased as the R factor decreased, indicating less linear behavior for lower R factors. Drift also decreased proportionally with increasing R factors from 1 to 5. Shear forces in the frames decreased with higher R factors. In general, R factors of 3 to 5 produced more satisfactory performance with less displacement and drift. The displacement variations between different building heights were consistent at different R factors. This study evaluated how R factors influence
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...IRJET Journal
This study compares the use of Stark Steel and TMT Steel as reinforcement materials in a two-way reinforced concrete slab. Mechanical testing is conducted to determine the tensile strength, yield strength, and other properties of each material. A two-way slab design adhering to codes and standards is executed with both materials. The performance is analyzed in terms of deflection, stability under loads, and displacement. Cost analyses accounting for material, durability, maintenance, and life cycle costs are also conducted. The findings provide insights into the economic and structural implications of each material for reinforcement selection and recommendations on the most suitable material based on the analysis.
Effect of Camber and Angles of Attack on Airfoil CharacteristicsIRJET Journal
This document discusses a study analyzing the effect of camber, position of camber, and angle of attack on the aerodynamic characteristics of airfoils. Sixteen modified asymmetric NACA airfoils were analyzed using computational fluid dynamics (CFD) by varying the camber, camber position, and angle of attack. The results showed the relationship between these parameters and the lift coefficient, drag coefficient, and lift to drag ratio. This provides insight into how changes in airfoil geometry impact aerodynamic performance.
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...IRJET Journal
This document reviews the progress and challenges of aluminum-based metal matrix composites (MMCs), focusing on their fabrication processes and applications. It discusses how various aluminum MMCs have been developed using reinforcements like borides, carbides, oxides, and nitrides to improve mechanical and wear properties. These composites have gained prominence for their lightweight, high-strength and corrosion resistance properties. The document also examines recent advancements in fabrication techniques for aluminum MMCs and their growing applications in industries such as aerospace and automotive. However, it notes that challenges remain around issues like improper mixing of reinforcements and reducing reinforcement agglomeration.
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...IRJET Journal
This document discusses research on using graph neural networks (GNNs) for dynamic optimization of public transportation networks in real-time. GNNs represent transit networks as graphs with nodes as stops and edges as connections. The GNN model aims to optimize networks using real-time data on vehicle locations, arrival times, and passenger loads. This helps increase mobility, decrease traffic, and improve efficiency. The system continuously trains and infers to adapt to changing transit conditions, providing decision support tools. While research has focused on performance, more work is needed on security, socio-economic impacts, contextual generalization of models, continuous learning approaches, and effective real-time visualization.
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...IRJET Journal
This document summarizes a research project that aims to compare the structural performance of conventional slab and grid slab systems in multi-story buildings using ETABS software. The study will analyze both symmetric and asymmetric building models under various loading conditions. Parameters like deflections, moments, shears, and stresses will be examined to evaluate the structural effectiveness of each slab type. The results will provide insights into the comparative behavior of conventional and grid slabs to help engineers and architects select appropriate slab systems based on building layouts and design requirements.
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...IRJET Journal
This document summarizes and reviews a research paper on the seismic response of reinforced concrete (RC) structures with plan and vertical irregularities, with and without infill walls. It discusses how infill walls can improve or reduce the seismic performance of RC buildings, depending on factors like wall layout, height distribution, connection to the frame, and relative stiffness of walls and frames. The reviewed research paper analyzes the behavior of infill walls, effects of vertical irregularities, and seismic performance of high-rise structures under linear static and dynamic analysis. It studies response characteristics like story drift, deflection and shear. The document also provides literature on similar research investigating the effects of infill walls, soft stories, plan irregularities, and different
This document provides a review of machine learning techniques used in Advanced Driver Assistance Systems (ADAS). It begins with an abstract that summarizes key applications of machine learning in ADAS, including object detection, recognition, and decision-making. The introduction discusses the integration of machine learning in ADAS and how it is transforming vehicle safety. The literature review then examines several research papers on topics like lightweight deep learning models for object detection and lane detection models using image processing. It concludes by discussing challenges and opportunities in the field, such as improving algorithm robustness and adaptability.
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...IRJET Journal
The document analyzes temperature and precipitation trends in Asosa District, Benishangul Gumuz Region, Ethiopia from 1993 to 2022 based on data from the local meteorological station. The results show:
1) The average maximum and minimum annual temperatures have generally decreased over time, with maximum temperatures decreasing by a factor of -0.0341 and minimum by -0.0152.
2) Mann-Kendall tests found the decreasing temperature trends to be statistically significant for annual maximum temperatures but not for annual minimum temperatures.
3) Annual precipitation in Asosa District showed a statistically significant increasing trend.
The conclusions recommend development planners account for rising summer precipitation and declining temperatures in
P.E.B. Framed Structure Design and Analysis Using STAAD ProIRJET Journal
This document discusses the design and analysis of pre-engineered building (PEB) framed structures using STAAD Pro software. It provides an overview of PEBs, including that they are designed off-site with building trusses and beams produced in a factory. STAAD Pro is identified as a key tool for modeling, analyzing, and designing PEBs to ensure their performance and safety under various load scenarios. The document outlines modeling structural parts in STAAD Pro, evaluating structural reactions, assigning loads, and following international design codes and standards. In summary, STAAD Pro is used to design and analyze PEB framed structures to ensure safety and code compliance.
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...IRJET Journal
This document provides a review of research on innovative fiber integration methods for reinforcing concrete structures. It discusses studies that have explored using carbon fiber reinforced polymer (CFRP) composites with recycled plastic aggregates to develop more sustainable strengthening techniques. It also examines using ultra-high performance fiber reinforced concrete to improve shear strength in beams. Additional topics covered include the dynamic responses of FRP-strengthened beams under static and impact loads, and the performance of preloaded CFRP-strengthened fiber reinforced concrete beams. The review highlights the potential of fiber composites to enable more sustainable and resilient construction practices.
Survey Paper on Cloud-Based Secured Healthcare SystemIRJET Journal
This document summarizes a survey on securing patient healthcare data in cloud-based systems. It discusses using technologies like facial recognition, smart cards, and cloud computing combined with strong encryption to securely store patient data. The survey found that healthcare professionals believe digitizing patient records and storing them in a centralized cloud system would improve access during emergencies and enable more efficient care compared to paper-based systems. However, ensuring privacy and security of patient data is paramount as healthcare incorporates these digital technologies.
Review on studies and research on widening of existing concrete bridgesIRJET Journal
This document summarizes several studies that have been conducted on widening existing concrete bridges. It describes a study from China that examined load distribution factors for a bridge widened with composite steel-concrete girders. It also outlines challenges and solutions for widening a bridge in the UAE, including replacing bearings and stitching the new and existing structures. Additionally, it discusses two bridge widening projects in New Zealand that involved adding precast beams and stitching to connect structures. Finally, safety measures and challenges for strengthening a historic bridge in Switzerland under live traffic are presented.
React based fullstack edtech web applicationIRJET Journal
The document describes the architecture of an educational technology web application built using the MERN stack. It discusses the frontend developed with ReactJS, backend with NodeJS and ExpressJS, and MongoDB database. The frontend provides dynamic user interfaces, while the backend offers APIs for authentication, course management, and other functions. MongoDB enables flexible data storage. The architecture aims to provide a scalable, responsive platform for online learning.
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...IRJET Journal
This paper proposes integrating Internet of Things (IoT) and blockchain technologies to help implement objectives of India's National Education Policy (NEP) in the education sector. The paper discusses how blockchain could be used for secure student data management, credential verification, and decentralized learning platforms. IoT devices could create smart classrooms, automate attendance tracking, and enable real-time monitoring. Blockchain would ensure integrity of exam processes and resource allocation, while smart contracts automate agreements. The paper argues this integration has potential to revolutionize education by making it more secure, transparent and efficient, in alignment with NEP goals. However, challenges like infrastructure needs, data privacy, and collaborative efforts are also discussed.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.IRJET Journal
This document provides a review of research on the performance of coconut fibre reinforced concrete. It summarizes several studies that tested different volume fractions and lengths of coconut fibres in concrete mixtures with varying compressive strengths. The studies found that coconut fibre improved properties like tensile strength, toughness, crack resistance, and spalling resistance compared to plain concrete. Volume fractions of 2-5% and fibre lengths of 20-50mm produced the best results. The document concludes that using a 4-5% volume fraction of coconut fibres 30-40mm in length with M30-M60 grade concrete would provide benefits based on previous research.
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...IRJET Journal
The document discusses optimizing business management processes through automation using Microsoft Power Automate and artificial intelligence. It provides an overview of Power Automate's key components and features for automating workflows across various apps and services. The document then presents several scenarios applying automation solutions to common business processes like data entry, monitoring, HR, finance, customer support, and more. It estimates the potential time and cost savings from implementing automation for each scenario. Finally, the conclusion emphasizes the transformative impact of AI and automation tools on business processes and the need for ongoing optimization.
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignIRJET Journal
The document describes the seismic design of a G+5 steel building frame located in Roorkee, India according to Indian codes IS 1893-2002 and IS 800. The frame was analyzed using the equivalent static load method and response spectrum method, and its response in terms of displacements and shear forces were compared. Based on the analysis, the frame was designed as a seismic-resistant steel structure according to IS 800:2007. The software STAAD Pro was used for the analysis and design.
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...IRJET Journal
This research paper explores using plastic waste as a sustainable and cost-effective construction material. The study focuses on manufacturing pavers and bricks using recycled plastic and partially replacing concrete with plastic alternatives. Initial results found that pavers and bricks made from recycled plastic demonstrate comparable strength and durability to traditional materials while providing environmental and cost benefits. Additionally, preliminary research indicates incorporating plastic waste as a partial concrete replacement significantly reduces construction costs without compromising structural integrity. The outcomes suggest adopting plastic waste in construction can address plastic pollution while optimizing costs, promoting more sustainable building practices.
Comparative analysis between traditional aquaponics and reconstructed aquapon...bijceesjournal
The aquaponic system of planting is a method that does not require soil usage. It is a method that only needs water, fish, lava rocks (a substitute for soil), and plants. Aquaponic systems are sustainable and environmentally friendly. Its use not only helps to plant in small spaces but also helps reduce artificial chemical use and minimizes excess water use, as aquaponics consumes 90% less water than soil-based gardening. The study applied a descriptive and experimental design to assess and compare conventional and reconstructed aquaponic methods for reproducing tomatoes. The researchers created an observation checklist to determine the significant factors of the study. The study aims to determine the significant difference between traditional aquaponics and reconstructed aquaponics systems propagating tomatoes in terms of height, weight, girth, and number of fruits. The reconstructed aquaponics system’s higher growth yield results in a much more nourished crop than the traditional aquaponics system. It is superior in its number of fruits, height, weight, and girth measurement. Moreover, the reconstructed aquaponics system is proven to eliminate all the hindrances present in the traditional aquaponics system, which are overcrowding of fish, algae growth, pest problems, contaminated water, and dead fish.
TIME DIVISION MULTIPLEXING TECHNIQUE FOR COMMUNICATION SYSTEMHODECEDSIET
Time Division Multiplexing (TDM) is a method of transmitting multiple signals over a single communication channel by dividing the signal into many segments, each having a very short duration of time. These time slots are then allocated to different data streams, allowing multiple signals to share the same transmission medium efficiently. TDM is widely used in telecommunications and data communication systems.
### How TDM Works
1. **Time Slots Allocation**: The core principle of TDM is to assign distinct time slots to each signal. During each time slot, the respective signal is transmitted, and then the process repeats cyclically. For example, if there are four signals to be transmitted, the TDM cycle will divide time into four slots, each assigned to one signal.
2. **Synchronization**: Synchronization is crucial in TDM systems to ensure that the signals are correctly aligned with their respective time slots. Both the transmitter and receiver must be synchronized to avoid any overlap or loss of data. This synchronization is typically maintained by a clock signal that ensures time slots are accurately aligned.
3. **Frame Structure**: TDM data is organized into frames, where each frame consists of a set of time slots. Each frame is repeated at regular intervals, ensuring continuous transmission of data streams. The frame structure helps in managing the data streams and maintaining the synchronization between the transmitter and receiver.
4. **Multiplexer and Demultiplexer**: At the transmitting end, a multiplexer combines multiple input signals into a single composite signal by assigning each signal to a specific time slot. At the receiving end, a demultiplexer separates the composite signal back into individual signals based on their respective time slots.
### Types of TDM
1. **Synchronous TDM**: In synchronous TDM, time slots are pre-assigned to each signal, regardless of whether the signal has data to transmit or not. This can lead to inefficiencies if some time slots remain empty due to the absence of data.
2. **Asynchronous TDM (or Statistical TDM)**: Asynchronous TDM addresses the inefficiencies of synchronous TDM by allocating time slots dynamically based on the presence of data. Time slots are assigned only when there is data to transmit, which optimizes the use of the communication channel.
### Applications of TDM
- **Telecommunications**: TDM is extensively used in telecommunication systems, such as in T1 and E1 lines, where multiple telephone calls are transmitted over a single line by assigning each call to a specific time slot.
- **Digital Audio and Video Broadcasting**: TDM is used in broadcasting systems to transmit multiple audio or video streams over a single channel, ensuring efficient use of bandwidth.
- **Computer Networks**: TDM is used in network protocols and systems to manage the transmission of data from multiple sources over a single network medium.
### Advantages of TDM
- **Efficient Use of Bandwidth**: TDM all
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELgerogepatton
As digital technology becomes more deeply embedded in power systems, protecting the communication
networks of Smart Grids (SG) has emerged as a critical concern. Distributed Network Protocol 3 (DNP3)
represents a multi-tiered application layer protocol extensively utilized in Supervisory Control and Data
Acquisition (SCADA)-based smart grids to facilitate real-time data gathering and control functionalities.
Robust Intrusion Detection Systems (IDS) are necessary for early threat detection and mitigation because
of the interconnection of these networks, which makes them vulnerable to a variety of cyberattacks. To
solve this issue, this paper develops a hybrid Deep Learning (DL) model specifically designed for intrusion
detection in smart grids. The proposed approach is a combination of the Convolutional Neural Network
(CNN) and the Long-Short-Term Memory algorithms (LSTM). We employed a recent intrusion detection
dataset (DNP3), which focuses on unauthorized commands and Denial of Service (DoS) cyberattacks, to
train and test our model. The results of our experiments show that our CNN-LSTM method is much better
at finding smart grid intrusions than other deep learning algorithms used for classification. In addition,
our proposed approach improves accuracy, precision, recall, and F1 score, achieving a high detection
accuracy rate of 99.50%.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...IJECEIAES
Climate change's impact on the planet forced the United Nations and governments to promote green energies and electric transportation. The deployments of photovoltaic (PV) and electric vehicle (EV) systems gained stronger momentum due to their numerous advantages over fossil fuel types. The advantages go beyond sustainability to reach financial support and stability. The work in this paper introduces the hybrid system between PV and EV to support industrial and commercial plants. This paper covers the theoretical framework of the proposed hybrid system including the required equation to complete the cost analysis when PV and EV are present. In addition, the proposed design diagram which sets the priorities and requirements of the system is presented. The proposed approach allows setup to advance their power stability, especially during power outages. The presented information supports researchers and plant owners to complete the necessary analysis while promoting the deployment of clean energy. The result of a case study that represents a dairy milk farmer supports the theoretical works and highlights its advanced benefits to existing plants. The short return on investment of the proposed approach supports the paper's novelty approach for the sustainable electrical system. In addition, the proposed system allows for an isolated power setup without the need for a transmission line which enhances the safety of the electrical network
ACEP Magazine edition 4th launched on 05.06.2024Rahul
This document provides information about the third edition of the magazine "Sthapatya" published by the Association of Civil Engineers (Practicing) Aurangabad. It includes messages from current and past presidents of ACEP, memories and photos from past ACEP events, information on life time achievement awards given by ACEP, and a technical article on concrete maintenance, repairs and strengthening. The document highlights activities of ACEP and provides a technical educational article for members.
Embedded machine learning-based road conditions and driving behavior monitoringIJECEIAES
Car accident rates have increased in recent years, resulting in losses in human lives, properties, and other financial costs. An embedded machine learning-based system is developed to address this critical issue. The system can monitor road conditions, detect driving patterns, and identify aggressive driving behaviors. The system is based on neural networks trained on a comprehensive dataset of driving events, driving styles, and road conditions. The system effectively detects potential risks and helps mitigate the frequency and impact of accidents. The primary goal is to ensure the safety of drivers and vehicles. Collecting data involved gathering information on three key road events: normal street and normal drive, speed bumps, circular yellow speed bumps, and three aggressive driving actions: sudden start, sudden stop, and sudden entry. The gathered data is processed and analyzed using a machine learning system designed for limited power and memory devices. The developed system resulted in 91.9% accuracy, 93.6% precision, and 92% recall. The achieved inference time on an Arduino Nano 33 BLE Sense with a 32-bit CPU running at 64 MHz is 34 ms and requires 2.6 kB peak RAM and 139.9 kB program flash memory, making it suitable for resource-constrained embedded systems.
A review on techniques and modelling methodologies used for checking electrom...nooriasukmaningtyas
The proper function of the integrated circuit (IC) in an inhibiting electromagnetic environment has always been a serious concern throughout the decades of revolution in the world of electronics, from disjunct devices to today’s integrated circuit technology, where billions of transistors are combined on a single chip. The automotive industry and smart vehicles in particular, are confronting design issues such as being prone to electromagnetic interference (EMI). Electronic control devices calculate incorrect outputs because of EMI and sensors give misleading values which can prove fatal in case of automotives. In this paper, the authors have non exhaustively tried to review research work concerned with the investigation of EMI in ICs and prediction of this EMI using various modelling methodologies and measurement setups.