This document summarizes a student's final thesis presentation on storing and analyzing synchrophasor data from phasor measurement units (PMUs) for event detection using Hadoop. The student proposes a new distributed storage and computation model using Apache Hadoop, HBase, and MapReduce to address the challenges of storing and processing large volumes of high-frequency PMU data in a parallelized manner, as an improvement over the existing centralized software system. The presentation provides background on PMUs, India's power grid monitoring initiatives, and shows how the new model is designed to store and analyze PMU data for fault detection and grid monitoring.
Tdtd-Edr: Time Orient Delay Tolerant Density Estimation Technique Based Data ...theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
Detection of malicious attacks by Meta classification algorithmsEswar Publications
We address the problem of malicious node detection in a network based on the characteristics in the behavior of the network. This issue brings out a challenging set of research papers in the recent contributing a critical component to secure the network. This type of work evolves with many changes in the solution strategies. In this work, we propose carefully the learning models with cautious selection of attributes, selection of parameter thresholds and number of iterations. In this research, appropriate approach to evaluate the performance of a set of meta classifier algorithms (Ad Boost, Attribute selected classifier, Bagging, Classification via Regression,
Filtered classifier, logit Boost, multiclass classifier). The ratio between training and testing data is made such way that compatibility of data patterns in both the sets are same. Hence we consider a set of supervised machine learning schemes with meta classifiers were applied on the selected dataset to predict the attack risk of the network environment . The trained models were then used for predicting the risk of the attacks in a web server environment or by any network administrator or any Security Experts. The Prediction Accuracy of the Classifiers was evaluated using 10-fold Cross Validation and the results have been compared to obtain the accuracy.
Optimization of workload prediction based on map reduce frame work in a cloud...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Impact of GPS Signal Loss and Spoofing on Power System Synchrophasor Applicat...Luigi Vanfretti
This presentation shows an experimental assessesment of the impact of time synchronization spoofing attacks (TSSA) on synchrophasor-based Wide-Area Monitoring, Protection and Control applications. Phase Angle Monitoring (PAM), anti-islanding protection and power oscillation damping applications are investigated. TSSA are created using a real-time IRIG-B signal generator and power system models are executed using a real-time simulator with commercial phasor measurement units (PMUs) coupled to them as hardware-in-the-loop. Because PMUs utilize time synchronization signals to compute synchrophasors, an error in the PMUs’ time input introduces a proportional phase error in the voltage or current phase measurements provided by the PMU. The experiments conclude that a phase angle monitoring application will show erroneous power transfers, whereas the anti-islanding protection mal-operates and the damping controller introduces negative damping in the system as a result of the time synchronization error incurred in the PMUs due to TSSA.
The proposed test-bench and TSSA approach can be used to investigate the impact of TSSA on any WAMPAC application and to determine the time synchronization error threshold that can be tolerated by these WAMPAC applications.
Auto-scaling Techniques for Elastic Data Stream ProcessingZbigniew Jerzak
An elastic data stream processing system is able to handle changes in workload by dynamically scaling out and
scaling in. This allows for handling of unexpected load spikes without the need for constant overprovisioning. One of the major challenges for an elastic system is to find the right point in time to scale in or to scale out. Finding such a point is difficult as it depends on constantly changing workload and system characteristics. In this paper we investigate the application of different auto-scaling techniques for solving this problem. Specifically: (1) we formulate basic requirements for an autoscaling technique used in an elastic data stream processing system, (2) we use the formulated requirements to select the best auto scaling techniques, and (3) we perform evaluation of the selected auto scaling techniques using the real world data. Our experiments show that the auto scaling techniques used in existing elastic data stream processing systems are performing worse than the strategies used in our work.
Tdtd-Edr: Time Orient Delay Tolerant Density Estimation Technique Based Data ...theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
Detection of malicious attacks by Meta classification algorithmsEswar Publications
We address the problem of malicious node detection in a network based on the characteristics in the behavior of the network. This issue brings out a challenging set of research papers in the recent contributing a critical component to secure the network. This type of work evolves with many changes in the solution strategies. In this work, we propose carefully the learning models with cautious selection of attributes, selection of parameter thresholds and number of iterations. In this research, appropriate approach to evaluate the performance of a set of meta classifier algorithms (Ad Boost, Attribute selected classifier, Bagging, Classification via Regression,
Filtered classifier, logit Boost, multiclass classifier). The ratio between training and testing data is made such way that compatibility of data patterns in both the sets are same. Hence we consider a set of supervised machine learning schemes with meta classifiers were applied on the selected dataset to predict the attack risk of the network environment . The trained models were then used for predicting the risk of the attacks in a web server environment or by any network administrator or any Security Experts. The Prediction Accuracy of the Classifiers was evaluated using 10-fold Cross Validation and the results have been compared to obtain the accuracy.
Optimization of workload prediction based on map reduce frame work in a cloud...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Impact of GPS Signal Loss and Spoofing on Power System Synchrophasor Applicat...Luigi Vanfretti
This presentation shows an experimental assessesment of the impact of time synchronization spoofing attacks (TSSA) on synchrophasor-based Wide-Area Monitoring, Protection and Control applications. Phase Angle Monitoring (PAM), anti-islanding protection and power oscillation damping applications are investigated. TSSA are created using a real-time IRIG-B signal generator and power system models are executed using a real-time simulator with commercial phasor measurement units (PMUs) coupled to them as hardware-in-the-loop. Because PMUs utilize time synchronization signals to compute synchrophasors, an error in the PMUs’ time input introduces a proportional phase error in the voltage or current phase measurements provided by the PMU. The experiments conclude that a phase angle monitoring application will show erroneous power transfers, whereas the anti-islanding protection mal-operates and the damping controller introduces negative damping in the system as a result of the time synchronization error incurred in the PMUs due to TSSA.
The proposed test-bench and TSSA approach can be used to investigate the impact of TSSA on any WAMPAC application and to determine the time synchronization error threshold that can be tolerated by these WAMPAC applications.
Auto-scaling Techniques for Elastic Data Stream ProcessingZbigniew Jerzak
An elastic data stream processing system is able to handle changes in workload by dynamically scaling out and
scaling in. This allows for handling of unexpected load spikes without the need for constant overprovisioning. One of the major challenges for an elastic system is to find the right point in time to scale in or to scale out. Finding such a point is difficult as it depends on constantly changing workload and system characteristics. In this paper we investigate the application of different auto-scaling techniques for solving this problem. Specifically: (1) we formulate basic requirements for an autoscaling technique used in an elastic data stream processing system, (2) we use the formulated requirements to select the best auto scaling techniques, and (3) we perform evaluation of the selected auto scaling techniques using the real world data. Our experiments show that the auto scaling techniques used in existing elastic data stream processing systems are performing worse than the strategies used in our work.
DGBSA : A BATCH JOB SCHEDULINGALGORITHM WITH GA WITH REGARD TO THE THRESHOLD ...IJCSEA Journal
In this paper , we will provide a scheduler on batch jobs with GA regard to the threshold detector. In The algorithm proposed in this paper, we will provide the batch independent jobs with a new technique ,so we can optimize the schedule them. To do this, we use a threshold detector then among the selected jobs, processing resources can process batch jobs with priority. Also hierarchy of tasks in each batch, will be determined with using DGBSA algorithm. Now , with the regard to the works done by previous ,we can provide an algorithm that by adding specific parameters to fitness function of the previous algorithms ,develop a optimum fitness function that in the proposed algorithm has been used. According to assessment done on DGBSA algorithm, in compare with the similar algorithms, it has more performance. The effective parameters that used in the proposed algorithm can reduce the total wasting time in compare with previous algorithms. Also this algorithm can improve the previous problems in batch processing with a new technique.
REDUCING THE COGNITIVE LOAD ON ANALYSTS THROUGH HAMMING DISTANCE BASED ALERT ...IJNSA Journal
Previous work introduced the idea of grouping alerts at a Hamming distance of 1 to achieve lossless alert aggregation; such aggregated meta-alerts were shown to increase alert interpretability. However, a mean
of 84023 daily Snort alerts were reduced to a still formidable 14099 meta-alerts. In this work, we address
this limitation by investigating several approaches that all contribute towards reducing the burden on the
analyst and providing timely analysis. We explore minimizing the number of both alerts and data elements
by aggregating at Hamming distances greater than 1. We show how increasing bin sizes can improve
aggregation rates. And we provide a new aggregation algorithm that operates up to an order of magnitude
faster at Hamming distance 1. Lastly, we demonstrate the broad applicability of this approach through
empirical analysis of Windows security alerts, Snort alerts, netflow records, and DNS logs. The result is a
reduction in the cognitive load on analysts by minimizing the overall number of alerts and the number of
data elements that need to be reviewed in order for an analyst to evaluate the set of original alerts.
Wireless data broadcast is an efficient way of disseminating data to users in the mobile computing environments. From the server’s point of view, how to place the data items on channels is a crucial issue, with the objective of minimizing the average access time and tuning time. Similarly, how to schedule the data retrieval process for a given request at the client side such that all the requested items can be downloaded in a short time is also an important problem. In this paper, we investigate the multi-item data retrieval scheduling in the push-based multichannel broadcast environments. The most important issues in mobile computing are energy efficiency and query response efficiency. However, in data broadcast the objectives of reducing access latency and energy cost can be contradictive to each other. Consequently, we define a new problem named Minimum Cost Data Retrieval Problem (MCDR) and Large Number Data Retrieval (LNDR) Problem. We also develop a heuristic algorithm to download a large number of items efficiently. When there is no replicated item in a broadcast cycle, we show that an optimal retrieval schedule can be obtained in polynomial time
Job Resource Ratio Based Priority Driven Scheduling in Cloud Computingijsrd.com
Cloud Computing is an emerging technology in the area of parallel and distributed computing. Clouds consist of a collection of virtualized resources, which include both computational and storage facilities that can be provisioned on demand, depending on the users' needs. Job scheduling is one of the major activities performed in all the computing environments. Cloud computing is one the upcoming latest technology which is developing drastically. To efficiently increase the working of cloud computing environments, job scheduling is one the tasks performed in order to gain maximum profit. In this paper we proposed a new scheduling algorithm based on priority and that priority is based on ratio of job and resource. To calculate priority of job we use analytical hierarchy process. In this paper we also compare result with other algorithm like First come first serve and round robin algorithms.
An Improved Differential Evolution Algorithm for Data Stream ClusteringIJECEIAES
A Few algorithms were actualized by the analysts for performing clustering of data streams. Most of these algorithms require that the number of clusters (K) has to be fixed by the customer based on input data and it can be kept settled all through the clustering process. Stream clustering has faced few difficulties in picking up K. In this paper, we propose an efficient approach for data stream clustering by embracing an Improved Differential Evolution (IDE) algorithm. The IDE algorithm is one of the quick, powerful and productive global optimization approach for programmed clustering. In our proposed approach, we additionally apply an entropy based method for distinguishing the concept drift in the data stream and in this way updating the clustering procedure online. We demonstrated that our proposed method is contrasted with Genetic Algorithm and identified as proficient optimization algorithm. The performance of our proposed technique is assessed and cr eates the accuracy of 92.29%, the precision is 86.96%, recall is 90.30% and F-measure estimate is 88.60%.
Assessing Software Reliability Using SPC – An Order Statistics Approach IJCSEA Journal
There are many software reliability models that are based on the times of occurrences of errors in the debugging of software. It is shown that it is possible to do asymptotic likelihood inference for software reliability models based on order statistics or Non-Homogeneous Poisson Processes (NHPP), with asymptotic confidence levels for interval estimates of parameters. In particular, interval estimates from these models are obtained for the conditional failure rate of the software, given the data from the debugging process. The data can be grouped or ungrouped. For someone making a decision about when to market software, the conditional failure rate is an important parameter. Order statistics are used in a wide variety of practical situations. Their use in characterization problems, detection of outliers, linear estimation, study of system reliability, life-testing, survival analysis, data compression and many other fields can be seen from the many books. Statistical Process Control (SPC) can monitor the forecasting of software failure and thereby contribute significantly to the improvement of software reliability. Control charts are widely used for software process control in the software industry. In this paper we proposed a control mechanism based on order statistics of cumulative quantity between observations of time domain
failure data using mean value function of Half Logistics Distribution (HLD) based on NHPP.
Assessing Software Reliability Using SPC – An Order Statistics ApproachIJCSEA Journal
There are many software reliability models that are based on the times of occurrences of errors in the debugging of software. It is shown that it is possible to do asymptotic likelihood inference for software reliability models based on order statistics or Non-Homogeneous Poisson Processes (NHPP), with asymptotic confidence levels for interval estimates of parameters. In particular, interval estimates from these models are obtained for the conditional failure rate of the software, given the data from the debugging process. The data can be grouped or ungrouped. For someone making a decision about when to market software, the conditional failure rate is an important parameter. Order statistics are used in a wide variety of practical situations. Their use in characterization problems, detection of outliers, linear estimation, study of system reliability, life-testing, survival analysis, data compression and many other fields can be seen from the many books. Statistical Process Control (SPC) can monitor the forecasting of software failure and thereby contribute significantly to the improvement of software reliability. Control charts are widely used for software process control in the software industry. In this paper we proposed a control mechanism based on order statistics of cumulative quantity between observations of time domain
failure data using mean value function of Half Logistics Distribution (HLD) based on NHPP.
DGBSA : A BATCH JOB SCHEDULINGALGORITHM WITH GA WITH REGARD TO THE THRESHOLD ...IJCSEA Journal
In this paper , we will provide a scheduler on batch jobs with GA regard to the threshold detector. In The algorithm proposed in this paper, we will provide the batch independent jobs with a new technique ,so we can optimize the schedule them. To do this, we use a threshold detector then among the selected jobs, processing resources can process batch jobs with priority. Also hierarchy of tasks in each batch, will be determined with using DGBSA algorithm. Now , with the regard to the works done by previous ,we can provide an algorithm that by adding specific parameters to fitness function of the previous algorithms ,develop a optimum fitness function that in the proposed algorithm has been used. According to assessment done on DGBSA algorithm, in compare with the similar algorithms, it has more performance. The effective parameters that used in the proposed algorithm can reduce the total wasting time in compare with previous algorithms. Also this algorithm can improve the previous problems in batch processing with a new technique.
REDUCING THE COGNITIVE LOAD ON ANALYSTS THROUGH HAMMING DISTANCE BASED ALERT ...IJNSA Journal
Previous work introduced the idea of grouping alerts at a Hamming distance of 1 to achieve lossless alert aggregation; such aggregated meta-alerts were shown to increase alert interpretability. However, a mean
of 84023 daily Snort alerts were reduced to a still formidable 14099 meta-alerts. In this work, we address
this limitation by investigating several approaches that all contribute towards reducing the burden on the
analyst and providing timely analysis. We explore minimizing the number of both alerts and data elements
by aggregating at Hamming distances greater than 1. We show how increasing bin sizes can improve
aggregation rates. And we provide a new aggregation algorithm that operates up to an order of magnitude
faster at Hamming distance 1. Lastly, we demonstrate the broad applicability of this approach through
empirical analysis of Windows security alerts, Snort alerts, netflow records, and DNS logs. The result is a
reduction in the cognitive load on analysts by minimizing the overall number of alerts and the number of
data elements that need to be reviewed in order for an analyst to evaluate the set of original alerts.
Wireless data broadcast is an efficient way of disseminating data to users in the mobile computing environments. From the server’s point of view, how to place the data items on channels is a crucial issue, with the objective of minimizing the average access time and tuning time. Similarly, how to schedule the data retrieval process for a given request at the client side such that all the requested items can be downloaded in a short time is also an important problem. In this paper, we investigate the multi-item data retrieval scheduling in the push-based multichannel broadcast environments. The most important issues in mobile computing are energy efficiency and query response efficiency. However, in data broadcast the objectives of reducing access latency and energy cost can be contradictive to each other. Consequently, we define a new problem named Minimum Cost Data Retrieval Problem (MCDR) and Large Number Data Retrieval (LNDR) Problem. We also develop a heuristic algorithm to download a large number of items efficiently. When there is no replicated item in a broadcast cycle, we show that an optimal retrieval schedule can be obtained in polynomial time
Job Resource Ratio Based Priority Driven Scheduling in Cloud Computingijsrd.com
Cloud Computing is an emerging technology in the area of parallel and distributed computing. Clouds consist of a collection of virtualized resources, which include both computational and storage facilities that can be provisioned on demand, depending on the users' needs. Job scheduling is one of the major activities performed in all the computing environments. Cloud computing is one the upcoming latest technology which is developing drastically. To efficiently increase the working of cloud computing environments, job scheduling is one the tasks performed in order to gain maximum profit. In this paper we proposed a new scheduling algorithm based on priority and that priority is based on ratio of job and resource. To calculate priority of job we use analytical hierarchy process. In this paper we also compare result with other algorithm like First come first serve and round robin algorithms.
An Improved Differential Evolution Algorithm for Data Stream ClusteringIJECEIAES
A Few algorithms were actualized by the analysts for performing clustering of data streams. Most of these algorithms require that the number of clusters (K) has to be fixed by the customer based on input data and it can be kept settled all through the clustering process. Stream clustering has faced few difficulties in picking up K. In this paper, we propose an efficient approach for data stream clustering by embracing an Improved Differential Evolution (IDE) algorithm. The IDE algorithm is one of the quick, powerful and productive global optimization approach for programmed clustering. In our proposed approach, we additionally apply an entropy based method for distinguishing the concept drift in the data stream and in this way updating the clustering procedure online. We demonstrated that our proposed method is contrasted with Genetic Algorithm and identified as proficient optimization algorithm. The performance of our proposed technique is assessed and cr eates the accuracy of 92.29%, the precision is 86.96%, recall is 90.30% and F-measure estimate is 88.60%.
Assessing Software Reliability Using SPC – An Order Statistics Approach IJCSEA Journal
There are many software reliability models that are based on the times of occurrences of errors in the debugging of software. It is shown that it is possible to do asymptotic likelihood inference for software reliability models based on order statistics or Non-Homogeneous Poisson Processes (NHPP), with asymptotic confidence levels for interval estimates of parameters. In particular, interval estimates from these models are obtained for the conditional failure rate of the software, given the data from the debugging process. The data can be grouped or ungrouped. For someone making a decision about when to market software, the conditional failure rate is an important parameter. Order statistics are used in a wide variety of practical situations. Their use in characterization problems, detection of outliers, linear estimation, study of system reliability, life-testing, survival analysis, data compression and many other fields can be seen from the many books. Statistical Process Control (SPC) can monitor the forecasting of software failure and thereby contribute significantly to the improvement of software reliability. Control charts are widely used for software process control in the software industry. In this paper we proposed a control mechanism based on order statistics of cumulative quantity between observations of time domain
failure data using mean value function of Half Logistics Distribution (HLD) based on NHPP.
Assessing Software Reliability Using SPC – An Order Statistics ApproachIJCSEA Journal
There are many software reliability models that are based on the times of occurrences of errors in the debugging of software. It is shown that it is possible to do asymptotic likelihood inference for software reliability models based on order statistics or Non-Homogeneous Poisson Processes (NHPP), with asymptotic confidence levels for interval estimates of parameters. In particular, interval estimates from these models are obtained for the conditional failure rate of the software, given the data from the debugging process. The data can be grouped or ungrouped. For someone making a decision about when to market software, the conditional failure rate is an important parameter. Order statistics are used in a wide variety of practical situations. Their use in characterization problems, detection of outliers, linear estimation, study of system reliability, life-testing, survival analysis, data compression and many other fields can be seen from the many books. Statistical Process Control (SPC) can monitor the forecasting of software failure and thereby contribute significantly to the improvement of software reliability. Control charts are widely used for software process control in the software industry. In this paper we proposed a control mechanism based on order statistics of cumulative quantity between observations of time domain
failure data using mean value function of Half Logistics Distribution (HLD) based on NHPP.
Checkpoint and recovery protocols are commonly used in distributed applications for providing fault
tolerance. A distributed system may require taking checkpoints from time to time to keep it free of arbitrary
failures. In case of failure, the system will rollback to checkpoints where global consistency is preserved.
Checkpointing is one of the fault-tolerant techniques to restore faults and to restart job fast. The algorithms
for checkpointing on distributed systems have been under study for years.
It is known that checkpointing and rollback recovery are widely used techniques that allow a distributed
computing to progress inspite of a failure.There are two fundamental approaches for checkpointing and
recovery.One is asynchronus approach, process take their checkpoints independenty.So,taking checkpoints
is very simple but due to absence of a recent consistent global checkpoint which may cause a rollback of
computation.Synchronus checkpointing approach assumes that a single process other than the application
process invokes the checkpointing algorithm periodically to determine a consistent global checkpoint.
Intelligent flood disaster warning on the fly: developing IoT-based managemen...journalBEEI
The number of natural disasters occurring yearly is increasing at an alarming rate which has caused a great concern over the well-being of human lives and economy sustenance. The rainfall pattern has also been affected and this has caused immense amount of flood cases in recent times. Flood disasters are damaging to economy and human lives. Yearly, millions of people are affected by floods in Asia alone. This has brought the attention of the government to develop a flood forecasting method to reduce flood casualties. In this article, a flood mitigation method will be evaluated which incorporates a miniaturized flow, water level sensor and pressure gauge. The data from the two sensors are used to predict flood status using a 2-class neural network. Real-time monitoring of the data from the sensor into Thingspeak channel were possible with the use of NodeMCU ESP8266. Furthermore, Microsoft’s Azure Machine Learning (AzureML) has built-in 2-class neural network which was used to predict flood status according to predefine rule. The prediction model has been published as Web services through AzureML service and it enables prediction as new data are available. The experimental result showed that using 3 hidden layers has the highest accuracy of 98.9% and precision of 100% when 2-class neural network
is used.
Write a brief report (one or two paragraphs) that (1) define middl.docxericbrooks84875
Write a brief report (one or two paragraphs) that (1) define middleware in your own words, (2) describe the daily or weekly activity that you do that uses middleware, (3) name the software you use to do the activity, and (4) as specifically as possible, state what middleware software the activity uses and justifies your explanation.
Your goal is to convince me with your report that you understand what middleware is and what role it plays in the delivery of modern software applications.
Resources that you can use to help you answer this questions are
1. Middleware Concepts (This video mentioned three vendors of middleware such as Oracle, SAP, and IBM with Web Sphere).
2. The New Middleware (This is in a pdf file that I attached on this assignment.
3. Internet. You can use the web to help you understand what software the term middleware encompasses. Also don’t forget to cite your information from the web.
TH VV
U S I N G M I D D L E -
W A R E , C U S T O M E R S
C A N D E P L O Y C O S T -
E F F E C T I V E A N D
H I G H L Y
F U N C T I O N A L
C L I E N T / S E R V E R
A P P L I C A T I O N S
O N C E T H E Y W O R K
O U T T H E K I N K S .
BY RICH FINKELSTEIN
B
roadly speaking, middleware
ties application components
together. This definition en-
c o m p a s s e s too many prod-
ucts to cover in one article,
so instead I'll look at signifi-
cant products that can be used to manage
cross-platform connectivity in a database
environment. Within this context, mid-
dleware products provide several impor-
tant functions, including:
• d a t a a c c e s s to h e t e r o g e n e o u s data-
bases;
• g a t e w a y s to r e m o t e mainframe data-
bases;
• appficafion partitioning across multiple
hardware platforms; and
• distributed updates across a homoge-
neous or h e t e r o g e n e o u s set of relational
databases.
M a n y m i d d l e w a r e p r o d u c t s have ap-
p e a r e d o v e r the l a s t s e v e r a l y e a r s t h a t
have attempted to solve the various prob-
l e m s a s s o c i a t e d with a c c e s s i n g and up-
dating databases over local and wide area
n e t w o r k s (LANs and WANs). F o r the
most part, p r o g r e s s in deploying produc-
tion applications that utilize middleware
has been slow - - much slower than what
you m i g h t first think. C u s t o m e r refer-
e n c e s for m i d d l e w a r e p r o d u c t s are still
few and far between. M o s t applications
that use middleware s e e m to have a rela-
tively simple application profile; that is,
low v o l u m e s of read-only t r a n s a c t i o n s .
However, in some cases, middleware has
b e e n u s e d Successfully to s u p p o r t very
large, mission-critical systems. Develop-
ing t h e s e types of systems usually takes
a substantial amount of time, resources,
and c o m m i t m e n t ; in o t h e r w o r d s , you
m u s t have a .
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
A workshop hosted by the South African Journal of Science aimed at postgraduate students and early career researchers with little or no experience in writing and publishing journal articles.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...Levi Shapiro
Letter from the Congress of the United States regarding Anti-Semitism sent June 3rd to MIT President Sally Kornbluth, MIT Corp Chair, Mark Gorenberg
Dear Dr. Kornbluth and Mr. Gorenberg,
The US House of Representatives is deeply concerned by ongoing and pervasive acts of antisemitic
harassment and intimidation at the Massachusetts Institute of Technology (MIT). Failing to act decisively to ensure a safe learning environment for all students would be a grave dereliction of your responsibilities as President of MIT and Chair of the MIT Corporation.
This Congress will not stand idly by and allow an environment hostile to Jewish students to persist. The House believes that your institution is in violation of Title VI of the Civil Rights Act, and the inability or
unwillingness to rectify this violation through action requires accountability.
Postsecondary education is a unique opportunity for students to learn and have their ideas and beliefs challenged. However, universities receiving hundreds of millions of federal funds annually have denied
students that opportunity and have been hijacked to become venues for the promotion of terrorism, antisemitic harassment and intimidation, unlawful encampments, and in some cases, assaults and riots.
The House of Representatives will not countenance the use of federal funds to indoctrinate students into hateful, antisemitic, anti-American supporters of terrorism. Investigations into campus antisemitism by the Committee on Education and the Workforce and the Committee on Ways and Means have been expanded into a Congress-wide probe across all relevant jurisdictions to address this national crisis. The undersigned Committees will conduct oversight into the use of federal funds at MIT and its learning environment under authorities granted to each Committee.
• The Committee on Education and the Workforce has been investigating your institution since December 7, 2023. The Committee has broad jurisdiction over postsecondary education, including its compliance with Title VI of the Civil Rights Act, campus safety concerns over disruptions to the learning environment, and the awarding of federal student aid under the Higher Education Act.
• The Committee on Oversight and Accountability is investigating the sources of funding and other support flowing to groups espousing pro-Hamas propaganda and engaged in antisemitic harassment and intimidation of students. The Committee on Oversight and Accountability is the principal oversight committee of the US House of Representatives and has broad authority to investigate “any matter” at “any time” under House Rule X.
• The Committee on Ways and Means has been investigating several universities since November 15, 2023, when the Committee held a hearing entitled From Ivory Towers to Dark Corners: Investigating the Nexus Between Antisemitism, Tax-Exempt Universities, and Terror Financing. The Committee followed the hearing with letters to those institutions on January 10, 202
This slide is special for master students (MIBS & MIFB) in UUM. Also useful for readers who are interested in the topic of contemporary Islamic banking.
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
Normal Labour/ Stages of Labour/ Mechanism of LabourWasim Ak
Normal labor is also termed spontaneous labor, defined as the natural physiological process through which the fetus, placenta, and membranes are expelled from the uterus through the birth canal at term (37 to 42 weeks
Executive Directors Chat Leveraging AI for Diversity, Equity, and InclusionTechSoup
Let’s explore the intersection of technology and equity in the final session of our DEI series. Discover how AI tools, like ChatGPT, can be used to support and enhance your nonprofit's DEI initiatives. Participants will gain insights into practical AI applications and get tips for leveraging technology to advance their DEI goals.
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
1. NationalInstituteofScience&Technology
[1]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
“STORAGE AND ANALYSIS OF SYNCHROPHASOR DATA FOR
EVENT DETECTION USING HADOOP ”
P r e s e n t e d B y
G Hema nt a Kuma r
R o l l N o . C S E 2 0 1 3 9 0 0 0 9
R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
U n d e r T h e S u p e r v i s i o n o f
D r . D i p t e n d u S i n h a R o y
A s s o c i a t e P r o f e s s o r ,
D e p a r t m e n t o f C o m p u t e r S c i e n c e & E n g i n e e r i n g ,
N a t i o n a l I n s t i t u t e o f S c i e n c e a n d T e c h n o l o g y
2. NationalInstituteofScience&Technology
[2]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Presentation Outlines
Introduction
Motivation
Problem Statement
Contribution & Development of A New Model
Synchrophasors/Phasor Measurement Units (PMU)
Indian Power Grid & Synchrophasors Initiatives Program
PMU Data Sample
A New Proposed Model Designed and Implemented
PMU Data Sample Storage Model Designed
MapReduce Implementation of Effective Values Finding
Client Interaction For Power Grid Status Observation
Conclusion
Future Work
References
3. NationalInstituteofScience&Technology
[3]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Introduction
Power Grid (PG) is composed of following three main components such as:
Generation, Transmission, Distribution
Modern PG are massively interconnected and interacting network “Wide Area
Measurement System (WAMS)”.
WASM is logically a Monitoring, Decision Making & Control Network of the PG. In
this network system both the Electrical Engineers & Computer Science Engineers are
working together.
For stable & accurate maintain of PG a fast & probably accurate decision making
Information System software models is needed.
4. NationalInstituteofScience&Technology
[4]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Motivation
Synchrophasors/PMUs are electrical properties measurement devices having properties
like “High Data Sampling Rates” and “Synchronization”
Due to high sampling rate & rapid deployment the PMU data sample Storage &
Processing becoming a Big-Data problem. Such a vast amount of Data Processing and
Storage is a major Computational Challenge Problem.
NLDC using a central software package called as “SYNCHROWAVE Central
Software” to collect the data samples generated by PMUs.
5. NationalInstituteofScience&Technology
[5]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Problem Statement
SYNCHROWAVE Central Software implemented on Vertical scalable hardware
units. Such hardware units are more expensive.
It stores all received data in a Database developed of File system and implemented
over Non-distributed storage architecture
It processes stored PMU datasets in Non-parallel approach.
6. NationalInstituteofScience&Technology
[6]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Contribution & Development of A New Model
My idea is to develop New Operative Software Model for PG usages with latest
COMPUTATIONAL & STORAGE SYSTEM ARCHITECTURES.
STORAGE SYSTEM Module Development
I configured Apache Hadoop which provides Distributed Reliable file System
known as Hadoop Distributed File System (HDFS).
For Random Access of PMU datasets with timestamp as key value I configured
Apache HBase over HDFS.
Apache HBase is a Distributed Non-Relational Database which supports Multi-
Dimensional Storage.
For storing PMU datasets I designed three Multi-Dimensional HBase tables namely
GRID_PROPERTIES, PMU_PROPERTIES, PMU_DATA_TREND
7. NationalInstituteofScience&Technology
[7]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Contribution & Development of A New Model…
COMPUTATIONAL SYSTEM Module Development
A MapReduce implementation of a Statistical Analysing Method was designed to
get effective electrical properties over a specific size overlapping moving windows
Other programming are designed to upload PMU datasets into HBase
Apache Thrift Gateway was configured to access this New Distributed Server remotely
for PMU dataset recodes access and Analysis through R client, Java client etc.
8. NationalInstituteofScience&Technology
[8]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Block Diagram of My Proposed Model
HDFS
NameNode Region Server + DataNode
…
…
…
Zookeeper Region Server + DataNode
Hbase Shell
MapReduce
HBase API
Thrift Gateway
R Client Java Client Other……
Distributed
storage
MapReduce Parallel
Processing
Admin
File
System
9. NationalInstituteofScience&Technology
[9]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Synchrophasors/Phasor Measurement Unit (PMU)
Synchrophasors/PMU is a synchronized
power signals measures devices deployed
at minimum voltage level of 110KV (i.e.
Sub-Station)
60 samples per second is the maximum
sampling rate of PMU. (Resettable)
PMUs have Ethernet & Serial Ports to
connect with WAMS communication
network.
PMUs send messages were defined in
IEEE C37.118.2 standard.
Generally PMUs sends information to
PDC.
10. NationalInstituteofScience&Technology
[10]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Indian Power Grid & Synchrophasors Initiatives Program
Indian PG was divided in to Five Regions & those inter-dependent region are centrally
under control of National Load Dispatch Centre (NLDC) located at New Delhi.
North Regional Load Despatch Centre (NRLDC)
North East Regional Load Despatch Centre (NERLDC)
East Regional Load Despatch Centre (ERLDC)
South Regional Load Despatch Centre (SRLDC)
West Regional Load Despatch Centre (WRLDC)
Indian PG started Synchrophasors Initiatives Program. Under this program 52 PMUs
were deployed national wide. Those were under control of NLDC. NLDC using a
commercial central software package called as SYNCHROWAVE Central Software
to collect the data samples generated by PMUs.
The PMU data samples are collected and transferred to Regional Phasor Data
Collector (PDC) located at Regional Centre then from the Regional PDC all samples
are transferred to Central PDC located at NLDC. It’s a hierarchical data collection
structure.
11. NationalInstituteofScience&Technology
[11]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
SYNCHROWAVE Central Software & PMU Data Samples
SYNCHROWAVE Central software provides both Real-Time Situational Awareness
and Historical Analytics.
It includes four main components:
Historian: Connects to multiple Synchrophasors sources PMUs or PDCs and
stores all received data in a database (i.e. file system).
Services: Connects the Historian Central web-based application (i.e. Central and
Event Viewer)
Admin: Configuration application for the system.
Central and Event Viewer: The primary web interface.
12. NationalInstituteofScience&Technology
[12]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
SYNCHROWAVE Central Software… (Block Diagram)
13. NationalInstituteofScience&Technology
[13]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Collecting Of PMU Data Using SYNCHROWAVE Central Software
14. NationalInstituteofScience&Technology
[14]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
PMU Data Sample
PMU datasets concerts of timestamp, voltage, current, frequency, phase angles etc.
Each PMUs whole days readings were recorded and stored into a separate specific file
with file extension .swave by SYNCHROWAVE software at NLDC.
Each day 52 file are generated on Historian storage.
Each file contains approximately 24*60*60*25 (= 2160000) records,
Each file had around 62 columns
This first column had the Timestamp
The Timestamp format is in form of “<YYYY/MM/DD HH:MM:SS:MMM>”.
16. NationalInstituteofScience&Technology
[16]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Uses of PMU Data Samples For Fault Detection
Fault detection & classification: Main objectives is to understand the reasons that has
led to the event, performance of protective equipment’s and remedial actions
taken to avoid its occurrence in future.
Transmission line faults consist of 85-87% of the total number of faults occurring in
power system.
Transmission line faults are classified as Single Line-to-Ground faults, Line-to-Line
faults, Double Line-to-Ground faults and Three Phase faults.
17. NationalInstituteofScience&Technology
[17]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Uses of PMU Data Samples For Fault Detection…
Researchers working all across the globe for fault detection and fault classification, to
develop smart auto alarming control system models. Those model can be broadly
classified in the following categories:
Complex Mathematical Optimization Model
Minimum Volume Enclosing Ellipsoid (MVEE) method known as ellipsoid
model by Makarow et. al.
Statistical Data Analysis Model
18. NationalInstituteofScience&Technology
[18]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
A New Proposed Model Designed and Implemented
This model broadly categorized into following module:
Store PMU Data Samples into HBase Tables
MapReduce Implementation of Computation For Max & Min Effective Value
Finding in Each Overlapping Moving Window
Plotting Graphs To Identify Possible Fault Events Using R Client
Apache Hadoop cluster, Apache HBase, Apache Thrift thread pool
19. NationalInstituteofScience&Technology
[19]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
A New Proposed Model Designed and Implemented…
HISTORIAN STORAGE SINK FOLDER
NON_MAP-REDUCE PROGRAM TO
SINK DATASET INTO HBase TABLE HBase TABLE “GRID_PROPERTIES”
MAP-REDUCE PROGRAM TO
COMPUTE EFFICTIVE VALUES HBase TABLE “PMU_DATA_TREND”
Apache
Thrift Thread
Pool
CLIENT ACCESS (PLOT
GRAPH/ VISUALISE)
20. NationalInstituteofScience&Technology
[20]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Apache Hadoop Overview
Hadoop is an open-source framework that allows to store and process big data in a
distributed environment across clusters of computers using simple programming
models. It is designed to scale up from single servers to thousands of machines, each
offering local computation and storage. Hadoop framework includes following
modules:
Hadoop Distributed File System (HDFS): A distributed file system that provides
high-throughput access to application data. HDFS uses a master/slave architecture
where master consists of a single NameNode that manages the file system
metadata and one or more slave DataNodes that store the actual data.
Hadoop MapReduce: This is a system for parallel processing of large data sets.
The MapReduce framework consists of a single master JobTracker and one slave
TaskTracker per cluster-node.
21. NationalInstituteofScience&Technology
[21]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Apache Hadoop Overview (Block Diagram)
22. NationalInstituteofScience&Technology
[22]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Apache HBase Overview
HBase is An Open-source, Horizontally Scalable, Non-Relational Database Model
designed to Provide Quick Random Access To Huge Amounts of Structured Data.
HBase is a Distributed Column-oriented Database where:
Table is a collection of rows.
Row is a collection of column families.
Column family is a collection of columns.
Column is a collection of key value pairs. (key: Timestamp, value: Value).
It provides low latency access to single rows from billions of records.
Hadoop framework includes following modules: Master Server, Region Servers,
Zookeepers, HBase Client API and HBase Shell.
23. NationalInstituteofScience&Technology
[23]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Apache HBase Overview (Block Diagram)
24. NationalInstituteofScience&Technology
[24]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
HBase Tables Designed to Store PMU Data Samples
We designed three Multi-dimensional tables on HBase such as:
GRID_PROPERTIES table:
It stores all 52 PMU reading in a single record. So it makes easier to retrieve
the whole grid power properties at specific timestamp through a single record
selection query.
Timestamp considered as RowKey
Each column-family represents at particular PMU & its cells stores voltage,
current, frequency etc reading of that
PMU_PROPERTIES table:
It stores only the PMU identification properties like installed at voltage level
(i.e. voltage level of grid station), region details, feeder details, supplier details
etc. It help to identify the PMU
PMU_DATA_TREND table:
It stores Computed Output of MapReduce program.
25. NationalInstituteofScience&Technology
[25]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
GRID_PROPERTIES HBase Table Model Designed
26. NationalInstituteofScience&Technology
[26]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
GRID_PROPERTIES HBase Tables Implemented
GRID_PROPERTIES Table:
27. NationalInstituteofScience&Technology
[27]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
PMU_PROPERTIES HBase Table Model Designed
28. NationalInstituteofScience&Technology
[28]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
PMU_PROPERTIES HBase Tables Implemented
29. NationalInstituteofScience&Technology
[29]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
PMU_DATA_TREND HBase Table Model Designed
30. NationalInstituteofScience&Technology
[30]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
PMU_DATA_TRAND HBase Tables
31. NationalInstituteofScience&Technology
[31]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Mathematical Representation of Designed Model
we considered W an Overlapping Moving Data Window set consists of 300 records
such as, W= {R1, R2... Rn} whereas n= 300 i.e. |W|= 300.
R represents each record of this table. Each record consists of a set of PMUs located in
grid (i.e. 52 PMUs) represented as R= {RowKey, P1, P2... Pm-1} whereas m= 53 no of
PMUs, |R|= 53.
Individual PMU are represented by P and it’s a columnfamily. Each P have five cells
such as P= {volt, cur, vAng, cAng, fr }
where as volt, cur, vAng, cAng and fr represents voltage, current, voltage phasor angle,
current phasor angle and frequency respectively.
32. NationalInstituteofScience&Technology
[32]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Mathematical Representation of Designed Model [Equations]…
33. NationalInstituteofScience&Technology
[33]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
MapReduce Implementation of Overlapping Moving Window
34. NationalInstituteofScience&Technology
[34]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Compute Max & Min For Each Overlapping Moving Window
Max-Min MapReduce Algorithm:
INPUT: scan from GRID_PROPERTIES
table
OUTPUT: put into PMU_DATA_TREND
table
Map Task:
map (RowKey, columns){
if row_count == 300 then:
frame_count= frame_count+ 1
else:
row_count= row_count+ 1
key= columns.name+ frame_count
x= getValue(RowKey, columns:cell)
context.write(key, x)
}
Reduce Task:
reduce (key, values, key, values){
for each x: value do
//write code to perform
computation to obtain effective values...
// create a new object to store
above effective values…
Context.write(key, new object)
}
35. NationalInstituteofScience&Technology
[35]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Plotting Graphs To Identify Possible Fault Events Using R
Analysis, scan out events, and plot visualization for butter understanding of grid status
and store more effective data sample for future reference
INPUT: scan records from PMU_DATA_TREND
OUTPUT: event (i.e. effective data sample), plot
Scan columns from HBase table “PMU_DATA_TREND”
Plot graphs for event observation (reading)
36. NationalInstituteofScience&Technology
[36]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Plotting Graphs To Identify Possible Fault Events Using R
Max~Min Graph: Red dots: max volt. & Green dots: min volt.
37. NationalInstituteofScience&Technology
[37]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Plotting Graphs To Identify Possible Fault Events Using R
voltage difference plot:
38. NationalInstituteofScience&Technology
[38]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Plotting Graphs To Identify Possible Fault Events Using R
Max-Min Method plot: Blue dots: volt. Diff., Red line: 3SD, Green line: Average volt.
39. NationalInstituteofScience&Technology
[39]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Plotting Graphs To Identify Possible Fault Events Using R
histogram plot of voltage difference:
40. NationalInstituteofScience&Technology
[40]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Software & Hardware Infrastructure
Software configuration used:
“Ubuntu 14.04 LTS 3.13.0.57-generic kernel”
“Hadoop cluster setup of Apache Hadoop V1.2.1”
“HBase cluster setup of Apache HBase V0.94.27”
“Java version 1.2.0_71”
“R Studio version 0.98.1091”
“Apache thrift V0.9.0.” Apache thrift (helps to connect to HBase server and access
HBase table data in R code)
“rhbase" package (provides necessary method to connect and access HBase table in
R)
Hardware configurations used:
Intel Core 2Duo CPU T6600 @ 2.20GHz x 2 Processor, main memory of 2GB,
Physical Disk space of 250GB
41. NationalInstituteofScience&Technology
[41]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Conclusion
From the experiment its conclude that:
PMU data can be stored more reliably in decentralized fashion with ssh security
Supports using low cost commodity hardware units (i.e. Hadoop clusters and
HDFS) its more economically cost efficient than current vertical scalable high-end
servers system.
By using HBase on the top of HDFS data read write becomes random
MapReduce it’s a new concept of parallel programming for handling large scale
data sets and it gives more faster processing.
HBase provides Non-Relational tabular database
HBase provides two types of access facilities such as command line by using
HBase Query language and by writing Java code using HBase client API
42. NationalInstituteofScience&Technology
[42]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
Future Work
Future work will look at development of MapReduce based clustering techniques and
machine learning techniques for classification of power fault detected as well as the
unique events detections. Machine learning is a Pattern Recognition technique which
can help to fetch specific patterns present inside the power signals, every different event
or fault develops a unique type of signal pattern [3]. So, development of a self-learning
neural network model for signal patent identification and early event prediction.
43. NationalInstituteofScience&Technology
[43]
M T E C H F I N A L T H E S I S P R E S E N T A T I O N
G H e m a n t a K u m a r R e g i s t r a t i o n N o . 1 3 0 7 2 0 0 5
References
[1] Phadke, A. G. "Synchronized phasor measurements in power systems." Computer Applications in Power, IEEE 6.2 (1993): 10-
15.
[2] Aghaei, Jamshid, and Mohammad-Iman Alizadeh. "Demand response in smart electricity grids equipped with renewable energy
sources: A review." Renewable and Sustainable Energy Reviews 18 (2013): 64-72.
[3] http://posoco.in/2013-03-12-10-34-42/synchrophasors
[14] Agrawal, V. K., P. K. Agarwal, and POSOCO DGM. "Experience of commissioning of PMUs pilot project in the northern
region of India." dynamics 7 (2010): 8.
[15] http://hadoop.apache.org/docs/r1.2.1/
[16] pdf: Hadoop:The Definitive Guide By Tom White, O’REILLY, SECOND EDITION
[17] http://hadoop.apache.org/docs/r1.2.1/
[18] http://HBase.apache.org/
[19] Song, Yaqi, Yongli Zhu, and Li Li. "Large scale data storage and processing of insulator leakage current using HBase and
MapReduce." Power System Technology (POWERCON), 2014 International Conference on. IEEE, 2014
[20] Bach, Felix, et al. "Power grid time series data analysis with Pig on a hadoop cluster compared to multi core systems." Parallel,
Distributed and Network-Based Processing (PDP), 2013 21st Euromicro International Conference on. IEEE, 2013
[21] https://thrift.apache.org/
[22] Allen, Alicia, et al. PMU Data Event Detection: A User Guide for Power Engineers. National Renewable Energy Laboratory,
2014
[23] pdf: HBase:The Definitive Guide By Lars George, O’REILLY