SlideShare a Scribd company logo
1 of 39
“ERROR DETECTION IN
BIGDATA ON CLOUD”
Group Id : 15
Names: Shweta Dolhare (B120344212)
Snehal Gaikwad (B120344215)
Poonam Ghorpade (B120344219)
Project Title : “Error Detection in Big Data
on Cloud”
Internal Guide : S.B.Jadhav
Project Definition:
To develop such a approach that will
efficiently reduce time for detecting errors in
big sensor data on cloud. If any error is found
then it also involves error recovery & storing
the data in original format.
Technical Keywords :
Cloud Computing, Service Composition,
Online Web services, Hadoop, MySQL, Map
Reduce.
Introduction :
We need to develop such a approach that
will efficiently reduce time for detecting
errors in big sensor data on cloud. If any
error is found then it also involves error
recovery & storing the data in original
format.
According to the error type and features
from scale-free network we have
proposed a time-efficient strategy for
detecting and locating errors in big data
sets on cloud . The main aim is to reduce
the time required to detect the errors and
to provide a error free transmission of
data .
Motivation of the Project:
According to the error type and features from
scale-free network we have proposed a time-
efficient strategy for detecting and locating
errors in big data sets on cloud.
The main aim is to reduce the time required to
detect the errors and to provide a error free
transmission of data.
Big Data Processing on Cloud
 Service over a network.
 Ideal platform for big data storage.
 Stream based data management.
 Hadoop based framework.
 Work load distribution .
 Scalability.
 Data filtering.
Error Detection on Cloud
 Error detection
 Error localization .
 Complexity analysis.
 Algorithm calibration on cloud.
MODULES :
Module 1 :
1.Create a big data
Module 2 :
2. Implementation of algorithm for error
detection
Module 3:
3. Implementation of recovery of file
Module 4 :
4. Testing
Flow Diagram of Project :
Upload a
file to
cloud
Store the
file on
cloud
If no
recover
the error
Check
if file is
equal
to
original
file
no
Yes
System Architecture
Design of Project:
A)Mathematical Model :
Let ‘S’ be the | Error detection in big data
as the final set.
Identify the inputs as D
S = {D,L,A}
D = {D1, D2, D3, D4| ‘D’ given Data
files}
Identify the outputs as O
S = {D, L, A}
D = {D1, D2, D3, D4| ‘D’ gives data files }
L = {L1, L2 | ‘L’ gives the log files for
upload and download and repair}
A = {A1, A2, A3 | ‘A’ gives alerts }
Identify the functions as ‘F’
S = {D, L, A, F}
F = {F1(), F2(), F3(), F4(), F5(), F6() }
F1( V ) :: Upload
F2 ( V) :: integrity check
F3 ( V ) :: Log generation
F4 ( T ) :: Alert the system
F4 ( D ) :: Restore the file
F6 ( V ) :: Download the data file
Feasibility Analysis:
This is considered with specifying equipment
and software that will successful satisfy the user
requirement the technical needs of the system
may vary considerably but might include
•The facility to produce outputs in a given time.
• Response time under certain conditions.
• Ability to process a certain column of
transaction at a particular speed.
Technical Feasibility :
The facility to produce outputs in a given
time.
 Response time under certain conditions.
 Ability to process a certain column of
transaction at a particular speed.
NP hard :
A problem is NP hard, if all other problems in
NP can be reduced to it .
NP complete :
A problem is NP complete , if it is (a) in NP, and
(b) NP hard.
In short:
NP- complete: the most difficult problems in
NP
Our project comes under NP Complete.
Algorithms :
1.Cyclic Redundancy Check.
2.Hamming Code.
3.Secure Hash Algorithm.
 Cyclic Redundancy Check
In CRC, q sequence of redundant bits, called
cyclic redundancy check bits are appended
to the end of data unit so that resulting data
unit becomes exactly divisible by a second,
predetermined binary number . The basic
idea of CRC algorithms is simply to treat the
message as an enormous binary number, to
divide it by another fixed binary number,
and to make the remainder from this
division the checksum.
Humming Code :
Higher information rate.
 It encode & decodes code words .
 Detect errors of weight up to 3.
 Correct errors of weight 1.
 The key to the Hamming Code is the use of
extra parity bits to allow the identification of a
single error.
SHA 1
 Cryptographic hash function.
 Produces 160 bit hash values as a message.
 Value is hexadecimal number.
 40 digits long.
 SHA-1 forms part of several widely used
security applications and protocols,
including TLS and SSL, PGP, SSH, S/MIME,
and IPsec. Those applications can also
use MD5; both MD5 and SHA-1 are descended
from MD4.
SHA-1 hashing is also used in distributed
revision control systems like Git, Mercurial,
and Monotone to identify revisions, and to
detect data corruption or tampering.
Data Flow Diagram:
(Level 0):
User System Web service
(Level 1) :
User System Web service
Hadoop
Generate Hash Key Hash Key Checker
(Level 2):
User System Web service
Hadoop
Generate Hash Key Hash Key Checker
Upload file
Repair file
No error message
Restore File
Class Diagram :
State Transition Diagram :
Use Case Diagram :
Activity Diagram :
Component Diagram :
Development Diagram :
Sequence Diagram :
1.Time efficient approach for detecting
and correcting errors.
2.It works for text , audio and video files.
Advantages:
Limitations:
1. It works for only specific kinds of errors.
2. Limited size of block upto 1 Gb.
3. Works on Public cloud.
Paper Submission Details :
Paper Title : “ERROR DETECTION IN
BID DATA”
Paper has been accepted for publication to
International Education and Research
Journal –IERJ (E-ISSN:2454-9916)
References :
 S. Tsuchiya, Y. Sakamoto, Y. Tsuchimoto,
and V. Lee, “Big Data Processing in Cloud
Environments,” FUJITSU Science and
Technology J., vol. 48, no. 2, pp. 159-168,
2012.
S. Sakr, A. Liu, D. Batista, and M. Alomari,
“A Survey of Large Scale Data Management
Approaches in Cloud Environments,” IEEE
Comm. Surveys & Tutorials, vol. 13, no. 3, pp.
311-336, Third Quarter 2011.
 M.C. Vuranand and I.F. Akyildiz, “Error
Control in Wireless Sensor Networks: A Cross
Layer Analysis,” IEEE Trans. Networking, vol.
17, no. 4, pp. 1186-1199, Aug. 2009.
 C. Liu, J. Chen, T. Yang, X. Zhang, C. Yang,
R. Ranjan, and K. Kotagiri, “Authorized public
auditing of dynamic big data storage on cloud
with efficient verifiable fine-grained updates,”
IEEE Trans. Parallel and Distributed Systems,
vol. 25, no. 9, pp. 2234–2244, Sept. 2014
“SensorCloud,” http://www.sensorcloud.com/,
accessed on 30, Aug. 2013.
Thank you

More Related Content

What's hot

What Are Science Clouds?
What Are Science Clouds?What Are Science Clouds?
What Are Science Clouds?Robert Grossman
 
Centralized Data Verification Scheme for Encrypted Cloud Data Services
Centralized Data Verification Scheme for Encrypted Cloud Data ServicesCentralized Data Verification Scheme for Encrypted Cloud Data Services
Centralized Data Verification Scheme for Encrypted Cloud Data ServicesEditor IJMTER
 
Health & Status Monitoring (2010-v8)
Health & Status Monitoring (2010-v8)Health & Status Monitoring (2010-v8)
Health & Status Monitoring (2010-v8)Robert Grossman
 
Introduction to Numetric (1)
Introduction to Numetric (1)Introduction to Numetric (1)
Introduction to Numetric (1)Matt Polson
 
Distributed Cyberinfrastructure to Support Big Data Machine Learning
Distributed Cyberinfrastructure to Support Big Data Machine LearningDistributed Cyberinfrastructure to Support Big Data Machine Learning
Distributed Cyberinfrastructure to Support Big Data Machine LearningLarry Smarr
 
Data Security Model Enhancement in Cloud Environment
Data Security Model Enhancement in Cloud EnvironmentData Security Model Enhancement in Cloud Environment
Data Security Model Enhancement in Cloud Environmentijsrd.com
 
grid mining
grid mininggrid mining
grid miningARNOLD
 
F233842
F233842F233842
F233842irjes
 
Reliable and Efficient Data Acquisition in Wireless Sensor Network
Reliable and Efficient Data Acquisition in Wireless Sensor NetworkReliable and Efficient Data Acquisition in Wireless Sensor Network
Reliable and Efficient Data Acquisition in Wireless Sensor NetworkIJMTST Journal
 
Grid computing the grid
Grid computing the gridGrid computing the grid
Grid computing the gridJivan Nepali
 
DATA PROVENENCE IN PUBLIC CLOUD
DATA PROVENENCE IN PUBLIC CLOUDDATA PROVENENCE IN PUBLIC CLOUD
DATA PROVENENCE IN PUBLIC CLOUDijsrd.com
 
A Novel Method of Directly Auditing Integrity On Encrypted Data
A Novel Method of Directly Auditing Integrity On Encrypted DataA Novel Method of Directly Auditing Integrity On Encrypted Data
A Novel Method of Directly Auditing Integrity On Encrypted DataIRJET Journal
 
Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)Robert Grossman
 
Usage Patterns to Provision for Scientific Experiments in Clouds
Usage Patterns to Provision for Scientific Experiments in CloudsUsage Patterns to Provision for Scientific Experiments in Clouds
Usage Patterns to Provision for Scientific Experiments in CloudsEran Chinthaka Withana
 
IDEAS 2013 Presentation
IDEAS 2013 PresentationIDEAS 2013 Presentation
IDEAS 2013 PresentationMuntazir Mehdi
 
MawereC- Ubuntunet paper publication 2015
MawereC- Ubuntunet paper publication 2015MawereC- Ubuntunet paper publication 2015
MawereC- Ubuntunet paper publication 2015CEPHAS MAWERE
 

What's hot (20)

What Are Science Clouds?
What Are Science Clouds?What Are Science Clouds?
What Are Science Clouds?
 
Centralized Data Verification Scheme for Encrypted Cloud Data Services
Centralized Data Verification Scheme for Encrypted Cloud Data ServicesCentralized Data Verification Scheme for Encrypted Cloud Data Services
Centralized Data Verification Scheme for Encrypted Cloud Data Services
 
STDCS
STDCSSTDCS
STDCS
 
Health & Status Monitoring (2010-v8)
Health & Status Monitoring (2010-v8)Health & Status Monitoring (2010-v8)
Health & Status Monitoring (2010-v8)
 
Introduction to Numetric (1)
Introduction to Numetric (1)Introduction to Numetric (1)
Introduction to Numetric (1)
 
CLOUD BIOINFORMATICS Part1
 CLOUD BIOINFORMATICS Part1 CLOUD BIOINFORMATICS Part1
CLOUD BIOINFORMATICS Part1
 
Distributed Cyberinfrastructure to Support Big Data Machine Learning
Distributed Cyberinfrastructure to Support Big Data Machine LearningDistributed Cyberinfrastructure to Support Big Data Machine Learning
Distributed Cyberinfrastructure to Support Big Data Machine Learning
 
Data Security Model Enhancement in Cloud Environment
Data Security Model Enhancement in Cloud EnvironmentData Security Model Enhancement in Cloud Environment
Data Security Model Enhancement in Cloud Environment
 
grid mining
grid mininggrid mining
grid mining
 
F233842
F233842F233842
F233842
 
Reliable and Efficient Data Acquisition in Wireless Sensor Network
Reliable and Efficient Data Acquisition in Wireless Sensor NetworkReliable and Efficient Data Acquisition in Wireless Sensor Network
Reliable and Efficient Data Acquisition in Wireless Sensor Network
 
Grid computing the grid
Grid computing the gridGrid computing the grid
Grid computing the grid
 
DATA PROVENENCE IN PUBLIC CLOUD
DATA PROVENENCE IN PUBLIC CLOUDDATA PROVENENCE IN PUBLIC CLOUD
DATA PROVENENCE IN PUBLIC CLOUD
 
A Novel Method of Directly Auditing Integrity On Encrypted Data
A Novel Method of Directly Auditing Integrity On Encrypted DataA Novel Method of Directly Auditing Integrity On Encrypted Data
A Novel Method of Directly Auditing Integrity On Encrypted Data
 
Grid computing
Grid computingGrid computing
Grid computing
 
Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)
 
Usage Patterns to Provision for Scientific Experiments in Clouds
Usage Patterns to Provision for Scientific Experiments in CloudsUsage Patterns to Provision for Scientific Experiments in Clouds
Usage Patterns to Provision for Scientific Experiments in Clouds
 
IDEAS 2013 Presentation
IDEAS 2013 PresentationIDEAS 2013 Presentation
IDEAS 2013 Presentation
 
A Back Propagation Neural Network Intrusion Detection System Based on KVM
A Back Propagation Neural Network Intrusion Detection System Based on KVMA Back Propagation Neural Network Intrusion Detection System Based on KVM
A Back Propagation Neural Network Intrusion Detection System Based on KVM
 
MawereC- Ubuntunet paper publication 2015
MawereC- Ubuntunet paper publication 2015MawereC- Ubuntunet paper publication 2015
MawereC- Ubuntunet paper publication 2015
 

Similar to prj exam

cloud computing preservity
cloud computing preservitycloud computing preservity
cloud computing preservitychennuruvishnu
 
My Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big DataMy Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big DataRobert Grossman
 
Using In-Memory Encrypted Databases on the Cloud
Using In-Memory Encrypted Databases on the CloudUsing In-Memory Encrypted Databases on the Cloud
Using In-Memory Encrypted Databases on the CloudFrancesco Pagano
 
An Efficient PDP Scheme for Distributed Cloud Storage
An Efficient PDP Scheme for Distributed Cloud StorageAn Efficient PDP Scheme for Distributed Cloud Storage
An Efficient PDP Scheme for Distributed Cloud StorageIJMER
 
Clo architecture for video surveillance service based on p2 p and cloud compu...
Clo architecture for video surveillance service based on p2 p and cloud compu...Clo architecture for video surveillance service based on p2 p and cloud compu...
Clo architecture for video surveillance service based on p2 p and cloud compu...manish bhandare
 
Big Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-onBig Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-onDony Riyanto
 
Predicting Space Weather with Docker
Predicting Space Weather with DockerPredicting Space Weather with Docker
Predicting Space Weather with DockerDocker, Inc.
 
Deeplearning on Hadoop @OSCON 2014
Deeplearning on Hadoop @OSCON 2014Deeplearning on Hadoop @OSCON 2014
Deeplearning on Hadoop @OSCON 2014Adam Gibson
 
PUBLIC AUDITING FOR SECURE CLOUD STORAGE ...
PUBLIC AUDITING 	             FOR SECURE CLOUD STORAGE                       ...PUBLIC AUDITING 	             FOR SECURE CLOUD STORAGE                       ...
PUBLIC AUDITING FOR SECURE CLOUD STORAGE ...Bharath Nair
 
IRJET-Block-Level Message Encryption for Secure Large File to Avoid De-Duplic...
IRJET-Block-Level Message Encryption for Secure Large File to Avoid De-Duplic...IRJET-Block-Level Message Encryption for Secure Large File to Avoid De-Duplic...
IRJET-Block-Level Message Encryption for Secure Large File to Avoid De-Duplic...IRJET Journal
 
IRJET- Secured Hadoop Environment
IRJET- Secured Hadoop EnvironmentIRJET- Secured Hadoop Environment
IRJET- Secured Hadoop EnvironmentIRJET Journal
 
Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Rusif Eyvazli
 
A Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate DataA Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate DataRobert Grossman
 
A Survey on Big Data Analysis Techniques
A Survey on Big Data Analysis TechniquesA Survey on Big Data Analysis Techniques
A Survey on Big Data Analysis Techniquesijsrd.com
 
Project Matsu: Elastic Clouds for Disaster Relief
Project Matsu: Elastic Clouds for Disaster ReliefProject Matsu: Elastic Clouds for Disaster Relief
Project Matsu: Elastic Clouds for Disaster ReliefRobert Grossman
 
2. Develop a MapReduce program to calculate the frequency of a given word in ...
2. Develop a MapReduce program to calculate the frequency of a given word in ...2. Develop a MapReduce program to calculate the frequency of a given word in ...
2. Develop a MapReduce program to calculate the frequency of a given word in ...Prof. Maulik Trivedi
 
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, ConfluentApache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, ConfluentHostedbyConfluent
 

Similar to prj exam (20)

Dosenet_Report
Dosenet_ReportDosenet_Report
Dosenet_Report
 
cloud computing preservity
cloud computing preservitycloud computing preservity
cloud computing preservity
 
My Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big DataMy Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big Data
 
Using In-Memory Encrypted Databases on the Cloud
Using In-Memory Encrypted Databases on the CloudUsing In-Memory Encrypted Databases on the Cloud
Using In-Memory Encrypted Databases on the Cloud
 
Presentazione pagano1
Presentazione pagano1Presentazione pagano1
Presentazione pagano1
 
An Efficient PDP Scheme for Distributed Cloud Storage
An Efficient PDP Scheme for Distributed Cloud StorageAn Efficient PDP Scheme for Distributed Cloud Storage
An Efficient PDP Scheme for Distributed Cloud Storage
 
Clo architecture for video surveillance service based on p2 p and cloud compu...
Clo architecture for video surveillance service based on p2 p and cloud compu...Clo architecture for video surveillance service based on p2 p and cloud compu...
Clo architecture for video surveillance service based on p2 p and cloud compu...
 
Big Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-onBig Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-on
 
Predicting Space Weather with Docker
Predicting Space Weather with DockerPredicting Space Weather with Docker
Predicting Space Weather with Docker
 
Deeplearning on Hadoop @OSCON 2014
Deeplearning on Hadoop @OSCON 2014Deeplearning on Hadoop @OSCON 2014
Deeplearning on Hadoop @OSCON 2014
 
PUBLIC AUDITING FOR SECURE CLOUD STORAGE ...
PUBLIC AUDITING 	             FOR SECURE CLOUD STORAGE                       ...PUBLIC AUDITING 	             FOR SECURE CLOUD STORAGE                       ...
PUBLIC AUDITING FOR SECURE CLOUD STORAGE ...
 
IRJET-Block-Level Message Encryption for Secure Large File to Avoid De-Duplic...
IRJET-Block-Level Message Encryption for Secure Large File to Avoid De-Duplic...IRJET-Block-Level Message Encryption for Secure Large File to Avoid De-Duplic...
IRJET-Block-Level Message Encryption for Secure Large File to Avoid De-Duplic...
 
IRJET- Secured Hadoop Environment
IRJET- Secured Hadoop EnvironmentIRJET- Secured Hadoop Environment
IRJET- Secured Hadoop Environment
 
Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...
 
A Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate DataA Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate Data
 
A Survey on Big Data Analysis Techniques
A Survey on Big Data Analysis TechniquesA Survey on Big Data Analysis Techniques
A Survey on Big Data Analysis Techniques
 
Project Matsu: Elastic Clouds for Disaster Relief
Project Matsu: Elastic Clouds for Disaster ReliefProject Matsu: Elastic Clouds for Disaster Relief
Project Matsu: Elastic Clouds for Disaster Relief
 
2. Develop a MapReduce program to calculate the frequency of a given word in ...
2. Develop a MapReduce program to calculate the frequency of a given word in ...2. Develop a MapReduce program to calculate the frequency of a given word in ...
2. Develop a MapReduce program to calculate the frequency of a given word in ...
 
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, ConfluentApache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
 
Hadoop dev 01
Hadoop dev 01Hadoop dev 01
Hadoop dev 01
 

prj exam

  • 2. Group Id : 15 Names: Shweta Dolhare (B120344212) Snehal Gaikwad (B120344215) Poonam Ghorpade (B120344219) Project Title : “Error Detection in Big Data on Cloud” Internal Guide : S.B.Jadhav
  • 3. Project Definition: To develop such a approach that will efficiently reduce time for detecting errors in big sensor data on cloud. If any error is found then it also involves error recovery & storing the data in original format. Technical Keywords : Cloud Computing, Service Composition, Online Web services, Hadoop, MySQL, Map Reduce.
  • 4. Introduction : We need to develop such a approach that will efficiently reduce time for detecting errors in big sensor data on cloud. If any error is found then it also involves error recovery & storing the data in original format.
  • 5. According to the error type and features from scale-free network we have proposed a time-efficient strategy for detecting and locating errors in big data sets on cloud . The main aim is to reduce the time required to detect the errors and to provide a error free transmission of data .
  • 6. Motivation of the Project: According to the error type and features from scale-free network we have proposed a time- efficient strategy for detecting and locating errors in big data sets on cloud. The main aim is to reduce the time required to detect the errors and to provide a error free transmission of data.
  • 7. Big Data Processing on Cloud  Service over a network.  Ideal platform for big data storage.  Stream based data management.  Hadoop based framework.  Work load distribution .  Scalability.  Data filtering.
  • 8. Error Detection on Cloud  Error detection  Error localization .  Complexity analysis.  Algorithm calibration on cloud.
  • 9. MODULES : Module 1 : 1.Create a big data Module 2 : 2. Implementation of algorithm for error detection
  • 10. Module 3: 3. Implementation of recovery of file Module 4 : 4. Testing
  • 11. Flow Diagram of Project : Upload a file to cloud Store the file on cloud If no recover the error Check if file is equal to original file no Yes
  • 13. Design of Project: A)Mathematical Model : Let ‘S’ be the | Error detection in big data as the final set. Identify the inputs as D S = {D,L,A} D = {D1, D2, D3, D4| ‘D’ given Data files}
  • 14. Identify the outputs as O S = {D, L, A} D = {D1, D2, D3, D4| ‘D’ gives data files } L = {L1, L2 | ‘L’ gives the log files for upload and download and repair} A = {A1, A2, A3 | ‘A’ gives alerts }
  • 15. Identify the functions as ‘F’ S = {D, L, A, F} F = {F1(), F2(), F3(), F4(), F5(), F6() } F1( V ) :: Upload F2 ( V) :: integrity check F3 ( V ) :: Log generation F4 ( T ) :: Alert the system F4 ( D ) :: Restore the file F6 ( V ) :: Download the data file
  • 16. Feasibility Analysis: This is considered with specifying equipment and software that will successful satisfy the user requirement the technical needs of the system may vary considerably but might include •The facility to produce outputs in a given time. • Response time under certain conditions. • Ability to process a certain column of transaction at a particular speed.
  • 17. Technical Feasibility : The facility to produce outputs in a given time.  Response time under certain conditions.  Ability to process a certain column of transaction at a particular speed.
  • 18. NP hard : A problem is NP hard, if all other problems in NP can be reduced to it . NP complete : A problem is NP complete , if it is (a) in NP, and (b) NP hard. In short: NP- complete: the most difficult problems in NP Our project comes under NP Complete.
  • 19. Algorithms : 1.Cyclic Redundancy Check. 2.Hamming Code. 3.Secure Hash Algorithm.
  • 20.  Cyclic Redundancy Check In CRC, q sequence of redundant bits, called cyclic redundancy check bits are appended to the end of data unit so that resulting data unit becomes exactly divisible by a second, predetermined binary number . The basic idea of CRC algorithms is simply to treat the message as an enormous binary number, to divide it by another fixed binary number, and to make the remainder from this division the checksum.
  • 21. Humming Code : Higher information rate.  It encode & decodes code words .  Detect errors of weight up to 3.  Correct errors of weight 1.  The key to the Hamming Code is the use of extra parity bits to allow the identification of a single error.
  • 22. SHA 1  Cryptographic hash function.  Produces 160 bit hash values as a message.  Value is hexadecimal number.  40 digits long.  SHA-1 forms part of several widely used security applications and protocols, including TLS and SSL, PGP, SSH, S/MIME, and IPsec. Those applications can also use MD5; both MD5 and SHA-1 are descended from MD4.
  • 23. SHA-1 hashing is also used in distributed revision control systems like Git, Mercurial, and Monotone to identify revisions, and to detect data corruption or tampering.
  • 24. Data Flow Diagram: (Level 0): User System Web service
  • 25. (Level 1) : User System Web service Hadoop Generate Hash Key Hash Key Checker
  • 26. (Level 2): User System Web service Hadoop Generate Hash Key Hash Key Checker Upload file Repair file No error message Restore File
  • 34. 1.Time efficient approach for detecting and correcting errors. 2.It works for text , audio and video files. Advantages:
  • 35. Limitations: 1. It works for only specific kinds of errors. 2. Limited size of block upto 1 Gb. 3. Works on Public cloud.
  • 36. Paper Submission Details : Paper Title : “ERROR DETECTION IN BID DATA” Paper has been accepted for publication to International Education and Research Journal –IERJ (E-ISSN:2454-9916)
  • 37. References :  S. Tsuchiya, Y. Sakamoto, Y. Tsuchimoto, and V. Lee, “Big Data Processing in Cloud Environments,” FUJITSU Science and Technology J., vol. 48, no. 2, pp. 159-168, 2012. S. Sakr, A. Liu, D. Batista, and M. Alomari, “A Survey of Large Scale Data Management Approaches in Cloud Environments,” IEEE Comm. Surveys & Tutorials, vol. 13, no. 3, pp. 311-336, Third Quarter 2011.
  • 38.  M.C. Vuranand and I.F. Akyildiz, “Error Control in Wireless Sensor Networks: A Cross Layer Analysis,” IEEE Trans. Networking, vol. 17, no. 4, pp. 1186-1199, Aug. 2009.  C. Liu, J. Chen, T. Yang, X. Zhang, C. Yang, R. Ranjan, and K. Kotagiri, “Authorized public auditing of dynamic big data storage on cloud with efficient verifiable fine-grained updates,” IEEE Trans. Parallel and Distributed Systems, vol. 25, no. 9, pp. 2234–2244, Sept. 2014 “SensorCloud,” http://www.sensorcloud.com/, accessed on 30, Aug. 2013.