2. A distinct IT environment that is designed for
the purpose of remotely provisioning scalable
and measured IT resources.
Instead of all the computer hardware and on
your desktop, it's provided as a service by
another party and accessed over the internet.
3. Pros
Lower upfront and infrastructure costs.
Easy to grow applications.
Scale up or down at short notice.
Only pay for what you use.
Everything managed under SLAs.
Overall environmental benefit (lower carbon
emissions)
4. Cons
Higher ongoing operating costs.
Greater dependency on service providers.
Potential privacy and security risks.
Dependency on a reliable Internet connection.
5. Access Control and Accountability
Data Security and Privacy Issues
◦ Third party publication of data; Security challenges
associated with data outsourcing;
◦ Data at the different sites have to be protected, with the
end results being made available; querying encrypted data
◦ Secure Query Processing/Updates in Cloud
Secure Storage
Security Related to Virtualization
Cloud Monitoring
6. The problem that we wish to tackle through
this project is that of privacy issues in cloud
computing since that is the most relevant and
most prominent and prevailing problem in
the domain.
7. Before Map-Reduce
◦ Large scale data processing was difficult!
Managing hundreds or thousands of processors
Managing parallelization and distribution
I/O Scheduling
Status and monitoring
Fault/crash tolerance
MapReduce provides all of these, easily!
9. Input :- Set Of Text Files.
Data Store :- Set of words on the basis of
which these files are segregated.
Features :- This data store is self expanding.
11. Finally this intermediate output is properly
processed and files are moved to a safe
location. Also the Data Store expands itself.
12. Creates an abstraction for dealing with
complex overhead
◦ The computations are simple, the overhead is
messy
Removing the overhead makes programs
much smaller and thus easier to use
◦ Less testing is required as well. The MapReduce
libraries can be assumed to work properly, so
only user code needs to be tested
Division of labor also handled by the
MapReduce libraries, so programmers only
need to focus on the actual computation
13. Segregation of data can be one of the key
solutions of majority of the security issues
that arise with cloud computing.
There is still further scope of improvement in
this project.
Combining the vast domain of Big Data and
Cloud Computing would be a boon in the true
sense.