SlideShare a Scribd company logo
1 of 179
CAE-2
Introduction to Distributed
Computing
Contents
• Introduction
– Motivation
– History
– Goal
– Characteristics
• Example of Applications
• Scenario
• Advantages and Disadvantages
• Issues and Challenges
• Conclusion
• References
Introduction
• Definition
– “A distributed system consists of multiple
autonomous computers that communicate
through a computer network.
– “Distributed computing utilizes a network of many
computers, each accomplishing a portion of an
overall task, to achieve a computational result
much more quickly than with a single computer.”
– “Distributed computing is any computing that
involves multiple computers remote from each
other that each have a role in a computation
problem or information processing.”
Introduction
• A distributed system is one in which hardware or
software components located at networked computers
communicate and coordinate their actions only by
message passing.
• In the term distributed computing, the word
distributed means spread out across space.
Thus, distributed computing is an activity performed on a
spatially distributed system.
• These networked computers may be in the same
room, same campus, same country, or in different
continents
Introduction
Cooperation
Cooperation
Cooperation
Internet
Large-scale
Application
Resource
Management
Subscription
Distribution
Distribution Distribution
Distribution
Agent
Agent Agent
Agent
Job Request
Motivation
• Inherently distributed applications
• Performance/cost
• Resource sharing
• Flexibility and extensibility
• Availability and fault tolerance
• Scalability
• Network connectivity is increasing.
• Combination of cheap processors often more cost-
effective than one expensive fast system.
• Potential increase of reliability.
History
• 1975 – 1985
– Parallel computing was favored in the early years
– Primarily vector-based at first
– Gradually more thread-based parallelism was introduced
– The first distributed computing programs were a pair of
programs called Creeper and Reaper invented in 1970s
– Ethernet that was invented in 1970s.
– ARPANET e-mail was invented in the early 1970s and
probably the earliest example of a large-scale distributed
application.
History
• 1985 -1995
– Massively parallel architectures start rising and message
passing interface and other libraries developed
– Bandwidth was a big problem
– The first Internet-based distributed computing project
was started in 1988 by the DEC System Research Center.
– Distributed.net was a project founded in 1997 -
considered the first to use the internet to distribute data
for calculation and collect the results,
History
• 1995 – Today
– Cluster/grid architecture increasingly dominant
– Special node machines eschewed in favor of COTS
technologies
– Web-wide cluster software
– Google take this to the extreme (thousands of
nodes/cluster)
– SETI@Home started in May 1999 - analyze the radio
signals that were being collected by the Arecibo Radio
Telescope in Puerto Rico.
Goal
• Making Resources Accessible
– Data sharing and device sharing
• Distribution Transparency
– Access, location, migration, relocation, replication,
concurrency, failure
• Communication
– Make human-to-human comm. easier. E.g.. :
electronic mail
• Flexibility
– Spread the work load over the available machines
in the most cost effective way
• To coordinate the use of shared resources
• To solve large computational problem
Characteristics
• Resource Sharing
• Openness
• Concurrency
• Scalability
• Fault Tolerance
• Transparency
Architecture
• Client-server
• 3-tier architecture
• N-tier architecture
• loose coupling, or tight coupling
• Peer-to-peer
• Space based
• Examples of commercial application :
– Database Management System
– Distributed computing using mobile agents
– Local intranet
– Internet (World Wide Web)
– JAVA Remote Method Invocation (RMI)
Application
Distributed Computing Using Mobile Agents
• Mobile agents can be wandering around in a network using free
resources for their own computations.
Local Intranet
• A portion of Internet that is separately administered & supports
internal sharing of resources (file/storage systems and printers) is
called local intranet.
Internet
• The Internet is a global system of interconnected computer networks that use
the standardized Internet Protocol Suite (TCP/IP).
JAVA RMI
• Embedded in language Java:-
– Object variant of remote procedure call
– Adds naming compared with RPC (Remote Procedure Call)
– Restricted to Java environments
RMI Architecture
Categories of Applications in
distributed computing
• Science
• Life Sciences
• Cryptography
• Internet
• Financial
• Mathematics
• Language
• Art
• Puzzles/Games
• Miscellaneous
• Distributed Human Project
• Collaborative Knowledge Bases
• Charity
Example of applications
• Internet – Gomez Distributed PEER Client
(peerReview)
– Evaluate the performance of large websites to find
bottlenecks.
• Life Sciences - Compute Against Cancer® (CAC)
– Create immediate impact in the lives of cancer
patients and their families today, while at the
same time empowering the research that will
result in improved therapies — and perhaps even
the cure.
Example of applications
Collaborative Knowledge Bases – Wikipedia
 A collaborative project to produce a complete a free
encyclopedia from scratch.
 The encyclopedia is available in many non-English languages.
Distributed Human Projects- Open Mind Indoor
Common Sense
 Help teach indoor mobile robots to be smarter. It will
create a repository of knowledge which will enable
people to create more intelligent mobile robots for use
in home and office environments.
• The key characteristics of a variety of modern computing scenarios distinguishing between
micro, medium and global-scale scenarios.
• Micro scale
– Researches focus on small computer-based components that can be deployed in an environment and
can coordinate their actions (i.e. local sensing and effecting of specific environmental conditions)
with the goal of enriching our physical word with specific “smart” functionalities. Potential
applications range from simple monitoring activities, to smart materials and self assembly of
computational beings
• Medium scale
– Our world is populated by local ad-hoc networks (e.g. the ensemble of Bluetooth enabled devices we
could carry on or we could find in our cars) and network-based furniture (e.g., Web enabled fridges
and ovens able to interact with each other and effectively support our cooking activities)
• Global scale
– In Internet computing, there is need to access data and services according to a variety of patterns,
independently of the availability/location of specific servers
Scenario
Scenario
Mechanisms of Self-organization in Modern
Distributed Computing Scenarios
Advantages
• Economics:-
– Computers harnessed together give a better price/performance ratio
than mainframes.
• Speed:-
– A distributed system may have more total computing power than a
mainframe.
• Inherent distribution of applications:-
– Some applications are inherently distributed. E.g., an ATM-banking
application.
• Reliability:-
– If one machine crashes, the system as a whole can still survive if you
have multiple server machines and multiple storage devices
(redundancy).
• Extensibility and Incremental Growth:-
– Possible to gradually scale up (in terms of processing power and
functionality) by adding more sources (both hardware and software).
This can be done without disruption to the rest of the system.
Disadvantages
• Complexity :-
– Lack of experience in designing, and implementing a
distributed system. E.g. which platform (hardware and OS) to
use, which language to use etc.
• Network problem:-
– If the network underlying a distributed system saturates or
goes down, then the distributed system will be effectively
disabled thus negating most of the advantages of the
distributed system.
• Security:-
– Security is a major hazard since easy access to data means
easy access to secret data as well.
Issues and Challenges
• Heterogeneity of components :-
– variety or differences that apply to computer hardware,
network, OS, programming language and
implementations by different developers.
– All differences in representation must be deal with if to do
message exchange.
– Example : different call for exchange message in UNIX
different from Windows.
• Openness:-
– System can be extended and re-implemented in various
ways.
– Cannot be achieved unless the specification and
documentation are made available to software developer.
– The most challenge to designer is to tackle the complexity
of distributed system; design by different people.
Issues and Challenges cont…
• Transparency:-
– Aim : make certain aspects of distribution
are invisible to the application programmer
; focus on design of their particular
application.
– They not concern the locations and details
of how it operate, either replicated or
migrated.
– Failures can be presented to application
programmers in the form of exceptions –
must be handled.
Issues and Challenges cont…
• Transparency:-
– This concept can be summarize as shown in
this Figure:
Issues and Challenges cont…
• Security:-
– Security for information resources in
distributed system have 3 components :
a. Confidentiality : protection against disclosure to
unauthorized individuals.
b. Integrity : protection against alteration/corruption
c. Availability : protection against interference with the
means to access the resources.
– The challenge is to send sensitive
information over Internet in a secure
manner and to identify a remote user or
other agent correctly.
Issues and Challenges cont..
• Scalability :-
– Distributed computing operates at many
different scales, ranging from small Intranet
to Internet.
– A system is scalable if there is significant
increase in the number of resources and
users.
– The challenges is :
a. controlling the cost of physical resources.
b. controlling the performance loss.
c. preventing software resource running out.
d. avoiding performance bottlenecks.
Issues and Challenges cont…
• Failure Handling :-
– Failures in a distributed system are partial –
some components fail while others can function.
– That’s why handling the failures are difficult
a. Detecting failures : to manage the presence of failures cannot
be detected but may be suspected.
b. Masking failures : hiding failure not guaranteed in the worst
case.
• Concurrency :-
– Where applications/services process
concurrency, it will effect a conflict in operations
with one another and produce inconsistence
results.
– Each resource must be designed to be safe in a
concurrent environment.
What is Distributed
Computing/System?
• Distributed computing
–A field of computing
science that studies
distributed system.
–The use of distributed
systems to solve
computational
problems.
• Distributed system
– Wikipedia
• There are several autonomous computational
entities, each of which has its own local
memory.
• The entities communicate with each other by
message passing.
– Operating System Concept
• The processors communicate with one another
through various communication lines, such as
high-speed buses or telephone lines.
• Each processor has its own local memory.
What is Distributed
Computing/System?
• Distributed program
–A computing program that runs in a
distributed system
• Distributed programming
–The process of writing distributed program
What is Distributed Computing/System?
• Common properties
– Fault tolerance
• When one or some nodes fails, the whole
system can still work fine except
performance.
• Need to check the status of each node
– Each node play partial role
• Each computer has only a
limited, incomplete view of the system. Each
computer may know only one part of the
input.
– Resource sharing
• Each user can share the computing power
and storage resource in the system with
other users
– Load Sharing
• Dispatching several tasks to each nodes can
help share loading to the whole system.
– Easy to expand
• It id easy to expand the system by adding
nodes.
Why Distributed Computing?
• The nature of application
• Performance
– Computing intensive
• The task could consume a lot of time on computing.
– Data intensive
• The task that deals with a lot mount or large size of files. For
example, Facebook,Orkut,Twitter ,etc.
• Robustness
– No SPOF (Single Point Of Failure)
– Other nodes can execute the same task executed on
failed node.
Common Architectures
• Communicate and coordinate works among
concurrent processes
– Processes communicate by sending/receiving
messages
– Synchronous/Asynchronous
Common Architectures
• Master/Slave architecture
– Master/slave is a model of
communication where one device
or process has unidirectional
control over one or more other
devices
• Database replication
–Source database can be
treated as a master and the
destination database can
treated as a slave.
• Client-server
–web browsers and web
servers
Best Practice /Characteristics
• Data Intensive or Computing Intensive
– Data size and the amount of data
• The attribute of data you consume
• Computing intensive
–We can move data to the nodes where
we can execute jobs
• Data Intensive
–We can separate/replicate data to
difference nodes, then we can execute
our tasks on these nodes
–Reduce data replication when executing
tasks
Best Practice /Characteristics
• Master nodes need to know data location
• No data loss when incidents happen
–SAN (Storage Area Network)
–Data replication on different nodes
• Synchronization
–When splitting tasks to different nodes, how
can we make sure these tasks are
synchronized?
Best Practice /Characteristics
• Robustness
– Still safe when one or partial nodes fail
– Need to recover when failed nodes are online.
No further or few action is needed
– Failure detection
• When any nodes fails, master nodes can
detect this situation.
–Eg: Heartbeat detection
– App/Users don’t need to know if any partial
failure happens.
• Restart tasks on other nodes for users
Best Practice /Characteristics
 Network issue
 Bandwidth
 Need to think of bandwidth when copying files from one
node to other nodes if we would like to execute the task on
the nodes if no data in these nodes.
 Scalability
 Easy to expand
 Hadoop – configuration modification and start daemon
 Optimization
 What can we do if the performance of some nodes
is not good?
 Monitoring the performance of each node
 According to any information exchange like heartbeat or
log
 Resume the same task on another nodes
summary
• The concept of distributed computing is the most
efficient way to achieve the optimization.
• Distributed computing is anywhere : intranet, Internet
or mobile ubiquitous computing
(laptop, PDAs, pagers, smart watches, hi-fi systems)
• It deals with hardware and software systems, that
contain more than one processing / storage and run in
concurrently.
• Main motivation factor is resource sharing; such as files
, printers, web pages or database records.
• Grid computing and cloud computing are form of
distributed computing.
Case study - Condor
Case study - Hadoop
The History and Future of Cloud
Computing
What is Cloud Computing?
• Cloud Computing is a general term used to describe a new class of
network based computing services that takes place over the
Internet,
– basically a step on from Utility Computing
– a collection/group of integrated and networked
hardware, software and Internet infrastructure (called
a platform).
– Using the Internet for communication and transport
provides hardware, software and networking services
to clients
• These platforms hide the complexity and details of the underlying
infrastructure from users and applications by providing very simple
graphical interface or API (Applications Programming Interface).
What is Cloud Computing?
• In addition, the platform provides on demand
services, that are always
ON, Anywhere, Anytime and Any place.
• Pay for use and as per needed, elastic
– scale up and down in capacity and functionalities
• The hardware and software services are
available to
– general public, enterprises, corporations and
businesses markets
Cloud Summary
• Cloud computing is an umbrella term used to refer to
Internet based development and services
• A number of characteristics define cloud
data, applications services and infrastructure:
– Remotely hosted: Services or data are hosted on
remote infrastructure.
– Ubiquitous: Services or data are available from
anywhere.
– Commodified: The result is a utility computing model
similar to traditional that of traditional utilities, like
gas and electricity - you pay for what you would want!
Cloud Architecture
What is Cloud Computing
Adopted from: Effectively and Securely Using the Cloud Computing Paradigm by peter Mell, Tim Grance
• Shared pool of configurable computing resources
• On-demand network access
• Provisioned by the Service Provider
Cloud Computing Characteristics
Common Characteristics:
Low Cost Software
Virtualization Service Orientation
Advanced Security
Homogeneity
Massive Scale Elastic Computing
Geographic Distribution
Essential Characteristics:
Resource Pooling
Broad Network Access Rapid Elasticity
Measured Service
On Demand Self-Service
Adopted from: Effectively and Securely Using the Cloud Computing Paradigm by peter Mell, Tim Grance
Essential Characteristics of cloud computing
• • On-demand self-service: A client can provision
computer resources without the need for interaction with
cloud service provider personnel.
• • Broad network access: Access to resources in the cloud
is available over the network using standard methods in a
manner that provides platform-independent access to
clients of all types.
• This includes a mixture of heterogeneous operating
systems, and thick and thin platforms such as
laptops, mobile phones,.
• • Resource pooling: A cloud service provider creates
resources that are pooled together in a system that
supports multi-tenant usage.
Benefits of cloud computing
• • Rapid elasticity: Resources can be rapidly and
elastically provisioned.
• Measured service: The use of cloud system
resources is measured, audited, and reported to the
customer based on a metered system.
Cloud Service Models
Software as a
Service (SaaS)
Platform as a
Service (PaaS)
Infrastructure as a
Service (IaaS)
Google App
Engine
SalesForce CRM
LotusLive
Adopted from: Effectively and Securely Using the Cloud Computing Paradigm by peter Mell, Tim Grance
Type of Cloud
• Deployment models: This refers to the location
and management of the cloud's infrastructure.
• Service models: This consists of the particular
types of services that you can access on a cloud
computing platform.
Classification of Clouds
• Cloud services can also be categorized into
three types based on access and location
– A public cloud is available to anyone on the
Internet
– Any user can sign up to use the public cloud
(e.g. Microsoft Windows Azure)
– A private cloud is a proprietary cloud
environment that only provides cloud
services to a limited number of users
– Private clouds are usually within your own
data center behind your firewall
Slide 59
Classification of Clouds
Slide 60
– A hybrid cloud, sometimes called a
virtual private cloud, provides
services that run on a public cloud
infrastructure, but limits access to it
with a virtual private network (VPN)
[Oracle 2009]
Public Cloud
• The main benefits of using a public cloud service are:
– Managed Infrastructure Easy and inexpensive set-up because
hardware (+storage) is managed by the provider
– Scalability/Elasticity  you get more/less processing power, storage,
etc. when needed
– Eliminates complex procurement cycles improving the time-to-
market for its users
– Removes Undifferentiated "Heavy Lifting“  let its users focus on
delivering differentiating business value instead of wasting valuable
resources on the undifferentiated heavy lifting that makes up most of
IT infrastructure
– Pay as you go  No wasted resources because you pay for what you
use
• Public clouds are run by third parties, and applications from
different customers are likely to be mixed together on the cloud’s
servers, storage systems, and networks.
Public Cloud
• The main drawbacks of using a public cloud
service are
– Vendor lock-in for PaaS  e.g. you cannot easily
change your software environment (e.g. from
Windows Azure with C# to Google App Engine
with Java)
• Examples of public clouds
– Amazon Elastic Compute Cloud (EC2)
– Google App Engine
– Microsoft Windows Azure
Private Cloud
• The main benefits of using a private cloud service are
– Exclusive use providing the utmost control over
data, security, and quality of service
– Ownership  has control over infrastructure and how
applications are deployed
– Managed by companies own IT organization
– High level of control
• The main drawbacks are
– Infrastructure costs (e.g. capital expenses for storage)
– Operating costs for running the data center
– Average utilization is not always known in advance  right
scale?
• Examples of private clouds
– Eucalyptus
Hybrid Cloud
• The main benefits of using a hybrid cloud
service are
– Augment a private cloud with the resources of a
public cloud  provide on-demand, externally
provisioned scale [Oracle 2009]
• The main drawbacks are
– Complexity how to distribute applications
across both a public and private cloud [Oracle
2009]
Slide 64
[Oracle 2009]
Service models
```
Advantage of cloud computing
• • Lower costs: Because cloud networks operate at higher
efficiencies and with greater utilization, significant cost
reductions are often encountered.
• • Ease of utilization: Depending upon the type of service being
offered, you may find that you do not require hardware or
software licenses to implement your service.
• • Quality of Service: The Quality of Service (QoS) is something
that you can obtain under contract from your vendor.
• • Reliability: The scale of cloud computing networks and their
ability to provide load balancing and failover makes them highly
reliable, often much more reliable than what you can achieve in
a single organization.
• • Outsourced IT management: A cloud computing
deployment lets someone else manage your computing
infrastructure while you manage your business. In most
instances, you achieve considerable reductions in IT
staffing costs.
• • Simplified maintenance and upgrade: Because the
system is centralized, you can easily apply patches and
upgrades. This means your users always have access to
the latest software versions.
• • Low Barrier to Entry: In particular, upfront capital
expenditures are dramatically reduced. In cloud
computing, anyone can be a giant at any time.
Advantage of cloud computing
Disadvantages of Cloud Computing
• Requires a constant Internet connection:
– Cloud computing is impossible if you cannot
connect to the Internet.
– Since you use the Internet to connect to both your
applications and documents, if you do not have an
Internet connection you cannot access anything,
even your own documents.
– A dead Internet connection means no work and in
areas where Internet connections are few or
inherently unreliable, this could be a deal-breaker.
Disadvantages of Cloud Computing
• Does not work well with low-speed connections:
– Similarly, a low-speed Internet connection, such as
that found with dial-up services, makes cloud
computing painful at best and often impossible.
– Web-based applications require a lot of bandwidth to
download, as do large documents.
• Features might be limited:
– This situation is bound to change, but today many
web-based applications simply are not as full-featured
as their desktop-based applications.
• For example, you can do a lot more with Microsoft
PowerPoint than with Google Presentation's web-based
offering
Disadvantages of Cloud Computing
• Can be slow:
– Even with a fast connection, web-based applications
can sometimes be slower than accessing a similar
software program on your desktop PC.
– Everything about the program, from the interface to
the current document, has to be sent back and forth
from your computer to the computers in the cloud.
– If the cloud servers happen to be backed up at that
moment, or if the Internet is having a slow day, you
would not get the instantaneous access you might
expect from desktop applications.
Disadvantages of Cloud Computing
• Stored data might not be secure:
– With cloud computing, all your data is stored on the
cloud.
• The questions is How secure is the cloud?
– Can unauthorised users gain access to your confidential
data?
• Stored data can be lost:
– Theoretically, data stored in the cloud is safe, replicated
across multiple machines.
– But on the off chance that your data goes missing, you
have no physical or local backup.
• Put simply, relying on the cloud puts you at risk if the cloud lets
you down.
Measuring the Cloud's Value
• • Profit: The economies of scale can make this a profitable business.
• • Optimization: The infrastructure already exists and isn't fully utilized.
• This was certainly the case for Amazon Web Services.
• • Strategic: A cloud computing platform extends the company's products
and their franchise.
• This is the case for Microsoft's Windows Azure Platform.
• • Extension: A branded cloud computing platform can extend customer
relationships by offering additional service options.
• This is the case with IBM Global Services and the various IBM cloud
services.
• • Presence: Establish a presence in a market before a large
competitor can emerge.
• Google App Engine allows a developer to scale an
application immediately. For Google, its office applications
can be rolled out quickly and to large audiences.
• • Platform: A cloud computing provider can become a hub
master at the centre of many ISV's (Independent Software
Vendor) offerings.
Measuring the Cloud's Value
Basic Cloud Characteristics
• The “no-need-to-know” in terms of the underlying details of
infrastructure, applications interface with the infrastructure
via the APIs.
• The “flexibility and elasticity” allows these systems to scale
up and down at will
– utilising the resources of all kinds
• CPU, storage, server capacity, load balancing, and databases
• The “pay as much as used and needed” type of utility
computing and the “always on, anywhere and any place”
type of network-based computing.
Basic Cloud Characteristics
• Cloud are transparent to users and applications, they can be
built in multiple ways
– branded products, proprietary open
source, hardware or software
– In general, they are built on clusters of PC servers
and off-the-shelf components plus Open Source
software combined with in-house applications
and/or system software.
Accessing the Cloud
• Cloud APIs allow software to request data and
computations from one or more services through a direct
or indirect interface.
• Cloud APIs most commonly expose their features via REST
and/or SOAP. (Representational state transfer)
• Vendor specific and cross-platform interfaces are available
for specific functions.
• Cross-platform interfaces have the advantage of allowing
applications to access services from multiple providers
without rewriting, but may have less functionality or other
limitations vs vendor-specific solutions
Cloud API Available for
• Infrastructure
• Infrastructure APIs modify the resources available to operate the
application.
• Service
• Service APIs provide an interface into a specific capability provided by a service
explicitly created to enable that capability.
• Database, messaging, web portals, mapping, e-commerce and storage are all
examples of service APIs.
• Application
• Application APIs provide methods to interface and extend applications on
the web.
• Application APIs connect to applications such as CRM, ERP, social media
and help desk.
Cloud storage compare
Dropbox
• Free space: 2GB
• Premium space: $99/year for
100GB
• File size limit: Unlimited
• Platforms:
Windows, Mac, Linux, iOS, Andr
oid, BlackBerry
• Best for: Seamless syncing
Google Drive
 Free space: 15GB
 Premium space: $59.88/year for
100GB
 File size limit: 10GB
 Platforms: Windows, Mac, iOS,
Android
 Best for: Web apps
Apple iCloud
 Free space: 5GB
 Premium space: $100/year for 50GB
 File size limit: 25MB free/250MB
paid
 Platforms: Mac, iOS, Windows
 Best for: Heavy iTunes/Mac users
Microsoft SkyDrive
 Free space: 7GB
 Premium space: $50/year for 100GB
 File size limit: 2GB
 Platforms:
Windows, Mac, iOS, Android, Windows
Phone
 Best for: Windows/Office integration
Free space
• All 3 – Dropbox, Google Drive and SkyDrive – start with a
certain amount of free space allotted to every user, which
can be increased upon payment, with rates varying for
each provider.
• SkyDrive starts with the maximum amount of free space –
7GB. Google starts with 15GB, while Dropbox starts with
the least – 2GB. You can however, considerably increase
your Dropbox storage capacity – by referrals, beta
testing, camera upload through your phone, and many
other tweaks. Or if you happen to buy an HTC device, you
start off with 25GB of space straightaway.
• If you’re not planning to pay for your storage – SkyDrive or
Dropbox offer the most value.
File Type Support
• Any file type can be uploaded on to these cloud services – but you can only view
file types that are supported. Keeping that in mind, here’s a comparison on file
type support for the three platforms:
• Dropbox doesn’t support any file type. All files must be downloaded and nothing
can be opened online. It’s not a major issue though – if you’re using Dropbox on
your phone – you can edit files right from your phone (via an editor of course) and
have them updated. The same would apply for your computer.
• Google Drive supports unusual, and in a way, diverse range of file types – like
Autodesk AutoCAD files, Photoshop (.psd) files, and even Adobe Illustrator files.
But at the same time, it lacks basics. You can only view, but not edit Microsoft
Office documents. All such files are converted to their Google Docs equivalent for
editing. That can troublesome if you’re using a phone. Since Google Docs can only
be accessed online, there is hardly any offline usability. Serious disadvantage here.
• SkyDrive, being Microsoft’s child, will let you open and edit any Office documents.
There is even support for audio formats, but limited to MP4 and WMV only.
Online features• Dropbox is the oldest of the three providers, and is thus more
established. It is bound to attract more users thanks to features like
referral, which earn you additional space.
• It is also the only service to support Linux and Blackberry systems.
• It even has passcode locks for its mobile apps.
• Although it charges higher dollar-per-GB, it offers a clean interface,
is easier to use and integrates very nicely with your phone.
• Purchasing an HTC gets you 25GB of free space. If you own an HTC,
a Blackberry or run a Linux – Dropbox is the place to go.
Dropbox
Google Drive
• Google Drive offers support more file types which makes online
sharing and editing much easier.
• It also offers more value for dollar-per-GB.
• Plus, if you purchase additional space, you get 25GB extra on to
your Gmail storage.
• But the fact that you have to convert office documents to Google
Docs before you can begin editing can be a big pain.
Google Drive
SkyDrive
• SkyDrive gives you the most free space to start with. It also lets
you remote access your PC online.
• SkyDrive belonging to Microsoft, it has a head start in terms of
targeting Windows users.
• SkyDrive integrates superbly with Windows Phone and Windows
8.
• Office 2013’s default file-saving location is your SkyDrive account.
• It may be termed an unfair advantage, but if you have a Windows
Phone or have recently purchased Windows 8, SkyDrive is a pretty
good option.
SkyDrive
Apple iCloud
• iCloud isn't much of a Dropbox competitor, but if all you need synced between
your devices are text documents, it can be a pretty seamless solution.
• Various apps such as Pages and iA Writer have iCloud sync capabilities, saving
your work after every keystroke and instantly sending changes to Apple's servers.
• Once you open up iA Writer or Pages on another Apple device or Mac with OS X
Lion, you'll already be working with the most recent version of your document.
• Additionally, Lion saves versions of your documents locally using Time Machine
so you can return to older versions of your document, but only on your machine.
• While iCloud is a very elementary document-syncing solution, it might also be
the simplest one to use. And if you need to stream music or videos you've
purchased from the cloud, you can do that, too
Application Types History
Enterprise Applications
• Characteristics of Enterprise Applications
–complex (business) logic,
–huge, complex data,
–special security requirements,
–focus is on transactions ,
–many clients,
–support for different platforms,
–heterogeneous development environments:
C, C++, Java, C#, COBOL, ...
• Application Requirements are a driving force for
Software Architecture.
Slide 102
Mainframe Applications
• Approach
– Mainframe provides all
resources and logic
– Terminals for user
interaction (thin clients)
• Advantages
– Simple deployment
• Problems and Limitations
– Limited GUI capabilities
– Every client allocates
server resources 
– Limited scalability
Slide 103
Mainframe:
Application and Data
Terminals: UI
Client/Server Architecture• Approach
– Increased capabilities of PCs
– Logic was (partially)
transferred to clients  thick
clients
– Central database server
• Advantages
– Better user experience (GUIs)
• Problems and Limitations
– Every client has a permanent
stateful connection
– Every client holds server
resources (db connections, …)
– Limited scalability
Slide 104
Server: Data
Thick Clients : UI + Logic
PC
Mac Linux-
Workstation
Middle-Tier Architecture
• Approach
– Logic bundled in Middle-tier
– Clients have no permanent
connection to Middle-tier 
stateless connections
• Advantages
– Factoring of business logic
– Middleware provides services
(pooling, security, …)
– Resources are shared between
clients  improved scalability
• Problems
– New programming model
Slide 105
Thin Clients
Data-Tier
Middle-Tier:
Logic + Services
(Middleware)
Stateful Services
• The service host provides a service instance for
every client during the whole lifetime of a session.
Slide 106
• Comparable to client/server architecture
– The services is an “extension” of the client
• Technologies
– EJB: Stateful Session Beans
– WCF: Per-Session Services
– CORBA
– Java RMI
Client 1
Proxy
Service Host
Instance 1
Instance 2
Client 2
Proxy
Stateful Services – Problems
• Every service instance holds server resources
– Only a limited number of clients can be served.
• What happens with the service instance when the client
dies unexpectedly?
– Distributed garbage collection, session timeouts, …
• What was the state of the instance when the connection
was interrupted?
– Reestablishing a connection may be painful.
• Transactions may manage state differently
– Transactions should be bound to service operations, not to
sessions.
– When the transaction is rolled back, database state and
session state are not in synch.
Slide 107
Stateless Services
• A service instance is created for each invocation of a
service operation.
Slide 108
Client 1
Proxy
Service Host
Instance 1
Instance 2
Instance 3
Client 2
Proxy
 There is no permanent connection between the client and
the service
 Resources can be shared between clients 
 Improved scalability.
 Technologies
 EJB: Stateless Session Beans
 WCF: Per-Call Services
Windows Azure
Internet-Scale Application
• 2011 stats:
– +200B pageviews/month
– >3.9T feed actions/day
– +300M active users
– >1B chat mesgs/day
– 100M search queries/day
– >6B minutes spent/day (ranked #2
on Internet)
– +20B photos, +2B/month growth
– 600,000 photos served / sec
– 25TB log data / day processed
– 120M queries /sec
• Scaling the “relational” data:
– Keeps data
normalized, randomly
distributed, accessed at high
volumes
– Uses “shared nothing”
architecture
Internet-Scale Application
• 2011 stats:
– +20 petabytes of data processed / day by +100K MapReduce jobs
– 1 petabyte sort took ~6 hours on ~4K servers replicated onto ~48K disks
– +200B clusters, each at 1-5K nodes, handling +5 petabytes of storage
• ~40 GB/sec aggregate
read/write throughput
across the cluster
• +500 servers for each search
query < 500ms
• Scaling the process:
– MapReduce: parallel
processing framework
– BigTable: structured hash
database
– Google File System: massively
scalable distributed storage
Microsoft in the Cloud
(15 years)
450M+
active users
(13 years)
550M
users/mth
(12 years)
Largest non-
ICP/IP cloud
service
x100M users
(11 years)
320M+
active
users
(11 years)
2B
queries/mth
(15 years)
450M+
active users
(7 years)
5B conf
min/yr
(6 years)
4B emails/day
Windows Azure Platform
• The Windows Azure Platform is a group of
cloud technologies to be used by applications
running in Microsoft’s data centers, on-
premises and on various devices.
• Windows Azure is a platform for running
Windows applications and storing data in the
cloud.
Windows Azure applications run in Microsoft
data centres and are accessed via the Internet.
Working
• Windows Azure runs on machines in Microsoft data
centres.
• Rather than Microsoft customers can install and run
software on their own computers , Windows Azure provide
a service.
•
• By which customers use it to run applications and store
data on Internet-accessible machines owned by Microsoft.
• Those applications might provide services to businesses, to
consumers, or both.
Applications that might be built on Windows
Azure
• An independent software vendor (ISV) :
-It could create an application that targets business users, an approach
that’s often referred to as Software as a Service (SaaS).
- ISVs can use Windows Azure as a foundation for a variety of
business-oriented SaaS applications.
- An ISV might create a SaaS application that targets consumers.
- -Facebook, OLX ,etc.
• Enterprises might use Windows Azure to build and run applications
that are used by their own employees.
Three main components of Windows Azure :
Windows Azure has three main components:
• Compute : The Compute service runs applications .
• Storage : The Storage service stores data .
• Fabric: It provides a common way to manage and monitor
applications that use this cloud platform.
Compute
Physical Machine 1
Compute Resources: Web and Worker Roles
• Web Role: Interactive .NET application hosted in IIS:
– Web Application or Web Service (WCF)
• Worker Role: Durable background process
– Often isolated from outside world
– There are ways to make it reachable for external application
• Fabric Agent collects resource metrics (usage, failures, …)
Slide 120
Hardware
Load
Balancer
Virtual Machine 2
Fabric Controller
Virtual Machine 1
Web Role
Instance
IIS
Fabric Agent
Physical Machine 2
Virtual Machine 2
Virtual Machine 1
Worker Instance
Fabric Agent
HTTP/HTTPS
Web Role
• Web role instance can accept incoming HTTP
or HTTPS requests.
• To allow this, it runs in a VM that includes
Internet Information Services (IIS) 7.
• Developers can create Web role instances
using ASP.NET, WCF, or another .NET
technology that works with IIS.
Worker Role
• It can’t accept requests from the outside world.
• A Worker role instance initiates its own requests for
input.
• It can read messages from a queue and it can open
connections with the outside world.
• Worker role instances can be viewed as similar to a
batch job or a Windows service.
Instance1
Instance 2
Instance 3
Instance 4
Instance 5
Windows Azure Storage provides
Blobs, Tables, and Queues.
Blob: binary large object
Window Azure Storage
• Cloud Applications can store data in
– Blobs
– Tables
– Queues
• Blobs provide storage for unstructured data.
– Blobs are organized in containers.
• Tables are structured collections of entities.
– Entities are collections of name/value pairs.
– Tables have no schema.
• Queues provide storage for small data items on a FIFO
basis.
– Queues enable communication between service instances.
Slide 125
Windows Azure Storage Architecture
Slide 126
Windows Azure Storage
Cloud Application/External Application
Storage Account
Blob Container
REST API
Blob Blob
Queue Table
Entity
Entity
Entity
.NET API (StorageClient)
Blob Blob
REST API
Storage Features
• Windows Azure Storage
– provides huge storage resources,
– is massively scalable, and
– is a highly reliable persistence mechanism.
• Data is stored in large server.
• Scalability
– Data can be distributed across many storage nodes.
– Storage access is load-balanced.
• Reliability
– Data is replicated to different storage nodes when a write
operation occurs (3 replicas in different default domains).
– When a device fails, data is replicated to a new storage node.
Slide 127
Storage Account
• The storage account is the entry point for all storage
services.
• Storage accounts can be created on the Azure
Development Portal.
• Storage accounts can be accessed with secret key:
– On the client side a hash code is computed (with SHA256).
–The hash code and the secret key is used to compute a
HMAC-SHA256 signature (Hash Message Authentication
Code).
– The HMAC is attached to the REST request.
– The storage server uses the signature to authenticate the
request.
Slide 128
Accessing Storage via REST
• Every storage item is identified by a Request URI:
– Blob: <http|https>://<account name>.blob.core.windows.net/<container>/<blob name>
– Table: <http|https>://<account name>.table.core.windows.net/<table name>
– Queue: <http|https>://<account name>.queue.core.windows.net/<queue name>
• HTTP verb represents the action executed on the resource:
– PUT: create container/queue/table, save blob/entity/queue item, set metadata, …
– GET: list blobs in container, get blob/entity/queue item, …
– DELETE: delete container/queue/table/blob/entity/queue item, …
• The REST API provides a platform-independent interface to
Azure storage.
• Example:
Slide 129
PUT http://myaccount.blob.core.windows.net/mycontainer/myblockblob HTTP/1.1
x-ms-version: 2009-09-19
x-ms-date: Sun, 27 Sep 2009 22:33:35 GMT
Content-Type: application/octet-stream
x-ms-blob-type: BlockBlob
x-ms-meta-m1: some metadata
Authorization: SharedKey myaccount:YjN4fAR8/AmBrqBz7MG2uFinQh4dscbj598g=
Content-Length: nn
Request Body
Blobs
• Blob Containers provide a grouping of a set of
blobs.
– Sharing policies are set at container level: public
read or private.
– E.g. Facebook Album(Public , Friends, Private)
– Containers can have metadata.
– E.g. Date /Time of creation
• Blob store big chunks of data.
– Blobs can have metadata.
Slide 130
Storage Account
Blob Container 1
Page
Blob 1
Page
Blob 2
Block Blob 1
B1 B2 B3
Block Blob 1
B1 B2 B3
Blob Container 2
Block Blob
blocks
Page Blob
pages
Twotypes ofBlobs
Uploading a Block Blob
Page Blob – Random Read/Write
Blob cont…….
• Windows Azure Blob storage is a service for
storing large amounts of unstructured data that
can be accessed from anywhere in the world via
HTTP or HTTPS.
• Common uses of Blob storage include:
• Serving images , videos or documents directly to a browser
• Storing files for distributed access
• Streaming video and audio
• Performing secure backup and disaster recovery
BlobStorageConcepts
BlobContainerAccount
cohowinery
images
PIC01.JPG
PIC02.JPG
videos VID1.AVI
http://<account>.blob.core.windows.net/<container>/<blobname>
Blobs
Blob Containers
Multiple Containers per Account
Special $root container
Blob Container
A container holds a set of blobs
Set access policies at the container level
Associate Metadata with Container
List the blobs in a container
IncludingBlobMetadata andMD5
NOsearch/query.i.e.noWHEREMetadataValue = ?
Blobs Throughput
Effectively in Partition of 1
Target of 60MB/s per Blob
BlobFeatures andFunctions
PutBlob
GetBlob
DeleteBlob
CopyBlob
<name, value>
Windows AzureTables
Tables
TableStorageConcepts
EntityTableAccount
cohowinery
Comments
Name =…
Comment= …
Advertisements
URI =…
Description =…
Name =…
Comment= …
URI =…
Description =…
Table . . .Table Table
Windows Azure Storage
A closer look at tables
Entity . . .Entity Entity
Property PropertyProperty
Name Type Value
Storage
Accounts
Tables
• Tables provide structured storage.
• Tables contain a set of entities.
• Entities contain a set of properties.
• Properties have
– a name, and
– a typed value (Int32, Int64, double, string, bool,
DateTime, byte array).
• Tables have no fixed schema, i.e. the structure of
each entity may be different.
• Each entity has two key properties:
– The partition key, and
– the row key.
Slide 143
Storage Account
Table1
Table 2
Entity 1
Property 1
Property 2
…
Entity 2
Property 1
Property 2
…
Partitioning of Tables
• The application controls the distribution of
entities by assigning partition keys to entities.
• If two entities have different partition keys they
may be stored in different storage nodes.
• Only entities stored within the same table and the
same partition can take part in an entity group
transaction.
• Exp: Partition key 2= GHRCE
- All friends Name and DOB is
Store in Node 2.
If Partition key 1= Shivaji College
- Then all friends Name and
DOB is Store in Node 1.
Slide 144
Storage Node 2
Person
Name Meyers
DOB 1999/5/5
Row Key 1
Partition Key 2
Name Gates
Row Key 2
Partition Key 2
Storage Node 1
Person
Name Gosling
DOB 1955/9/6
Row Key 3
Partition Key 1
DOB 1955/4/3
Entity Properties
Entity can have up to 255 properties
Up to 1MB per entity
Mandatory Properties for every entity
PartitionKey & RowKey
Uniquely identifiesan entity
Definesthe sortorder
No fixed schema for other properties
Each property is stored as a <name, typed value> pair
No schema stored for a table
No Fixed Schema
TableOperations
Create, Delete
Insert
Update
Merge
Replace
Delete
Queue
QueueStorageConcepts
MessageQueueAccount
order processing
customer ID
order ID
http://…
customer ID
order ID
http://…
cohowinery
Queue
Using Queues
Web Role
ASP.NET,
WCF, etc.
Worker Role
main()
{ … }
1) Receive
work
2) Put work
in queue
3) Get work
from queue
4) Do
work
Queues
Queues
• Queues provide a reliable asynchronous message
delivery mechanism.
• Reliability
– Messages are redelivered until a consumer
confirms that it has been successfully processed
– Messages are processed at least once.
• Decoupling of producer and consumer
– Different parts of the system can be implemented
using different technologies and languages.
– Producer and consumer needn’t be alive at the
same time.
• Scalability
– Queue length grows  number of consumer
instances can be increased.
Slide 154
Message
Producer
Message
Consumer
CloudQueue
Message
CloudQueue
Message
Queue
Inter-role Communication
• Role instances can communicate asynchronously via queues.
– Preferred method for reliable messaging
Slide 155
Role
Instance 1
Queue
Role
Instance 2
Role
Instance 3
Role
Instance 4
 Role instances can also communicate directly using TCP or HTTP(S)
connections.
External
App
Role
Instance 2
Load
Balancer
Role
Instance 1
Role
Instance 3
Role
Instance 4
External
Endpoint
Internal
Endpoint
Queues – Features and Limitations
• Queue
– The number of messages is not limited.
– Metadata can be associated.
– There is no guaranteed return order of messages.
– Messages are delivered at least once  consumer must be able to
deal with messages that are delivered more than once.
• Message
– Message can be up to 8KB in size.
– Message structure:
• MessageID: a GUID
• VisibilityTimeout: If processing of message is not confirmed within this time
range the message reappears in the queue (default = 30 seconds)
• MessageTTL: Time-to-live interval for messages (maximum = default = 7
days)
• Payload: String or byte array.
Slide 156
LooselyCoupledinteraction withQueues
Azure Queue
Input Queue
(Work Items)
Web
Role
Web
Role
Web
Role
Hello Cloud
THE FABRIC
•Video
• All Windows Azure applications and all of the
data in Windows Azure Storage live in some
Microsoft data center.
• Within that data center, the set of machines
dedicated to Windows Azure is organized into
a fabric(structure, framework ).
• Windows Azure Fabric consists of a (large)
group of machines, all of which are managed
by software called the fabric controller.
Cont…
• The fabric controller is replicated across a
group of five to seven machines.
• Because it can communicate with a fabric
agent on every computer, it’s also aware of
every Windows Azure application in this fabric.
• It monitors all running applications, for
example, giving it an up-to-the-minute picture
of what’s happening in the fabric.
• To create a scalable Web application on Windows Azure, a developer can use Web roles and tables.
CREATING A SCALABLE WEB APPLICATION
• The clients are browsers, and so the application logic
might be implemented using ASP.NET or another Web
technology.
• It’s also possible to create a scalable Web application
that exposes RESTful and/or SOAP-based Web services
using WCF.
• The fabric controller also monitors these
instances, making sure that the requested number is
always available.
• For data storage, the application uses Windows Azure
Storage tables, which provide scale-out storage capable
of handling very large amounts of data
So what do you mean by Scalable Application?
• In a conventional data center, it requires always having
enough machines on hand to handle the peaks hours
time, even though most of those systems go unused
most of the time.
• But ,if the application is built on Windows Azure, the
organization running it can expand the number of
instances it’s using only when needed, then shrink back
to a smaller number.
• An online IPL ticketing selling site, Customer Call
Centers Video sites with occasional hot
stories, governments online exam web sites that are
used mostly at certain times of day, and others.
A PARALLEL PROCESSING APPLICATION
• The parallel work is done by some number of Worker role
instances running simultaneously, each using blob data.
• Since Windows Azure imposes no limit on how long an
instance can run, each one can perform an arbitrary
amount of work.
• Through this interface, the user might
determine how many Worker instances should
run, start and stop those instances, get
results, and more.
• Communication between the Web role
instance and the Worker role instances relies
on Windows Azure Storage queues.
A parallel processing application can
communicate with an on-premises application.
• Rather than relying on a Web role instance running on
Windows Azure, the user might instead interact with the
Worker role instances via an on-premises application .
• Multiple Worker role instances run simultaneously, each
interacting with the outside world via queues.
• Here, however, work is put into those queues directly by
an on-premises application
• In a scenario like this, the user might have no idea that
the on-premises application he’s using relies on
Windows Azure for parallel processing.
• Exp: mobile tracking application
On-premises software is installed
and run on computers on the
premises (in the building) of the
person or organisation using the
software, rather than at a remote
facility, such as at a server or
cloud somewhere on the
Internet.
CREATING A SCALABLE WEB APPLICATION WITH
BACKGROUND PROCESSING
• A majority of today’s applications built to provide a
browser interface.
• That do nothing but accept and respond to browser
requests that are useful, but they’re also limiting.
• There are lots of situations where Web-accessible
software also needs to initiate work that runs in the
background, independently from the request/response
part of the application.
• For example, Online recommendation systems
, Facebook etc.
• Instead, the part of the application that accepts
browser requests should be able to initiate a
background task that carries out this work
without the knowledge of users.
• Windows Azure Web roles and Worker roles can
be used together to address this scenario. i.e.
Background processing
• The two major parts of Windows Azure are:
– Azure AppFabric
– SQL Azure
AppFabric
• Windows Azure AppFabric provides a comprehensive cloud
middleware platform for developing, deploying and managing
applications on the Windows Azure Platform.
• It enables bridging your existing applications to the cloud
through secure connectivity across network and geographic
boundaries.
• Let existing WhatsApp application merge with Facebook’s cloud
application.
• It delivers additional developer productivity, adding in higher-
level Platform-as-a-Service (PaaS) capabilities on top of the
familiar Windows Azure application model.
Video
• There are three key components of Windows Azure
AppFabric:
• Middleware Services: platform capabilities as
services, which raise the level of abstraction and
reduce complexity of cloud development.
• Composite Applications: a set of new innovative
frameworks, tools and composition engine to easily
assemble, deploy, and manage a composite
application as a single logical entity.
• Scale-out application infrastructure: optimized for
cloud-scale services and mid-tier components.
AppFabric
AppFabric
• Middleware Services
• Windows Azure AppFabric provides pre-built, higher level
middleware services that raise the level of abstraction and
reduce complexity of cloud development.
• These services are open and interoperable across languages
(.NET, Java, Ruby, PHP…) and give developers a powerful
pre-built “class library" for next-gen cloud applications.
• Developers can use each of the services stand-alone, or
combine services to provide a composite solution.
AppFabric
• Service Bus: provides secure messaging and connectivity
capabilities that enable building distributed and disconnected
applications in the cloud, as well as hybrid applications across
both on-premise and the cloud.
• It enables using various communication and messaging
protocols and patterns, and removes the need for the
developer to worry about delivery assurance, reliable
messaging and scale.
• Access Control :enables an easy way to
provide identity and access control to web
applications and services, while integrating
with standards-based identity providers,
including enterprise directories such as
Active Directory®, and web identities such as
Windows Live ID, Google, Yahoo! and
Facebook.
• Caching : performance of Windows Azure and SQL Azure
based apps by providing a distributed, in-memory application
cache, provided entirely as a service (no installation or
management of instances, dynamically increase/decrease
cache size as needed).
• Pre-integration with ASP.NET enables easy acceleration of
web applications without having to modify application code.
• Integration: on Windows Azure, using out-of-box
integration patterns to accelerate and simplify development.
• It will also deliver higher level business user enablement
capabilities such as Business Activity Monitoring and Rules, as
well as self-service trading partner community portal and
provisioning of business-to-business pipelines.
• Composite App : it delivers a complete hosting
environment for web services built using Windows
Communication Foundation (including WCF Data Services and
WCF RIA Services) and workflows built using Windows
Workflow Foundation.
More Info
Who’s Live on Windows Azure
SQL Azure
• SQL Server hosted in the cloud.
• It provides relational database features, so that a
platform is scalable, highly available and load-balanced.
Video
SQL Azure Database

More Related Content

What's hot

What's hot (20)

Cloud Infrastructure Mechanisms
Cloud Infrastructure MechanismsCloud Infrastructure Mechanisms
Cloud Infrastructure Mechanisms
 
Virtual machine security
Virtual machine securityVirtual machine security
Virtual machine security
 
PPT on Cloud computing
PPT on Cloud computingPPT on Cloud computing
PPT on Cloud computing
 
Green cloud computing
Green cloud computingGreen cloud computing
Green cloud computing
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 
Distributed Server
Distributed ServerDistributed Server
Distributed Server
 
Network security
Network security Network security
Network security
 
Server virtualization
Server virtualizationServer virtualization
Server virtualization
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 
Proxy Server
Proxy ServerProxy Server
Proxy Server
 
PRESENTATION ON CLOUD COMPUTING
PRESENTATION ON CLOUD COMPUTINGPRESENTATION ON CLOUD COMPUTING
PRESENTATION ON CLOUD COMPUTING
 
Implementation levels of virtualization
Implementation levels of virtualizationImplementation levels of virtualization
Implementation levels of virtualization
 
Containerized Cloud Computing - Redhat
Containerized Cloud Computing - RedhatContainerized Cloud Computing - Redhat
Containerized Cloud Computing - Redhat
 
Distributed computing
Distributed computingDistributed computing
Distributed computing
 
Top 10 cloud service providers
Top 10 cloud service providersTop 10 cloud service providers
Top 10 cloud service providers
 
Virtualization in cloud computing ppt
Virtualization in cloud computing pptVirtualization in cloud computing ppt
Virtualization in cloud computing ppt
 
Firewalls
FirewallsFirewalls
Firewalls
 
Mobile cloud Computing
Mobile cloud ComputingMobile cloud Computing
Mobile cloud Computing
 
Physical design of io t
Physical design of io tPhysical design of io t
Physical design of io t
 
Data center virtualization
Data center virtualizationData center virtualization
Data center virtualization
 

Similar to Concepts of Distributed Computing & Cloud Computing

cloudcomputingdistributedcomputing-171208050503 (1).pdf
cloudcomputingdistributedcomputing-171208050503 (1).pdfcloudcomputingdistributedcomputing-171208050503 (1).pdf
cloudcomputingdistributedcomputing-171208050503 (1).pdfArchanaPandiyan
 
Grid and Cloud Computing Lecture 1a.pptx
Grid and Cloud Computing Lecture 1a.pptxGrid and Cloud Computing Lecture 1a.pptx
Grid and Cloud Computing Lecture 1a.pptxDrAdeelAkram2
 
_Cloud_Computing_Overview.pdf
_Cloud_Computing_Overview.pdf_Cloud_Computing_Overview.pdf
_Cloud_Computing_Overview.pdfTyStrk
 
Week 1 Lecture_1-5 CC_watermark.pdf
Week 1 Lecture_1-5 CC_watermark.pdfWeek 1 Lecture_1-5 CC_watermark.pdf
Week 1 Lecture_1-5 CC_watermark.pdfJohn422973
 
Week 1 lecture material cc
Week 1 lecture material ccWeek 1 lecture material cc
Week 1 lecture material ccAnkit Gupta
 
vssutcloud computing.pptx
vssutcloud computing.pptxvssutcloud computing.pptx
vssutcloud computing.pptxMunmunSaha7
 
Computing notes
Computing notesComputing notes
Computing notesthenraju24
 
20IT703_PDS_PPT_Unit_I.ppt
20IT703_PDS_PPT_Unit_I.ppt20IT703_PDS_PPT_Unit_I.ppt
20IT703_PDS_PPT_Unit_I.pptsuganthi66742
 
Lect 1 Distributed System.pptx
Lect 1 Distributed System.pptxLect 1 Distributed System.pptx
Lect 1 Distributed System.pptxPardonSamson
 
CS8791 CLOUD COMPUTING_UNIT-I_FINAL_ppt (1).pptx
CS8791 CLOUD COMPUTING_UNIT-I_FINAL_ppt (1).pptxCS8791 CLOUD COMPUTING_UNIT-I_FINAL_ppt (1).pptx
CS8791 CLOUD COMPUTING_UNIT-I_FINAL_ppt (1).pptxMALATHYANANDAN
 
Lect 2 Types of Distributed Systems.pptx
Lect 2 Types of Distributed Systems.pptxLect 2 Types of Distributed Systems.pptx
Lect 2 Types of Distributed Systems.pptxPardonSamson
 
UNIT I DIS.pptx
UNIT I DIS.pptxUNIT I DIS.pptx
UNIT I DIS.pptxSamPrem3
 
OIT552 Cloud Computing Material
OIT552 Cloud Computing MaterialOIT552 Cloud Computing Material
OIT552 Cloud Computing Materialpkaviya
 
introduction to cloud computing for college.pdf
introduction to cloud computing for college.pdfintroduction to cloud computing for college.pdf
introduction to cloud computing for college.pdfsnehan789
 
01Introduction to Cloud Computing .pptx
01Introduction to Cloud Computing  .pptx01Introduction to Cloud Computing  .pptx
01Introduction to Cloud Computing .pptxssuser586772
 

Similar to Concepts of Distributed Computing & Cloud Computing (20)

cloudcomputingdistributedcomputing-171208050503 (1).pdf
cloudcomputingdistributedcomputing-171208050503 (1).pdfcloudcomputingdistributedcomputing-171208050503 (1).pdf
cloudcomputingdistributedcomputing-171208050503 (1).pdf
 
Grid and Cloud Computing Lecture 1a.pptx
Grid and Cloud Computing Lecture 1a.pptxGrid and Cloud Computing Lecture 1a.pptx
Grid and Cloud Computing Lecture 1a.pptx
 
_Cloud_Computing_Overview.pdf
_Cloud_Computing_Overview.pdf_Cloud_Computing_Overview.pdf
_Cloud_Computing_Overview.pdf
 
Week 1 Lecture_1-5 CC_watermark.pdf
Week 1 Lecture_1-5 CC_watermark.pdfWeek 1 Lecture_1-5 CC_watermark.pdf
Week 1 Lecture_1-5 CC_watermark.pdf
 
Week 1 lecture material cc
Week 1 lecture material ccWeek 1 lecture material cc
Week 1 lecture material cc
 
vssutcloud computing.pptx
vssutcloud computing.pptxvssutcloud computing.pptx
vssutcloud computing.pptx
 
CCUnit1.pdf
CCUnit1.pdfCCUnit1.pdf
CCUnit1.pdf
 
1.intro. to distributed system
1.intro. to distributed system1.intro. to distributed system
1.intro. to distributed system
 
Computing notes
Computing notesComputing notes
Computing notes
 
20IT703_PDS_PPT_Unit_I.ppt
20IT703_PDS_PPT_Unit_I.ppt20IT703_PDS_PPT_Unit_I.ppt
20IT703_PDS_PPT_Unit_I.ppt
 
Unit 1
Unit 1Unit 1
Unit 1
 
Lect 1 Distributed System.pptx
Lect 1 Distributed System.pptxLect 1 Distributed System.pptx
Lect 1 Distributed System.pptx
 
CS8791 CLOUD COMPUTING_UNIT-I_FINAL_ppt (1).pptx
CS8791 CLOUD COMPUTING_UNIT-I_FINAL_ppt (1).pptxCS8791 CLOUD COMPUTING_UNIT-I_FINAL_ppt (1).pptx
CS8791 CLOUD COMPUTING_UNIT-I_FINAL_ppt (1).pptx
 
Unit 1
Unit 1Unit 1
Unit 1
 
Lect 2 Types of Distributed Systems.pptx
Lect 2 Types of Distributed Systems.pptxLect 2 Types of Distributed Systems.pptx
Lect 2 Types of Distributed Systems.pptx
 
UNIT I DIS.pptx
UNIT I DIS.pptxUNIT I DIS.pptx
UNIT I DIS.pptx
 
Unit 1
Unit 1Unit 1
Unit 1
 
OIT552 Cloud Computing Material
OIT552 Cloud Computing MaterialOIT552 Cloud Computing Material
OIT552 Cloud Computing Material
 
introduction to cloud computing for college.pdf
introduction to cloud computing for college.pdfintroduction to cloud computing for college.pdf
introduction to cloud computing for college.pdf
 
01Introduction to Cloud Computing .pptx
01Introduction to Cloud Computing  .pptx01Introduction to Cloud Computing  .pptx
01Introduction to Cloud Computing .pptx
 

More from Hitesh Kumar Markam (19)

Data guard
Data guardData guard
Data guard
 
Tunning overview
Tunning overviewTunning overview
Tunning overview
 
Resize sga
Resize sgaResize sga
Resize sga
 
Rman offline backup
Rman offline backupRman offline backup
Rman offline backup
 
Oracle shutdown
Oracle shutdownOracle shutdown
Oracle shutdown
 
Log miner in oracle.ppt
Log miner in oracle.pptLog miner in oracle.ppt
Log miner in oracle.ppt
 
Dba in 2 days unit no 9
Dba in 2 days unit no 9Dba in 2 days unit no 9
Dba in 2 days unit no 9
 
5 backuprecoveryw imp
5 backuprecoveryw imp5 backuprecoveryw imp
5 backuprecoveryw imp
 
Rmanpres
RmanpresRmanpres
Rmanpres
 
Pl sql
Pl sqlPl sql
Pl sql
 
Oracle archi ppt
Oracle archi pptOracle archi ppt
Oracle archi ppt
 
Lecture2 oracle ppt
Lecture2 oracle pptLecture2 oracle ppt
Lecture2 oracle ppt
 
Dbms objective and subjective notes
Dbms objective and subjective notesDbms objective and subjective notes
Dbms objective and subjective notes
 
Dba in 2 days
Dba in 2 daysDba in 2 days
Dba in 2 days
 
Creating database
Creating databaseCreating database
Creating database
 
1 plsql introduction1
1 plsql introduction11 plsql introduction1
1 plsql introduction1
 
Store programs
Store programsStore programs
Store programs
 
javascript code for mysql database connection
javascript code for mysql database connectionjavascript code for mysql database connection
javascript code for mysql database connection
 
Advanced Planning And Optimization
Advanced Planning And OptimizationAdvanced Planning And Optimization
Advanced Planning And Optimization
 

Recently uploaded

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........LeaCamillePacle
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2
 

Recently uploaded (20)

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........Atmosphere science 7 quarter 4 .........
Atmosphere science 7 quarter 4 .........
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
 

Concepts of Distributed Computing & Cloud Computing

  • 3. Contents • Introduction – Motivation – History – Goal – Characteristics • Example of Applications • Scenario • Advantages and Disadvantages • Issues and Challenges • Conclusion • References
  • 4. Introduction • Definition – “A distributed system consists of multiple autonomous computers that communicate through a computer network. – “Distributed computing utilizes a network of many computers, each accomplishing a portion of an overall task, to achieve a computational result much more quickly than with a single computer.” – “Distributed computing is any computing that involves multiple computers remote from each other that each have a role in a computation problem or information processing.”
  • 5. Introduction • A distributed system is one in which hardware or software components located at networked computers communicate and coordinate their actions only by message passing. • In the term distributed computing, the word distributed means spread out across space. Thus, distributed computing is an activity performed on a spatially distributed system. • These networked computers may be in the same room, same campus, same country, or in different continents
  • 7. Motivation • Inherently distributed applications • Performance/cost • Resource sharing • Flexibility and extensibility • Availability and fault tolerance • Scalability • Network connectivity is increasing. • Combination of cheap processors often more cost- effective than one expensive fast system. • Potential increase of reliability.
  • 8. History • 1975 – 1985 – Parallel computing was favored in the early years – Primarily vector-based at first – Gradually more thread-based parallelism was introduced – The first distributed computing programs were a pair of programs called Creeper and Reaper invented in 1970s – Ethernet that was invented in 1970s. – ARPANET e-mail was invented in the early 1970s and probably the earliest example of a large-scale distributed application.
  • 9. History • 1985 -1995 – Massively parallel architectures start rising and message passing interface and other libraries developed – Bandwidth was a big problem – The first Internet-based distributed computing project was started in 1988 by the DEC System Research Center. – Distributed.net was a project founded in 1997 - considered the first to use the internet to distribute data for calculation and collect the results,
  • 10. History • 1995 – Today – Cluster/grid architecture increasingly dominant – Special node machines eschewed in favor of COTS technologies – Web-wide cluster software – Google take this to the extreme (thousands of nodes/cluster) – SETI@Home started in May 1999 - analyze the radio signals that were being collected by the Arecibo Radio Telescope in Puerto Rico.
  • 11. Goal • Making Resources Accessible – Data sharing and device sharing • Distribution Transparency – Access, location, migration, relocation, replication, concurrency, failure • Communication – Make human-to-human comm. easier. E.g.. : electronic mail • Flexibility – Spread the work load over the available machines in the most cost effective way • To coordinate the use of shared resources • To solve large computational problem
  • 12. Characteristics • Resource Sharing • Openness • Concurrency • Scalability • Fault Tolerance • Transparency
  • 13. Architecture • Client-server • 3-tier architecture • N-tier architecture • loose coupling, or tight coupling • Peer-to-peer • Space based
  • 14. • Examples of commercial application : – Database Management System – Distributed computing using mobile agents – Local intranet – Internet (World Wide Web) – JAVA Remote Method Invocation (RMI) Application
  • 15. Distributed Computing Using Mobile Agents • Mobile agents can be wandering around in a network using free resources for their own computations.
  • 16. Local Intranet • A portion of Internet that is separately administered & supports internal sharing of resources (file/storage systems and printers) is called local intranet.
  • 17. Internet • The Internet is a global system of interconnected computer networks that use the standardized Internet Protocol Suite (TCP/IP).
  • 18. JAVA RMI • Embedded in language Java:- – Object variant of remote procedure call – Adds naming compared with RPC (Remote Procedure Call) – Restricted to Java environments RMI Architecture
  • 19. Categories of Applications in distributed computing • Science • Life Sciences • Cryptography • Internet • Financial • Mathematics • Language • Art • Puzzles/Games • Miscellaneous • Distributed Human Project • Collaborative Knowledge Bases • Charity
  • 20. Example of applications • Internet – Gomez Distributed PEER Client (peerReview) – Evaluate the performance of large websites to find bottlenecks. • Life Sciences - Compute Against Cancer® (CAC) – Create immediate impact in the lives of cancer patients and their families today, while at the same time empowering the research that will result in improved therapies — and perhaps even the cure.
  • 21. Example of applications Collaborative Knowledge Bases – Wikipedia  A collaborative project to produce a complete a free encyclopedia from scratch.  The encyclopedia is available in many non-English languages. Distributed Human Projects- Open Mind Indoor Common Sense  Help teach indoor mobile robots to be smarter. It will create a repository of knowledge which will enable people to create more intelligent mobile robots for use in home and office environments.
  • 22. • The key characteristics of a variety of modern computing scenarios distinguishing between micro, medium and global-scale scenarios. • Micro scale – Researches focus on small computer-based components that can be deployed in an environment and can coordinate their actions (i.e. local sensing and effecting of specific environmental conditions) with the goal of enriching our physical word with specific “smart” functionalities. Potential applications range from simple monitoring activities, to smart materials and self assembly of computational beings • Medium scale – Our world is populated by local ad-hoc networks (e.g. the ensemble of Bluetooth enabled devices we could carry on or we could find in our cars) and network-based furniture (e.g., Web enabled fridges and ovens able to interact with each other and effectively support our cooking activities) • Global scale – In Internet computing, there is need to access data and services according to a variety of patterns, independently of the availability/location of specific servers Scenario
  • 23. Scenario Mechanisms of Self-organization in Modern Distributed Computing Scenarios
  • 24. Advantages • Economics:- – Computers harnessed together give a better price/performance ratio than mainframes. • Speed:- – A distributed system may have more total computing power than a mainframe. • Inherent distribution of applications:- – Some applications are inherently distributed. E.g., an ATM-banking application. • Reliability:- – If one machine crashes, the system as a whole can still survive if you have multiple server machines and multiple storage devices (redundancy). • Extensibility and Incremental Growth:- – Possible to gradually scale up (in terms of processing power and functionality) by adding more sources (both hardware and software). This can be done without disruption to the rest of the system.
  • 25. Disadvantages • Complexity :- – Lack of experience in designing, and implementing a distributed system. E.g. which platform (hardware and OS) to use, which language to use etc. • Network problem:- – If the network underlying a distributed system saturates or goes down, then the distributed system will be effectively disabled thus negating most of the advantages of the distributed system. • Security:- – Security is a major hazard since easy access to data means easy access to secret data as well.
  • 26. Issues and Challenges • Heterogeneity of components :- – variety or differences that apply to computer hardware, network, OS, programming language and implementations by different developers. – All differences in representation must be deal with if to do message exchange. – Example : different call for exchange message in UNIX different from Windows. • Openness:- – System can be extended and re-implemented in various ways. – Cannot be achieved unless the specification and documentation are made available to software developer. – The most challenge to designer is to tackle the complexity of distributed system; design by different people.
  • 27. Issues and Challenges cont… • Transparency:- – Aim : make certain aspects of distribution are invisible to the application programmer ; focus on design of their particular application. – They not concern the locations and details of how it operate, either replicated or migrated. – Failures can be presented to application programmers in the form of exceptions – must be handled.
  • 28. Issues and Challenges cont… • Transparency:- – This concept can be summarize as shown in this Figure:
  • 29. Issues and Challenges cont… • Security:- – Security for information resources in distributed system have 3 components : a. Confidentiality : protection against disclosure to unauthorized individuals. b. Integrity : protection against alteration/corruption c. Availability : protection against interference with the means to access the resources. – The challenge is to send sensitive information over Internet in a secure manner and to identify a remote user or other agent correctly.
  • 30. Issues and Challenges cont.. • Scalability :- – Distributed computing operates at many different scales, ranging from small Intranet to Internet. – A system is scalable if there is significant increase in the number of resources and users. – The challenges is : a. controlling the cost of physical resources. b. controlling the performance loss. c. preventing software resource running out. d. avoiding performance bottlenecks.
  • 31. Issues and Challenges cont… • Failure Handling :- – Failures in a distributed system are partial – some components fail while others can function. – That’s why handling the failures are difficult a. Detecting failures : to manage the presence of failures cannot be detected but may be suspected. b. Masking failures : hiding failure not guaranteed in the worst case. • Concurrency :- – Where applications/services process concurrency, it will effect a conflict in operations with one another and produce inconsistence results. – Each resource must be designed to be safe in a concurrent environment.
  • 32. What is Distributed Computing/System? • Distributed computing –A field of computing science that studies distributed system. –The use of distributed systems to solve computational problems.
  • 33. • Distributed system – Wikipedia • There are several autonomous computational entities, each of which has its own local memory. • The entities communicate with each other by message passing. – Operating System Concept • The processors communicate with one another through various communication lines, such as high-speed buses or telephone lines. • Each processor has its own local memory.
  • 34. What is Distributed Computing/System? • Distributed program –A computing program that runs in a distributed system • Distributed programming –The process of writing distributed program
  • 35. What is Distributed Computing/System? • Common properties – Fault tolerance • When one or some nodes fails, the whole system can still work fine except performance. • Need to check the status of each node – Each node play partial role • Each computer has only a limited, incomplete view of the system. Each computer may know only one part of the input.
  • 36. – Resource sharing • Each user can share the computing power and storage resource in the system with other users – Load Sharing • Dispatching several tasks to each nodes can help share loading to the whole system. – Easy to expand • It id easy to expand the system by adding nodes.
  • 37. Why Distributed Computing? • The nature of application • Performance – Computing intensive • The task could consume a lot of time on computing. – Data intensive • The task that deals with a lot mount or large size of files. For example, Facebook,Orkut,Twitter ,etc. • Robustness – No SPOF (Single Point Of Failure) – Other nodes can execute the same task executed on failed node.
  • 38. Common Architectures • Communicate and coordinate works among concurrent processes – Processes communicate by sending/receiving messages – Synchronous/Asynchronous
  • 39. Common Architectures • Master/Slave architecture – Master/slave is a model of communication where one device or process has unidirectional control over one or more other devices • Database replication –Source database can be treated as a master and the destination database can treated as a slave. • Client-server –web browsers and web servers
  • 40. Best Practice /Characteristics • Data Intensive or Computing Intensive – Data size and the amount of data • The attribute of data you consume • Computing intensive –We can move data to the nodes where we can execute jobs • Data Intensive –We can separate/replicate data to difference nodes, then we can execute our tasks on these nodes –Reduce data replication when executing tasks
  • 41. Best Practice /Characteristics • Master nodes need to know data location • No data loss when incidents happen –SAN (Storage Area Network) –Data replication on different nodes • Synchronization –When splitting tasks to different nodes, how can we make sure these tasks are synchronized?
  • 42. Best Practice /Characteristics • Robustness – Still safe when one or partial nodes fail – Need to recover when failed nodes are online. No further or few action is needed – Failure detection • When any nodes fails, master nodes can detect this situation. –Eg: Heartbeat detection – App/Users don’t need to know if any partial failure happens. • Restart tasks on other nodes for users
  • 43. Best Practice /Characteristics  Network issue  Bandwidth  Need to think of bandwidth when copying files from one node to other nodes if we would like to execute the task on the nodes if no data in these nodes.  Scalability  Easy to expand  Hadoop – configuration modification and start daemon  Optimization  What can we do if the performance of some nodes is not good?  Monitoring the performance of each node  According to any information exchange like heartbeat or log  Resume the same task on another nodes
  • 44. summary • The concept of distributed computing is the most efficient way to achieve the optimization. • Distributed computing is anywhere : intranet, Internet or mobile ubiquitous computing (laptop, PDAs, pagers, smart watches, hi-fi systems) • It deals with hardware and software systems, that contain more than one processing / storage and run in concurrently. • Main motivation factor is resource sharing; such as files , printers, web pages or database records. • Grid computing and cloud computing are form of distributed computing.
  • 45. Case study - Condor Case study - Hadoop
  • 46. The History and Future of Cloud Computing
  • 47. What is Cloud Computing? • Cloud Computing is a general term used to describe a new class of network based computing services that takes place over the Internet, – basically a step on from Utility Computing – a collection/group of integrated and networked hardware, software and Internet infrastructure (called a platform). – Using the Internet for communication and transport provides hardware, software and networking services to clients • These platforms hide the complexity and details of the underlying infrastructure from users and applications by providing very simple graphical interface or API (Applications Programming Interface).
  • 48. What is Cloud Computing? • In addition, the platform provides on demand services, that are always ON, Anywhere, Anytime and Any place. • Pay for use and as per needed, elastic – scale up and down in capacity and functionalities • The hardware and software services are available to – general public, enterprises, corporations and businesses markets
  • 49. Cloud Summary • Cloud computing is an umbrella term used to refer to Internet based development and services • A number of characteristics define cloud data, applications services and infrastructure: – Remotely hosted: Services or data are hosted on remote infrastructure. – Ubiquitous: Services or data are available from anywhere. – Commodified: The result is a utility computing model similar to traditional that of traditional utilities, like gas and electricity - you pay for what you would want!
  • 51. What is Cloud Computing Adopted from: Effectively and Securely Using the Cloud Computing Paradigm by peter Mell, Tim Grance • Shared pool of configurable computing resources • On-demand network access • Provisioned by the Service Provider
  • 52. Cloud Computing Characteristics Common Characteristics: Low Cost Software Virtualization Service Orientation Advanced Security Homogeneity Massive Scale Elastic Computing Geographic Distribution Essential Characteristics: Resource Pooling Broad Network Access Rapid Elasticity Measured Service On Demand Self-Service Adopted from: Effectively and Securely Using the Cloud Computing Paradigm by peter Mell, Tim Grance
  • 53. Essential Characteristics of cloud computing • • On-demand self-service: A client can provision computer resources without the need for interaction with cloud service provider personnel. • • Broad network access: Access to resources in the cloud is available over the network using standard methods in a manner that provides platform-independent access to clients of all types. • This includes a mixture of heterogeneous operating systems, and thick and thin platforms such as laptops, mobile phones,. • • Resource pooling: A cloud service provider creates resources that are pooled together in a system that supports multi-tenant usage.
  • 54. Benefits of cloud computing • • Rapid elasticity: Resources can be rapidly and elastically provisioned. • Measured service: The use of cloud system resources is measured, audited, and reported to the customer based on a metered system.
  • 55. Cloud Service Models Software as a Service (SaaS) Platform as a Service (PaaS) Infrastructure as a Service (IaaS) Google App Engine SalesForce CRM LotusLive Adopted from: Effectively and Securely Using the Cloud Computing Paradigm by peter Mell, Tim Grance
  • 56. Type of Cloud • Deployment models: This refers to the location and management of the cloud's infrastructure. • Service models: This consists of the particular types of services that you can access on a cloud computing platform.
  • 57. Classification of Clouds • Cloud services can also be categorized into three types based on access and location – A public cloud is available to anyone on the Internet – Any user can sign up to use the public cloud (e.g. Microsoft Windows Azure) – A private cloud is a proprietary cloud environment that only provides cloud services to a limited number of users – Private clouds are usually within your own data center behind your firewall Slide 59
  • 58. Classification of Clouds Slide 60 – A hybrid cloud, sometimes called a virtual private cloud, provides services that run on a public cloud infrastructure, but limits access to it with a virtual private network (VPN) [Oracle 2009]
  • 59. Public Cloud • The main benefits of using a public cloud service are: – Managed Infrastructure Easy and inexpensive set-up because hardware (+storage) is managed by the provider – Scalability/Elasticity  you get more/less processing power, storage, etc. when needed – Eliminates complex procurement cycles improving the time-to- market for its users – Removes Undifferentiated "Heavy Lifting“  let its users focus on delivering differentiating business value instead of wasting valuable resources on the undifferentiated heavy lifting that makes up most of IT infrastructure – Pay as you go  No wasted resources because you pay for what you use • Public clouds are run by third parties, and applications from different customers are likely to be mixed together on the cloud’s servers, storage systems, and networks.
  • 60. Public Cloud • The main drawbacks of using a public cloud service are – Vendor lock-in for PaaS  e.g. you cannot easily change your software environment (e.g. from Windows Azure with C# to Google App Engine with Java) • Examples of public clouds – Amazon Elastic Compute Cloud (EC2) – Google App Engine – Microsoft Windows Azure
  • 61. Private Cloud • The main benefits of using a private cloud service are – Exclusive use providing the utmost control over data, security, and quality of service – Ownership  has control over infrastructure and how applications are deployed – Managed by companies own IT organization – High level of control • The main drawbacks are – Infrastructure costs (e.g. capital expenses for storage) – Operating costs for running the data center – Average utilization is not always known in advance  right scale? • Examples of private clouds – Eucalyptus
  • 62. Hybrid Cloud • The main benefits of using a hybrid cloud service are – Augment a private cloud with the resources of a public cloud  provide on-demand, externally provisioned scale [Oracle 2009] • The main drawbacks are – Complexity how to distribute applications across both a public and private cloud [Oracle 2009] Slide 64 [Oracle 2009]
  • 64.
  • 65. ```
  • 66.
  • 67.
  • 68.
  • 69.
  • 70.
  • 71. Advantage of cloud computing • • Lower costs: Because cloud networks operate at higher efficiencies and with greater utilization, significant cost reductions are often encountered. • • Ease of utilization: Depending upon the type of service being offered, you may find that you do not require hardware or software licenses to implement your service. • • Quality of Service: The Quality of Service (QoS) is something that you can obtain under contract from your vendor. • • Reliability: The scale of cloud computing networks and their ability to provide load balancing and failover makes them highly reliable, often much more reliable than what you can achieve in a single organization.
  • 72. • • Outsourced IT management: A cloud computing deployment lets someone else manage your computing infrastructure while you manage your business. In most instances, you achieve considerable reductions in IT staffing costs. • • Simplified maintenance and upgrade: Because the system is centralized, you can easily apply patches and upgrades. This means your users always have access to the latest software versions. • • Low Barrier to Entry: In particular, upfront capital expenditures are dramatically reduced. In cloud computing, anyone can be a giant at any time. Advantage of cloud computing
  • 73. Disadvantages of Cloud Computing • Requires a constant Internet connection: – Cloud computing is impossible if you cannot connect to the Internet. – Since you use the Internet to connect to both your applications and documents, if you do not have an Internet connection you cannot access anything, even your own documents. – A dead Internet connection means no work and in areas where Internet connections are few or inherently unreliable, this could be a deal-breaker.
  • 74. Disadvantages of Cloud Computing • Does not work well with low-speed connections: – Similarly, a low-speed Internet connection, such as that found with dial-up services, makes cloud computing painful at best and often impossible. – Web-based applications require a lot of bandwidth to download, as do large documents. • Features might be limited: – This situation is bound to change, but today many web-based applications simply are not as full-featured as their desktop-based applications. • For example, you can do a lot more with Microsoft PowerPoint than with Google Presentation's web-based offering
  • 75. Disadvantages of Cloud Computing • Can be slow: – Even with a fast connection, web-based applications can sometimes be slower than accessing a similar software program on your desktop PC. – Everything about the program, from the interface to the current document, has to be sent back and forth from your computer to the computers in the cloud. – If the cloud servers happen to be backed up at that moment, or if the Internet is having a slow day, you would not get the instantaneous access you might expect from desktop applications.
  • 76. Disadvantages of Cloud Computing • Stored data might not be secure: – With cloud computing, all your data is stored on the cloud. • The questions is How secure is the cloud? – Can unauthorised users gain access to your confidential data? • Stored data can be lost: – Theoretically, data stored in the cloud is safe, replicated across multiple machines. – But on the off chance that your data goes missing, you have no physical or local backup. • Put simply, relying on the cloud puts you at risk if the cloud lets you down.
  • 77. Measuring the Cloud's Value • • Profit: The economies of scale can make this a profitable business. • • Optimization: The infrastructure already exists and isn't fully utilized. • This was certainly the case for Amazon Web Services. • • Strategic: A cloud computing platform extends the company's products and their franchise. • This is the case for Microsoft's Windows Azure Platform. • • Extension: A branded cloud computing platform can extend customer relationships by offering additional service options. • This is the case with IBM Global Services and the various IBM cloud services.
  • 78. • • Presence: Establish a presence in a market before a large competitor can emerge. • Google App Engine allows a developer to scale an application immediately. For Google, its office applications can be rolled out quickly and to large audiences. • • Platform: A cloud computing provider can become a hub master at the centre of many ISV's (Independent Software Vendor) offerings. Measuring the Cloud's Value
  • 79. Basic Cloud Characteristics • The “no-need-to-know” in terms of the underlying details of infrastructure, applications interface with the infrastructure via the APIs. • The “flexibility and elasticity” allows these systems to scale up and down at will – utilising the resources of all kinds • CPU, storage, server capacity, load balancing, and databases • The “pay as much as used and needed” type of utility computing and the “always on, anywhere and any place” type of network-based computing.
  • 80. Basic Cloud Characteristics • Cloud are transparent to users and applications, they can be built in multiple ways – branded products, proprietary open source, hardware or software – In general, they are built on clusters of PC servers and off-the-shelf components plus Open Source software combined with in-house applications and/or system software.
  • 81. Accessing the Cloud • Cloud APIs allow software to request data and computations from one or more services through a direct or indirect interface. • Cloud APIs most commonly expose their features via REST and/or SOAP. (Representational state transfer) • Vendor specific and cross-platform interfaces are available for specific functions. • Cross-platform interfaces have the advantage of allowing applications to access services from multiple providers without rewriting, but may have less functionality or other limitations vs vendor-specific solutions
  • 82. Cloud API Available for • Infrastructure • Infrastructure APIs modify the resources available to operate the application. • Service • Service APIs provide an interface into a specific capability provided by a service explicitly created to enable that capability. • Database, messaging, web portals, mapping, e-commerce and storage are all examples of service APIs. • Application • Application APIs provide methods to interface and extend applications on the web. • Application APIs connect to applications such as CRM, ERP, social media and help desk.
  • 83.
  • 85.
  • 86. Dropbox • Free space: 2GB • Premium space: $99/year for 100GB • File size limit: Unlimited • Platforms: Windows, Mac, Linux, iOS, Andr oid, BlackBerry • Best for: Seamless syncing Google Drive  Free space: 15GB  Premium space: $59.88/year for 100GB  File size limit: 10GB  Platforms: Windows, Mac, iOS, Android  Best for: Web apps Apple iCloud  Free space: 5GB  Premium space: $100/year for 50GB  File size limit: 25MB free/250MB paid  Platforms: Mac, iOS, Windows  Best for: Heavy iTunes/Mac users Microsoft SkyDrive  Free space: 7GB  Premium space: $50/year for 100GB  File size limit: 2GB  Platforms: Windows, Mac, iOS, Android, Windows Phone  Best for: Windows/Office integration Free space
  • 87. • All 3 – Dropbox, Google Drive and SkyDrive – start with a certain amount of free space allotted to every user, which can be increased upon payment, with rates varying for each provider. • SkyDrive starts with the maximum amount of free space – 7GB. Google starts with 15GB, while Dropbox starts with the least – 2GB. You can however, considerably increase your Dropbox storage capacity – by referrals, beta testing, camera upload through your phone, and many other tweaks. Or if you happen to buy an HTC device, you start off with 25GB of space straightaway. • If you’re not planning to pay for your storage – SkyDrive or Dropbox offer the most value.
  • 88. File Type Support • Any file type can be uploaded on to these cloud services – but you can only view file types that are supported. Keeping that in mind, here’s a comparison on file type support for the three platforms: • Dropbox doesn’t support any file type. All files must be downloaded and nothing can be opened online. It’s not a major issue though – if you’re using Dropbox on your phone – you can edit files right from your phone (via an editor of course) and have them updated. The same would apply for your computer. • Google Drive supports unusual, and in a way, diverse range of file types – like Autodesk AutoCAD files, Photoshop (.psd) files, and even Adobe Illustrator files. But at the same time, it lacks basics. You can only view, but not edit Microsoft Office documents. All such files are converted to their Google Docs equivalent for editing. That can troublesome if you’re using a phone. Since Google Docs can only be accessed online, there is hardly any offline usability. Serious disadvantage here. • SkyDrive, being Microsoft’s child, will let you open and edit any Office documents. There is even support for audio formats, but limited to MP4 and WMV only.
  • 89. Online features• Dropbox is the oldest of the three providers, and is thus more established. It is bound to attract more users thanks to features like referral, which earn you additional space. • It is also the only service to support Linux and Blackberry systems. • It even has passcode locks for its mobile apps. • Although it charges higher dollar-per-GB, it offers a clean interface, is easier to use and integrates very nicely with your phone. • Purchasing an HTC gets you 25GB of free space. If you own an HTC, a Blackberry or run a Linux – Dropbox is the place to go.
  • 91. Google Drive • Google Drive offers support more file types which makes online sharing and editing much easier. • It also offers more value for dollar-per-GB. • Plus, if you purchase additional space, you get 25GB extra on to your Gmail storage. • But the fact that you have to convert office documents to Google Docs before you can begin editing can be a big pain.
  • 93. SkyDrive • SkyDrive gives you the most free space to start with. It also lets you remote access your PC online. • SkyDrive belonging to Microsoft, it has a head start in terms of targeting Windows users. • SkyDrive integrates superbly with Windows Phone and Windows 8. • Office 2013’s default file-saving location is your SkyDrive account. • It may be termed an unfair advantage, but if you have a Windows Phone or have recently purchased Windows 8, SkyDrive is a pretty good option.
  • 95. Apple iCloud • iCloud isn't much of a Dropbox competitor, but if all you need synced between your devices are text documents, it can be a pretty seamless solution. • Various apps such as Pages and iA Writer have iCloud sync capabilities, saving your work after every keystroke and instantly sending changes to Apple's servers. • Once you open up iA Writer or Pages on another Apple device or Mac with OS X Lion, you'll already be working with the most recent version of your document. • Additionally, Lion saves versions of your documents locally using Time Machine so you can return to older versions of your document, but only on your machine. • While iCloud is a very elementary document-syncing solution, it might also be the simplest one to use. And if you need to stream music or videos you've purchased from the cloud, you can do that, too
  • 97. Enterprise Applications • Characteristics of Enterprise Applications –complex (business) logic, –huge, complex data, –special security requirements, –focus is on transactions , –many clients, –support for different platforms, –heterogeneous development environments: C, C++, Java, C#, COBOL, ... • Application Requirements are a driving force for Software Architecture. Slide 102
  • 98. Mainframe Applications • Approach – Mainframe provides all resources and logic – Terminals for user interaction (thin clients) • Advantages – Simple deployment • Problems and Limitations – Limited GUI capabilities – Every client allocates server resources  – Limited scalability Slide 103 Mainframe: Application and Data Terminals: UI
  • 99. Client/Server Architecture• Approach – Increased capabilities of PCs – Logic was (partially) transferred to clients  thick clients – Central database server • Advantages – Better user experience (GUIs) • Problems and Limitations – Every client has a permanent stateful connection – Every client holds server resources (db connections, …) – Limited scalability Slide 104 Server: Data Thick Clients : UI + Logic PC Mac Linux- Workstation
  • 100. Middle-Tier Architecture • Approach – Logic bundled in Middle-tier – Clients have no permanent connection to Middle-tier  stateless connections • Advantages – Factoring of business logic – Middleware provides services (pooling, security, …) – Resources are shared between clients  improved scalability • Problems – New programming model Slide 105 Thin Clients Data-Tier Middle-Tier: Logic + Services (Middleware)
  • 101. Stateful Services • The service host provides a service instance for every client during the whole lifetime of a session. Slide 106 • Comparable to client/server architecture – The services is an “extension” of the client • Technologies – EJB: Stateful Session Beans – WCF: Per-Session Services – CORBA – Java RMI Client 1 Proxy Service Host Instance 1 Instance 2 Client 2 Proxy
  • 102. Stateful Services – Problems • Every service instance holds server resources – Only a limited number of clients can be served. • What happens with the service instance when the client dies unexpectedly? – Distributed garbage collection, session timeouts, … • What was the state of the instance when the connection was interrupted? – Reestablishing a connection may be painful. • Transactions may manage state differently – Transactions should be bound to service operations, not to sessions. – When the transaction is rolled back, database state and session state are not in synch. Slide 107
  • 103. Stateless Services • A service instance is created for each invocation of a service operation. Slide 108 Client 1 Proxy Service Host Instance 1 Instance 2 Instance 3 Client 2 Proxy  There is no permanent connection between the client and the service  Resources can be shared between clients   Improved scalability.  Technologies  EJB: Stateless Session Beans  WCF: Per-Call Services
  • 105. Internet-Scale Application • 2011 stats: – +200B pageviews/month – >3.9T feed actions/day – +300M active users – >1B chat mesgs/day – 100M search queries/day – >6B minutes spent/day (ranked #2 on Internet) – +20B photos, +2B/month growth – 600,000 photos served / sec – 25TB log data / day processed – 120M queries /sec • Scaling the “relational” data: – Keeps data normalized, randomly distributed, accessed at high volumes – Uses “shared nothing” architecture
  • 106. Internet-Scale Application • 2011 stats: – +20 petabytes of data processed / day by +100K MapReduce jobs – 1 petabyte sort took ~6 hours on ~4K servers replicated onto ~48K disks – +200B clusters, each at 1-5K nodes, handling +5 petabytes of storage • ~40 GB/sec aggregate read/write throughput across the cluster • +500 servers for each search query < 500ms • Scaling the process: – MapReduce: parallel processing framework – BigTable: structured hash database – Google File System: massively scalable distributed storage
  • 107. Microsoft in the Cloud (15 years) 450M+ active users (13 years) 550M users/mth (12 years) Largest non- ICP/IP cloud service x100M users (11 years) 320M+ active users (11 years) 2B queries/mth (15 years) 450M+ active users (7 years) 5B conf min/yr (6 years) 4B emails/day
  • 108. Windows Azure Platform • The Windows Azure Platform is a group of cloud technologies to be used by applications running in Microsoft’s data centers, on- premises and on various devices. • Windows Azure is a platform for running Windows applications and storing data in the cloud.
  • 109. Windows Azure applications run in Microsoft data centres and are accessed via the Internet.
  • 110. Working • Windows Azure runs on machines in Microsoft data centres. • Rather than Microsoft customers can install and run software on their own computers , Windows Azure provide a service. • • By which customers use it to run applications and store data on Internet-accessible machines owned by Microsoft. • Those applications might provide services to businesses, to consumers, or both.
  • 111. Applications that might be built on Windows Azure • An independent software vendor (ISV) : -It could create an application that targets business users, an approach that’s often referred to as Software as a Service (SaaS). - ISVs can use Windows Azure as a foundation for a variety of business-oriented SaaS applications. - An ISV might create a SaaS application that targets consumers. - -Facebook, OLX ,etc. • Enterprises might use Windows Azure to build and run applications that are used by their own employees.
  • 112. Three main components of Windows Azure :
  • 113. Windows Azure has three main components: • Compute : The Compute service runs applications . • Storage : The Storage service stores data . • Fabric: It provides a common way to manage and monitor applications that use this cloud platform.
  • 115. Physical Machine 1 Compute Resources: Web and Worker Roles • Web Role: Interactive .NET application hosted in IIS: – Web Application or Web Service (WCF) • Worker Role: Durable background process – Often isolated from outside world – There are ways to make it reachable for external application • Fabric Agent collects resource metrics (usage, failures, …) Slide 120 Hardware Load Balancer Virtual Machine 2 Fabric Controller Virtual Machine 1 Web Role Instance IIS Fabric Agent Physical Machine 2 Virtual Machine 2 Virtual Machine 1 Worker Instance Fabric Agent HTTP/HTTPS
  • 116. Web Role • Web role instance can accept incoming HTTP or HTTPS requests. • To allow this, it runs in a VM that includes Internet Information Services (IIS) 7. • Developers can create Web role instances using ASP.NET, WCF, or another .NET technology that works with IIS.
  • 117. Worker Role • It can’t accept requests from the outside world. • A Worker role instance initiates its own requests for input. • It can read messages from a queue and it can open connections with the outside world. • Worker role instances can be viewed as similar to a batch job or a Windows service.
  • 119. Windows Azure Storage provides Blobs, Tables, and Queues. Blob: binary large object
  • 120. Window Azure Storage • Cloud Applications can store data in – Blobs – Tables – Queues • Blobs provide storage for unstructured data. – Blobs are organized in containers. • Tables are structured collections of entities. – Entities are collections of name/value pairs. – Tables have no schema. • Queues provide storage for small data items on a FIFO basis. – Queues enable communication between service instances. Slide 125
  • 121. Windows Azure Storage Architecture Slide 126 Windows Azure Storage Cloud Application/External Application Storage Account Blob Container REST API Blob Blob Queue Table Entity Entity Entity .NET API (StorageClient) Blob Blob REST API
  • 122. Storage Features • Windows Azure Storage – provides huge storage resources, – is massively scalable, and – is a highly reliable persistence mechanism. • Data is stored in large server. • Scalability – Data can be distributed across many storage nodes. – Storage access is load-balanced. • Reliability – Data is replicated to different storage nodes when a write operation occurs (3 replicas in different default domains). – When a device fails, data is replicated to a new storage node. Slide 127
  • 123. Storage Account • The storage account is the entry point for all storage services. • Storage accounts can be created on the Azure Development Portal. • Storage accounts can be accessed with secret key: – On the client side a hash code is computed (with SHA256). –The hash code and the secret key is used to compute a HMAC-SHA256 signature (Hash Message Authentication Code). – The HMAC is attached to the REST request. – The storage server uses the signature to authenticate the request. Slide 128
  • 124. Accessing Storage via REST • Every storage item is identified by a Request URI: – Blob: <http|https>://<account name>.blob.core.windows.net/<container>/<blob name> – Table: <http|https>://<account name>.table.core.windows.net/<table name> – Queue: <http|https>://<account name>.queue.core.windows.net/<queue name> • HTTP verb represents the action executed on the resource: – PUT: create container/queue/table, save blob/entity/queue item, set metadata, … – GET: list blobs in container, get blob/entity/queue item, … – DELETE: delete container/queue/table/blob/entity/queue item, … • The REST API provides a platform-independent interface to Azure storage. • Example: Slide 129 PUT http://myaccount.blob.core.windows.net/mycontainer/myblockblob HTTP/1.1 x-ms-version: 2009-09-19 x-ms-date: Sun, 27 Sep 2009 22:33:35 GMT Content-Type: application/octet-stream x-ms-blob-type: BlockBlob x-ms-meta-m1: some metadata Authorization: SharedKey myaccount:YjN4fAR8/AmBrqBz7MG2uFinQh4dscbj598g= Content-Length: nn Request Body
  • 125. Blobs • Blob Containers provide a grouping of a set of blobs. – Sharing policies are set at container level: public read or private. – E.g. Facebook Album(Public , Friends, Private) – Containers can have metadata. – E.g. Date /Time of creation • Blob store big chunks of data. – Blobs can have metadata. Slide 130 Storage Account Blob Container 1 Page Blob 1 Page Blob 2 Block Blob 1 B1 B2 B3 Block Blob 1 B1 B2 B3 Blob Container 2
  • 128. Page Blob – Random Read/Write
  • 129. Blob cont……. • Windows Azure Blob storage is a service for storing large amounts of unstructured data that can be accessed from anywhere in the world via HTTP or HTTPS. • Common uses of Blob storage include: • Serving images , videos or documents directly to a browser • Storing files for distributed access • Streaming video and audio • Performing secure backup and disaster recovery
  • 131. Blobs
  • 132. Blob Containers Multiple Containers per Account Special $root container Blob Container A container holds a set of blobs Set access policies at the container level Associate Metadata with Container List the blobs in a container IncludingBlobMetadata andMD5 NOsearch/query.i.e.noWHEREMetadataValue = ? Blobs Throughput Effectively in Partition of 1 Target of 60MB/s per Blob
  • 135. Tables
  • 136. TableStorageConcepts EntityTableAccount cohowinery Comments Name =… Comment= … Advertisements URI =… Description =… Name =… Comment= … URI =… Description =…
  • 137. Table . . .Table Table Windows Azure Storage A closer look at tables Entity . . .Entity Entity Property PropertyProperty Name Type Value Storage Accounts
  • 138. Tables • Tables provide structured storage. • Tables contain a set of entities. • Entities contain a set of properties. • Properties have – a name, and – a typed value (Int32, Int64, double, string, bool, DateTime, byte array). • Tables have no fixed schema, i.e. the structure of each entity may be different. • Each entity has two key properties: – The partition key, and – the row key. Slide 143 Storage Account Table1 Table 2 Entity 1 Property 1 Property 2 … Entity 2 Property 1 Property 2 …
  • 139. Partitioning of Tables • The application controls the distribution of entities by assigning partition keys to entities. • If two entities have different partition keys they may be stored in different storage nodes. • Only entities stored within the same table and the same partition can take part in an entity group transaction. • Exp: Partition key 2= GHRCE - All friends Name and DOB is Store in Node 2. If Partition key 1= Shivaji College - Then all friends Name and DOB is Store in Node 1. Slide 144 Storage Node 2 Person Name Meyers DOB 1999/5/5 Row Key 1 Partition Key 2 Name Gates Row Key 2 Partition Key 2 Storage Node 1 Person Name Gosling DOB 1955/9/6 Row Key 3 Partition Key 1 DOB 1955/4/3
  • 140. Entity Properties Entity can have up to 255 properties Up to 1MB per entity Mandatory Properties for every entity PartitionKey & RowKey Uniquely identifiesan entity Definesthe sortorder No fixed schema for other properties Each property is stored as a <name, typed value> pair No schema stored for a table
  • 143. Queue
  • 144. QueueStorageConcepts MessageQueueAccount order processing customer ID order ID http://… customer ID order ID http://… cohowinery
  • 145. Queue Using Queues Web Role ASP.NET, WCF, etc. Worker Role main() { … } 1) Receive work 2) Put work in queue 3) Get work from queue 4) Do work
  • 146. Queues
  • 147. Queues • Queues provide a reliable asynchronous message delivery mechanism. • Reliability – Messages are redelivered until a consumer confirms that it has been successfully processed – Messages are processed at least once. • Decoupling of producer and consumer – Different parts of the system can be implemented using different technologies and languages. – Producer and consumer needn’t be alive at the same time. • Scalability – Queue length grows  number of consumer instances can be increased. Slide 154 Message Producer Message Consumer CloudQueue Message CloudQueue Message Queue
  • 148. Inter-role Communication • Role instances can communicate asynchronously via queues. – Preferred method for reliable messaging Slide 155 Role Instance 1 Queue Role Instance 2 Role Instance 3 Role Instance 4  Role instances can also communicate directly using TCP or HTTP(S) connections. External App Role Instance 2 Load Balancer Role Instance 1 Role Instance 3 Role Instance 4 External Endpoint Internal Endpoint
  • 149. Queues – Features and Limitations • Queue – The number of messages is not limited. – Metadata can be associated. – There is no guaranteed return order of messages. – Messages are delivered at least once  consumer must be able to deal with messages that are delivered more than once. • Message – Message can be up to 8KB in size. – Message structure: • MessageID: a GUID • VisibilityTimeout: If processing of message is not confirmed within this time range the message reappears in the queue (default = 30 seconds) • MessageTTL: Time-to-live interval for messages (maximum = default = 7 days) • Payload: String or byte array. Slide 156
  • 150. LooselyCoupledinteraction withQueues Azure Queue Input Queue (Work Items) Web Role Web Role Web Role
  • 154. • All Windows Azure applications and all of the data in Windows Azure Storage live in some Microsoft data center. • Within that data center, the set of machines dedicated to Windows Azure is organized into a fabric(structure, framework ). • Windows Azure Fabric consists of a (large) group of machines, all of which are managed by software called the fabric controller.
  • 155. Cont… • The fabric controller is replicated across a group of five to seven machines. • Because it can communicate with a fabric agent on every computer, it’s also aware of every Windows Azure application in this fabric. • It monitors all running applications, for example, giving it an up-to-the-minute picture of what’s happening in the fabric.
  • 156. • To create a scalable Web application on Windows Azure, a developer can use Web roles and tables. CREATING A SCALABLE WEB APPLICATION
  • 157. • The clients are browsers, and so the application logic might be implemented using ASP.NET or another Web technology. • It’s also possible to create a scalable Web application that exposes RESTful and/or SOAP-based Web services using WCF. • The fabric controller also monitors these instances, making sure that the requested number is always available. • For data storage, the application uses Windows Azure Storage tables, which provide scale-out storage capable of handling very large amounts of data
  • 158. So what do you mean by Scalable Application? • In a conventional data center, it requires always having enough machines on hand to handle the peaks hours time, even though most of those systems go unused most of the time. • But ,if the application is built on Windows Azure, the organization running it can expand the number of instances it’s using only when needed, then shrink back to a smaller number. • An online IPL ticketing selling site, Customer Call Centers Video sites with occasional hot stories, governments online exam web sites that are used mostly at certain times of day, and others.
  • 159. A PARALLEL PROCESSING APPLICATION
  • 160. • The parallel work is done by some number of Worker role instances running simultaneously, each using blob data. • Since Windows Azure imposes no limit on how long an instance can run, each one can perform an arbitrary amount of work.
  • 161. • Through this interface, the user might determine how many Worker instances should run, start and stop those instances, get results, and more. • Communication between the Web role instance and the Worker role instances relies on Windows Azure Storage queues.
  • 162. A parallel processing application can communicate with an on-premises application.
  • 163. • Rather than relying on a Web role instance running on Windows Azure, the user might instead interact with the Worker role instances via an on-premises application . • Multiple Worker role instances run simultaneously, each interacting with the outside world via queues. • Here, however, work is put into those queues directly by an on-premises application • In a scenario like this, the user might have no idea that the on-premises application he’s using relies on Windows Azure for parallel processing. • Exp: mobile tracking application On-premises software is installed and run on computers on the premises (in the building) of the person or organisation using the software, rather than at a remote facility, such as at a server or cloud somewhere on the Internet.
  • 164. CREATING A SCALABLE WEB APPLICATION WITH BACKGROUND PROCESSING
  • 165. • A majority of today’s applications built to provide a browser interface. • That do nothing but accept and respond to browser requests that are useful, but they’re also limiting. • There are lots of situations where Web-accessible software also needs to initiate work that runs in the background, independently from the request/response part of the application. • For example, Online recommendation systems , Facebook etc.
  • 166. • Instead, the part of the application that accepts browser requests should be able to initiate a background task that carries out this work without the knowledge of users. • Windows Azure Web roles and Worker roles can be used together to address this scenario. i.e. Background processing
  • 167. • The two major parts of Windows Azure are: – Azure AppFabric – SQL Azure
  • 168. AppFabric • Windows Azure AppFabric provides a comprehensive cloud middleware platform for developing, deploying and managing applications on the Windows Azure Platform. • It enables bridging your existing applications to the cloud through secure connectivity across network and geographic boundaries. • Let existing WhatsApp application merge with Facebook’s cloud application. • It delivers additional developer productivity, adding in higher- level Platform-as-a-Service (PaaS) capabilities on top of the familiar Windows Azure application model. Video
  • 169. • There are three key components of Windows Azure AppFabric: • Middleware Services: platform capabilities as services, which raise the level of abstraction and reduce complexity of cloud development. • Composite Applications: a set of new innovative frameworks, tools and composition engine to easily assemble, deploy, and manage a composite application as a single logical entity. • Scale-out application infrastructure: optimized for cloud-scale services and mid-tier components. AppFabric
  • 171. • Middleware Services • Windows Azure AppFabric provides pre-built, higher level middleware services that raise the level of abstraction and reduce complexity of cloud development. • These services are open and interoperable across languages (.NET, Java, Ruby, PHP…) and give developers a powerful pre-built “class library" for next-gen cloud applications. • Developers can use each of the services stand-alone, or combine services to provide a composite solution. AppFabric
  • 172. • Service Bus: provides secure messaging and connectivity capabilities that enable building distributed and disconnected applications in the cloud, as well as hybrid applications across both on-premise and the cloud. • It enables using various communication and messaging protocols and patterns, and removes the need for the developer to worry about delivery assurance, reliable messaging and scale.
  • 173. • Access Control :enables an easy way to provide identity and access control to web applications and services, while integrating with standards-based identity providers, including enterprise directories such as Active Directory®, and web identities such as Windows Live ID, Google, Yahoo! and Facebook.
  • 174. • Caching : performance of Windows Azure and SQL Azure based apps by providing a distributed, in-memory application cache, provided entirely as a service (no installation or management of instances, dynamically increase/decrease cache size as needed). • Pre-integration with ASP.NET enables easy acceleration of web applications without having to modify application code.
  • 175. • Integration: on Windows Azure, using out-of-box integration patterns to accelerate and simplify development. • It will also deliver higher level business user enablement capabilities such as Business Activity Monitoring and Rules, as well as self-service trading partner community portal and provisioning of business-to-business pipelines.
  • 176. • Composite App : it delivers a complete hosting environment for web services built using Windows Communication Foundation (including WCF Data Services and WCF RIA Services) and workflows built using Windows Workflow Foundation. More Info
  • 177. Who’s Live on Windows Azure
  • 178. SQL Azure • SQL Server hosted in the cloud. • It provides relational database features, so that a platform is scalable, highly available and load-balanced. Video

Editor's Notes

  1. The Gomez PEER is a secure, Java-based application that runs in the background of your PC. You may even forget it&apos;s there, because it will not disrupt the way you use your computer.Using advanced, peer-to-peer distributed computing technology, the Gomez PEER combines the spare capacity of PCs around the world to measure the performance of Web sites. After you install the Gomez PEER, it will periodically communicate with Gomez servers via the Internet, signal that it&apos;s available for work, and request a work unit. And you will be credited for your online time and work processed approximately every 15 minutes that the application is running.Based on your computer&apos;s characteristics (Internet connection type, geographic location and Internet service provider), your PC will receive instructions to autonomously test the performance of Web sites — gathering important metrics that help identify network bottlenecks and performance problems. All of this happens &quot;behind the scenes,&quot; even when you&apos;re away from your PC or asleep. Finally, your PC will send its results back to Gomez, where they are added to the work of thousands of other PCs around the world.The Gomez PEER is available for Windows 2000/NT/XP/2003.
  2. The Gomez PEER is a secure, Java-based application that runs in the background of your PC. You may even forget it&apos;s there, because it will not disrupt the way you use your computer.Using advanced, peer-to-peer distributed computing technology, the Gomez PEER combines the spare capacity of PCs around the world to measure the performance of Web sites. After you install the Gomez PEER, it will periodically communicate with Gomez servers via the Internet, signal that it&apos;s available for work, and request a work unit. And you will be credited for your online time and work processed approximately every 15 minutes that the application is running.Based on your computer&apos;s characteristics (Internet connection type, geographic location and Internet service provider), your PC will receive instructions to autonomously test the performance of Web sites — gathering important metrics that help identify network bottlenecks and performance problems. All of this happens &quot;behind the scenes,&quot; even when you&apos;re away from your PC or asleep. Finally, your PC will send its results back to Gomez, where they are added to the work of thousands of other PCs around the world.The Gomez PEER is available for Windows 2000/NT/XP/2003.
  3. Distributed computing is getting more and moredecentralized, dynamic, and intertwined with thephysical world. These characteristics appear very clearlyin a variety of emerging scenarios, such as peer-to-peer(P2P) networks, mobile ad-hoc networks and sensornetworks. As a consequence, the complexity of thesescenarios is such that they can no longer be managedwith traditional approaches to distributed systemsengineering. There is need for approaches supportingdistributed systems to autonomously self-configure andself-adapt their activities in their operationalenvironment.
  4. Source: John Rothschild, VP of Technology, Facebook
  5. Source: Jeffrey Dean and Sanjay Ghemawat, Google
  6. Slide ObjectiveUnderstand uploading a block blobVALUE PROPBlock blobs let you upload large blobs efficiently. Block blobs are comprised of blocks, each of which is identified by a block ID.Speaking notesWhen you upload a block to a blob in your storage account, it is associated with the specified block blob, but it does not become part of the blob until you commit a list of blocks that includes the new block&apos;s ID. New blocks remain in an uncommitted state until they are specifically committed or discarded. Writing a block does not update the last modified time of an existing blob.With a block blob, you can upload multiple blocks in parallel to decrease upload time. Each block can include an MD5 hash to verify the transfer, so you can track upload progress and re-send blocks as needed. You can upload blocks in any order, and determine their sequence in the final block list commitment step.Notes
  7. Slide ObjectiveUnderstand page blobVALUE PROPPage blobs are a collection of 512-byte pages optimized for random read and write operations.Speaking notesThe maximum size for a page blob is 1 TB.To create a page blob, you initialize the page blob and specify the maximum size the page blob will grow. To add or update the contents of a page blob, you write a page or pages by specifying an offset and a range that align to 512-byte page boundaries. A write to a page blob can overwrite just one page, some pages, or up to 4 MB of the page blob. Writes to page blobs happen in-place and are immediately committed to the blob. Notes
  8. Slide ObjectiveUnderstand containersSpeaker NotesAccount can contain unlimited number of containersRoot container useful when serving Silverlight and flash out of Blob storage. May need to store Cross domain access policy files in root of the domainMetadata is up to 8KB of name value pairs per containerNoteshttp://msdn.microsoft.com/en-us/library/dd179361.aspxhttp://msdn.microsoft.com/en-us/library/ee395424.aspxA root container serves as a default container for your storage account. A storage account may have one root container. The root container must be explicitly created and must be named $root.A blob stored in the root container may be addressed without referencing the root container name, so that a blob can be addressed at the top level of the storage account hierarchy. For example, you can now reference a blob that resides in the root container in the following manner:
  9. Slide ObjectivesUnderstand Tables and EntitiesSpeaker NotesTables store data as entities. An entity is a collection of named properties and their values, similar to a row- not an RDBMS thoughTables are partitioned to support load balancing across storage nodes. Each table has as its first property a partition key that specifies the partition an entity belongs to. The second property is a row key that identifies an entity within a given partition. The combination of the partition key and the row key forms a primary key that identifies each entity uniquely within the table.The Table service does not enforce any schema. A developer may choose to implement and enforce a schema on the client sideNoteshttp://msdn.microsoft.com/en-us/library/dd573356.aspxhttp://msdn.microsoft.com/en-us/library/dd179338.aspx
  10. Slide ObjectivesUnderstand Flexible EntitiesSpeaker NotesTables store data as entities. A table can contain entities of any shapeThere is no fixed schemaThere is no schema checkingThere is no strong typing- not that Birthdate is stored as both a datetime value and as a stringNot that we can add additional columnsNoteshttp://msdn.microsoft.com/en-us/library/dd573356.aspx