In this presentation how cloud is useful in big data analytics.It givers brief introduction to cloud service models and Big data 4V's.Here I'm describing how cloud is used in telecom and finance domain. How it is better than traditional methods.
Introduction to Cloud computing and Big Data-HadoopNagarjuna D.N
Cloud Computing Evolution
Why Cloud Computing needed?
Cloud Computing Models
Cloud Solutions
Cloud Jobs opportunities
Criteria for Big Data
Big Data challenges
Technologies to process Big Data- Hadoop
Hadoop History and Architecture
Hadoop Eco-System
Hadoop Real-time Use cases
Hadoop Job opportunities
Hadoop and SAP HANA integration
Summary
Cloud computing & big data for service innovation & learning2016
Cloud Computing and Big Data for Service Innovations & Learning
Up till now, most of the adoption of cloud computing focusses on the automation and consolidation of traditional IT services. As such, the gains are confined to the uniformity of control, cost reduction and better governance. Recent adoption of the cloud has gradually moved into tactical and even strategic levels thereby demonstrating a high level of gains for using the cloud for business transformations and innovations. Such benefits include dynamism in business model compositions and speed and ease in orchestrating service innovations in the cloud. This talk will shed light on how massive and rapid accumulation of data in the cloud can support human-machine cooperative problem solving and re-define the landscape of Open Innovation and Connectionist Learning via a Knowledge Cloud.
Adopting Hadoop to manage your Big Data is an important step, but not the end-solution to your Big Data challenges. Here are some of the additional considerations you must face:
Choosing the right cloud for the job: The massive computing and storage resources that are needed to support Big Data applications make cloud environments an ideal fit, and more than ever, there is a growing number of choices of cloud infrastructure types and providers. Given the diverse options, and the dynamic environments involved, it becomes ever more important to maintain the flexibility for all your IT needs.
Big Data is a complex beast: It involves many and different moving parts, in large clusters, and is continually growing and evolving. Managing such an environment manually is not a viable option. The question is, how can you achieve automation of all this complexity?
The world beyond Hadoop: Big Data is not just Hadoop – there is a whole rapidly growing ecosystem to contend with, including NoSQL, data processing, analytics tools… As well as your own application services. How can you manage deployment, configuration, scaling and failover of all the different pieces, in a consistent way?
In this session, you’ll learn how to deploy and manage your Hadoop cluster on any Cloud, as well as manage the rest of your big data application stack using a new open source framework called Cloudify.
It describe cloud infrastructure required for big data. It discusses the object storage and virtualization required for big data. Ceph is discussed as example.
In this presentation how cloud is useful in big data analytics.It givers brief introduction to cloud service models and Big data 4V's.Here I'm describing how cloud is used in telecom and finance domain. How it is better than traditional methods.
Introduction to Cloud computing and Big Data-HadoopNagarjuna D.N
Cloud Computing Evolution
Why Cloud Computing needed?
Cloud Computing Models
Cloud Solutions
Cloud Jobs opportunities
Criteria for Big Data
Big Data challenges
Technologies to process Big Data- Hadoop
Hadoop History and Architecture
Hadoop Eco-System
Hadoop Real-time Use cases
Hadoop Job opportunities
Hadoop and SAP HANA integration
Summary
Cloud computing & big data for service innovation & learning2016
Cloud Computing and Big Data for Service Innovations & Learning
Up till now, most of the adoption of cloud computing focusses on the automation and consolidation of traditional IT services. As such, the gains are confined to the uniformity of control, cost reduction and better governance. Recent adoption of the cloud has gradually moved into tactical and even strategic levels thereby demonstrating a high level of gains for using the cloud for business transformations and innovations. Such benefits include dynamism in business model compositions and speed and ease in orchestrating service innovations in the cloud. This talk will shed light on how massive and rapid accumulation of data in the cloud can support human-machine cooperative problem solving and re-define the landscape of Open Innovation and Connectionist Learning via a Knowledge Cloud.
Adopting Hadoop to manage your Big Data is an important step, but not the end-solution to your Big Data challenges. Here are some of the additional considerations you must face:
Choosing the right cloud for the job: The massive computing and storage resources that are needed to support Big Data applications make cloud environments an ideal fit, and more than ever, there is a growing number of choices of cloud infrastructure types and providers. Given the diverse options, and the dynamic environments involved, it becomes ever more important to maintain the flexibility for all your IT needs.
Big Data is a complex beast: It involves many and different moving parts, in large clusters, and is continually growing and evolving. Managing such an environment manually is not a viable option. The question is, how can you achieve automation of all this complexity?
The world beyond Hadoop: Big Data is not just Hadoop – there is a whole rapidly growing ecosystem to contend with, including NoSQL, data processing, analytics tools… As well as your own application services. How can you manage deployment, configuration, scaling and failover of all the different pieces, in a consistent way?
In this session, you’ll learn how to deploy and manage your Hadoop cluster on any Cloud, as well as manage the rest of your big data application stack using a new open source framework called Cloudify.
It describe cloud infrastructure required for big data. It discusses the object storage and virtualization required for big data. Ceph is discussed as example.
The rise of “Big Data” on cloud computing: Review and open research issues
Paper Link: https://www.researchgate.net/publication/264624667_The_rise_of_Big_Data_on_cloud_computing_Review_and_open_research_issues
Disclaimer :
The images, company, product and service names that are used in this presentation, are for illustration purposes only. All trademarks and registered trademarks are the property of their respective owners.
Data/Image collected from various sources from Internet.
Intention was to present the big picture of Big Data & Hadoop
re:Invent re:Cap - Big Data & IoT at Any ScaleAdrian Hornsby
This session covers the most recent Big Data & IoT announcements at re:Invent. Learn about trends and use cases for understanding your data and implementing an Internet of Things (IoT) project. Hear about how AWS customers are using AWS IoT to connect their devices to the cloud and solve business challenges with IoT.
The web-conference hosted by CRISIL Global Research & Analytics on “Big Data’s Big Impact on Businesses” on January 29, 2013, saw participation from senior officials of global multinationals from 9 countries. The presentation described how data analytics is helping businesses make “evidence-based” decisions, thereby creating a positive impact. It also spoke about the opportunities opening up in the Big Data space in India and across the globe.
Hosted by:
Sanjeev Sinha, President, CRISIL Global Research & Analytics
Gaurav Dua, Director & Practice Leader (Technology, Media & Telecom), CRISIL Global Research & Analytics
I bumped into Internet of Things today and thus jumps in to understand what it is. With IoT, I can't help but see logs in a totally different paradigm.
Learn Big data and Hadoop online at Easylearning Guru. We are offer Instructor led online training and Life Time LMS (Learning Management System). Join Our Free Live Demo Classes of Big Data Hadoop .
Big Data may well be the Next Big Thing in the IT world
Big data burst upon the scene in the first decade of the 21st century
The first organizations to embrace it were online and startup firms. Firms like Google , eBay , LinkedIn , and Facebook were built around big data from the beginning.
Like many new information technologies , bigdata can bring about dramatic cost reductions , substantial improvements in the time required to perform to computing task , or new product and service offerings.
Big Data brings big promise and also big challenges, the primary and most important one being the ability to deliver Value to business stakeholders who are not data scientists!
The rise of “Big Data” on cloud computing: Review and open research issues
Paper Link: https://www.researchgate.net/publication/264624667_The_rise_of_Big_Data_on_cloud_computing_Review_and_open_research_issues
Disclaimer :
The images, company, product and service names that are used in this presentation, are for illustration purposes only. All trademarks and registered trademarks are the property of their respective owners.
Data/Image collected from various sources from Internet.
Intention was to present the big picture of Big Data & Hadoop
re:Invent re:Cap - Big Data & IoT at Any ScaleAdrian Hornsby
This session covers the most recent Big Data & IoT announcements at re:Invent. Learn about trends and use cases for understanding your data and implementing an Internet of Things (IoT) project. Hear about how AWS customers are using AWS IoT to connect their devices to the cloud and solve business challenges with IoT.
The web-conference hosted by CRISIL Global Research & Analytics on “Big Data’s Big Impact on Businesses” on January 29, 2013, saw participation from senior officials of global multinationals from 9 countries. The presentation described how data analytics is helping businesses make “evidence-based” decisions, thereby creating a positive impact. It also spoke about the opportunities opening up in the Big Data space in India and across the globe.
Hosted by:
Sanjeev Sinha, President, CRISIL Global Research & Analytics
Gaurav Dua, Director & Practice Leader (Technology, Media & Telecom), CRISIL Global Research & Analytics
I bumped into Internet of Things today and thus jumps in to understand what it is. With IoT, I can't help but see logs in a totally different paradigm.
Learn Big data and Hadoop online at Easylearning Guru. We are offer Instructor led online training and Life Time LMS (Learning Management System). Join Our Free Live Demo Classes of Big Data Hadoop .
Big Data may well be the Next Big Thing in the IT world
Big data burst upon the scene in the first decade of the 21st century
The first organizations to embrace it were online and startup firms. Firms like Google , eBay , LinkedIn , and Facebook were built around big data from the beginning.
Like many new information technologies , bigdata can bring about dramatic cost reductions , substantial improvements in the time required to perform to computing task , or new product and service offerings.
Big Data brings big promise and also big challenges, the primary and most important one being the ability to deliver Value to business stakeholders who are not data scientists!
The impact of emerging IoT Technology and BigData. This is the slide presentation I did at the http://globalbigdatabootcamp.com/speakers/sanjay-sabnis/
Big Data made easy in the era of the Cloud - Demi Ben-AriDemi Ben-Ari
Talking about the ease of use and handling Big Data technologies in the Cloud. Using Google Cloud Platform and Amazon Web Services and all of the tools around it.
Showing the problems and how we can solve them with simple tools.
BIG DATA
Prepared By
Muhammad Abrar Uddin
Introduction
· Big Data may well be the Next Big Thing in the IT world.
· Big data burst upon the scene in the first decade of the 21st century.
· The first organizations to embrace it were online and startup firms. Firms like Google, eBay, LinkedIn, and Facebook were built around big data from the beginning.
· Like many new information technologies, big data can bring about dramatic cost reductions, substantial improvements in the time required to perform a computing task, or new product and service offerings.
What is BIG DATA?
· ‘Big Data’ is similar to ‘small data’, but bigger in
size
· but having data bigger it requires different approaches:
– Techniques, tools and architecture
· an aim to solve new problems or old problems in a better way
· Big Data generates value from the storage and processing of very large quantities of digital information that cannot be analyzed with traditional computing techniques.
What is BIG DATA
· Walmart handles more than 1 million customer transactions every hour.
· Facebook handles 40 billion photos from its user base.
· Decoding the human genome originally took 10years to process; now it can be achieved in one week.
Three Characteristics of Big Data V3s
(
Volume
Data
quantity
) (
Velocity
Data
Speed
) (
Variety
Data
Types
)
1st Character of Big Data
Volume
· A typical PC might have had 10 gigabytes of storage in 2000.
· Today, Facebook ingests 500 terabytes of new data every day.
· Boeing 737 will generate 240 terabytes of flight data during a single
flight across the US.
· The smart phones, the data they create and consume; sensors embedded into everyday objects will soon result in billions of new, constantly-updated data feeds containing environmental, location, and other information, including video.
2nd Character of Big Data
Velocity
· Clickstreams and ad impressions capture user behavior at millions of events per second
· high-frequency stock trading algorithms reflect market changes within microseconds
· machine to machine processes exchange data between billions of devices
· infrastructure and sensors generate massive log data in real- time
· on-line gaming systems support millions of concurrent users, each producing multiple inputs per second.
3rd Character of Big Data
Variety
· Big Data isn't just numbers, dates, and strings. Big Data is also geospatial data, 3D data, audio and video, and unstructured text, including log files and social media.
· Traditional database systems were designed to address smaller volumes of structured data, fewer updates or a predictable, consistent data structure.
· Big Data analysis includes different types of data
Storing Big Data
· Analyzing your data characteristics
· Selecting data sources for analysis
· Eliminating redundant data
· Establishing the role of NoSQL
· Overview of Big Data stores
· Data models: key value, graph, document, column-family
· Hadoop Distributed File System
· H.
Short introduction to Big Data Analytics, the Internet of Things, and their s...Andrei Khurshudov
Invited talk at the 26th ASME annual conference on information and storage and processing systems (ISPS 2017) held at Hilton San Francisco District, San Francisco, California, USA from August 29–30, 2017.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
3. What Is Internet Of Things
The general concept of the Internet of Things is
that we can put a sensor on anything and have
it send data back to a database through the
Internet.
In this way we can monitor everything,
everywhere and build smarter systems that are
more interactive than ever before.
Dan Rowinski, article on ReadWrite.com
6. How Many Things
• Projections Vary
– Uses cases still largely theoretical
• Depends on what’s counted
– E.g. include smart phones
• Beware of motivations
– Defining markets, setting agenda
Example Estimates
2020 50 billion Cisco Systems, 2013
2020 26 billion Gartner
7. Internet of Things
• “Internet of Things”
– Popularized in 1999 by MIT research group
– Sensor networks & content tagging to enable
interaction with physical and logical objects
– Core technologies
• Nanotechnology, intelligent
embedded systems, RFID,
sensor technology,…
– Platform technologies
• IPv6, Big Data, Virtualization,
Cloud Computing
– Standards & APIs
Entrance to SARAH, the artificially
Intelligent "home of the future” in Syfy’s Eureka.
9. Internet of Things
• Smart Planet (IBM), Planetary Skin (Cisco),
CeNSE (HP),…
• Connections, it’s not just of people anymore
• IOT is about
– Data
– Sensors
– Control
– Analytics
– Networks
10. Internet of Things
• Applications of IoT
– Manufacturing
• Robotics, analytics, smart meters
– Retail
• Inventory tracking,
– Healthcare
• Remote monitoring, records
– Transportation
• Autonomous vehicles, GPS
– Home
• Monitoring, SMART devices, locks
Thermostat, nest.com
11. Internet of Things
• Concerns
– Competence
– Technocracy
– Panopticon
– Profiling
– Hacking
– Complexity
– Inevitability
– Data Ownership
– Costs
– Energy disney.wikia.com
I’m not bad, I’m just drawn that way. – Jessica Rabbit
12. “If you have something that you don't
want anyone to know, maybe you
shouldn't be doing it in the first
place.” – Google CEO Eric Schmidt, December 2009 CNBC interview
Privacy & The Internet
15. Virtualization
Hardware (CPU, RAM, HDD)
Host Operating System
App App
Before Virtualization
Hardware (CPU, RAM, HDD)
Hypervisor
App
Virtual
Machine #1
Virtual
Machine #n
Host Operating
System 1
Host Operating
System n
App AppApp
After Virtualization
Virtualization is a methodology of dividing the resources of a computer
hardware and software into multiple execution environments, known
as virtual machines.
16. Virtualization Benefits
Server Consolidation
Power Savings
Data Center Space Decrease
Ease of use
High Availability
Simplified Management
Improved Go-To-Market Time
Enhanced Security
Reduced Networking Costs
Reduced Carbon Footprint
Reduced TCO
Desktop Consolidation
Why People Love it
Remember: Business cares about results, not virtualization
19. Cloud Computing
Storage and Compute
happens here
Input and consumption of resulting information happens here
Cloud
20. Cloud Computing
• Computer scientist John McCarthy in the
1960s predicted that, "Computation may
someday be organized as a public utility.“
• What is cloud computing?
– In one of the more bare-bones definition, it is the
ability to process information on someone else’s
device.
• Cloud computing essentially transfers
computing tasks to the Internet.
21. “Cloud computing is a model for enabling
ubiquitous, convenient, on-demand
network access to a shared pool of
configurable computing resources (e.g.,
networks, servers, storage, applications,
and services) that can be rapidly
provisioned and released with minimal
management effort or service provider
interaction” -- NIST
The service is provided over
a network as opposed to
directly cabled. A
broadband or high-speed
network access is assumed.
There are many use cases
and many possible solution
for cloud computing, any
definition should be a
model that can be modified
to fit a particular use case.
The resources have a
management user interface
allowing the user to configure
and customize the resources
as needed
The asset or service can be
requested (provisioned) more
quickly than in the traditional
IT model. Usually in seconds
or minutes.
The service provider should be
able to “set it and forget it”. After
the initial deployment, ongoing
maintenance requirements
should be minimal.
This implies a self-service model.
The user has direct access to the
resources and can provision
(request) and release (return) an
asset with minimum overhead.
The relevant
compute
resources are
pooled and then
logically divided
and shared
amongst the
users. Related
terms include
multi-tenancy
and elastic.
22. Cloud Computing – Deployment
Models
Cloud Types Properties
Private Cloud For the exclusive use of a single organization.
Can have multiple consumers, e.g. business units
May exist on or off premise
Community
Cloud
For exclusive use of a community with a shared concern or purpose.
Owned and managed by the community and/or a third party.
May exist on or off premise
Public Cloud Available to the general public
Owned and managed by a service provider organization.
Exists on the premise of the service provider.
Hybrid Cloud Composition of two or more distinct cloud infrastructures.
Bound together by standard or proprietary technology yet otherwise
distinct.
Enables data and app portability
24. Big Data
“There are more things in heaven and earth,
Horatio, than are dreamt of in your philosophy.”
– Hamlet
From the beginning of recorded time until 2003,
we created 5 billion gigabytes of data. In 2011
the same amount was created every two days.
By 2013 that time will shrink to 10 minutes.
– The Human Face of Big Data – Rick Smolan & Jennifer Erwitt
25. SI Decimal Prefix Value
Hard Drive Storage
(decimal)
Processor/Virtual Storage
(binary)
Binary Digit 100 1 Bit 1 Bit
Byte 8 8 Bits 8 Bits
Kilobyte 103 1000 Bytes 1024 Bytes
Megabyte 106 1000 Kilobytes 1024 Kilobytes
Gigabyte 109 1000 Megabytes 1024 Megabytes
Terabyte 1012 1000 Gigabyte 1024 Gigabyte
Petabyte 1015 1000 Terabyte 1024 Terabyte
Exabyte 1018 1000 Petabyte 1024 Petabyte
Zettabyte 1021 1000 Exabyte 1024 Exabyte
Yottabyte 1024 1000 Zettabyte 1024 Zettabyte
Brontobyte 1027 1000 Yottabyte 1024 Yottabyte
Geopbyte 1030 1000 Brontobyte 1024 Brontobyte
Binary
Decision
Text
Character
Half Page
One Min.
MP3 Audio 894,784
Plaintext Pages
4,581,298
books 268,435,456
MP3 Files
245 Million
DVDs
375 Trillion
Digital
Pictures
26. What Is Big Data?
• Big data
– Too big in size, created too fast & with little or no
standard structure
– Structured + Semi-Structured + Unstructured Data
– Social media, sensors, retail, weather, etc.
– Difficult to process using traditional tools
• Big data spans three dimensions:
– Volume, Velocity and Variety.
28. Big Data Velocity Examples
• In one second on the internet there are…
– 197 Reddit votes cast
– 463 Instagram Photos posted
– 833 Tumblr posts posted
– 1024 Skype calls made
– 3935 Tweets tweeted
– 11574 Dropbox files uploaded
– 33,333 Google searches made
– 46,333 YouTube videos viewed
– 52,083 Facebook likes
Captured in 2014
29. Big Data
• Why Does Big data Matters?
– New Data
– Unlock Value
– Shape The Future
– Knowledge Is Power
• Data Sources
– Interconnectivity
– Machines
– Historical
Image from commons.wikimedia.org
30. Analyzing Big Data
• New Tools
– Data Scientist
– Scale-out Hardware
– Parallel Programming Algorithms
• MapReduce
• Hadoop
– Fast Storage:
• In-memory
• SSD
• DAS
– Languages for data mining/analytics: R, NoSQL,
Pig/Hive/Hadoop, C/C++, Perl,…
31. Criticizing Big Data
• Privacy
• “Truthiness”
• “Forest For The Tree”
• Decisions based on the past
• Correlation does not mean causation.
• “Big Judgment”
– Providing context for big data outcomes
…not everything that can be
counted counts, and not
everything that counts can be
counted. – William Bruce Cameron
32. Big Data Analytics
“In a world where every click can be tracked and
recorded, we shouldn’t be managing customers
by putting them into groups of similar people,
we really shouldn’t be guessing. We should be
able to read the signals that customers are
telling us to figure out what they want. I call
that personalization.”
– Nilan Peiris, CMTO HolidayExtras.com
33. Big Data
Known Knowns
• Things we know and we
know that we know
• working data
Known Unknowns
• Things we know that we
do NOT know
• data to be acquired
Unknown Knowns
• Things we do NOT know
that we know
• forgotten data
Unknown Unknowns
• Things we do NOT know
that we do NOT know
• data ignorance
Popularized partly by Donald Rumsfeld
34. data
engineer
Data Information Knowledge Understanding Wisdom
data
engineer
data
analyst
data
miner
data
scientist
raw data context experience models prediction
The Data Scientist can obtain actionable insight from an analysis of the data.
Big Data
35. Smart Dust
The Sensors That Track Every Thing, Everywhere
Sensors are ubiquitous, not just in devices