- All the data – whether or not categorized – present in servers of a company is collectively called BIG DATA.
- For Example, customer’s shopping history, web surfing history, Likes and Comments on Facebook, Tweets etc.
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Big data management
1. Bilwa Upadhye - FPM03
Chetna Chauhan – FPM04
Leon Dukkipati – PGP0686
Manzoor Ul Akram – FPM05
Soumya Soni – PGP06105
IIM Rohtak
2. • The exponential growth and availability of data, both
structured and unstructured.
Structured Data
• Data that resides in a fixed field within a record or file is
called structured data. This includes data contained in
relational databases and spreadsheets.
Unstructured Data
• Text and multimedia content like e-mail messages, word
processing documents, videos, photos, audio files,
presentations, webpages and many other kinds of
business documents. The data doesn't fit neatly in a
database
• 80 – 90% data in any organization is unstructured12/6/2015 Big Data 2
3. • eBay – 100 PB
• Google – 100 PB
• Facebook - 600 PB
• Twitter – 100 TB
• NSA – 29 TB
• 90% of the data in the world today has been created in
the last two years alone
Examples :
• Sensors used to gather climate information, posts to
social media sites, digital pictures and videos, purchase
transaction records, cell phone GPS signals, UID
information, patient information etc.
Source : http://wikibon.org12/6/2015 Big Data 3
5. • Organization, administration and governance of large
volumes of both structured and unstructured data
• Tools used:
Hadoop, NoSQL, Platfora
• Big data management is important to business, and
society, because more data may lead to more accurate
analyses.
12/6/2015 Big Data 5
6. RDBMS
• Structured data
• ER model defined
perfectly
• Less amount of data
• Relational data base
management system
• Applications: IIM Rohtak
Big data management
technologies
• Unstructured data, semi-
structured data,
unstructured data
• No perfect ER model
• Large amount of data
• Node based flat structure
• Healthcare, retail, Google,
IBM
12/6/2015 Big Data 6
7.
8. • Open source software framework – JAVA
• Fundamental assumption
• Storage part: HDFS ( Hadoop distributed file system)
• Processing part: Map reduce
• Working of Hadoop
12/6/2015 Big Data 8
10. • Non SQL database
• Provides mechanism for storage and retrieval of data
• Horizontal scaling
Platfora
• Software works with open source software framework
Hadoop
• When user queries database, software delivers answer in
real time
12/6/2015 Big Data 10
11. • Highly fault - tolerant and is designed to be deployed on
low-cost hardware
• Provides high throughput access to application data and
is suitable for applications that have large data sets
• Relaxes a few POSIX requirements to enable streaming
access to file system data
12/6/2015 Big Data 11
12. Large: Thousands of
server machines
Replicated data
blocks
Failure is norm
Fast detection and
recovery of faults
Properties
of HDFS
12/6/2015 Big Data 12
13. • Programming model for processing large data sets
• Developed by Google for internal search applications
• Currently used by Yahoo, Amazon, IBM etc
• The run time partitions the input and provides it to
different Map instances
12/6/2015 Big Data 13
14. Partitioning
the input
Mapping of
instances
Map (key,
value)
(key’, value’)
Collection
of the
(key’,
value’)
pairs
Distribution
to reduce
functions
Each
reduce
produces
single file
output
12/6/2015 Big Data 14
16. • $300 billion potential
annual value to US
health care.
• $600 billion potential
annual consumer
surplus from using
personal location data.
• 60% potential in
retailers’ operating
margins.
17. • Gaining attraction
• Huge market opportunities for IT services
(82.9% of revenues) and analytics firms
(17.1 % )
• Current market size is $200 million. By 2015 $1
billion
• The opportunity for Indian service providers lies
in offering services around Big Data
implementation and analytics for global
multinationals
18. 18
Big Data Challenges
• Hard to quantify
value to the
enterprise
• Data Scientists
roles are difficult
to fill
• Difficult to
design effective
visualization and
reporting of
new data sets
19. • Goal of improving retention and
graduation rates
• Developing a more pro-active
relationship with students to help
them be more successful during
and after graduation
• Approach:
1. Online Applications for
Education
2. Forums
3. Help desk
4. Student Demographic and
Operational Information
22. 12/6/2015 Big Data 22
What makes Modi’s use of big data so impressive
Volume of Data : 814 million voters
Variety of data – 12 different languages
-- 900,000 PDF’s amounting
-- 25 million pages
-- heterogeneous, non-uniform data
For what purpose did he use Big Data ?
-> To drive donations, enroll volunteers, and improve
the effectiveness of everything from door knocks…to social media
BJP’s website, planted cookies on all computers that visited its site - for customised
advertisements.
#IndiaVotes
Source : dataconomy.com