A peek into the future

A PEEK INTO THE FUTURE OF
DATA
ORM
NoSQL
Big Data

Presented by:
PRATEEK CHAUHAN
10ESKCS738

BEFORE STARTING….
• Are relational tables the most efficient way to
manage data?
• Do companies like Facebook, Twitter really use
traditional relational DBMS to manage data?

ORM
O

OBJECT

M

R

RELATIONAL

MAPPING

WAYS TO ACCESS DATABASE
• Using a GUI based DBMS
• Using a console based DBMS
• Using database embedded with applications
(most important).

THE BRIDGE ?

APPLICATION
PROGRAMMING
INTERFACE
(API)

DATABASE
THE BRIDGE

THE BRIDGE: JDBC
•Standard Java API for database-independent connectivity
between the Java programming language and a wide range of
databases.
•JDBC provides a flexible architecture to write a database
independent applications that can run on different platforms and
interact with different DBMS without any modification.
•JDBC includes APIs for each of the task commonly associated
with database usage:
Making a connection to a database.
Creating SQL statements.
Executing SQL queries in the database.
Viewing & modifying the resulting records.

JDBC
Pros of JDBC
• Clean and simple SQL
processing
• Good performance with
small data
• Very good for small
applications
• Simple syntax so easy to
learn

Cons of JDBC
• Complex if it is used in large
projects
• Large programming
overhead
• No encapsulation
• Hard to implement MVC
concept
• Query is DBMS specific

The Problem
•
•
•
•

Mapping member variables to columns
Mapping Relationships
Handling data types (esp. Boolean)
Managing changes to object state

The Problem

Relational
Object

Mapping!

Saving without ORM
•
•
•
•
•

Database Configuration
The Model Object
Service method to create the model object
Database Design
DAO method to save the object using SQL
queries

The ORM Way
• JDBC Database Configuration – ORM specific
Configuration
• The Model object – Annotations
• Service method to create the model object –
Use the ORM framework API API
• Database Design – Not Needed !
• DAO method to save the objects using SQL
queries – Not Needed !

THE ONLY DISADVANTAGE
• Boilerplate code
=> XML configuration files
=> XML system files
=> Extra classes like POJO, etc.

NoSQL: THE NAME
• SQL: In general, “Traditional Relational DBMS”.
• Past decade: RDBMS isn’t the best solution.
• NoSQL: “No SQL”=> Not using traditional
RDBMS

ISSUES WITH RDBMS
• Primary issue: big package, has all the
features, but sometimes we don’t need all of
them:
COMPROMISES
• Convenient
• Multi-user

SIMILAR
• Safety
• Persistent

BOOSTS
• Reliable
• MASSIVE (big
data)
• Efficient

NoSQL SYSTEMS
Alternative to traditional RDBMS
Pros
• Flexible Schema
• Quicker/ Cheaper to
setup
• Massive scalability:
handle

big data

• Relaxed Consistency:
higher performance &
availability

Cons
• No declarative query
language: more
programming

• Relaxed Consistency:
fewer guarantees

Example: Social-Network Graph
Each record: User ID1, User ID2 …
Separate records: User Id, name, age, gender …
A

B

I

G
H

C

F

D

K
J

E

L

Example: Social-Network Graph
• TASK: Find all friends of given users.
• TASK: Find all friends of friends of given user.

• TASK: Find all women friends of men friends of
given user.
• TASK: Find all friends of friends of…. friends of
given user.

INCARNATIONS OF NoSQL
• MapReduce Framework: OLAP (big operations)
• Key-Value Store: OLTP (small operations)

• Document Stores
• Graph database systems

MapReduce Framework
• Originally from Google, open source: Hadoop.
• Two main functions:
1. Map: divides the problem into sub problem.
2. Reduce: operates upon the sub problems and
combines output to give record.
• Current implementations:
1. Hive: SQL like language
2. Pig: statement language

Graph Database Systems
•Data Model: nodes and edges.
•Nodes may have properties.
•Edges may have labels or roles.
•Example: neo4j, FlockDB, Pregel
Friends
ID: 3

ID: 1

Friends

Likes

Likes
ID: 2

AGAIN, SOME QUESTIONS…
• What is the maximum file size you’ve dealt so
far?
• What is the maximum download speed you
get?
• How much time required to just transfer data?

What is Big Data?
• Every day, we create 2.5 quintillion bytes of data — so
much that 90% of the data in the world today has been
created in the last two years alone.
• From the beginning of recorded time until 2003,
 We created 5 billion gigabytes (exabytes) of data.

• In 2011, the same amount was created every two days
• In 2013, the same amount of data is created every 10
minutes.
THIS IS “BIG DATA”

What is Big Data?-FINALLY..
• Big- Data’ is similar to ‘Small-data’ but bigger
• But having data bigger it requires different
approaches:
– Techniques, tools, architecture
• With an aim to solve new problems
– Or old problems in a better way

Type of Data
• Relational Data (Tables/Transaction/Legacy
Data)
• Text Data (Web)
• Semi-structured Data (XML)
• Graph Data
– Social Network, Semantic Web (RDF), …

• Streaming Data
– You can only scan the data once

What to do with these data?
• Aggregation and Statistics
– Data warehouse and OLAP

• Indexing, Searching, and Querying
– Keyword based search
– Pattern matching (XML/RDF)

• Knowledge discovery
– Data Mining
– Statistical Modeling

Big Data Analytics Technologies
• NoSQL: non-relational database solutions such as
Hbase, Cassandra, MongoDB, Riak, CouchDB, and
many others.

• Hadoop: It is an ecosystem of software
packages, including MapReduce, HDFS, and a whole
host of other software packages.

Summarizing…
• Key enablers for the appearance and growth
of ‘Big-Data’ are:
+ Increase in storage capabilities
+ Increase in processing power
+ Availability of data

A peek into the future

More Related Content

What's hot

Viewers also liked

Similar to A peek into the future

Recently uploaded

A peek into the future