Building Big Data applications with Stratio Sparta
SPARTA 2.0
José Carlos García Serrano
Big Data Architect in Stratio.
I am from Granada and Computer Science Engineer in the
ETSII, post graduate in Big Data and certificate in Spark and
AWS
def fanBoy(): Seq[Skills] = {
val functional = Seq(Scala, Akka)
val processing = Seq(Spark)
val noSql = Seq(MongoDB, Cassandra)
functional ++ processing ++ noSql
}
def aLongTimeAgo(): Seq[Skills] = {
val programming = Seq(Delphi, C++)
val processing = Seq(Hadoop)
val sql = Seq(Interbase, FireBird)
programming ++ processing ++ sql
}
Sparktan one
Sparktan two
Javier Yuste Checa
Product Owner at Stratio.
I am from Madrid and I have a Master in Computer Science by
the UPM, with 10+ years of hands-on software development
experience.
I love travelling and motorbikes
I like:
● Software development
● Agile methodologies
● Product development
● Make things happen
What is Sparta?1
Questions4
New version2 Use case - Demo3
Index
WHAT IS SPARTA?
1
© Stratio 2016. Confidential, All Rights Reserved.
Towards a generic real-time aggregation platform
7
At Stratio, we have implemented several real-time analytic projects based on Apache Spark,
Kafka, Flume, Cassandra, or MongoDB.
These technologies were always a perfect fit, but soon we found ourselves writing the same
pieces of integration code over and over again.
This is how SPARTA was born
© Stratio 2016. Confidential, All Rights Reserved.
SPARTA - Beginning
8
The goals
○ Pure Spark!
○ No need of coding, only declarative workflows
○ Data continuously streamed in and processed in near real-time
○ Ready to use out of the box
○ Plug & play: flexible workflows (inputs, outputs, parsers, etc…)
○ High performance
○ Scalable and fault tolerant
○ Stateful operations with OLAP engine
○ Execute SQL over Streaming and batch data
© Stratio 2016. Confidential, All Rights Reserved.
SPARTA 1.0
9
Kafka
Flume
Crossdata
RabbitMQ
Socket
WebSocket
HDFS/S3
Twitter
MongoDB
Cassandra
ElasticSearch
Redis
JDBC
Crossdata
CSV
Parquet
Http
Kafka
HDFS/S3
Http Rest
Avro
© Stratio 2016. Confidential, All Rights Reserved.
SPARTA 1.0
10
NEW VERSION
2
© Stratio 2016. Confidential, All Rights Reserved.
SPARTA 2.0
12
Security
○ Data
○ Resource
○ Service
Distributed
Multiprocessing
○ Streaming
○ Batch
○ SQL / OLAP
○ Machine learning
SaaS & Multi
Tenancy
○ Sparta as a Service
○ Multi User
○ Multi Tenant
○ Multi Instance
© Stratio 2016. Confidential, All Rights Reserved.
EOS - Data centric Suite
13
USE CASES
DEMO
3
© Stratio 2017. Confidential, All Rights Reserved.
Big Data use case
15
• Different technologies and skills
needed
• Experience solving complex problems
• Knowledge of Big data architectures
© Stratio 2017. Confidential, All Rights Reserved.
Big Data use case
16
© Stratio 2017. Confidential, All Rights Reserved.
Simple use case
17
Input Test:
{
"event":"Big data spain",
"speech":"Sparta",
"attendees":10,
“address”: ”Kinepolis”
}
Select:
speech, attendees
Cube:
dimensions:
speech
operations:
sum(attendees)
BIG DATA
CHILD`S PLAY
THANKS
Contacts:
jcgarcia@stratio.com
es.linkedin.com/in/gserranojc
jyuste@stratio.com
es.linkedin.com/in/jyustecheca
Stratio Sparta 2.0

Stratio Sparta 2.0

  • 2.
    Building Big Dataapplications with Stratio Sparta SPARTA 2.0
  • 3.
    José Carlos GarcíaSerrano Big Data Architect in Stratio. I am from Granada and Computer Science Engineer in the ETSII, post graduate in Big Data and certificate in Spark and AWS def fanBoy(): Seq[Skills] = { val functional = Seq(Scala, Akka) val processing = Seq(Spark) val noSql = Seq(MongoDB, Cassandra) functional ++ processing ++ noSql } def aLongTimeAgo(): Seq[Skills] = { val programming = Seq(Delphi, C++) val processing = Seq(Hadoop) val sql = Seq(Interbase, FireBird) programming ++ processing ++ sql } Sparktan one
  • 4.
    Sparktan two Javier YusteCheca Product Owner at Stratio. I am from Madrid and I have a Master in Computer Science by the UPM, with 10+ years of hands-on software development experience. I love travelling and motorbikes I like: ● Software development ● Agile methodologies ● Product development ● Make things happen
  • 5.
    What is Sparta?1 Questions4 Newversion2 Use case - Demo3 Index
  • 6.
  • 7.
    © Stratio 2016.Confidential, All Rights Reserved. Towards a generic real-time aggregation platform 7 At Stratio, we have implemented several real-time analytic projects based on Apache Spark, Kafka, Flume, Cassandra, or MongoDB. These technologies were always a perfect fit, but soon we found ourselves writing the same pieces of integration code over and over again. This is how SPARTA was born
  • 8.
    © Stratio 2016.Confidential, All Rights Reserved. SPARTA - Beginning 8 The goals ○ Pure Spark! ○ No need of coding, only declarative workflows ○ Data continuously streamed in and processed in near real-time ○ Ready to use out of the box ○ Plug & play: flexible workflows (inputs, outputs, parsers, etc…) ○ High performance ○ Scalable and fault tolerant ○ Stateful operations with OLAP engine ○ Execute SQL over Streaming and batch data
  • 9.
    © Stratio 2016.Confidential, All Rights Reserved. SPARTA 1.0 9 Kafka Flume Crossdata RabbitMQ Socket WebSocket HDFS/S3 Twitter MongoDB Cassandra ElasticSearch Redis JDBC Crossdata CSV Parquet Http Kafka HDFS/S3 Http Rest Avro
  • 10.
    © Stratio 2016.Confidential, All Rights Reserved. SPARTA 1.0 10
  • 11.
  • 12.
    © Stratio 2016.Confidential, All Rights Reserved. SPARTA 2.0 12 Security ○ Data ○ Resource ○ Service Distributed Multiprocessing ○ Streaming ○ Batch ○ SQL / OLAP ○ Machine learning SaaS & Multi Tenancy ○ Sparta as a Service ○ Multi User ○ Multi Tenant ○ Multi Instance
  • 13.
    © Stratio 2016.Confidential, All Rights Reserved. EOS - Data centric Suite 13
  • 14.
  • 15.
    © Stratio 2017.Confidential, All Rights Reserved. Big Data use case 15 • Different technologies and skills needed • Experience solving complex problems • Knowledge of Big data architectures
  • 16.
    © Stratio 2017.Confidential, All Rights Reserved. Big Data use case 16
  • 17.
    © Stratio 2017.Confidential, All Rights Reserved. Simple use case 17 Input Test: { "event":"Big data spain", "speech":"Sparta", "attendees":10, “address”: ”Kinepolis” } Select: speech, attendees Cube: dimensions: speech operations: sum(attendees)
  • 18.