SlideShare a Scribd company logo
Avoiding CRUD operations lock-in
in NoSQL databases: extension of
the CPIM library
Candidato: Fabio Arcidiacono (799001)
Relatore: Prof.ssa Elisabetta Di Nitto
Correlatore: Ing. Marco Scavuzzo
Scuola di Ingegneria Industriale e dell'Informazione
Corso di Laurea Magistrale in
Ingegneria Informatica
Anno Accademico 2013 - 2014
Tesi di Laurea Magistrale – Fabio Arcidiacono
Data management systems
2
Well structured data
Relational model
ACID transactions
Vertical scaling
SQL
RDBMS
Non-structured data
Various data models
BASE properties
Horizontal scaling
Proprietary API
NoSQL
Tesi di Laurea Magistrale – Fabio Arcidiacono
NoSQL Common language approaches
3
Meta-model
• Apache MetaModel
• SOS platform
SQLification
• Apache Phoenix
• UnQL
• Native support
ORM
• Kundera
• PlayORM
• Spring-data
• Apache Gora
Tesi di Laurea Magistrale – Fabio Arcidiacono
Work objectives
4
Integrate Kundera in the CPIM library
Contribute to the open source project Kundera
Integrate the migration and synchronization system Hegira
Evaluation
Tesi di Laurea Magistrale – Fabio Arcidiacono
Work objectives
5
Integrate Kundera in the CPIM library
Contribute to the open source project Kundera
Integrate the migration and synchronization system Hegira
Evaluation
Tesi di Laurea Magistrale – Fabio Arcidiacono


Kundera
A JPA 2.1 ORM Library for NoSQL databases
6
ORM operation (through EntityManager interface)
JPQL queries (DELETE and UPDATE)
On-premises databases:
• Cassandra
• HBase
• MongoDB
• Oracle NoSQL
• Redis
• Neo4j
• Couchdb
• Elastic Search
• MySQL
Tesi di Laurea Magistrale – Fabio Arcidiacono
Why Kundera
• Open source
• Developed with extensibility as primary goal
• Support to many different NoSQL databases
• Polyglot persistency
• In the field since 2010 with an active community
• Already used in production
7
Tesi di Laurea Magistrale – Fabio Arcidiacono
Contributions to Kundera
Two newly developed clients
• Azure Tables1
• GAE Datastore2
Paradigm shift
• Off-premises databases à DaaS solutions
• Bug fix Kundera deploy on PaaS
8
1: https://github.com/deib-polimi/kundera-azure-table 

2: https://github.com/deib-polimi/kundera-gae-datastore
Tesi di Laurea Magistrale – Fabio Arcidiacono
Developed clients
9
Exploit consistency mechanisms as much
as possible
GAE Datastore
à no Ancestor Path support
Azure Tables
à manage partition key and row key
master
Limited support to consistency 

mechanisms but achieve interoperability
GAE Datastore
à no Ancestor Path support
Azure Tables
à fix partition key to table name
migration
Tesi di Laurea Magistrale – Fabio Arcidiacono
Work objectives
10
Integrate Kundera in the CPIM library
Contribute to the open source project Kundera
Integrate the migration and synchronization system Hegira
Evaluation
Tesi di Laurea Magistrale – Fabio Arcidiacono


CPIM
Abstract application logic from the specific PaaS Provider to overcome the vendor lock-in
11
Many supported services:
• Blob
• NoSQL
• Memcache
• Queue
• Mail
• SQL
Tesi di Laurea Magistrale – Fabio Arcidiacono
Original CPIM NoSQL service implementation
12
• Many JPA providers
• Duplicated code
• No complete code portability
• Choice of the NoSQL database strictly bounded to the cloud 

provider (e.g. App Engine à Datastore)
• Limited NoSQL databases support
CloudEntityManager
jpa4Azure
SimpleJPA
Google JPA
Azure EntityManager
Azure EntityManagerFactory
AWS EntityManager
AWS EntityManagerFactory
GAE EntityManager
GAE EntityManagerFactory
CloudEntityManagerFactory
Tesi di Laurea Magistrale – Fabio Arcidiacono
Kundera integration
• Single persistence provider
• Complete code portability
• NoSQL support inherited by Kundera
• Easier Configuration through standard persistence.xml
13
CloudEntityManager
Kundera
CloudEntityManagerFactory
Tesi di Laurea Magistrale – Fabio Arcidiacono
Work objectives
14
Integrate Kundera in the CPIM library
Contribute to the open source project Kundera
Integrate the migration and synchronization system Hegira
Evaluation
Tesi di Laurea Magistrale – Fabio Arcidiacono


Data migration
15
• move application to another cloud provider
• move data to a database that better fit requirements
• load balancing, system expansion, failure recovery, costs, etc.
• modern computer systems are expected to be up continuously
• data synchronization between the two involved systems
Tesi di Laurea Magistrale – Fabio Arcidiacono


Hegira support
16
• Intercept transparently user operations (DMQ)
• Translate operations to SQL statements
• Send them to the Hegira commit-log
Tesi di Laurea Magistrale – Fabio Arcidiacono
Work objectives
17
Integrate Kundera in the CPIM library
Contribute to the open source project Kundera
Integrate the migration and synchronization system Hegira
Evaluation
Tesi di Laurea Magistrale – Fabio Arcidiacono
Cloud Serving Benchmark
18
Compare Kundera client w.r.t. the use of low-level API for the same operations
• Development of new adapter for operations through Kundera
• Development of new adapter for operations through the low-level API
Workload
100.000 entities
Transaction

phase

(read)
Load

phase
(write)
Write

operation 

report
Read
operation 

report
produces
produces
Framework for evaluating the performance of different NoSQL databases
Tesi di Laurea Magistrale – Fabio Arcidiacono
19
Environment setup
YCSB

+
YCSB adapters
+ 

Kundera GAE
Datastore client Datastore
YCSB

+
YCSB adapters
+ 

Kundera Azure
Tables client
Azure 

Tables
4 core 

7 GB RAM
4 core 

3.6 GB RAM
Tesi di Laurea Magistrale – Fabio Arcidiacono
Results comparison
20
Azure Tables 

Read latency Read throughput Write latency Write throughput
Kundera 42,44 ms 689,67 ops/sec 40,701 ms 707,12 ops/sec
low-level API 36,74 ms 787,22 ops/sec 38,809 ms 758,54 ops/sec
overhead 13,43 % 12,39 % 4,75 % 6,78 %
Google Datastore
Read latency Read throughput Write latency Write throughput
Kundera 139,13 ms 212,74 ops/sec 151,159 ms 194,64 ops/sec
low-level API 132,36 ms 222,5 ops/sec 150,018 ms 198,67 ops/sec
overhead 4,36 % 4,39 % 0,76 % 2,03 %
Tesi di Laurea Magistrale – Fabio Arcidiacono
Conclusions
21
Contributions:
● Integration of Kundera in CPIM library
● New Kundera clients to support Google Datastore and Azure Tables
● Hegira integration in the CPIM library
Future work:
● Compare developed client performance with the ones of the other
client developed by Kundera team
Tesi di Laurea Magistrale – Fabio Arcidiacono
THANK YOU
42

More Related Content

What's hot

Research in Internet of Things' Operating Systems (IoT OS's)
Research in Internet of Things' Operating Systems (IoT OS's)Research in Internet of Things' Operating Systems (IoT OS's)
Research in Internet of Things' Operating Systems (IoT OS's)
Salahuddin ElKazak
 
Next IIoT wave: embedded digital twin for manufacturing
Next IIoT wave: embedded digital twin for manufacturing Next IIoT wave: embedded digital twin for manufacturing
Next IIoT wave: embedded digital twin for manufacturing
IRS srl
 
Presentazione Tesi di Laurea Triennale
Presentazione Tesi di Laurea Triennale Presentazione Tesi di Laurea Triennale
Presentazione Tesi di Laurea Triennale
Gianmarco Beato
 
Industry 4.0
Industry 4.0Industry 4.0
Industry 4.0
Vinoli Soysa
 
Presentazione tesi di laurea
Presentazione tesi di laureaPresentazione tesi di laurea
Presentazione tesi di laurea
Erika Montoli
 
Eco conception textile
Eco conception textileEco conception textile
Eco conception textile
Perrine Collin
 
INDUSTRY 4.0
INDUSTRY 4.0INDUSTRY 4.0
INDUSTRY 4.0
Pajin Batman
 
What is industry 4.0
What is industry 4.0 What is industry 4.0
What is industry 4.0
Marc-Andre Leger
 
Industry 4.0
Industry 4.0Industry 4.0
Industry 4.0
Anuj Gupta
 
19. Evoluzione dei paradigmi di interazione (I)
19. Evoluzione dei paradigmi di interazione (I)19. Evoluzione dei paradigmi di interazione (I)
19. Evoluzione dei paradigmi di interazione (I)
Roberto Polillo
 
Le plan de numérisation du Ministère de la Culture (1996-2003)
Le plan de numérisation du Ministère de la Culture (1996-2003) Le plan de numérisation du Ministère de la Culture (1996-2003)
Le plan de numérisation du Ministère de la Culture (1996-2003)
Jpsd consultant
 
Next generation Manufacturing - winning through technology and innovation
Next generation Manufacturing - winning through technology and innovationNext generation Manufacturing - winning through technology and innovation
Next generation Manufacturing - winning through technology and innovation
Felipe Sotelo A.
 
Università Di Salerno Presentazione Tesi Gaetano Costa
Università Di Salerno   Presentazione Tesi Gaetano CostaUniversità Di Salerno   Presentazione Tesi Gaetano Costa
Università Di Salerno Presentazione Tesi Gaetano Costa
guest777bcf
 
Implementing IOT - A Road Map
Implementing IOT - A Road MapImplementing IOT - A Road Map
Implementing IOT - A Road Map
Bryan K. O'Rourke
 
AI in Manufacturing - John.pdf
AI in Manufacturing - John.pdfAI in Manufacturing - John.pdf
AI in Manufacturing - John.pdf
John Chang
 
Methods and Challenges for Metaverse Analytics.pdf
Methods and Challenges for Metaverse Analytics.pdfMethods and Challenges for Metaverse Analytics.pdf
Methods and Challenges for Metaverse Analytics.pdf
Safaa Alnabulsi
 
Digital Transformation in the Manufacturing sector
Digital Transformation in the Manufacturing sectorDigital Transformation in the Manufacturing sector
Digital Transformation in the Manufacturing sector
Arun Natarajan
 
COMPONENTS OF INDUSTRY 4.0
COMPONENTS OF INDUSTRY 4.0COMPONENTS OF INDUSTRY 4.0
COMPONENTS OF INDUSTRY 4.0
JerishAmul
 
What's next in edge computing?
What's next in edge computing?What's next in edge computing?
What's next in edge computing?
Fastly
 
Slides tesi di laurea Fabiano Dalla Piazza
Slides tesi di laurea Fabiano Dalla PiazzaSlides tesi di laurea Fabiano Dalla Piazza
Slides tesi di laurea Fabiano Dalla Piazza
Fabiano Dalla Piazza
 

What's hot (20)

Research in Internet of Things' Operating Systems (IoT OS's)
Research in Internet of Things' Operating Systems (IoT OS's)Research in Internet of Things' Operating Systems (IoT OS's)
Research in Internet of Things' Operating Systems (IoT OS's)
 
Next IIoT wave: embedded digital twin for manufacturing
Next IIoT wave: embedded digital twin for manufacturing Next IIoT wave: embedded digital twin for manufacturing
Next IIoT wave: embedded digital twin for manufacturing
 
Presentazione Tesi di Laurea Triennale
Presentazione Tesi di Laurea Triennale Presentazione Tesi di Laurea Triennale
Presentazione Tesi di Laurea Triennale
 
Industry 4.0
Industry 4.0Industry 4.0
Industry 4.0
 
Presentazione tesi di laurea
Presentazione tesi di laureaPresentazione tesi di laurea
Presentazione tesi di laurea
 
Eco conception textile
Eco conception textileEco conception textile
Eco conception textile
 
INDUSTRY 4.0
INDUSTRY 4.0INDUSTRY 4.0
INDUSTRY 4.0
 
What is industry 4.0
What is industry 4.0 What is industry 4.0
What is industry 4.0
 
Industry 4.0
Industry 4.0Industry 4.0
Industry 4.0
 
19. Evoluzione dei paradigmi di interazione (I)
19. Evoluzione dei paradigmi di interazione (I)19. Evoluzione dei paradigmi di interazione (I)
19. Evoluzione dei paradigmi di interazione (I)
 
Le plan de numérisation du Ministère de la Culture (1996-2003)
Le plan de numérisation du Ministère de la Culture (1996-2003) Le plan de numérisation du Ministère de la Culture (1996-2003)
Le plan de numérisation du Ministère de la Culture (1996-2003)
 
Next generation Manufacturing - winning through technology and innovation
Next generation Manufacturing - winning through technology and innovationNext generation Manufacturing - winning through technology and innovation
Next generation Manufacturing - winning through technology and innovation
 
Università Di Salerno Presentazione Tesi Gaetano Costa
Università Di Salerno   Presentazione Tesi Gaetano CostaUniversità Di Salerno   Presentazione Tesi Gaetano Costa
Università Di Salerno Presentazione Tesi Gaetano Costa
 
Implementing IOT - A Road Map
Implementing IOT - A Road MapImplementing IOT - A Road Map
Implementing IOT - A Road Map
 
AI in Manufacturing - John.pdf
AI in Manufacturing - John.pdfAI in Manufacturing - John.pdf
AI in Manufacturing - John.pdf
 
Methods and Challenges for Metaverse Analytics.pdf
Methods and Challenges for Metaverse Analytics.pdfMethods and Challenges for Metaverse Analytics.pdf
Methods and Challenges for Metaverse Analytics.pdf
 
Digital Transformation in the Manufacturing sector
Digital Transformation in the Manufacturing sectorDigital Transformation in the Manufacturing sector
Digital Transformation in the Manufacturing sector
 
COMPONENTS OF INDUSTRY 4.0
COMPONENTS OF INDUSTRY 4.0COMPONENTS OF INDUSTRY 4.0
COMPONENTS OF INDUSTRY 4.0
 
What's next in edge computing?
What's next in edge computing?What's next in edge computing?
What's next in edge computing?
 
Slides tesi di laurea Fabiano Dalla Piazza
Slides tesi di laurea Fabiano Dalla PiazzaSlides tesi di laurea Fabiano Dalla Piazza
Slides tesi di laurea Fabiano Dalla Piazza
 

Viewers also liked

Tesi Polimi Presentation
Tesi Polimi PresentationTesi Polimi Presentation
Tesi Polimi Presentation
binh11091983
 
UIC POLIMI Master of Science in Computer Science Presentation
UIC POLIMI Master of Science in Computer Science PresentationUIC POLIMI Master of Science in Computer Science Presentation
UIC POLIMI Master of Science in Computer Science Presentation
Pier Luca Lanzi
 
Master of Science in Computer Science - Politecnico di Milano and UIC
Master of Science in Computer Science - Politecnico di Milano and UICMaster of Science in Computer Science - Politecnico di Milano and UIC
Master of Science in Computer Science - Politecnico di Milano and UIC
Pier Luca Lanzi
 
MM Bagali, HR, HRM, HRD, MBA, Interview, Research....... interview 2013
MM Bagali, HR, HRM, HRD, MBA, Interview, Research.......  interview 2013MM Bagali, HR, HRM, HRD, MBA, Interview, Research.......  interview 2013
MM Bagali, HR, HRM, HRD, MBA, Interview, Research....... interview 2013
dr m m bagali, phd in hr
 
The Art of Doing a PhD
The Art of Doing a PhDThe Art of Doing a PhD
The Art of Doing a PhD
Jakob Bardram
 
Interview Presentation
Interview PresentationInterview Presentation
Interview Presentation
Paesandon
 
Phd interview
Phd interviewPhd interview
Phd interview
Ty Zhang
 

Viewers also liked (7)

Tesi Polimi Presentation
Tesi Polimi PresentationTesi Polimi Presentation
Tesi Polimi Presentation
 
UIC POLIMI Master of Science in Computer Science Presentation
UIC POLIMI Master of Science in Computer Science PresentationUIC POLIMI Master of Science in Computer Science Presentation
UIC POLIMI Master of Science in Computer Science Presentation
 
Master of Science in Computer Science - Politecnico di Milano and UIC
Master of Science in Computer Science - Politecnico di Milano and UICMaster of Science in Computer Science - Politecnico di Milano and UIC
Master of Science in Computer Science - Politecnico di Milano and UIC
 
MM Bagali, HR, HRM, HRD, MBA, Interview, Research....... interview 2013
MM Bagali, HR, HRM, HRD, MBA, Interview, Research.......  interview 2013MM Bagali, HR, HRM, HRD, MBA, Interview, Research.......  interview 2013
MM Bagali, HR, HRM, HRD, MBA, Interview, Research....... interview 2013
 
The Art of Doing a PhD
The Art of Doing a PhDThe Art of Doing a PhD
The Art of Doing a PhD
 
Interview Presentation
Interview PresentationInterview Presentation
Interview Presentation
 
Phd interview
Phd interviewPhd interview
Phd interview
 

Similar to Master thesis

IncQuery Server for Teamwork Cloud - Talk at IW2019
IncQuery Server for Teamwork Cloud - Talk at IW2019IncQuery Server for Teamwork Cloud - Talk at IW2019
IncQuery Server for Teamwork Cloud - Talk at IW2019
Istvan Rath
 
Full lifecycle of a microservice
Full lifecycle of a microserviceFull lifecycle of a microservice
Full lifecycle of a microservice
Luigi Bennardis
 
APIdays Paris 2019 - Delivering Exceptional User Experience with REST and Gra...
APIdays Paris 2019 - Delivering Exceptional User Experience with REST and Gra...APIdays Paris 2019 - Delivering Exceptional User Experience with REST and Gra...
APIdays Paris 2019 - Delivering Exceptional User Experience with REST and Gra...
apidays
 
2018 02-08-what's-new-in-apache-spark-2.3
2018 02-08-what's-new-in-apache-spark-2.3 2018 02-08-what's-new-in-apache-spark-2.3
2018 02-08-what's-new-in-apache-spark-2.3
Chester Chen
 
Continuous Integration & Continuous Delivery
Continuous Integration & Continuous DeliveryContinuous Integration & Continuous Delivery
Continuous Integration & Continuous Delivery
Databricks
 
What's New in Upcoming Apache Spark 2.3
What's New in Upcoming Apache Spark 2.3What's New in Upcoming Apache Spark 2.3
What's New in Upcoming Apache Spark 2.3
Databricks
 
High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...
High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...
High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...
Spark Summit
 
Bringing Streaming Data To The Masses: Lowering The “Cost Of Admission” For Y...
Bringing Streaming Data To The Masses: Lowering The “Cost Of Admission” For Y...Bringing Streaming Data To The Masses: Lowering The “Cost Of Admission” For Y...
Bringing Streaming Data To The Masses: Lowering The “Cost Of Admission” For Y...
confluent
 
2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix
Yu Ishikawa
 
Incquery Suite Models 2020 Conference by István Ráth, CEO of IncQuery Labs
Incquery Suite Models 2020 Conference by István Ráth, CEO of IncQuery LabsIncquery Suite Models 2020 Conference by István Ráth, CEO of IncQuery Labs
Incquery Suite Models 2020 Conference by István Ráth, CEO of IncQuery Labs
IncQuery Labs
 
Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...
Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...
Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...
Vietnam Open Infrastructure User Group
 
Day 13 - Creating Data Processing Services | Train the Trainers Program
Day 13 - Creating Data Processing Services | Train the Trainers ProgramDay 13 - Creating Data Processing Services | Train the Trainers Program
Day 13 - Creating Data Processing Services | Train the Trainers Program
FIWARE
 
A Collaborative Data Science Development Workflow
A Collaborative Data Science Development WorkflowA Collaborative Data Science Development Workflow
A Collaborative Data Science Development Workflow
Databricks
 
Data Engineering Course Syllabus - WeCloudData
Data Engineering Course Syllabus - WeCloudDataData Engineering Course Syllabus - WeCloudData
Data Engineering Course Syllabus - WeCloudData
WeCloudData
 
Fighting Fraud with Apache Spark
Fighting Fraud with Apache SparkFighting Fraud with Apache Spark
Fighting Fraud with Apache Spark
Miklos Christine
 
The structured streaming upgrade to Apache Spark and how enterprises can bene...
The structured streaming upgrade to Apache Spark and how enterprises can bene...The structured streaming upgrade to Apache Spark and how enterprises can bene...
The structured streaming upgrade to Apache Spark and how enterprises can bene...
Impetus Technologies
 
Apache Arrow: Open Source Standard Becomes an Enterprise Necessity
Apache Arrow: Open Source Standard Becomes an Enterprise NecessityApache Arrow: Open Source Standard Becomes an Enterprise Necessity
Apache Arrow: Open Source Standard Becomes an Enterprise Necessity
Wes McKinney
 
Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...
Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...
Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...
Tokyo Azure Meetup
 
Solving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache ArrowSolving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache Arrow
Wes McKinney
 
Cloud Native Apps
Cloud Native AppsCloud Native Apps
Cloud Native Apps
David Chou
 

Similar to Master thesis (20)

IncQuery Server for Teamwork Cloud - Talk at IW2019
IncQuery Server for Teamwork Cloud - Talk at IW2019IncQuery Server for Teamwork Cloud - Talk at IW2019
IncQuery Server for Teamwork Cloud - Talk at IW2019
 
Full lifecycle of a microservice
Full lifecycle of a microserviceFull lifecycle of a microservice
Full lifecycle of a microservice
 
APIdays Paris 2019 - Delivering Exceptional User Experience with REST and Gra...
APIdays Paris 2019 - Delivering Exceptional User Experience with REST and Gra...APIdays Paris 2019 - Delivering Exceptional User Experience with REST and Gra...
APIdays Paris 2019 - Delivering Exceptional User Experience with REST and Gra...
 
2018 02-08-what's-new-in-apache-spark-2.3
2018 02-08-what's-new-in-apache-spark-2.3 2018 02-08-what's-new-in-apache-spark-2.3
2018 02-08-what's-new-in-apache-spark-2.3
 
Continuous Integration & Continuous Delivery
Continuous Integration & Continuous DeliveryContinuous Integration & Continuous Delivery
Continuous Integration & Continuous Delivery
 
What's New in Upcoming Apache Spark 2.3
What's New in Upcoming Apache Spark 2.3What's New in Upcoming Apache Spark 2.3
What's New in Upcoming Apache Spark 2.3
 
High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...
High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...
High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...
 
Bringing Streaming Data To The Masses: Lowering The “Cost Of Admission” For Y...
Bringing Streaming Data To The Masses: Lowering The “Cost Of Admission” For Y...Bringing Streaming Data To The Masses: Lowering The “Cost Of Admission” For Y...
Bringing Streaming Data To The Masses: Lowering The “Cost Of Admission” For Y...
 
2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix
 
Incquery Suite Models 2020 Conference by István Ráth, CEO of IncQuery Labs
Incquery Suite Models 2020 Conference by István Ráth, CEO of IncQuery LabsIncquery Suite Models 2020 Conference by István Ráth, CEO of IncQuery Labs
Incquery Suite Models 2020 Conference by István Ráth, CEO of IncQuery Labs
 
Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...
Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...
Room 2 - 6 - Đinh Tuấn Phong - Migrate opensource database to Kubernetes easi...
 
Day 13 - Creating Data Processing Services | Train the Trainers Program
Day 13 - Creating Data Processing Services | Train the Trainers ProgramDay 13 - Creating Data Processing Services | Train the Trainers Program
Day 13 - Creating Data Processing Services | Train the Trainers Program
 
A Collaborative Data Science Development Workflow
A Collaborative Data Science Development WorkflowA Collaborative Data Science Development Workflow
A Collaborative Data Science Development Workflow
 
Data Engineering Course Syllabus - WeCloudData
Data Engineering Course Syllabus - WeCloudDataData Engineering Course Syllabus - WeCloudData
Data Engineering Course Syllabus - WeCloudData
 
Fighting Fraud with Apache Spark
Fighting Fraud with Apache SparkFighting Fraud with Apache Spark
Fighting Fraud with Apache Spark
 
The structured streaming upgrade to Apache Spark and how enterprises can bene...
The structured streaming upgrade to Apache Spark and how enterprises can bene...The structured streaming upgrade to Apache Spark and how enterprises can bene...
The structured streaming upgrade to Apache Spark and how enterprises can bene...
 
Apache Arrow: Open Source Standard Becomes an Enterprise Necessity
Apache Arrow: Open Source Standard Becomes an Enterprise NecessityApache Arrow: Open Source Standard Becomes an Enterprise Necessity
Apache Arrow: Open Source Standard Becomes an Enterprise Necessity
 
Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...
Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...
Tokyo Azure Meetup #7 - Introduction to Serverless Architectures with Azure F...
 
Solving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache ArrowSolving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache Arrow
 
Cloud Native Apps
Cloud Native AppsCloud Native Apps
Cloud Native Apps
 

Recently uploaded

Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
alexjohnson7307
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
Shinana2
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!
GDSC PJATK
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
fredae14
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 

Recently uploaded (20)

Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!Finale of the Year: Apply for Next One!
Finale of the Year: Apply for Next One!
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 

Master thesis

  • 1. Avoiding CRUD operations lock-in in NoSQL databases: extension of the CPIM library Candidato: Fabio Arcidiacono (799001) Relatore: Prof.ssa Elisabetta Di Nitto Correlatore: Ing. Marco Scavuzzo Scuola di Ingegneria Industriale e dell'Informazione Corso di Laurea Magistrale in Ingegneria Informatica Anno Accademico 2013 - 2014
  • 2. Tesi di Laurea Magistrale – Fabio Arcidiacono Data management systems 2 Well structured data Relational model ACID transactions Vertical scaling SQL RDBMS Non-structured data Various data models BASE properties Horizontal scaling Proprietary API NoSQL
  • 3. Tesi di Laurea Magistrale – Fabio Arcidiacono NoSQL Common language approaches 3 Meta-model • Apache MetaModel • SOS platform SQLification • Apache Phoenix • UnQL • Native support ORM • Kundera • PlayORM • Spring-data • Apache Gora
  • 4. Tesi di Laurea Magistrale – Fabio Arcidiacono Work objectives 4 Integrate Kundera in the CPIM library Contribute to the open source project Kundera Integrate the migration and synchronization system Hegira Evaluation
  • 5. Tesi di Laurea Magistrale – Fabio Arcidiacono Work objectives 5 Integrate Kundera in the CPIM library Contribute to the open source project Kundera Integrate the migration and synchronization system Hegira Evaluation
  • 6. Tesi di Laurea Magistrale – Fabio Arcidiacono 
 Kundera A JPA 2.1 ORM Library for NoSQL databases 6 ORM operation (through EntityManager interface) JPQL queries (DELETE and UPDATE) On-premises databases: • Cassandra • HBase • MongoDB • Oracle NoSQL • Redis • Neo4j • Couchdb • Elastic Search • MySQL
  • 7. Tesi di Laurea Magistrale – Fabio Arcidiacono Why Kundera • Open source • Developed with extensibility as primary goal • Support to many different NoSQL databases • Polyglot persistency • In the field since 2010 with an active community • Already used in production 7
  • 8. Tesi di Laurea Magistrale – Fabio Arcidiacono Contributions to Kundera Two newly developed clients • Azure Tables1 • GAE Datastore2 Paradigm shift • Off-premises databases à DaaS solutions • Bug fix Kundera deploy on PaaS 8 1: https://github.com/deib-polimi/kundera-azure-table 
 2: https://github.com/deib-polimi/kundera-gae-datastore
  • 9. Tesi di Laurea Magistrale – Fabio Arcidiacono Developed clients 9 Exploit consistency mechanisms as much as possible GAE Datastore à no Ancestor Path support Azure Tables à manage partition key and row key master Limited support to consistency 
 mechanisms but achieve interoperability GAE Datastore à no Ancestor Path support Azure Tables à fix partition key to table name migration
  • 10. Tesi di Laurea Magistrale – Fabio Arcidiacono Work objectives 10 Integrate Kundera in the CPIM library Contribute to the open source project Kundera Integrate the migration and synchronization system Hegira Evaluation
  • 11. Tesi di Laurea Magistrale – Fabio Arcidiacono 
 CPIM Abstract application logic from the specific PaaS Provider to overcome the vendor lock-in 11 Many supported services: • Blob • NoSQL • Memcache • Queue • Mail • SQL
  • 12. Tesi di Laurea Magistrale – Fabio Arcidiacono Original CPIM NoSQL service implementation 12 • Many JPA providers • Duplicated code • No complete code portability • Choice of the NoSQL database strictly bounded to the cloud 
 provider (e.g. App Engine à Datastore) • Limited NoSQL databases support CloudEntityManager jpa4Azure SimpleJPA Google JPA Azure EntityManager Azure EntityManagerFactory AWS EntityManager AWS EntityManagerFactory GAE EntityManager GAE EntityManagerFactory CloudEntityManagerFactory
  • 13. Tesi di Laurea Magistrale – Fabio Arcidiacono Kundera integration • Single persistence provider • Complete code portability • NoSQL support inherited by Kundera • Easier Configuration through standard persistence.xml 13 CloudEntityManager Kundera CloudEntityManagerFactory
  • 14. Tesi di Laurea Magistrale – Fabio Arcidiacono Work objectives 14 Integrate Kundera in the CPIM library Contribute to the open source project Kundera Integrate the migration and synchronization system Hegira Evaluation
  • 15. Tesi di Laurea Magistrale – Fabio Arcidiacono 
 Data migration 15 • move application to another cloud provider • move data to a database that better fit requirements • load balancing, system expansion, failure recovery, costs, etc. • modern computer systems are expected to be up continuously • data synchronization between the two involved systems
  • 16. Tesi di Laurea Magistrale – Fabio Arcidiacono 
 Hegira support 16 • Intercept transparently user operations (DMQ) • Translate operations to SQL statements • Send them to the Hegira commit-log
  • 17. Tesi di Laurea Magistrale – Fabio Arcidiacono Work objectives 17 Integrate Kundera in the CPIM library Contribute to the open source project Kundera Integrate the migration and synchronization system Hegira Evaluation
  • 18. Tesi di Laurea Magistrale – Fabio Arcidiacono Cloud Serving Benchmark 18 Compare Kundera client w.r.t. the use of low-level API for the same operations • Development of new adapter for operations through Kundera • Development of new adapter for operations through the low-level API Workload 100.000 entities Transaction
 phase
 (read) Load
 phase (write) Write
 operation 
 report Read operation 
 report produces produces Framework for evaluating the performance of different NoSQL databases
  • 19. Tesi di Laurea Magistrale – Fabio Arcidiacono 19 Environment setup YCSB
 + YCSB adapters + 
 Kundera GAE Datastore client Datastore YCSB
 + YCSB adapters + 
 Kundera Azure Tables client Azure 
 Tables 4 core 
 7 GB RAM 4 core 
 3.6 GB RAM
  • 20. Tesi di Laurea Magistrale – Fabio Arcidiacono Results comparison 20 Azure Tables 
 Read latency Read throughput Write latency Write throughput Kundera 42,44 ms 689,67 ops/sec 40,701 ms 707,12 ops/sec low-level API 36,74 ms 787,22 ops/sec 38,809 ms 758,54 ops/sec overhead 13,43 % 12,39 % 4,75 % 6,78 % Google Datastore Read latency Read throughput Write latency Write throughput Kundera 139,13 ms 212,74 ops/sec 151,159 ms 194,64 ops/sec low-level API 132,36 ms 222,5 ops/sec 150,018 ms 198,67 ops/sec overhead 4,36 % 4,39 % 0,76 % 2,03 %
  • 21. Tesi di Laurea Magistrale – Fabio Arcidiacono Conclusions 21 Contributions: ● Integration of Kundera in CPIM library ● New Kundera clients to support Google Datastore and Azure Tables ● Hegira integration in the CPIM library Future work: ● Compare developed client performance with the ones of the other client developed by Kundera team
  • 22. Tesi di Laurea Magistrale – Fabio Arcidiacono THANK YOU 42