Role of data models in the Virtual Observatory, and overview of IVOA data models. Part of the virtual observatory course by Juan de Dios Santander Vela, as imparted for the MTAF (Métodos y Técnicas Avanzadas en Física, Advanced Methods and Techniques in Physics) Master at the University of Granada (UGR).
The talk present a new Data Aggregation System for CMS experiment at CERN. We use MongoDB database as caching layer to query multiple data-provides (backed up by RDMS) and aggregate data across them.
Talk has been presented at ICCS 2010 conference.
Balancing Replication and Partitioning in a Distributed Java DatabaseBen Stopford
This talk, presented at JavaOne 2011, describes the ODC, a distributed, in-memory database built in Java that holds objects in a normalized form in a way that alleviates the traditional degradation in performance associated with joins in shared-nothing architectures. The presentation describes the two patterns that lie at the core of this model. The first is an adaptation of the Star Schema model used to hold data either replicated or partitioned data, depending on whether the data is a fact or a dimension. In the second pattern, the data store tracks arcs on the object graph to ensure that only the minimum amount of data is replicated. Through these mechanisms, almost any join can be performed across the various entities stored in the grid, without the need for key shipping or iterative wire calls.
Mindtree is one of the first IT service providers to invest in emerging technologies and has developed various technology assets. Customers in product engineering services benefit heavily from our domain expertise.
Some of the technology assets developed include short-range wireless connectivity technologies such as Bluetooth and UWB, Video Analytic Algorithms, Acoustic Echo Cancellation, Audio Codecs, VoIP Stacks, etc.
Amit Sheth, "Semantic Interoperability and Information Brokering in Global Information Systems," Keynote given at IEEE Meta-Data, Bathesda, MD, April 6 1999.
Java Tech & Tools | Beyond the Data Grid: Coherence, Normalisation, Joins and...JAX London
2011-11-02 | 02:25 PM - 03:15 PM
In 2009 RBS set out to build a single store of trade and risk data that all applications in the bank could access simultaniously. This talk discusses a number of novel techniques that were developed as part of this work. Based on Oracle Coherence the ODC departs from the trend set by most caching solutions by holding its data in a normalised form making it both memory efficient and easy to change. However it does this in a novel way that supports most arbitrary queries without the usual problems associated with distributed joins. We'll be discussing these patterns as well as others that allow linear scalability, fault tolerance and millisecond latencies.
Detecting and Recognising Highly Arbitrary Shaped Texts from Product ImagesDatabricks
Extracting texts of various sizes, shapes and orientations from images containing multiple objects is an important problem in many contexts, especially, in connection to e-commerce, augmented reality assistance system in a natural scene, content moderation in social media platform, etc. In the context of a scale with which Walmart operates the text from the product image can be a richer and more accurate source of data than human inputs which can be used in several applications like AttributeExtraction, Offensive Text Classification, Compliance use cases, etc. Accurately extracting text from product images is a challenge given that product images come with a lot of variation which includes small, highly oriented, arbitrary shaped texts with fancy fonts, etc .. Typical word-level text detectors for text detection fails to detect/capture these variations or even if they are detected, text recognition models without any transformation layers fail to recognize and accurately extract the highly oriented or arbitrary shaped texts.
Embracing Observability in CI/CD with OpenTelemetryCyrille Le Clerc
Discover how observability and OpenTelemetry offer unprecedented solutions for both CI/CD administrators and dev teams to troubleshoot CI platforms and solve much more problems thanks to a vibrant community and a growing ecosystem. We will see with real life CI/CD pipelines using Jenkins, Maven, and Ansible how OpenTelemetry offers unprecedented solutions to troubleshoot software delivery pipelines. How the open source and standard nature of OpenTelemetry enables the emergence of a vibrant ecosystem of OpenTelemetry aware CI/CD tools to observe the entire software supply chain and help DevOps teams solve problems that go way beyond the observability use cases we have in mind.
https://community.cncf.io/events/details/cncf-cloud-native-canada-presents-november-2021-eastern-canadian-cncf-meetup-kubernetes-123-release-update-and-cicd-observability/
Open Source Monitoring for Java with JMX and Graphite (GeeCON 2013)Cyrille Le Clerc
Fast feedback from monitoring is a key of Continuous Delivery. JMX is the right Java API to do so but it unfortunately stayed underused and underappreciated as it was difficult to connect to monitoring and graphing systems.
Throw in the sin bin the poor solutions based on log files and weakly secured web interfaces! A new generation of Open Source tooling makes it easy to graph java application metrics and integrate them to traditional monitoring systems like Nagios.
Following the logic of DevOps, we will look together how best to integrate the monitoring dimension in a project: from design to development, to QA and finally to production on both traditional deployment and in the Cloud.
Come and discover how the JmxTrans-Graphite ticket can make your life easier.
More Related Content
Similar to GeeCon 2011 - NoSQL and In Memory Data Grids from a developer perspective
Role of data models in the Virtual Observatory, and overview of IVOA data models. Part of the virtual observatory course by Juan de Dios Santander Vela, as imparted for the MTAF (Métodos y Técnicas Avanzadas en Física, Advanced Methods and Techniques in Physics) Master at the University of Granada (UGR).
The talk present a new Data Aggregation System for CMS experiment at CERN. We use MongoDB database as caching layer to query multiple data-provides (backed up by RDMS) and aggregate data across them.
Talk has been presented at ICCS 2010 conference.
Balancing Replication and Partitioning in a Distributed Java DatabaseBen Stopford
This talk, presented at JavaOne 2011, describes the ODC, a distributed, in-memory database built in Java that holds objects in a normalized form in a way that alleviates the traditional degradation in performance associated with joins in shared-nothing architectures. The presentation describes the two patterns that lie at the core of this model. The first is an adaptation of the Star Schema model used to hold data either replicated or partitioned data, depending on whether the data is a fact or a dimension. In the second pattern, the data store tracks arcs on the object graph to ensure that only the minimum amount of data is replicated. Through these mechanisms, almost any join can be performed across the various entities stored in the grid, without the need for key shipping or iterative wire calls.
Mindtree is one of the first IT service providers to invest in emerging technologies and has developed various technology assets. Customers in product engineering services benefit heavily from our domain expertise.
Some of the technology assets developed include short-range wireless connectivity technologies such as Bluetooth and UWB, Video Analytic Algorithms, Acoustic Echo Cancellation, Audio Codecs, VoIP Stacks, etc.
Amit Sheth, "Semantic Interoperability and Information Brokering in Global Information Systems," Keynote given at IEEE Meta-Data, Bathesda, MD, April 6 1999.
Java Tech & Tools | Beyond the Data Grid: Coherence, Normalisation, Joins and...JAX London
2011-11-02 | 02:25 PM - 03:15 PM
In 2009 RBS set out to build a single store of trade and risk data that all applications in the bank could access simultaniously. This talk discusses a number of novel techniques that were developed as part of this work. Based on Oracle Coherence the ODC departs from the trend set by most caching solutions by holding its data in a normalised form making it both memory efficient and easy to change. However it does this in a novel way that supports most arbitrary queries without the usual problems associated with distributed joins. We'll be discussing these patterns as well as others that allow linear scalability, fault tolerance and millisecond latencies.
Detecting and Recognising Highly Arbitrary Shaped Texts from Product ImagesDatabricks
Extracting texts of various sizes, shapes and orientations from images containing multiple objects is an important problem in many contexts, especially, in connection to e-commerce, augmented reality assistance system in a natural scene, content moderation in social media platform, etc. In the context of a scale with which Walmart operates the text from the product image can be a richer and more accurate source of data than human inputs which can be used in several applications like AttributeExtraction, Offensive Text Classification, Compliance use cases, etc. Accurately extracting text from product images is a challenge given that product images come with a lot of variation which includes small, highly oriented, arbitrary shaped texts with fancy fonts, etc .. Typical word-level text detectors for text detection fails to detect/capture these variations or even if they are detected, text recognition models without any transformation layers fail to recognize and accurately extract the highly oriented or arbitrary shaped texts.
Embracing Observability in CI/CD with OpenTelemetryCyrille Le Clerc
Discover how observability and OpenTelemetry offer unprecedented solutions for both CI/CD administrators and dev teams to troubleshoot CI platforms and solve much more problems thanks to a vibrant community and a growing ecosystem. We will see with real life CI/CD pipelines using Jenkins, Maven, and Ansible how OpenTelemetry offers unprecedented solutions to troubleshoot software delivery pipelines. How the open source and standard nature of OpenTelemetry enables the emergence of a vibrant ecosystem of OpenTelemetry aware CI/CD tools to observe the entire software supply chain and help DevOps teams solve problems that go way beyond the observability use cases we have in mind.
https://community.cncf.io/events/details/cncf-cloud-native-canada-presents-november-2021-eastern-canadian-cncf-meetup-kubernetes-123-release-update-and-cicd-observability/
Open Source Monitoring for Java with JMX and Graphite (GeeCON 2013)Cyrille Le Clerc
Fast feedback from monitoring is a key of Continuous Delivery. JMX is the right Java API to do so but it unfortunately stayed underused and underappreciated as it was difficult to connect to monitoring and graphing systems.
Throw in the sin bin the poor solutions based on log files and weakly secured web interfaces! A new generation of Open Source tooling makes it easy to graph java application metrics and integrate them to traditional monitoring systems like Nagios.
Following the logic of DevOps, we will look together how best to integrate the monitoring dimension in a project: from design to development, to QA and finally to production on both traditional deployment and in the Cloud.
Come and discover how the JmxTrans-Graphite ticket can make your life easier.
Monitoring Open Source pour Java avec JmxTrans, Graphite et Nagios - DevoxxFR...Cyrille Le Clerc
Le feedback rapide offert par le monitoring est un element essentiel des bonnes pratiques de Continuous Delivery. Java dispose dans son ecosysteme d'un composant robuste dedie a cela : JMX.
Cependant, la difficulte de raccordement de JMX a des outils de supervision et de graphe a longtemps ete un frein a son adoption.
Jetez aux orties les solutions bancales a base de logs applicatifs ou d'interface web mal protegees, et venez decouvrir une voie ouverte. Une nouvelle generation d'outils Open Source permet de grapher simplement les metriques de vos applications et de les fournir a un systeme de supervision et d'alerte.
Dans une logique DevOps, nous verrons ensemble comment integrer la dimension Monitoring dans un projet : de la conception des metriques par les developpeurs, a l'integration des besoins des equipes Ops et Q&A, en deploiement traditionnel ou dans le Cloud. JmxTrans, Graphite et Nagios, ce tryptique peut vous faciliter la vie, venez decouvrir comment.
L'application demo : http://demo-cocktail.jmxtrans.cloudbees.net
Le code source de l'application demo : https://github.com/jmxtrans/embedded-jmxtrans-samples/tree/master/embedded-jmxtrans-webapp-coktail
Embedded JmxTrans : https://github.com/jmxtrans/embedded-jmxtrans
Paris NoSQL User Group - In Memory Data Grids in Action (without transactions...Cyrille Le Clerc
In Memory Data Grids in Action with Oracle Coherence presented to No SQL users.
The "transactions" chapter is missing as it has been rescheduled to another session.
Bonnes pratiques des applications java prêtes pour la productionCyrille Le Clerc
Les bonnes pratiques des applications Java prêtes pour la production.
Les enjeux :
* Améliorer la disponibilité des applications
* Réduire le cycle de vie des projets
* Améliorer les plateformes
* Diminuer le coût d’exploitation
Les axes clefs :
* Le déploiement
* La supervision et le monitoring
* La gestion des logs
* La robustesse
* L’organisation
Cyrille Le Clerc (Xebia), Erwan Alliaume (Xebia), Jean Michel Bea (Fast Connect) ont présenté au Paris Java User Group les principes du Data Grid.
Cache distribué, Network Attached Memory, Data Grid ou Cloud Computing sont des termes très à la mode qui s’inscrivent dans la même tendance.
Nous présenterons pendant cette soirée le chemin qui nous à conduit d’un simple EH Cache à des grilles de centaines de giga octects de données qui s’étalent sur des data center.
CACHES DISTRIBUES
Les Cache Distribués se sont banalisés avec les frameworks Open Source Jboss Cache et EH Cache distribué. Où en sommes nous aujourd’hui ?
- Quels sont les cas d’utilisation d’un cache distribué ? Quels gains en attendre ?
- Comment migrer d’un cache local à un cache distribué ? Nos frameworks sont-ils adaptés à ces caches distribués ?
- Comment fonctionne un cache distribué ?
NETWORK ATTACHED MEMORY
Le concept de Network Attached Memory a décollé dans l’univers Java avec Terracotta et offre à nos applications un espace mémoire encore inimaginable il y a peu. Que se cache-t-il derrière ?
- Quels sont les cas d’utilisation des technologies de Network Attached Memory ?
- Cette mémoire virtuellement infinie n’introduit-elle pas des contraintes ?
- Si la mémoire est partagée, qu’en est-il des traitements ?
- Quelles sont les perspectives des technologies de Network Attached Memory ?
DATA GRID
Le concept de data grid s’est popularisé avec les services Google Big Table ou Amazon S3 mais aussi avec des sites comme eBay qui annoncent des data center gigantesques. Cela va-t-il arriver dans l’informatique classique ?
- Qu’est-ce qu’une grille de donnée ? Comment ça marche ?
- Qui a besoin de Data Grid ? Est-ce réservé aux hyper scalable comme eBay ou Facebook ? Comment faisait-on avant ? En ai-je besoin ?
- Comment structurer une application pour utiliser une grille de données ? Cela la change-t-il la façon de programmer ?
- Map Reduce est-il un pattern utilisable avec une grille de données ? Est-ce le seul ?
- Les grilles de données vont-elles remplacer les bases de données traditionnelles ? Comment peuvent-elles cohabiter ?
DATA GRID, CLOUD ET LES AUTRES
Data Grid, Grid Computing, Cloud Computing et eXtreme Transaction Processing (XTP) sont fréquement associés.
Comment positionner Data Grid par rapport à ces technologies ?
Quels positionnements ont les acteurs de cet univers ? Amazon S3&EC2 ? Coherence ? Gigaspace ? Google App engine & Big Table ? Grid Gain ? Terracotta ? Websphere eXtreme Scale ?
Et les mainframes dans tout ça ?
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
8. On the Web side
- Created Dynamo
Similar needs for Web giants :
- < 40 min of unavailability per year
• Huge amount of data
• High availability
• Fault tolerance
- Created BigTable & MapReduce
• Scalability on commodity - Stores every webpages of Internet
hardware
9. Amazon : the birth of Dynamo
Requires complex requests,
temporal unavailability is acceptable
Fill cart Checkout Payment Process order Prepare Send
Requires high availability,
key-value store is enough
10. On the Financial side
- Released Coherence in 2001
Needs within financial market :
- Started as a distributed cache
• Very low latency
• Rich queries & transactions
• Scalability
- Released Gigaspaces XAP in 2001
• Data consistency - Routes the request inside the data
17. Partitioned Data Modeling
Seat
Booking Passenger
number
reduction name
price
Train
code
type
TrainStation
TrainStop
code
date
name
Typical relational data model
18. Partitionned Data Modeling
Partitioning ready
entities tree
e ntity
Root Seat
Booking Passenger
number
reduction name
price
Train
code
Du
type pli Refe
ca
ted renc
in e d
TrainStation ea ata
TrainStop ch
code pa
date rtit
ion
name
Find the root entity and denormalize
19. Partitionned Data Modeling
Remove unused data
Seat
Booking Passenger
number
reduction name
price
booked
Train
code
type
TrainStation
TrainStop
code
date
name
20. Partitionned Data Modeling
Sharding ready data structure
Seat
number
price
booked
Train
code
type
TrainStation
TrainStop
code
date
name
33. Request Driven Data Modeling
• Relational data modeling is business driven
Adaptation to requests comes with tuning
• With partitioning, data modeling had to be adapted for requests
Because network latency matters
• NoSQL & DataGrids data modeling is request driven
Two requests may require to store data twice
34. Key-Value Store
In memory
In memory
with async
persistence
Persistent
35. Example with a user profile
johndoe User profile as byte[]
Similar to a Java
HashMap
36. Write Example with Riak
RiakClient riak = new RiakClient("http://server1:8098/riak");
RiakObject userProfileObj =
new RiakObject("bucket", "johndoe", serializer.serialize(userProfile);
riak.store(userProfileObj);
Inserts a user profile
into Riak
37. Read Example with Riak
FetchResponse response = riak.fetch("bucket", "johndoe");
if (response.hasObject()) {
userProfileObj = response.getObject();
}
Fetch a user profile using
its key in Riak
39. Column Families Store
For each Row ID we have
a list of key-value pairs
Key-value
pairs are
sorted by keys
Relational DB Column families DB
40. Example with a shopping cart
johndoe 17:21 Iphone 17:32 DVD Player 17:44 MacBook
willsmith 6:10 Camera 8:29 Ipad
pitdavis 14:45 PlayStation 15:01 Asus EEE 15:03 Iphone
41. Write Example with Cassandra
Cluster cluster =
HFactory.getOrCreateCluster("cluster", new CassandraHostConfigurator("server1:9160"));
Keyspace keyspace = HFactory.createKeyspace("EcommerceKeyspace", cluster);
Mutator<String> mutator = HFactory.createMutator(keyspace, stringSerializer);
mutator.insert("johndoe", "ShoppingCartColumnFamily",
HFactory.createStringColumn("14:21", "Iphone"));
Inserts a column into the
ShoppingCartColumnFamily
42. Read Example with Cassandra
SliceQuery<String, String, String> query =
HFactory.createSliceQuery(keyspace,
stringSerializer, stringSerializer, stringSerializer);
query.setColumnFamily("ShoppingCartColumnFamily")
.setKey("johndoe")
.setRange("", "", false, 10);
QueryResult<ColumnSlice<String, String>> result = query.execute();
Reads a slice of 10 columns
from ShoppingCartColumnFamily
44. Example with an item of a catalog
{
"name": "Iphone",
"price": 559.0,
item_1 "vendor": "Apple",
"rating": 4.6,
"tags": [ "phone", "touch" ]
}
The database is aware of
document’s fields and
can offers complex
queries
45. Write Example with MongoDB
Mongo mongo = new Mongo("mongos_1", 27017);
DB db = mongo.getDB("Ecommerce");
DBCollection catalog = db.getCollection("Catalog");
BasicDBObject doc = new BasicDBObject();
doc.put("name", "Iphone");
doc.put("price", 559.0);
catalog.insert(doc);
Inserts an item
document into MongoDB
46. Read Example with MongoDB
BasicDBObject query = new BasicDBObject();
query.put("price", new BasicDBObject("$lt", 600));
DBCursor cursor = catalog.find(query);
while(cursor.hasNext()) {
System.out.println(cursor.next());
}
Queries for all items with
a price lower than 600
48. Example with train booking with IBM eXtremeScale
@Entity(schemaRoot=true)
public class Train { Seat
number
price
@Id
booked
String code; Train
code
@Index type
@Basic
TrainStop
String name;
date
@OneToMany(cascade=CascadeType.ALL)
List<Seat> seats = new ArrayList<Seat>();
@Version
int version;
...
} With Data Grids,
sub entities can have
cross relations
49. Write Example with IBM eXtreme Scale
eXtreme Scale provides
a JPA Style API
void persist(Train train) {
entityManager.persist(train);
}
Inserts a train into
eXtreme Scale
50. Read Example with IBM eXtreme Scale
/** Find by key */
Train findById(String id) {
return (Train) entityManager.find(Train.class, id);
}
/** Query Language */
Train findByTrain(String code) {
Query q = entityManager.createQuery("select t from Train t where t.code=:code");
q.setParameter("code", code);
return (Train) q.getSingleResult();
}
Simple and complex queries
with eXtreme Scale
51. More APIs
• Another Java EE versus Spring battle ? JSR 347 Data Grids vs. Spring Data
Unified API ontop of relational, document, column, key-value ?
Object to tuple projection API
64. Transactions with Manual Compensation
• Code “do” & “undo” & chain execution
• What about interrupted chain execution ? Data corruption ?
65. Transactions with Manual Compensation
• Code “do” & “undo” & chain execution
• What about interrupted chain execution ? Data corruption ?
data store managed transaction chain execution
67. Key-Value Store
• Get and Set by key
Simple but enough for a lot of use cases
• Riak and Voldemort provide a great scalability
Great to persist continuously growing datasets
• Memcached and Redis offer low overhead and latency
Great for cache and live data
68. Column Families Store
• Get and Set by key of a list of columns
Makes it possible to fetch and update partial data
• Queries are simples, but columns slice fetching is possible
Great for pagination
• Data model is too low level for many complex data modeling
Should typically be used for the largest scalability needs
69. Document Store
• Schema less
Great for continuously updated schemas
• Complex queries are available
Necessary for filtering and search
• Scalability may be limited if not querying using partition key
Can be handle using multiple storage and limited queries
70. In Memory Data Grid
• Very Low Latency & eXtreme Transaction Processing (XTP)
Investment banking, booking & inventory systems
• In Memory - No Persistence
Most of the time backed with a database
• High budget and Developer skills required
Some Open Source alternatives are appearing
71. Polyglot storage for eCommerce
Products
Solr
search
Product catalog MongoDB
Application
User account and
Cassandra
Shopping cart
Warehouse
inventory Coherence
72. Why NoSQL & DataGrids matter ?
• Polyglot Storage: databases that fit the needs of every type of data
• Linear Scalability: being able to handle any further business requirements
• High Availability: multi-servers and multi-datacenters
• Elasticity: natural integration with Cloud Computing philosophy
• Some new use cases now available