J-Fall2013- Lucas Jellema: Integrity in Java apps handouts
Upcoming SlideShare
Loading in...5
×
 

J-Fall2013- Lucas Jellema: Integrity in Java apps handouts

on

  • 731 views

The accuracy, internal quality, and reliability of data is frequently referred to using the term 'data integrity'. Without it, data is less valuable or even useless. This session takes a close look at ...

The accuracy, internal quality, and reliability of data is frequently referred to using the term 'data integrity'. Without it, data is less valuable or even useless. This session takes a close look at what data integrity entails and how it can be enforced in multi-tier application architectures using distributed data sources and global transactions. The discussion will make clear which elements are required from any robust implementation of data oriented business rules aka data constraints and it will explain how most existing solutions are not as watertight as is frequently assumed. Steps for achieving reliable constraint enforcement are demonstrated.

Statistics

Views

Total Views
731
Slideshare-icon Views on SlideShare
725
Embed Views
6

Actions

Likes
0
Downloads
0
Comments
0

1 Embed 6

http://www.linkedin.com 6

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Alternative UI on top of same data serviceRequestmanipulationIncreasingly Complex constraints=> consider (also) server side validationUsing BeanValidation (1.0 JSR 303 Java EE 6, 1.1 JSR 349 Java EE 8?)Validationhooks in frameworkCustom code
  • Alternative UI on top of same data serviceRequestmanipulationIncreasingly Complex constraints=> consider (also) server side validationUsing BeanValidation (1.0 JSR 303 Java EE 6, 1.1 JSR 349 Java EE 8?)Validationhooks in frameworkCustom code
  • Alternative UI on top of same data serviceRequestmanipulationIncreasingly Complex constraints=> consider (also) server side validationUsing BeanValidation (1.0 JSR 303 Java EE 6, 1.1 JSR 349 Java EE 8?)Validationhooks in frameworkCustom code
  • Alternative UI on top of same data serviceRequestmanipulationIncreasingly Complex constraints=> consider (also) server side validationUsing BeanValidation (1.0 JSR 303 Java EE 6, 1.1 JSR 349 Java EE 8?)Validationhooks in frameworkCustom code
  • And Web Application running againstitNumeric Data Type Last Name requiredand no more than 30 charactersCheck Constraint: Age >= 6Check 18 and != kidsUK: onesession per attendee per slotUK: onekeynotesession per slot
  • Alternative:No more plannedsessionattendances are allowedthan the capacity of the room in which the session is scheduledto take placeViolating eventsInsert of attendance (for session) (update of attendancenotallowed, delete cannotviolateconstraint)Update of designated room (of session)Change of room capacity (of room)Let’s focus on first violating event: insert of attendanceCreate post insert trigger that calls functionthatcountsnumber of sessionattendancesandcompareswithsession’s room capacityRaiseexceptionwhennumber > capacityNote: functioncanalsobecalledfrommiddle tier afterposting dataIfafter post (statement & trigger fires) andbeforecommitsomethingsimilar is done in a second session (new attendance, validation) andsubsequentlyboth transactions commit – the end result is invalidNo more plannedsessionattendances are allowedthan the capacity of the room in which the session is scheduledto take placeIfafter post (statement & trigger fires) andbeforecommitsomethingsimilar is done in a second session (new attendance, validation) andsubsequentlyboth transactions commit – the end result is invalidLocking is required!Lock down entire Database – no change, no integrityviolation!Lock AttendancetableToo prohibitiveAlsolock Rooms andSessionstabletoprevent change of session room assignmentand room capacityMore fine grained Lock on SESSION_ATTENDANCE_ROOM_CAPACITY_CONSTRAINT_For any transaction tryingtoperform a data manipulationthatpotentiallyviolates the SESSION_ATTENDANCE_ROOM_CAPACITY_CONSTRAINT for a certainsession in a certain room, validation/enforcement is required; the fine grainedlockneedstobeacquired; ifitcannotbe, somebodyelse is validatingthisrule for this room/sessioncombination. Whendone, the changes are committedand the sessioncanproceed – withallcommitted changes from the othersession
  • Against JPASingle thread – constraintenforcedWithout lock – twothreads – constraintby-passedTransaction part 1In Database?Using triggersSame problem!DB trick MVStatement level plus finegrainedlockDemo UK
  • Suppose UI does not support Update of a SessionAttendanceIf person wants to switch fromonesessiontoanother in a certain slot, (s)he has tocreate a new oneandremove the existingoneIn the right order?!DemonstrateBoth steps are DML statementsConstrainttypicallyenforced at statement levelHowever, sometimestheycanbe made deferred [to transaction commit time]
  • Implementationfromprevious slideDiscusssynchronization: othersession (i.e. thread) alsomanipulated the collection; does the constraintevaluation cover the complete set of data? Depends on timingTobesure: usesynchronized access To the collection – safe, but slowTosomethingelse? – safe andmuch more scalableand elegant
  • UK on FirstName, LastNameUpdate existing person to John Smith (S1, T1)Update same person in different session (either name or anotherattribute) => runs intolockInsert new person John Smith (S2, T2)Runs into a lock! A new Record blockedby a Lock? What is the lock on? Locking on a logical ‘semaphor’ canbedone in a very fine-grained wayFor example: {EMP_NAME_UK1, John, Smith}
  • UK on FirstName, LastNameUpdate existing person to John Smith (S1, T1)Update same person in different session (either name or anotherattribute) => runs intolockInsert new person John Smith (S2, T2)Runs into a lock! A new Record blockedby a Lock? What is the lock on? Locking on a logical ‘semaphor’ canbedone in a very fine-grained wayFor example: {EMP_NAME_UK1, John, Smith}
  • UK on FirstName, LastNameUpdate existing person to John Smith (S1, T1)Update same person in different session (either name or anotherattribute) => runs intolockInsert new person John Smith (S2, T2)Runs into a lock! A new Record blockedby a Lock? What is the lock on? Locking on a logical ‘semaphor’ canbedone in a very fine-grained wayFor example: {EMP_NAME_UK1, John, Smith}
  • Examplewith conferenceRegister as Conference Attendee & PaymentConference registrationshouldonlysucceedwhenpayment is completePaymentshouldnotbedonewhenregistrationfails on some business rule
  • Common with Web Services and BPELNottrivial at all!If resources are knowntoparticipate in this type of transaction, theycould have a logical state on records (staging, reserved, …) thatrequiresconfirmation (or compensation) tobecometrulyapplied or undone(WS-Transaction is anattempt)

J-Fall2013- Lucas Jellema: Integrity in Java apps handouts J-Fall2013- Lucas Jellema: Integrity in Java apps handouts Presentation Transcript

  • On the integrity of data in Java Applications Lucas Jellema (AMIS) NLJUG JFall 2013 6th November 2013, Nijkerk, The Netherlands
  • Agenda • What is integrity? • Enforcing data constraints – throughout the application architecture • Transactions • Exclusive Access to … • The Distributed World
  • 3 Definition of Integrity • Truth – Nothing but the truth • The Only Truth • [Degree of] success or completeness of actions is known
  • 4 Sufficient Integrity Integrity Integrity 7,0 48,23 π Uncorrupted 33,0000002 “five” Corruptible 42 Correct Complete Consistent Reliable
  • 5 Conference Application
  • 6 Conference Application Client (HTML 5 & Java Script) Web Tier JavaServer Faces POJO Domain Model Business Tier JPA RDBMS EJB
  • 7 Validation at entry time
  • 8 Validation at entry time Client and View
  • 9 Validation at entry time Client and View
  • More validation at entry time – bean Validation 10
  • 11 Validation at entry time Bean Validation in View
  • 12 Engage Bean Validation in Web Tier
  • 13 Record (Type) level rules • Program should be Kids when age < 18; either Developer or Management when age > 18 • Using JavaScript – when either field changes (handle nulls) – on submit of the entire record • Using Bean Validation: custom type validator – in either web-tier or JPA
  • 14 Type Level Constraints with Bean Validation
  • 15 Type Level Bean Validation: Custom Validator
  • 16 Validation Implementation options & considerations Native Mobile Client Native HTML 5; JavaScript Client (pure HTML 5 & Java Script) Native HTML 5; JavaScript Client (JSF based HTML 5 & Java Script) Custom; Web Tier JSF Validator; Bean JavaServer Faces Validation Custom; Bean Validation RESTful Services POJO Domain Model Business Tier JPA RDBMS EJB Custom; Bean Validation
  • 17 But wait – there is more! • More User Interfaces • More Attendee • • • • Instances More Entities & More types of Constraints More Users, Sessions, and Transactions More Nodes in the Middle Tier Cluster More Data Stores
  • 18 Domain model • • • • • • Attendee Speaker Session Room Slot Attendance – Booked – Realized
  • 19 Multiple-Instances-of-Single-Entity constraints • Constraints that cover multiple same type objects/instances – – – – – – Attendee’s Registration Id is unique No more than 5 conference attendees from the same company Not more than two sessions by the same speaker At most one session scheduled per room per slot Only one keynote session in a slot Sessions from up to a maximum of three tracks can be scheduled in the same room
  • 20 Inter entity constraints • Attendees can only attend one hands-on session during the conference • A person cannot attend another session in a slot in which the session (s)he is speaker of is scheduled • No more planned session attendances are allowed than the capacity of the room in which the session is scheduled to take place • If the room capacity is smaller than 100, then no more than 2 people from the same company may sign up for it • Attendees from Amsterdam cannot attend sessions in room 010 • Common challenge: – Many data change events can lead to constraint violation
  • 21 Event Analysis for Inter Entity Constraint • No more planned session attendances are allowed than the capacity of the room in which the session is scheduled to take place Create, Update (session reference) Update (room reference) Update (capacity [decrease])
  • 22 Constraint classification • Based on event-analysis (when can the constraint get violated) we discern these categories of contraints – – – – Attribute Tuple Entity Inter Entity • Each category has its own implementation methods, options and considerations – Multi record instance rules cannot meaningfully be enforced in client/web-tier
  • 23 Nous ne sommes pas ‘Sans Famille’
  • 24 Nous ne sommes pas ‘Sans Famille’ Mobile Client Client (pure HTML 5 & Java Script) Client (JSF based HTML 5 & Java Script) Web Tier JavaServer Faces RESTful Services POJO Domain Model Business Tier JPA RDBMS EJB
  • 25 Multiple clients for Data Source Client (pure HTML 5 & Java Script) Mobile Client Client (JSF based HTML 5 & Java Script) Web Tier JavaServer Faces RESTful Services POJO Domain Model EJB Business Tier JPA Mobile Client Client (pure HTML 5 & Java Script) Client (JSF based HTML 5 & Java Script) Web Tier JavaServer Faces RESTful Services POJO Domain Model Business Tier JPA EJB .NET ESB DBA/ Application Admin RDBMS Batch
  • 26 Integrity Enforcement in the Persistent Store • All data is available • Persistent store is the final stop: the buck stops here – Any alternative data manipulation (channel) has to go to the persistent store – Mobile, Batch, DBA, ESB • Built-in (native) mechanisms for constraint enforcement – Productive development, proven robustness, scalable performance – For example: Column Type, PK/UK, FK, Check; trigger • Transactions • Enforcing integrity is integral part of persisting data – Without final validation, persistent store cannot take responsibility for integrity
  • 27 Multiple-Instances-of-Single-Entity constraints • No more than 5 conference attendees from the same company
  • 28 Implementation Consideration for Multiple-Entity-Instance rule • Implementation – how and where? – – – – – Is the entire set of data available Is all associated info available Is the data set stable? Can the constraint elegantly be implemented (natively? good framework support?) Are all data access paths covered?
  • 29 Implementing Multi-Instance constraint ‘5 max per company’ Register New Attendee – method A - Ensure L2 Cache is up to date in terms of Attendees (fetch all attendees into cache) - Inspect the collection of attendees for same company - Persist Attendee if collection does not hold 5 (or more) POJO Domain Model Register New Attendee – method B - Select count of attendees in same company from the Data Store - Inspect the long value - Persist Attendee if long is < 5 Business Tier JPA Attendees L2 Cache Attendees
  • 30 Max 5 per company JPA Facade enforcement
  • Max 5 per Company – Flaws in JPA Enforcement • Persist does not [always] ‘post to database’ – When more than one attendee is added in a transaction, prior ones are not counted when the latter are validated Thread 1 POJO Domain Model select count persist select count persist Facade Business Tier JPA Attendees 31
  • 32 One thread persisting two attendees in a row – no flush
  • Max 5 per Company – Flaws in JPA Enforcement • Persist does not [always] ‘post to database’ – When more than one attendee is added in a transaction, prior ones are not counted when the latter are validated Thread 1 POJO Domain Model select count persist select count persist commit Facade Business Tier JPA Attendees 33
  • 34 Flush after persist for complete picture
  • 35 JPA Facade enforcement in a multi-threaded world Client HTML 5 & Java Script Session A Client HTML 5 & Java Script Session B Web Tier Thread 1 POJO Domain Model Thread 2 select count persist select count persist Facade Business Tier JPA Attendees
  • 36 JPA Facade enforcement in a multi-threaded world Client HTML 5 & Java Script Session A Client HTML 5 & Java Script Session B Web Tier Thread 1 POJO Domain Model Thread 2 select count persist commit select count persist commit Facade Business Tier JPA Attendees
  • 37 Two threads inter-leaving
  • 38 Database Solution?
  • Data Trick – Materialized View with Check Constraint 39
  • 40 Transactions • Logically consistent set of data manipulations – Atomic units of work – Succeed or fail together – Any changes inside a transaction are invisible to other sessions/transactions until the transaction completes (commits) – Note: during a transaction, constraints may be violated; the only thing that matters: commit [time] – Transaction ends with succesful commit or rollback – In both cases, transaction-owned locks are released • ACID (in RDBMS) – vs BASE (in NoSQL) • Note: post vs. commit with RDBMS – Post means do [all] data manipulation (insert, update, delete) but do not commit [yet] – Only upon commit are changes persisted and published
  • 41 Perfect Integrity
  • 42 Fine grained locking Transaction 1 Transaction 2 insert … ('John','Doe',…) Attendees Unique Key UK1 on (FirstName, LastName)
  • 43 Fine grained locking Transaction 1 insert … ('John','Doe',…) Transaction 2 insert … ('Jane','Doe',…) update <JANE> set firstname ='John' Attendees Unique Key UK1 on (FirstName, LastName)
  • 44 Fine grained locking Transaction 1 insert … ('John','Doe',…) Lock on UK1_JOHN_ DOE Transaction 2 insert … ('Jane','Doe',…) update <JANE> set firstname ='John' commit Attendees Unique Key UK1 on (FirstName, LastName)
  • 45 JPA Facade enforcement Exclusive Constraint Checking Client HTML 5 & Java Script Session A Client HTML 5 & Java Script Session B Web Tier Thread 1 POJO Domain Model Thread 2 take lock select count persist Facade commit take lock… select count rollback Business Tier JPA LockMgr ATT_MAX Attendees
  • 46 Two threads and Lock on Constraint
  • 47 Two threads and Lock on Constraint
  • 48 Distributed or Global Transaction • One logical unit of work - involving data manipulations in multiple resources (global transaction composed of local transactions) Mobile Client Client (pure HTML 5 & Java Script) Client (JSF based HTML 5 & Java Script) Web Tier JavaServer Faces RESTful Services POJO Domain Model RDBMS EJB Business Tier RDBMS JCA JMS ERP
  • 49 Implementation for Distributed Transaction • Typical approach: two-phase commit – Each resource locks and validates – then reports OK or NOK back to the transaction overseeer – When all resources have indicated OK then phase two: all resources commit and release locks – When one or more resources signal NOK, then phase two: all resources roll back/undo changes and release locks • With regards to integrity: – With a distributed transaction, the integrity for each participant is handled as before; this will result in ‘constraint-locks’ in multiple separate resources
  • 50 Distributed (aka global) transaction inside container • Java EE containers (and various non-EE JTA implementations) support global (distributed) transactions within a JVM – JTA (JSR-907) – based on X/Open XA architecture • Key element is Transaction Monitor (the container) and Resource Managers (JDBC, EJB, JMS, JCA) • One non-XA resource can participate (file system, email, …) in a global transaction: – – – – All XA-resources perform Phase One The non-XA resource does its thing Upon success of the non-XA resource: others perform Phase two by comitting Upon failure of the non-XA resource: others roll back
  • 51 Distributed transactions across/outside containers Step 2: Payment Mobile Client Client (pure HTML 5 & Java Script) Client (JSF based HTML 5 & Java Script) Web Tier JavaServer Faces RESTful Services POJO Domain Model Business Tier JPA RDBMS EJB
  • 52 Distributed transactions across/outside containers • Transaction involving remote containers, Web Services, File System or any stateless transaction participant • There is no actual common, shared vehicle (like a global XA transaction) – There is not really a coordinated two-phase commit • Transaction consists of – Any resource does its thing – lock, validate, commit (or rollback), report back – If all resources report succes: great, done – If one resource reports failure the all other resources should perform ‘compensation’ – i.e. rollback/undo effects of a committed transaction commit Container Local Enterprise Resource Transaction compensate commit Remote/Stateless Enterprise Resource Remote/Stateless Enterprise Resource
  • 53 Compensation • How to implement a compensation mechanism? • How long after the commit can compensation be requested? • What is the state of the enterprise resource between commit and the compensation expiry time? • Should the invoker notify the resource that compensation is no longer required (so the ‘logical locks’/’temporary state’ can be updated) – i.e. the global distributed transaction has succussfully completed commit compensate Enterprise Resource
  • 54 RESTful transaction is a distributed transaction Client Resource A Resource B Domain Model/JPA Cache Resource C
  • 55 RESTful transaction is a distributed transaction Client Resource A Resource B Domain Model/JPA Resource C
  • 56 Distributed Constraints • Constraints that involve data collections in multiple enterprise resources Mobile Client Client (pure HTML 5 & JS) Client (JSF based HTML 5 & Java Script) Web Tier JavaServer Faces RESTful Services POJO Domain Model RDBMS Table Y Business Tier RDBMS Table X EJB JCA JMS ERP
  • 57 Distributed Constraints • Not more than three attendees (resource A) from the same company may attend a session (resource B) – Insert/Update Attendance requires validation – as does update of Attendee.company Client Client Web Tier Java EE Business Tier Client Web Tier MAX_3_COMP_ATT Java EE Business Tier Distributed Lock Manager ATTENDEES ATTENDANCES
  • 58 Distributed Constraints • Not more than three attendees (resource A) from the same company may attend a session (resource B) – Insert/Update Attendance requires validation – as does update of Attendee.company Client Client Web Tier Java EE Business Tier Client Web Tier MAX_3_COMP_ATT Java EE Business Tier Distributed Lock Manager ATTENDEES ATTENDANCES
  • 59 Distributed Constraints • Not more than three attendees (resource A) from the same company may attend a session (resource B) – Insert/Update Attendance requires validation – as does update of Attendee.company Client ESB Client Web Tier Java EE Business Tier Client Web Tier MAX_3_COMP_ATT Java EE Business Tier Distributed Lock Manager ATTENDEES ATTENDANCES
  • 61 Java global (distributed) lock managers • Within JVM: SynchronousQueue • Across JVMs: Apache ZooKeeper, HazelCast, Oracle Coherence, … JVM JVM JVM
  • 62 Summary • Which level of integrity is required? • Change undermines integrity – Data change is trigger for constraint validation • Exclusive lock on multi-record validation – released when transaction commits • Ensure that all data access paths are covered – Not all data manipulations may come through the Java middle tier • Transactions may include multiple enterprise resources – That may not be able to participate in a distributed transaction and have to support a compensation mechanism • True integrity and real robustness are very hard to achieve – Much harder than is commonly assumed
  • 64 Handling Integrity Really Well...
  • Lucas Jellema (AMIS) Email: lucas.jellema@amis.nl Twitter: @lucasjellema Blog: http://technology.amis.nl Website: http://www.amis.nl