The document discusses several key aspects of building distributed systems including:
- Separating the UI layer from the data store to allow different data sources.
- Storing data in a key-value store to avoid data duplication and complex calculations.
- Performing validations on input at the service layer before updating domain models to prevent command rejection.
- Differentiating between correcting data vs new information to properly handle updates.
23. • Data duplication.
• Avoid calculations on data.
• Deploy to web tier (No need to hop
through firewalls, only SELECT is allowed)
• Role-based security not needed.
• Use for preliminary validation results in
less command rejection.
24.
25.
26. • Validations
– Is the input potentially good
– Structured correctly
– Ranges, lengths etc.
• Rules
– Should we do this?
– Based on the current system state
– What the user saw is irrelevant
29. • Differentiating between
– Correcting a shipping address
– User has moved, rerouting shipment location
• Sometimes users accidently modify fields
by tabbing.
30.
31.
32. • Checkbox based UI
• Why user’s need bulk booking?
• But then concurrency happens
• Then book seats somewhere else
33. • Group reservation
• Enter number of people and preferred
seating location
• System emails back when reservation is
confirmed
34. • On submit from browser, show comment
using AJAX on page.
• Why wait for response from server.
• Does the user calls all his friends and tell
them about the comment he posted?
35.
36. • Validations:
– Commands are validated before the domain
model is called
• Queries:
– Entity relations for reading are unnecessary
37. • In addition to doing what the command
said, doing other things as well.
38. • The domain model is not responsible for
persisting all the data. Only persist what is
needed.
• The rest of the world is using the data from
the query store anyway
Editor's Notes
Try to suspend your disbelief for the next hour or half in order to let all the pieces click in place.Think about the eng in Apples..
try to suspend your disbelief. Think what the iphone developer team told the manager. we're going to build a phone without buttons.
Some history of how we started computing
Paper ruled the world
UI's for data entry was designed for a typing interfaceSource of truth was paper. Computer never said no. No validation, because paper was right.
Now when paper was gone, all of the sudden the machine knew best and user didnt have a piece of paper to prove otherwise.
Systems that we are building today are not data entry driven. user tries to do something with a system and the system tell a lot about it, you can do this with this data, you can't do this with the data. Business users said that we have this amount of data and we need to protect this. We have to protect it because we have worked hard. we need to put validations, authorization. but we are using the same ui, the same base of system interaction as in old days. OO came about 25 years ago, layering etc are old model. the things we are doing today is the shinning layer of the same old thing C++ became F# etc..
distributed architecturedb->dal->BL->service->ui.But we are not doing just data entry, we're trying to fit it into this architecture. What doesn't appear in this diagram is users. that is the most imp. especially 2 users because users are not just data entry operators
No matter how cleverly you architected the system, there will be a situation where one user changed the data and other user doesn't know. And then all the decisions this person is making is based on an old view of the world. that matters (making decisions on stale data)
Seems to be a lot of effort to go through these layers just to get stale data? and performance. So cache. (Interesting thing is cache is also stale data)so according to best practices just to show stale data to users? And caching is need to improve performance. (It's more maintainable the old way). how do we solve this?.
Lets look at queries independent of the system. we know that once we show the data it's stale. why don't we show the stale data to users with info how stale it is. at least they know that how old it is.
when we look into query and data shown to user, it's is just flat data. they are not objects, they don't have behaviour. so why ORM's?
Simple 2 tier architecture for showing data. no MVC, DTO's etc.. no CRAP!!!In Agile terms, the simplest thing that could possibly work. how we get the data in here, we'll address that later. you don't do any calculations, if anything is to be calculated you do it before it is put in this table. why do this if data is stale. show that the calculation was done 10 mins ago.
Don't give the generic search screens to do anything they want to do. Design screens for what they want. SQL server full text search doesn't work :(Also search can and will show stale data.
Indexes are still required, but no FKeys, why? the data is stale. I don’t need referential integrity for stale data.
Duplicate data, why do we need to store this in tables, why should we hop through firewalls, why cant we store this in the web tier itself. when we say duplication is fine, the table to store data for employee view and supervisor view is different. we don't need column based security to do role based security, because users are viewing different screens.Preliminary check is not a complete check. before adding a user with same email, we can verify that. It's not 100% foolproof but it work 99%. that's okay because it is preliminary validation. e.g yahoo sign in. (suggest possible usernames) now we can also think it is a decision support system, rather than a query system. which is stable, maintainable and performance oriented. simplest thing that can possibly work.
Commands: we've moved beyond the date where paper was the ultimate authority. can we trust the users when they are doing this kind of activity. how do we protect our data from false entries.
Validation and business rules need not to be mixed? validations is about asking question. does it follow the rules? Treating well behaved and Ill behaved clients.Well behaved clients would have done this check ahead of putting the data. Ill behaved clients like hackers will be not doing this. but that's okay we don't need to treat all of them similar. we can design the system in such a way that well behaved clients get a better exp than ill behaved clients. Rules: Should we do this. like authorization. can he do this. at this point we are looking the current state of the system, not stale data. we can't allow the employee who was fired 10 milliseconds back to do something on the data.
Command processing looks similar to what we are doing currently. on server side we don't trust our clients so we do validation again. Ill behave clients will not do these validations they will just push data to server. this make it accurate that when a command actually arise, we have a better success rate. because good clients have a success rate of 99.99% based on the current state of the system. but however the expectation of an immediate response will not be immediate. we tell the user, that we've structured our system well, we've trained you to use this system, you've been using this over and over for a long time. we don't believe that this is going to be wrong. rather than designing the system with the impression that any command can fail at anytime and give an immediate response, why don't we exploit that fact that these things will succeed than fail and use our persistence view model to increase the confidence in that decision. Rather than telling the user that, 'what a surprise, your command has succeeded, why don't we turn around and say we'll let you know if there is a problem. so that he can do other things.Once the user accepts this fact that if there is a problem, we'll let you know, we can queue up this requests in an offload balancing system and scale them on large number of servers, because we don't need to give immediate response.
Anyone can correct a shipping address, but specifying that the user has moved needs special permissions. text has to be enabled. Permissions required. There is a business meaning behind it. Changing marital status is a separate task than changing the shipping object. both are different unit of works. Developers says first one wins or last one wins. We've designed UI in such a way that it doesn't align with the business. Users are tabbing around 50 columns about 1000 times a day, they mistype. Because we haven't created the UI to capture user's intent. it's the same way the data entry screens was created.
Large sitting groups can sit in several smalls groups. it doesn't matter how the groups within sit in small groups, we dont want the user to specify these minute details. We want to capture the users intent, what's imp and what's not. number of people, where you want to sit. and system will say, got it. i'm looking. i'll let you know when i find your sittings. now I have enough time to do all background processing to find the seats. Just give a form to submit there intent. and then do the job for them. Some people say that this is crap. users want the results right there. What is this waiting thing!!! they want to let the failure immediately. think as a user, do you want to sit in front of the system all day refreshing and rebooking or gets the job done from the system? From a system's perpective this also makes our jobs easire. all concurrency problems dissapear. user's are happy. family gets to sit together. boyfriend girlfriend sits together.... and the system is happy also. we dont need to show the status of reservation. SEats types help to get areas.and this big gigantic query of getting the status of all seats with refreshing and things go off.posing a command to a blog. there is no reason this command is going to fail. but the user wants to see their command appearing in the blog after posting it. it can be posting a comment on facebook or review in amazon. they dont want to see the thank you suceeded message. why wait the response from server to show this comment? just show that it there. it may not show on other users page, but what's the problem?
validations are not done here. instead they are done on commands. Commands are validated before the domain model is called. domain model is used for executing the rules. when you withdraw cash. domain model is not going to reject it. it's the job of the command. domain model can calculate the interest associated, but not reject. it can do things what you asked for and something extra on my data as a result of this commands. if we have queries for reading the domain model, why do we need relationships. why do we need customers.orders.products? most of them don't need it. that's what it is going to do, not questions like lazy loading or eager fetching. hibernate or EF etc. because I'm not using queries. it's all about doing things what you have asked for. DM is not for validations, validations are for keeping the garbage out. DM is for the interesting stuff.
E.G when an order is submitted, if the customer has ordered more than x in the past, give a discount.
Do we really have to every bit of data in the domain model to be persisted?DBA"s will say that you have to have a single a single source of truth. it's all gone, you are working with stale data and that's the truth. let's simply admit it. now when we have admitted it, let's simplify our architecture.