A plumber's guide to SaaS


Published on

1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

A plumber's guide to SaaS

  1. 1. A Plumber's Guide To SaaS (and multi-tenanting, and the Cloud) [Author : Kinshuk Adhikary - software plumber. LinkedIn : http://in.linkedin.com/in/kinshuka Blog : http://me-plumber.blogspot.com/ Email: kinshuk-in@yahoo.com ] Disclaimer # 1 : This write-up is arbitrarily lengthy. I have added whenever I wanted to. Disorganized too, because I have jumped zillions of light-years between topics. That is the way I am made, plus it is not easy to compress 5-6 years of experience into neat topics. Disclaimer # 2 : Sometimes I go very "business" and sometimes very "techie". I can't help it. I am a great believer in mixing up those two. Introduction : Quite recently, I asked a question on a SaaS forum "what the hell is SaaS ?". Some said it was any web application with a log-in, same as ASP. Really ?!! Some said it had to be multi-tenanted if it was SaaS. I will go with the second one for now, it serves as a basis. About 7-8 years ago, deep into a multi-tenanted web based collaboration platform (we didn't call it SaaS back then), I had an interesting conversation with an architect (a proper architect, not the mushrooming fakes you find these days). "What is all this multi-tenanting rubbish ?" he asked. A database is a logical space, not a physical thing. It is anyway bifurcated at a "root" level, so why bifurcate it further at the level of "organization" ? Why share a single database between many subscribing organizations ?. They all have the same structure. The web application is the same. Why not have a 100 instances of the application, and a 100 instances of the database ? After all, a single script and a single deployment engineer can "manage" all of them, there is no extra maintenance cost involved. In fact, the bigger headache is in trying to "box" all these 100 organizations into the same logical space. Multi-tenant SaaS applications, a few pictures: (Non-tech people, please bear with me. Could be important later on). Without multi-tenanting : Hosting company, infrastructure provider Org1 Org2 Scripts that DB instance DB instance (same structure) manage 1 app and 1 App instance App instance (same code version) logical database Remember what the SaaS concept promises. It is the same application that is "shared". There is no difference in what the different organizations "see", neither in the application nor in the database structure. To the developer, all that really means, 1 script.
  2. 2. Alternative with multi-tenanting : Licensing company, infrastructure provider Org1 DB instance Org2 Scripts that manage 1 app and 1 App instance logical database Ok. So what really happened here ? We put two orgs into the same logical database. 1. Less maintenance costs in this kind of sharing ? : If there is absolutely no difference between the versions of the applications that the two organizations are using, then the "effort at maintenance" faced by the developer is the same in both cases. 1 script. [This is a very big if, but it is some sort of a basis As we will see later, it never stays the same application and the same database for very long]. A single codebase, and a single database maintenance script. In the first alternative it is "run" twice over the two instances, in the second one it is run once over a single instance. That is all. 2. Clear and marketable ownership is now fuzzed up : But notice that while in the first case, each organization was able to clearly point and say "this is MY database instance, and that is MY application instance". In the second case the database is now "shared", and the instance too is "shared". Responsibilities too, are a bit fuzzy here. This has major ramifications. Enter the SaaS Licensor. This is the guy who has probably hosted the SaaS app, has built it, know it well, and is now contracting with the many organizations who will use the instance. In many cases it will be the same entity as the "cloud provider". It may not be, but we are considering simple scenarios here. The 2 organizations now "share" the database, which means that the same "structure" still applies to them, but which also means that every piece of data is now "tagged" with the organization's name. To cleanly separate the data between the two orgs, when needed. It is the Licensor's responsibility to ensure that this "tagging" really happens, and that one organization (viewing the data through the application's window) does not end up "seeing" another organization's data. That would rarely happen. I can assure you that it is rare, unless the Licensor is an amateur. Most concerns of "data security" raised in cloud computing scenarios are somewhat unfounded. The real issues around "sharing" are somewhere else, and more subtle. Solvable too.
  3. 3. Do not imagine I am against SaaS, or against the Cloud. Each thing has its proper use in its proper place. The right knowledge is essential, that is all. Common data : The above data sharing picture is more correctly visualized as this. DB instance Org1 Org2 There is always some common data. This could be master data, it could be configurations. When an organization decides to "subscribe" to a SaaS application, it is allowed to share this already existing common stuff, else the rest of its "own" data would be meaningless. What happens when the organizations "leaves" the SaaS provider/licensor, or joins another competing one ? Is it freely allowed to carry its own data as well as the common data ? I have no idea, this exact situation has not occurred in my experience. What else is the organization "sharing" ? Something that many people cannot see in all these pictures. And yet, that something is probably far more important than data. It is "business logic". It is the "way your business processes are run". When an organization enters the SaaS environment, this is what the picture often is : SaaS Org + 3 years New org ''sharing'' environment. Entering Leaving Adapting to new business logic. Enhanced, Its existing Its current Getting changes Increased unchanged, or legacy data business logic as the SaaS version data unwittingly shared enhances. business logic What exactly is this animal called business logic ? If you will believe me, it could by a lot of things : - the convenient way your screens look, the way information is presented in them - all the if-then-else conditional logic scattered all over the place, yes/nos, configurations - the workflows your users follow as they achieve a business objective - the way your data is structured - the way you organize your sub-departments - the way Jim gets permissions to travel when Tom is on leave
  4. 4. The "best practices" argument : SaaS companies often entice organizations to join up, citing "best practices". Here we have these 200 companies, and our application incorporates the best of their best practices, so if you join up you will be able to avail of them. Quite true. When a newbie company joins up, it is probably "sharing" with the best of breed companies the same screens, the same if-else logic, same capabilities. And how exactly do these best practices get incorporated into the SaaS applications ? From the best organizations that join up, most of the time. To start with, the SaaS application is probably a rough thing. But software applications are all about upgrades, new versions, new feature additions. Lets say an organization joins up which has a best practice X, that the existing SaaS version does not have. So the organizations is told - we do not have this. If you want it, it will come out in our next version, and you have to pay extra. The new organization in its eagerness to join up, says, no problems. And there you are. Right after the new version is deployed, about 200 other organizations are now able to avail of this new feature X. Everyone is happy. No one has put a price on the "business logic". And, as grizzled application builders know only two well, it is the best , or rather the only part of an application that has any value, and it comes straight from the business guys in the best run businesses. Too much sharing of best practices between everyone can make best practices look silly. The data analytics future : (This is one of those light-year jumps I warned you about. But it is relevant. If you have difficulties on this part please ask someone). Mature organizations know what to do with 5 years worth of accumulated data. They do analytics on it. Although still a bit unproven, data analytics yeilds some very significant conclusions to those who can do it well. Lets say that you can somehow get hold of a large volume of customer purchases data. When you "classify" the data properly and give it to a data analytics engine, it will probably be able to tell you things like - "any customer that has purchased more than 200 dollars from this store during January is very likely to be spending 500 dollars during Christmas shopping". Your marketing people would have guessed that anyway, but now you have some solid figures. And the bigger is the volume of data you have available for the analytics, the more complete the data, the more current, the higher are the chances that such analytics predictions are correct. So, now you are a 200 customer small company in the SaaS environment. Your data isn't really good enough. And now enters a 50000 customers company, and stays in the SaaS environment for 3 years.
  5. 5. And someone now does data analytics and arrives at important business conclusions (we do not specify who that will be, but it will be equally stupid to prevent people from doing data analytics when so much good data is all around). There are several really important issues here : - who really owns "ALL of the data?" . I guess whoever has "control" over it. Since organizations are entering the SaaS environment primarily to be free of all the hassles, they are relinquishing their control. - can the Licensor do analytics on the data and derive such conclusions, and sell such conclusions to others ? - even the small company that had 200 customers did contribute somewhat, or maybe even significantly, to such an analytics result. It depends on how the data was classified before doing the analytics, maybe it had all customers of a certain type that were not "outliers", maybe its data was less patchy and therefore better suited to analytics. This single word "data analytics" has the potential to enforce a complete re-thinking on how applications of today contribute to a data pool, and how pricing for SaaS applications are done. Both good and bad, always it is like that. One has to manage them. A semi-technical dive, on things that make or break SaaS apps : (this one is more relevant for the developer and the SaaS vendor, but it provides some insight into the jugglery inside). After 1,3 and 5 years of a SaaS application : So my architect friend and I chewed the cud over "logical vs physical" sharing aspects of the databases, and we decided not to worry about these things, because the people who paid us seemed to think that a single instance meant less complexity, less costs, whatever. Only thing is, we later wished we had, because the increasing complexities fell right on top of us, after 1 year into the app, and then after 3 years. After 5 years there was no one left to worry further, because the product had been entirely too successful and the company was sold off, techies like us asked to fade. I hear the guys who purchased it are still grappling with the mess. It is not easy, you have to know exactly what multi-tenanting implies. Let me state that "a business application is all about requirements". Smart SaaS vendors ought to select a group of requirements that are more or less common across the entire business community. Only then can it become "a service". Remember the "ideal" case we discussed ? That the database structure is the same, that the application is the same. In about 1 years time, after the SaaS vendor has added about 10 organizations, all that idealism starts coming apart. In 3 years time, when about 200 organizations have been added, a second phase of changes are also to be expected. These are more about performance and speed issues.
  6. 6. "Boxing" the real world into a single logical space : Consider these (few) examples : • One organization has 2 users (a small mom-and-pop company). The other joins with about 80000 users. Various shades in between. Do you seriously imagine they will have the same business requirements, and fit into the same database structure ? • One organization has its departments/divisions organized functionally. The other has organized them regionally. The domain requirements could be very similar in both cases (both follow the same sales tracking process, say), but the user permissioning could be completely different. For example, users in Timbuctoo are not repeat not allowed to see orders that pertain to nuclear spare parts. • A single big company breaks up into two. In the real world, they divide up the existing contracts neatly, assets neatly, employees neatly, and so on. But remember there is a lot of historical data related to these contracts. So what does one do ? Make copies ? • An employee leaves one organization and joins another in the same Saas environment. Whose responsibility is it, to ensure that there is no overlap during the login changes, that the same employee is not able to view/use/download both organization's data simultaneously. Models break quickly : The experience is - "models break very easily". The moment an organization in the real world re-structures itself, or starts calling itself "a vendor who also happens to be a customer to other vendors", you have many technical problems overnight. No point aiming for "flexibility", "extensibility" etc., seasoned developers know all the limits. The whole "shared" concept depends very finely on ''the model being invariant over time". However, organizations are different, or they change under business pressures, or grow wiser. Simple models become complex. Often there is total conflict with the logical structures and spaces that the SaaS application envisages. In such cases, the patchy solutions often go against common sense and good software design practices. For example, storing Strings in big flat tables as "custom" fields is not a good practice. Yet custom fields are often touted as "a feature" in many SaaS applications. If you want your application to be fine tuned to the business, one of a kind, and easily amenable to drastic changes if you wish to (often happens), you may not like the simplicity that SaaS may try to "box" you into. On the other hand, if the SaaS licensor is a great "pleaser of all customers", then very soon each organization's "logical space" starts looking distinctly individualistic, and very soon you may end up having the 100 separate applications and 100 separate databases anyway, or much worse, a mix and a hotch- potch which is a nightmare for your developers to maintain and update. Fine grained authorizations and permissions : SaaS applications need extremely fine grained authentication and authorization modules. That is because you have none of the usual comfort that organizational firewalls and LANs provide. So you must positively identify and separate each resource, each piece of data, each operation, each screen. Some have their own special permissioning requirements. Individual object level ACLs are almost a given in SaaS applications.
  7. 7. Inherently collaborative : The application we worked in was all about "collaboration" (remember it wasn't called SaaS back then). I still maintain that the key benefit that SaaS and multi-tenanting provides is a collaboration between organizations, between people in those organizations, both intra-organization and inter-organization. Silo applications without linkages are trivial. At times, there are things one can appreciate in the multi-tenant design, the 60-70% commonality, the ability to link one organization to another easily without needing to go out of the system. For example, the same "vendor organization" can collaborate with multiple "customer organizations". So can "consultants". But usually such collborative apps take a huge amount of design effort, and I wonder if it is worth it. Fat content sharing : Think also of "fat content". For example, let us say huge video files available more or less to all organizations, some kind of training material, say. Are you sure each organization should maintain its own copy of such big files ? Or should you have just one copy, with some kind of "sharing" logic pointing to the fat resource. And once again, I do not know if such permission and such content sharing is good, or bad. Simplicity is so much better always :-) Feeding data in and out : For various reasons, a stream of data must go in and out of all applications. When an organization initially joins up, there is a fairly big task of getting all ist legacy data to conform to the SaaS model. Periodically, data extracts need to be taken out to feed other things that the organization may need it for. And of course, there is the "integration" requirement, your CRM needs to talk to your bespoke accounting application. The "tagging" of data that we did ensures that your see only your organization's data, and not some other organizations. It is however a blessing only in disguise. The "tag" must be forever preserved. I guess this hint is enough for those in the know of what ETL (extract- transform-load) is and how badly it can screw up :-) Identity Management : For some reason I cannot quite pin-point, the issues of "object identity managment" seems to hit a multi-tenant SaaS application with unusual ferocity. We often find a large number of "duplicates", duplicated companies, duplicated people, created , requiring frequent removal, and messing up the colloaborative nature of the platform. Like I said, it ought to be the same intensity in a silo application too, but my experience is that it is not the same, I do not know why. Split databases : Many organizations join, but soon start asking for "a separate database for all of OUR data". The SaaS developer may oblige, by routing each request to its own specific database, after parsing each request, finding out which organization the request was for. Preserving the illusion of "a single application instance", at least. A roundabout way of having your own application instance with own database in the first place !
  8. 8. My own reasons in the Cloud, why, what : (this is complete speculation !!!). Adoption of cloud during budget approvals : The cloud will be adopted. Can you imagine any CTO/CIO saying "yes" to a departmental head's request for "buying one more server for the new set of applications". They will almost certainly say - try it out on the cloud. We will save Capex, and see if it works, and weigh carefully if "hired" infrastrcuture is cheaper than buying more servers. Of course, there will also be a drive to "utilize better the existing CPU cycles in the current set of servers". I mean, 20% server utilization. What a waste (as if server utilization was an accounting number, and as if people know or ever knew how to size servers against arbitrary unpredictable application loads). It is always about "saving dollars", isn't it ? For some odd reason, CTOs tend to adopt the "accounting" perspective. I have met very few who have the "investor" attitude, unlike other business heads. A stronger reason for going to the cloud : It is the old old story - lack of available knowledge and skillsets. There is a great shortage of enough DBAs, enough IT staff who know enough about security, database replication, load balancing, maintaining multiple applications, squeezing the maximum out of servers. Whereas the cloud provider folks would definitely know. You are safer trusting security and your data to them, than to your own in-house staff. It is the sheer vast knowledge of cloud providers that is one very good reason why folks will go to the cloud. Who is the "best" cloud provider ? : That is a strange question. If you wanted rapidly built applications, obviously the ones who provide ready-to-cook-and- eat APIs. the taste will be a little insipid, but you can get by. If you wanted more or less your own custom application, but wouldn't mind some ready- built common services, there are many others. Watch this space. The biggies are moving, in their own ways. One of them is moving very slowly, in spite of provocation. This one "knows the mind of the hacker". So its current offering does not allow you to do this, or do that. This one also knows that a database on the cloud is not exactly the same thing as a database elsewhere. Another has just realized that the desktop is a goner, it better have the applications made for the browser. This is a good guy too, except that in the last few years it has been living off its past glory, and now it is the past that may pull it back. Some are offering pricing as their only feature. Everyone is struggling with issues like session handling, insidious APIs that are vulnerable to leet haxors, and of course the circuit breakers. All in all, post-2010, the cloud story is gathering momentum. A good thing too, IMHO, as long as one is mindful of the basic definitions of the cloud, and does not go by jargon.