• Save
Multiplexing in Thrift: Enhancing thrift to meet Enterprise expectations- Impetus White Paper
Upcoming SlideShare
Loading in...5
×
 

Multiplexing in Thrift: Enhancing thrift to meet Enterprise expectations- Impetus White Paper

on

  • 2,908 views

For Impetus’ White Papers archive, visit- http://www.impetus.com/whitepaper ...

For Impetus’ White Papers archive, visit- http://www.impetus.com/whitepaper

This paper addresses the challenge and details the approach that Impetus has devised, to enhance the caliber of Thrift and enable it to meet enterprise expectations.

Statistics

Views

Total Views
2,908
Views on SlideShare
2,907
Embed Views
1

Actions

Likes
10
Downloads
0
Comments
0

1 Embed 1

http://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Multiplexing in Thrift: Enhancing thrift to meet Enterprise expectations- Impetus White Paper Multiplexing in Thrift: Enhancing thrift to meet Enterprise expectations- Impetus White Paper Document Transcript

  • Multiplexing in Thrift: EnhancingThrift to Meet EnterpriseExpectationsAbstractThrift [1] is an open source library that expedites developmentand implementation of efficient and scalable back-end services.Its lightweight framework and support for cross languagecommunication makes it more robust and efficient than otherRPC frameworks like SOA (REST/SOAP) for many operations.However, Thrift’s capabilities are challenged by emergingenterprise solutions like Big Data that impose highmaintainability and administrative overheads on an enterprisehosting multiple services over the network, due to its limitationof hosting one service per port.This paper addresses the challenge and details the approach thatImpetus has devised, to enhance the caliber of Thrift and enableit to meet enterprise expectations.Impetus Technologies, Inc.www.impetus.comW H I T E P A P E R
  • Multiplexing in Thrift: Enhancing thrift to meet enterprise expectations2Table of ContentsIntroduction..............................................................................................3What’s so special about Thrift? ................................................................3Thrift is powerful, yet lacks the prowess..................................................4Adding charm to the glorious API through multiplexing ..........................5The approach...............................................................................6Components.................................................................................6How to use thrift multiplexing..................................................................9Creating a multiplexing server with a lookup registry.................9Making a wise investment lucrative .......................................................13Summary.................................................................................................14
  • Multiplexing in Thrift: Enhancing thrift to meet enterprise expectations3IntroductionThrift is a very lightweight framework for developing and accessing remoteservices that are highly reliable, scalable and efficient in communicating acrosslanguages.Thrift API is extensively used for creating services like search, logging, mobile,ads, and the developer platform across various enterprises. The services ofvarious Big Data open source initiatives like HBase [6], Hive [7] and Cassandra[8] are hosted on Thrift. Its simplicity, versioning support, developmentefficiency, and scalability make it a strong contender in the SOA market, helpingit to compete successfully against more established integration approaches andproducts.Thrift has the capability of supporting a large number of functions,communicating across languages for each service. This capability can be furtherenhanced by extending Thrift support to host multiple services on each server.In this white paper, we look at how the capabilities of Thrift can be enhancedto make optimum use of enterprise resources. We have also presented aframework that can enable the creation of server hosting multiple services,registration of service(s) and lookup of service(s), based on standard context.What’s So Special About Thrift?There are various flavors of RPC implementations available in the open sourcearena, including Thrift, Avro [2], MessagePack [3], Protocol Buffers [4], BSON[5], etc. Each of RPC implementation libraries has its own pros and cons. Ideallywe should select the RPC library according to specific enterprise solutionrequirements of the project.Some of the features that any RPC implementation aspires for are:1. Cross Platform communication2. Multiple Programming Languages3. Support for Fast protocols (local, binary, zipped, etc.)4. Support for Multiple transports5. Flexible Server (configuration for non-blocking, multithreading, etc.)6. Standard server and client implementations7. Compatibility with other RPC libraries
  • Multiplexing in Thrift: Enhancing thrift to meet enterprise expectations48. Support for different data types and containers9. Support for Asynchronous communication10. Support for dynamic typing (no schema compilation)11. Fast serializationCompared to other RPC implementations, Thrift, Avro and MessagePack are thetop contenders, serving most of the above listed requirements.In an Avro implementation, out-of-band schema can become overkill forinfrequent conversations between a server and client. MessagePack,meanwhile, is weaker than Thrift on account of a paucity of data typecontainers, being inherently JSON-based and no type checking with schema.On the other hand, support for various protocols and transports, configurableservers, simple standardized IDL, and battle -tested integration with Big DataNoSQL data stores like Cassandra make Thrift a powerful contender andpreferred RPC implementation in enterprise solutions.Thrift is Powerful, Yet Lacks the ProwessDespite being a powerful and efficient cross language communication tool,Thrift’s services are challenged by high administrative and maintenanceoverheads. The fact remains that every Thrift server is capable of exposing onlya single service at a time. In order to host multiple functions, Thrift providesorganizations with the following two options1) Write a monolithic, unwieldy implementation and host it as singleservice2) Host multiple small services across a series of portsfig1.1 : Option 1- Write a monolithic, unwieldy implementation and host it as single service
  • Multiplexing in Thrift: Enhancing thrift to meet enterprise expectations5If an enterprise opts to follow the first option (ref fig 1.1) then, monolithic andunwieldy implementation elevates the development cost of the solution. Sincethe complexity of the solution keeps on growing with the addition of every newservice. Return on Investment (ROI) is adversely affected by high maintenanceoverheads.fig1.2 : Option 2 - Host multiple small services across a series of portsIf an enterprise opts for the second option, the number of ports consumed forhosting multiple services will be high. Since ports are a limited enterpriseresource, that needs to be used judiciously, this poses a serious concern. Thisoption will therefore be challenged by high administrative and maintenanceoverheads. Also, to prevent overheads related to connection setups on eachcall, clients have to maintain too many connections (at least one to each port).With the addition of every new service, a new port has to be opened on thefirewall. The advantage of Thrift’s flexible design for the solution is thuschallenged by high administrative overheads.Adding Charm to the Glorious API throughMultiplexingThe need of the hour is to realize and harness the potential of the Thrift API, byovercoming its limitation of hosting a single service on each server. The solutionpresented through this White Paper is an attempt to create a framework thatcan enable Java developers to create and host multiple services on each server.This solution also presents a lookup framework that any Java client/server canuse for quick and easy lookup of services that is hosted on each server and away to access the same.
  • Multiplexing in Thrift: Enhancing thrift to meet enterprise expectations6The approachThe baseline approach is to assign a symbolic name to each service which isreferred to as service context in this Paper. This will help us in hosting multipleservices on each server where each service can be recognized by its respectiveservice context. A client using lookup service should be able to fetch theappropriate service context and use the same for directing the service call to therespective servant.ComponentsThe solution has extended the Thrift API[version 0.9.0] to introduce some of thenew components (highlighted with red boundaries in fig1.3) mentioned below:MultiplexerMultiplexer is the processor that is at the heart of this solution. Thiscomponent acts as a server side request broker and is responsible foridentifying the service that the client has requested for, based on theservice context propagated by the client. This component maintains amapping between the service context and the service. While processingany request, it reads the service context from the underlying protocoland based on the mapping, directs the request to the appropriateservice.fig1.3 : Thrift MultiplexingProtocolIn our approach, we have made our solution transport and protocolagnostic. We have created a wrapper around the underlying protocol(any Protocol instance) that is capable of embedding service context to
  • Multiplexing in Thrift: Enhancing thrift to meet enterprise expectations7the message on the client side and fetching the same on the server side.Thus, we have added a new class TMultiplexProtocol as a wrapperaround the existing TProtocol that overrides the behavior ofwriteMessageBegin (TMessage) and readMessageBegin() methods. Anyclient that has to communicate with TMultiplexer needs to wrap theunderlying protocol using the TMultiplexProtocol instance.Registry and LookupIn order to reduce the overheads associated with managing the servicecontext manually, we have created a registry component along with thissolution that is responsible for managing information pertaining to allservices hosted on a particular server. This component is hosted as oneof the service on the underlying multiplexer and can be queried by theclient on the TMultiplexerConstants.LOOKUP_CONTEXT for procuringrelevant information about the hosted services.The TRegistry interface is the basic client API for querying the lookupregistry. It provides several lookup methods for querying registry basedon service context, service name and regular expression. It alsofacilitates users in checking the existence of any service context andlisting all available service contexts with the registry.TRegistryHelper is an interface for the server API, which is used by theserver for binding, rebinding and unbinding of service context with thelookup registry. We have provided one basic implementation of theregistry API, TRegistryBase that performs in memory management ofthe service context. This component can be extended to override thedefault behavior, based on the specific need, and can be used alongwith the Factory class. TRegistryClientFactory is the Factory class forcreating the registry client that facilitates remote lookup of registry.Service InformationThe solution uses the URIContext class to capture/representinformation regarding service(s) hosted on a particular server. Thisobject is capable of transmitting across the network; and hence can beaccessed remotely by the client. Service context, service name anddescription are part of the information captured by this object in thepresent solution.Multiplexer-extension for lookup
  • Multiplexing in Thrift: Enhancing thrift to meet enterprise expectations8fig1.3 : Thrift Multiplexing with Lookup RegistryOn its own, Multiplexer is capable of hosting multiple services.However, managing service information is an overhead for the client aswell as server administrator. To reduce this overhead, we haveintroduced a registry component that is capable of managing serviceinformation. In order to leverage the capability of the multiplexer andregistry component in a single processor, we have introduced our newprocessor TLookupMultiplexer that is capable of hosting multipleservices along with an additional lookup service based on the registry.The processor therefore creates an instance of registry with all serviceinformation and exposes it as an additional service to clients. Thisenables clients to query registry using Registry API, and accessing theunderlying service using the service context obtained after querying.ServerWe have presented a new abstract server, the TMultiplexingServer,which is capable of hosting any server implementation on any transportand any protocol, using TLookupMultiplexer. This class abstracts theunderlying complexities of object creation and exposes two abstractmethods, vis. getServer and configureMultiplexer, to be implementedby any class extending this class. This class enables a user to identify theserver transport and protocol at the time of the server object creation,
  • Multiplexing in Thrift: Enhancing thrift to meet enterprise expectations9thus providing an additional degree of flexibility when it comes tohosting the same server with multiple services on different transportand protocols with no additional coding effort. The TMultiplexingServerinternally wraps the instance of the TServer, allowing the server startupand shutdown to be managed in accordance with the requirement.Source CodeWe have extended the Thrift Java library[version 0.9.0] and added anew source folder by the name ‘ext’ that contains the underlyingimplementation of multiplexing components. Also, build.xml has beenamended to compile existing and extended source code. Compatibilityof the solution has additionally been tested with the present stableversion 0.8.0 of Thrift for seamless integration. In order to use themultiplexing capability of Thrift, one has to download/pull source codeof the extended Thrift library [9] from git-hub and run the ‘ant’command on the downloaded Thrift Java library. This will generate thelibthrift-xxx.jar in build folder, which can further be used by developersfor creating their enterprise solutions.How to use Thrift MultiplexingCreating a multiplexing server with a lookup registryThe multiplexing server can be created by extendingorg.apache.thrift.server.TMultiplexingServer class and by implementing theabstract method configureMultiplexer () and getServer(TServerTransportserverTransport, TProtocolFactoryprotFactory, TProcessorprocessor). The sample code with the illustration is provided below:Step 1: Creating the server class by extending the TMultiplexingServer class.public class Server1<T extends TServerTransport, F extends TProtocolFactory>extends TMultiplexingServer<T, F>Step 2: Optionally override the default constructor to accept server transportand protocolpublic Server1(T serverTransport, F protFactory) {super(serverTransport, protFactory);}Step 3: Implement the configureMultiplexer() method to configure the lookupmultiplexer. As a part of this configuration, one has to create a list of
  • Multiplexing in Thrift: Enhancing thrift to meet enterprise expectations10MultiplexerArgs that capture the details of the services that will be hosted onthe server and their respective service information. In the example illustratedbelow, we have hosted the HR and Finance services on Server1.@Overrideprotected List<MultiplexerArgs<URIContext,TProcessor>>configureMultiplexer() {//list of multiplexer argumentsList<MultiplexerArgs<URIContext, TProcessor>> args = newArrayList<MultiplexerArgs<URIContext, TProcessor>>();// configuring HR service contextTProcessor processor = new HRService.Processor<HRServiceImpl>(newHRServiceImpl());URIContext context = new URIContext(Constants.HR_CONTEXT,"HumanResource_Service");MultiplexerArgs<URIContext, TProcessor> arg = newMultiplexerArgs<URIContext,TProcessor>(processor, context);args.add(arg);// configuring FIN service contextprocessor = new FinanceService.Processor<FinanceServiceImpl>(newFinanceServiceImpl());context = new URIContext(Constants.FIN_CONTEXT, "Finance_Service");arg = new MultiplexerArgs<URIContext,TProcessor>(processor, context);args.add(arg);return args;}Step 4: Implement the getServer(…) method to create an instance of the desiredserver. In the example below, we are creating an instance of ThreadPoolServerusing the arguments.@OverrideProtected TServer getServer (TServerTransport serverTransport,TProtocolFactory protFactory, TProcessor processor) {//creating server argsArgs serverArgs= new Args(serverTransport);serverArgs.protocolFactory(protFactory);serverArgs.transportFactory(new TTransportFactory());serverArgs.processor(processor);serverArgs.minWorkerThreads=1;
  • Multiplexing in Thrift: Enhancing thrift to meet enterprise expectations11serverArgs.maxWorkerThreads=5;//creating server instanceReturn new TThreadPoolServer(serverArgs);}Step 5: Create the instance of a server class, using the appropriate transport andprotocol, and start the server.public static void main(String[] args) {//identifying server transportTServerSocket SERVER1_TRANSPORT = newTServerSocket(Constants.SERVICE1_PORT);//identifying server protocolFactory SERVER1_FACTORY = new TBinaryProtocol.Factory();//creating server instances for specific transport and protocolServer1<TServerSocket, TBinaryProtocol.Factory> server1 =new Server1<TServerSocket, TBinaryProtocol.Factory>(SERVER1_TRANSPORT,SERVER1_FACTORY);//starting serverserver1.start();}Creating a client for querying the registry and using the service contextA Client-to-query multiplexing server registry can be procured fromorg.apache.thrift.registry.TRegistryClientFactory class.TRegistryClientFactory isthe convenience class that provides multiplexing client instances. On the clientside, one can use the static method getClient(..) of this factory to procure theregistry client. This can further be used to query registry and identify theappropriate server for processing the request. The example code providedbelow is about a client that retrieves the tax detail of an employee using thefinance service:public double getTaxDetails(intempId){TTransport transport = null;TProtocol protocol = null;try {//transporttransport = new TSocket(Constants.SERVICE_IP,Constants.SERVICE1_PORT, 60);
  • Multiplexing in Thrift: Enhancing thrift to meet enterprise expectations12//Multiplexing protocolprotocol = Factory.getProtocol(new TBinaryProtocol(transport),TConstants.LOOKUP_CONTEXT);//Procuring Registry clientTRegistry client = TRegistryFactory.getClient(protocol);//opening transporttransport.open();//querying registry to get contextSet<URIContext> contexts = client.lookupByName("Finance_Service");//executing the request on appropriate service using the contextif(contexts.size()==1){URIContext uricontext = contexts.iterator().next();protocol =newTMultiplexProtocol(newTBinaryProtocol(transport),uricontext.getContext());com.service.FinanceService.Client finService = newcom.service.FinanceService.Client(protocol);return finService.getTaxDeductedTillDate(empId);}}finally {if(transport!=null)//closing transporttransport.close();}}
  • Multiplexing in Thrift: Enhancing thrift to meet enterprise expectations13Making a Wise Investment LucrativeThrift is a big plus in today’s enterprise environment, as it addresses all thechallenges imposed by any Big Data solution in an effective manner, andpresents a solution that can be exposed as a service across the network. Mostenterprises have limited ports, especially in the production environment, andopening new ports involves an associated cost. Using Thrift as an RPCmechanism for a solution is restrictive, on account of the limited availability ofthe ports. Also, various Big Data solutions like Hadoop, Hive, HBase, Cassandra,NoSQL data stores etc., and other enterprise software such as web servers,application servers, and ESBs already use up a number of ports. If an enterprisehas to expose its solutions as services (that are using the underlying Big Data) onthe network, then opening extra ports for each service would be ineffective interms of cost and resources. This enterprise problem can be effectivelyaddressed by hosting all the services with the help of Thrift multiplexing thatcan reduce the number of ports to one, with very minimal development andadministrative overheads.An organization investing in this technology is certainly going to reap the benefitof quick turnaround times and low development costs. Furthermore, theextensions done for multiplexing make these investments lucrative by reducingthe maintenance and administrative overheads for enterprises. Withmultiplexing, multiple services can be hosted on a single Thrift server, thuscutting maintenance costs over the long run. Modular designing of services canbe undertaken using the capability of multiplexing that can reduce the futuredevelopment cost of introducing new service(s)/function(s) or amendingexisting services. Hence, multiplexing through its simple approach, not onlymakes an investment worthwhile, but also brings added value to business.
  • Multiplexing in Thrift: Enhancing thrift to meet enterprise expectations14SummaryIn recent times Thrift has emerged as a powerful technology for communicatingacross programming languages in a reliable and efficient manner. Enterprisesdealing with Big Data and other advanced technologies can use the Thriftsolution to host multiple services on the network by efficiently utilizingenterprise resources, at low maintenance costs.References[1] http://thrift.apache.org/[2] http://avro.apache.org/[3] http://msgpack.org/[4] http://code.google.com/p/protobuf/[5] http://bsonspec.org/[6] http://hbase.apache.org/[7] http://hive.apache.org/[8] http://cassandra.apache.org/[9] git://github.com/impetus-opensource/thrift.gitAbout ImpetusImpetus Technologies is a leading provider of Big Data solutions for theFortune 500®. We help customers effectively manage the “3-Vs” of Big Dataand create new business insights across their enterprises.Website: www.bigdata.impetus.com | Email: bigdata@impetus.com© 2013 Impetus Technologies, Inc.All rights reserved. Product andcompany names mentioned hereinmay be trademarks of theirrespective companies.May 2013