Apache Airavata is software for providing services to manage scientific applications on a wide range of remote computing resources. Airavata can be used by both individual scientists to run scientific workflows as well as communities of scientists through Web browser interfaces. It is a challenge to bring all of Airavata’s capabilities together in the single API layer that is our prerequisite for a 1.0 release. To support our diverse use cases, we have developed a rich data model and messaging format that we need to expose to client developers using many programming languages. We do not believe this is a good match for REST style services. In this presentation, we present our use and evaluation of Apache Thrift as an interface and data model definition tool, its use internally in Airavata, and its use to deliver and distribute client development kits.
Apache thrift-RPC service cross languagesJimmy Lai
This slides illustrate how to use Apache Thrift for building RPC service and provide demo example code in Python. The example scenario is: we have a prepared machine learning model, and we'd like to load the model in advance as a server for providing prediction service.
Building scalable and language independent java services using apache thriftTalentica Software
This presentation is about the key challenges of cross language interactions and how they can be overcome. We discuss the Apache Thrift as a solution and understand its principle of Operation with code snippets and examples.
Apache thrift-RPC service cross languagesJimmy Lai
This slides illustrate how to use Apache Thrift for building RPC service and provide demo example code in Python. The example scenario is: we have a prepared machine learning model, and we'd like to load the model in advance as a server for providing prediction service.
Building scalable and language independent java services using apache thriftTalentica Software
This presentation is about the key challenges of cross language interactions and how they can be overcome. We discuss the Apache Thrift as a solution and understand its principle of Operation with code snippets and examples.
Thrift vs Protocol Buffers vs Avro - Biased ComparisonIgor Anishchenko
Igor Anishchenko
Odessa Java TechTalks
Lohika - May, 2012
Let's take a step back and compare data serialization formats, of which there are plenty. What are the key differences between Apache Thrift, Google Protocol Buffers and Apache Avro. Which is "The Best"? Truth of the matter is, they are all very good and each has its own strong points. Hence, the answer is as much of a personal choice, as well as understanding of the historical context for each, and correctly identifying your own, individual requirements.
Multiplexing in Thrift: Enhancing thrift to meet Enterprise expectations- Imp...Impetus Technologies
For Impetus’ White Papers archive, visit- http://www.impetus.com/whitepaper
This paper addresses the challenge and details the approach that Impetus has devised, to enhance the caliber of Thrift and enable it to meet enterprise expectations.
Apache Avro and Messaging at Scale in LivePersonLivePerson
This talk covers the challenges we tackled during building our new service oriented system. Summarizing what we realized would bad Ideas to do, what are the better approaches to data consistency, how we used Apache Avro technology and what other supporting infrastructure we created to help us achieving the goal of consistent yet flexible system.
Amihay Zer-Kavod is I'm a Senior Software Architect at LivePerson.
If you are building a service oriented system and you want to build it for scale as well as flexibility. There are a few questions you need to make sure are asked and answered regarding the data interchange between services and offline persistency of services data. Questions as:
- How can I change a service API without breaking other services?
- How do I keep data from services consistent over time?
This talk covers the challenges we tackled during building our new service oriented system. Summarizing what we realized would bad Ideas to do, what are the better approaches to data consistency.
It includes a dive into the Apache Avro technology and how we used it.
Also what other supporting infrastructure we created to help us achieving the goal of consistent yet flexible system.
After more than 5 years of doing this, I think I managed to capture the essence of the beast quite neatly. Here's what matters about Redis, the open source in-memory data structure store, IMO.
JSON, by now, became a regular part of most applications and services. Do we, how ever, really want to transfer human readable information or are we looking for a binary protocol to be as debuggable as JSON? CBOR the Concise Binary Object Representation offers the best of JSON + an extremely efficient, binary representation.
http://www.cbor.io
This is the fourteenth (and last for now) set of slides from a Perl programming course that I held some years ago.
I want to share it with everyone looking for intransitive Perl-knowledge.
A table of content for all presentations can be found at i-can.eu.
The source code for the examples and the presentations in ODP format are on https://github.com/kberov/PerlProgrammingCourse
Present and future of unified, portable, and efficient data processing with A...DataWorks Summit
The world of big data involves an ever-changing field of players. Much as SQL stands as a lingua franca for declarative data analysis, Apache Beam aims to provide a portable standard for expressing robust, out-of-order data processing pipelines in a variety of languages across a variety of platforms. In a way, Apache Beam is a glue that can connect the big data ecosystem together; it enables users to "run any data processing pipeline anywhere."
This talk will briefly cover the capabilities of the Beam model for data processing and discuss its architecture, including the portability model. We’ll focus on the present state of the community and the current status of the Beam ecosystem. We’ll cover the state of the art in data processing and discuss where Beam is going next, including completion of the portability framework and the Streaming SQL. Finally, we’ll discuss areas of improvement and how anybody can join us on the path of creating the glue that interconnects the big data ecosystem.
Speaker
Davor Bonaci, Apache Software Foundation; Simbly, V.P. of Apache Beam; Founder/CEO at Operiant
Thrift vs Protocol Buffers vs Avro - Biased ComparisonIgor Anishchenko
Igor Anishchenko
Odessa Java TechTalks
Lohika - May, 2012
Let's take a step back and compare data serialization formats, of which there are plenty. What are the key differences between Apache Thrift, Google Protocol Buffers and Apache Avro. Which is "The Best"? Truth of the matter is, they are all very good and each has its own strong points. Hence, the answer is as much of a personal choice, as well as understanding of the historical context for each, and correctly identifying your own, individual requirements.
Multiplexing in Thrift: Enhancing thrift to meet Enterprise expectations- Imp...Impetus Technologies
For Impetus’ White Papers archive, visit- http://www.impetus.com/whitepaper
This paper addresses the challenge and details the approach that Impetus has devised, to enhance the caliber of Thrift and enable it to meet enterprise expectations.
Apache Avro and Messaging at Scale in LivePersonLivePerson
This talk covers the challenges we tackled during building our new service oriented system. Summarizing what we realized would bad Ideas to do, what are the better approaches to data consistency, how we used Apache Avro technology and what other supporting infrastructure we created to help us achieving the goal of consistent yet flexible system.
Amihay Zer-Kavod is I'm a Senior Software Architect at LivePerson.
If you are building a service oriented system and you want to build it for scale as well as flexibility. There are a few questions you need to make sure are asked and answered regarding the data interchange between services and offline persistency of services data. Questions as:
- How can I change a service API without breaking other services?
- How do I keep data from services consistent over time?
This talk covers the challenges we tackled during building our new service oriented system. Summarizing what we realized would bad Ideas to do, what are the better approaches to data consistency.
It includes a dive into the Apache Avro technology and how we used it.
Also what other supporting infrastructure we created to help us achieving the goal of consistent yet flexible system.
After more than 5 years of doing this, I think I managed to capture the essence of the beast quite neatly. Here's what matters about Redis, the open source in-memory data structure store, IMO.
JSON, by now, became a regular part of most applications and services. Do we, how ever, really want to transfer human readable information or are we looking for a binary protocol to be as debuggable as JSON? CBOR the Concise Binary Object Representation offers the best of JSON + an extremely efficient, binary representation.
http://www.cbor.io
This is the fourteenth (and last for now) set of slides from a Perl programming course that I held some years ago.
I want to share it with everyone looking for intransitive Perl-knowledge.
A table of content for all presentations can be found at i-can.eu.
The source code for the examples and the presentations in ODP format are on https://github.com/kberov/PerlProgrammingCourse
Present and future of unified, portable, and efficient data processing with A...DataWorks Summit
The world of big data involves an ever-changing field of players. Much as SQL stands as a lingua franca for declarative data analysis, Apache Beam aims to provide a portable standard for expressing robust, out-of-order data processing pipelines in a variety of languages across a variety of platforms. In a way, Apache Beam is a glue that can connect the big data ecosystem together; it enables users to "run any data processing pipeline anywhere."
This talk will briefly cover the capabilities of the Beam model for data processing and discuss its architecture, including the portability model. We’ll focus on the present state of the community and the current status of the Beam ecosystem. We’ll cover the state of the art in data processing and discuss where Beam is going next, including completion of the portability framework and the Streaming SQL. Finally, we’ll discuss areas of improvement and how anybody can join us on the path of creating the glue that interconnects the big data ecosystem.
Speaker
Davor Bonaci, Apache Software Foundation; Simbly, V.P. of Apache Beam; Founder/CEO at Operiant
Building and deploying LLM applications with Apache AirflowKaxil Naik
Behind the growing interest in Generate AI and LLM-based enterprise applications lies an expanded set of requirements for data integrations and ML orchestration. Enterprises want to use proprietary data to power LLM-based applications that create new business value, but they face challenges in moving beyond experimentation. The pipelines that power these models need to run reliably at scale, bringing together data from many sources and reacting continuously to changing conditions.
This talk focuses on the design patterns for using Apache Airflow to support LLM applications created using private enterprise data. We’ll go through a real-world example of what this looks like, as well as a proposal to improve Airflow and to add additional Airflow Providers to make it easier to interact with LLMs such as the ones from OpenAI (such as GPT4) and the ones on HuggingFace, while working with both structured and unstructured data.
In short, this shows how these Airflow patterns enable reliable, traceable, and scalable LLM applications within the enterprise.
https://airflowsummit.org/sessions/2023/keynote-llm/
Present and future of unified, portable and efficient data processing with Ap...DataWorks Summit
The world of big data involves an ever-changing field of players. Much as SQL stands as a lingua franca for declarative data analysis, Apache Beam aims to provide a portable standard for expressing robust, out-of-order data processing pipelines in a variety of languages across a variety of platforms. In a way, Apache Beam is a glue that can connect the big data ecosystem together; it enables users to "run any data processing pipeline anywhere."
This talk will briefly cover the capabilities of the Beam model for data processing and discuss its architecture, including the portability model. We’ll focus on the present state of the community and the current status of the Beam ecosystem. We’ll cover the state of the art in data processing and discuss where Beam is going next, including completion of the portability framework and the Streaming SQL. Finally, we’ll discuss areas of improvement and how anybody can join us on the path of creating the glue that interconnects the big data ecosystem.
Speaker
Davor Bonaci, V.P. of Apache Beam; Founder/CEO at Operiant
Hail hydrate! from stream to lake using open sourceTimothy Spann
(VIRTUAL) Hail Hydrate! From Stream to Lake Using Open Source - Timothy J Spann, StreamNative
https://osselc21.sched.com/event/lAPi?iframe=no
A cloud data lake that is empty is not useful to anyone. How can you quickly, scalably and reliably fill your cloud data lake with diverse sources of data you already have and new ones you never imagined you needed. Utilizing open source tools from Apache, the FLiP stack enables any data engineer, programmer or analyst to build reusable modules with low or no code. FLiP utilizes Apache NiFi, Apache Pulsar, Apache Flink and MiNiFi agents to load CDC, Logs, REST, XML, Images, PDFs, Documents, Text, semistructured data, unstructured data, structured data and a hundred data sources you could never dream of streaming before. I will teach you how to fish in the deep end of the lake and return a data engineering hero. Let's hope everyone is ready to go from 0 to Petabyte hero.
https://osselc21.sched.com/event/lAPi/virtual-hail-hydrate-from-stream-to-lake-using-open-source-timothy-j-spann-streamnative
Realizing the promise of portability with Apache BeamJ On The Beach
The world of big data involves an ever changing field of players. Much as SQL stands as a lingua franca for declarative data analysis, Apache Beam (incubating) aims to provide a portable standard for expressing robust, out-of-order data processing pipelines in a variety of languages across a variety of platforms.
In this talk, I will:
Cover briefly the capabilities of the Beam model for data processing and integration with IOs, as well as the current state of the Beam ecosystem.
Discuss the benefits Beam provides regarding portability and ease-of-use.
Demo the same Beam pipeline running on multiple runners in multiple deployment scenarios (e.g. Apache Flink on Google Cloud, Apache Spark on AWS, Apache Apex on-premise).
Give a glimpse at some of the challenges Beam aims to address in the future.
Cyberinfrastructure Experiences with Apache Airavatasmarru
In this short presentation, we summarize the Apache Airavata's use of component-based architecture to encompass major gateway capabilities (such as metadata management, meta-scheduling, execution management, and messaging).
The goal of this talk is to highlight open source opportunities for students especially through an opportunity to earn $5000 through Google Summer of Code program. I will discuss some of the tips on how to engage with open source communities, the befits for contributing. I will provide motivating examples on how students can gain significant experience in contributing challenging distributed systems problems while impacting scientific research. I will specifically focus with a concrete example of Apache Airavata software suite for Web-based science gateways. I will list some example GSoC topics of interest and provide some recipes for success in getting accepted and navigating through success.
The success of the Google Summer of Code program within ASF demonstrates the interest and potential impact Apache projects could have on grooming next generation software developers. Many projects have benefited from the GSoC contributions and some have succeeded in retaining the students as active PMC members. While GSoC is a good vehicle for potential student committers, we could extend the impact and broaden the reach. Beyond GSoC, currently there is no compelling mechanism for interested students to venture into the 150+ Apache project issue trackers to find out an interesting topic to contribute. We propose to build on the GSoC success and create a common forum for PMC’s to propose topics and volunteer to mentor well defined and suitably scoped student research projects. These student projects create a win-win situation for both the Apache projects and the students.
As an exemplar, we will discuss the Apache Airavata project engagement with student academic projects. The globally distributed locations of PMC members of the Apache Airavata project has resulted in the successful launch of many student research projects in the US, Indian and Sri Lanka. Brief descriptions of the projects, their inclusion within existing university curricula and their successes and challenges will be presented. We will then elaborate on how these experiences can be generalized and modeled as a systematic mechanism to catalyze student research projects. While particularly sharing the experiences from developing countries, we discuss how these ideas are globally applicable in exposing students to the ASF model, enabling them to discuss their ideas and work with leading researchers and open source developers around the world, motivating them through virtual hackathons and eventually creating potential pathways to Apache Committership.
The proposed effort raises many open questions. However, initiated through this talk, we would like to hear feedback from Apache projects and the user community and take the idea further with the Apache Community Development PMC.
This talk introduces the Apache Airavata software for executing and managing computational jobs on distributed computing resources including local clusters, supercomputers, national grids, academic and commercial clouds. Airavata is currently used to build Web-based science gateways and assist to compose, manage, execute, and monitor large scale applications and workflows composed of these services.
Advanced Flow Concepts Every Developer Should KnowPeter Caitens
Tim Combridge from Sensible Giraffe and Salesforce Ben presents some important tips that all developers should know when dealing with Flows in Salesforce.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
Why React Native as a Strategic Advantage for Startup Innovation.pdfayushiqss
Do you know that React Native is being increasingly adopted by startups as well as big companies in the mobile app development industry? Big names like Facebook, Instagram, and Pinterest have already integrated this robust open-source framework.
In fact, according to a report by Statista, the number of React Native developers has been steadily increasing over the years, reaching an estimated 1.9 million by the end of 2024. This means that the demand for this framework in the job market has been growing making it a valuable skill.
But what makes React Native so popular for mobile application development? It offers excellent cross-platform capabilities among other benefits. This way, with React Native, developers can write code once and run it on both iOS and Android devices thus saving time and resources leading to shorter development cycles hence faster time-to-market for your app.
Let’s take the example of a startup, which wanted to release their app on both iOS and Android at once. Through the use of React Native they managed to create an app and bring it into the market within a very short period. This helped them gain an advantage over their competitors because they had access to a large user base who were able to generate revenue quickly for them.
A Comprehensive Look at Generative AI in Retail App Testing.pdfkalichargn70th171
Traditional software testing methods are being challenged in retail, where customer expectations and technological advancements continually shape the landscape. Enter generative AI—a transformative subset of artificial intelligence technologies poised to revolutionize software testing.
Your Digital Assistant.
Making complex approach simple. Straightforward process saves time. No more waiting to connect with people that matter to you. Safety first is not a cliché - Securely protect information in cloud storage to prevent any third party from accessing data.
Would you rather make your visitors feel burdened by making them wait? Or choose VizMan for a stress-free experience? VizMan is an automated visitor management system that works for any industries not limited to factories, societies, government institutes, and warehouses. A new age contactless way of logging information of visitors, employees, packages, and vehicles. VizMan is a digital logbook so it deters unnecessary use of paper or space since there is no requirement of bundles of registers that is left to collect dust in a corner of a room. Visitor’s essential details, helps in scheduling meetings for visitors and employees, and assists in supervising the attendance of the employees. With VizMan, visitors don’t need to wait for hours in long queues. VizMan handles visitors with the value they deserve because we know time is important to you.
Feasible Features
One Subscription, Four Modules – Admin, Employee, Receptionist, and Gatekeeper ensures confidentiality and prevents data from being manipulated
User Friendly – can be easily used on Android, iOS, and Web Interface
Multiple Accessibility – Log in through any device from any place at any time
One app for all industries – a Visitor Management System that works for any organisation.
Stress-free Sign-up
Visitor is registered and checked-in by the Receptionist
Host gets a notification, where they opt to Approve the meeting
Host notifies the Receptionist of the end of the meeting
Visitor is checked-out by the Receptionist
Host enters notes and remarks of the meeting
Customizable Components
Scheduling Meetings – Host can invite visitors for meetings and also approve, reject and reschedule meetings
Single/Bulk invites – Invitations can be sent individually to a visitor or collectively to many visitors
VIP Visitors – Additional security of data for VIP visitors to avoid misuse of information
Courier Management – Keeps a check on deliveries like commodities being delivered in and out of establishments
Alerts & Notifications – Get notified on SMS, email, and application
Parking Management – Manage availability of parking space
Individual log-in – Every user has their own log-in id
Visitor/Meeting Analytics – Evaluate notes and remarks of the meeting stored in the system
Visitor Management System is a secure and user friendly database manager that records, filters, tracks the visitors to your organization.
"Secure Your Premises with VizMan (VMS) – Get It Now"
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...informapgpstrackings
Keep tabs on your field staff effortlessly with Informap Technology Centre LLC. Real-time tracking, task assignment, and smart features for efficient management. Request a live demo today!
For more details, visit us : https://informapuae.com/field-staff-tracking/
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Globus
The U.S. Geological Survey (USGS) has made substantial investments in meeting evolving scientific, technical, and policy driven demands on storing, managing, and delivering data. As these demands continue to grow in complexity and scale, the USGS must continue to explore innovative solutions to improve its management, curation, sharing, delivering, and preservation approaches for large-scale research data. Supporting these needs, the USGS has partnered with the University of Chicago-Globus to research and develop advanced repository components and workflows leveraging its current investment in Globus. The primary outcome of this partnership includes the development of a prototype enterprise repository, driven by USGS Data Release requirements, through exploration and implementation of the entire suite of the Globus platform offerings, including Globus Flow, Globus Auth, Globus Transfer, and Globus Search. This presentation will provide insights into this research partnership, introduce the unique requirements and challenges being addressed and provide relevant project progress.
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
Even though at surface level ‘java.lang.OutOfMemoryError’ appears as one single error; underlyingly there are 9 types of OutOfMemoryError. Each type of OutOfMemoryError has different causes, diagnosis approaches and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
2. 40% discount coupon on any of Manning Publications
Coupon code -‐‑ ********
(listen through the end for decryption key)
A free ebook: The Programmer’s Guide to Apache Thrift, Manning
Publications Co.
(listen through the end and impress us to win)
Red Alert:
FREE Stuff ….FREE as in
* Image Source: hHp://blog.law.cornell.edu/voxpop/files/2012/05/VOX.Chicken.Crossing_FreeBeer.jpg
3. Outline
— Introduction
— Apache Thrift
— Apache Airavata
— Motivation for Airavata to explore Thrift
— Airavata API
— Road to Airavata 1.0
— Airavata Thrift based architecture
— Experience & lessons learned
— Discussion, Q & A
4. — Modern distributed
applications are rarely
composed of modules wriHen
in a single language
— Weaving together innovations
made in a range of languages
is a core competency of
successful enterprises
— Cross language
communications are a
necessity, not a luxury
Polyglotism
** Figure source: The Programmer’s Guide to Apache Thrift, Manning Publications Co.
* Slide source: Randy Abernethy.
5. — A high performance, scalable cross language serialization and RPC framework
— What?
— Full RPC Implementation -‐‑ Apache Thrift supplies a complete RPC solution:
clients, servers, everything but your business logic
— Modularity -‐‑ Apache Thrift supports plug-‐‑in serialization protocols:
binary, compact, json, or build your own
— Multiple End Points – Plug-‐‑in transports for network, disk and memory end points,
making Thrift easy to integrate with other communications and storage solutions like
AMQP messaging and HDFS
— Performance -‐‑ Apache Thrift is fast and efficient, solutions for minimal parsing overhead
and minimal size
— Reach -‐‑ Apache Thrift supports a wide range of languages and platforms:
Linux, OSX, Windows, Embedded Systems, Mobile, Browser, C++, Go, PHP, Erlang,
Haskell, Ruby, Node.js, C#, Java, C, OCaml, ObjectiveC, D, Perl, Python, SmallTalk, …
— Flexibility -‐‑ Apache Thrift supports interface evolution, that is to say, CI environments can
roll new interface features incrementally without breaking existing infrastructure.
Apache Thrift
* Slide source: Randy Abernethy.
6. 1. Define the service interface in IDL
2. Compile the IDL to generate client/server RPC stubs in the desired
languages
3. Call the remote functions as if they were local using the client stubs
4. Connect the server stubs to the desired implementation
5. Choose a prebuilt Apache Thrift server to host your service
thrift IDL
7. — Streaming – Communications characterized by an
ongoing flow of bytes from a server to one or more
clients.
— Example: An internet radio broadcast where the client
receives bytes over time transmiHed by the server in an
ongoing sequence of small packets.
— Messaging – Message passing involves one way
asynchronous, often queued, communications, producing
loosely coupled systems.
— Example: Sending an email message where you may get a
response or you may not, and if you do get a response
you don’t know exactly when you will get it.
— RPC – Remote Procedure Call systems allow function
calls to be made between processes on different
computers.
— Example: An iPhone app calling a service on the Internet
which returns the weather forecast.
Comm schemes
Apache Thrift is an efficient cross platform
serialization solution for streaming interfaces
Apache Thrift provides a complete RPC
framework
** Figure source: The Programmer’s Guide to Apache Thrift, Manning Publications Co.
* Slide source: Randy Abernethy.
8. — User Code
— client code calls RPC methods and/or
[de]serializes objects
— service handlers implement RPC service
behavior
— Generated Code
— RPC stubs supply client side proxies and
server side processors
— type serialization code provides serialization
for IDL defined types
— Library Code
— servers host user defined services, managing
connections and concurrency
— protocols perform serialization
— transports move bytes from here to there
Architecture
** Figure source: The Programmer’s Guide to Apache Thrift, Manning Publications Co.
* Slide source: Randy Abernethy.
9. — The Thrift framework was originally developed at Facebook and
released as open source in 2007. The project became an Apache
Software Foundation incubator project in 2008, after which four early
versions were released.
— 0.9.1
released 2013-‐‑07-‐‑16
— 0.9.2
Planned 2014-‐‑06-‐‑01
— 1.0.0
Planned 2015-‐‑01-‐‑01
Thrift Roadmap
it is difficult to make
predictions, particularly
about the future.
-‐‑-‐‑ Mark Twain
Open Source
Community Developed
Apache License Version 2.0
** Figure source: The Programmer’s Guide to Apache Thrift, Manning Publications Co.
* Slide source: Randy Abernethy.
10. — Web
— thrift.apache.org
— github.com/apache/thrift
— Mail
— Users: user-‐‑subscribe@thrift.apache.org
— Developers: dev-‐‑subscribe@thrift.apache.org
— Chat
— #thrift
— Book
— Abernethy (2014), The Programmer’s Guide to Apache Thrift, Manning
Publications Co. [hHp://www.manning.com/abernethy/]
Thrift Resources
Chapter
1 is free
* Slide source: Randy Abernethy.
13. Airavata API
Security
Launch & Manage Jobs
Save/Load
configurations and
provenance data
Notify progress of job
or workflow execution
Real-‐‑Time
Monitoring
Execute & Manage
Computations
Data Ingest & Discovery
Data Movement
Data &
Provenance
Subsystem
Job &
Workflow
Subsystem
Messaging Subsystem
i
Information
Subsystem
CIPRES
Neuro
Science
Ultrascan
BioVLAB
GAAMP
Param
Chem
Academic &
Commercial
Clouds
Inter-‐‑
national
Grids
Apache Airavata
18. Airavata API in a nutshell
— Airavata provides a API to science gateway developers
— Science gateways => diverse requirements
— Airavata API needs to be
— Easy to understand and simple
— Generic enough to support different gateways
— Existing science gateway integration should be smooth
— Flexible for changes arising with different requirements of science
gateways
20. Road to Airavata 1.0
— First Approach -‐‑ SOAP / WS
— Pros
— ability to separate out context and the payload
— Already proven tools and techniques (XML, WSDL, WS-‐‑Security etc)
— Clients can easily generate stubs using WSDLs
— Cons
— Heavy weight
— Data schema changes cannot be handled easily
— Could not support broad range of clients
21. Contd.. Road to Airavata 1.0
— Second Approach -‐‑ REST
— Pros
— Flexible for data representation (JSON or XML)
— Light weight
— BeHer performance
— Cons
— Multiple object layers
— No standard way to describe the service to the client
22. Key issues with earlier Airavata API
— Only java clients were developed
— users had to write their own client for other languages
— API design was a hybrid model
— Makes it more complicated internally
— Complex data objects are overwhelming to work in REST
— Lot of marshalling / unmarshalling of data objects
— Unnecessary need for a servlet container
— Overwhelming number of methods with lots of overloaded methods
23. Contd..Road to Airavata 1.0
— Third Approach -‐‑ RPC style tools (Apache Thrift, Protocol Buffers,
Apache Avro)
— Ability to communicate easily across different programming languages
— Easy way to generate server and client code
24. Why Apache Thrift
— Apache project
— For Airavata users, a client library that can consume the API is more
important than an open API like REST
— Light framework
— No multiple dependencies and servlet containers.
— Easy learning curve to get started
— If carefully crafted, the IDLs and framework support backward and
forward compatibility
25. Clean way to define IDLs with richer data structures
26. Apache Thrift Integration with Airavata
● Airavata API server and
Orchestrator are thrift
services at the moment
● Other components will also
become thrift services in the
future
28. Experience: What we gain
— Able to support clients wriHen in different languages
— Clean design
— Having different versions of servers
— TSimpleServer -‐‑ Simple single threaded server
— TThreadPoolServer -‐‑ Uses Java'ʹs built in ThreadPool management
— TNonblockingServer -‐‑ non-‐‑blocking TServer implementation
— THsHaServer -‐‑ extension of the TNonblockingServer to a Half-‐‑Sync/
Half-‐‑Async server
29. Experience Contd.. What we gain
— No need of marshalling / unmarshalling -‐‑ data models used internally
— Server is very robust -‐‑ able to handle large number of concurrent
requests.
— Easy to do modifications to the models, since code is auto-‐‑generated
— Convenient way to achieve backward compatibility
30. Lessons Learned
— Modularize the data models in order to ease the maintenance
— No maven plugin at the moment, hence you have to commit auto-‐‑
generated code.
32. Lessons Learned Contd..
— Thrift has limited support to handle null values.
— Experiment model is a complex model with lot of other structs.
— Enums cannot be passed as null over the wire even they are specified as
optional in the struct
— Thrift has limited documentation and samples
— airavata can be used as a reference
— hHps://github.com/apache/airavata
33. Future Directions
— March towards Airavata 1.0 with a stable API
— Exploring next generation information model and Messaging Systems
— Ka{a, flume, fluend, Scribe, Hedwig….
— Refactor Registry Implementation
— Airavata architecture to support multi-‐‑tenanted PAAS
— Motivated by a downstream open source project SciGaP – Science
Gateways Platform as a Service
34. How you can participate/follow
— Join the party -‐‑ Architecture mailing list
— BYOX
— X = opinions, software, ideas, code, yourself ..
35. Take Home
— 2 e books on thrift
— awareness of thrift and airavata
— lessons learned
— how to get involved
— We are under same umbrella for a reason, lets party together
— Join our Architecture List and help
— Other means??