Synthetic APIs Approach Improves Fragmented Data Acquisition for Thomson Reuters’ Content Sharing Platform
Synthetic APIs Approach Improves Fragmented Data
Acquisition for Thomson Reuters’ Content Sharing Platform
Transcript of a BrieﬁngsDirect podcast on how Kapow Software helps a worldwide data
company manage web acquisition in a cost-effective and consistent way.
Listen to the podcast. Find it on iTunes. Sponsor: Kapow Software
Dana Gardner: Hello, and welcome to a special BrieﬁngsDirect discussion series coming to you
from the 2013 Kapow.wow user conference in Redwood Shores, California.
We'll hear how innovative companies are dodging data complexity through the
use of Synthetic APIs. We'll see how from across many different industries and
regions of the globe, inventive companies are able to get the best information
delivered to those who can act on it with speed and at massive scale.
I'm Dana Gardner, Principal Analyst at Interarbor Solutions, and I'll be your host
throughout this series of Kapow Software-sponsored BrieﬁngsDirect use case discussions.
[Kapow Software is a sponsor of BrieﬁngsDirect podcasts.]
Our next innovator interview examines the improved data use beneﬁts at Thomson Reuters in
Here to explain how improved information integration and delivery can be made into business
success, we're joined by Pedro Saraiva, the product manager for Content Shared Platforms and
Rapid Sourcing at Thomson Reuters. Glad to have you with us.
Pedro Saraiva: Thank you very much. Pleased to meet you.
Gardner: Pedro, you ﬁrst launched Thomson Reuters content-sharing platform over four years
ago, I'm told, after joining the company in 1996. And the platform
there now enables agile delivery of automated content-acquisition
solutions across a range of content areas.
Saraiva: That's right.
Gardner: Tell me what that really means. What are you delivering and to whom?
Saraiva: It's actually very simple. We're a business that requires a lot of information, a lot of
data because our business is information -- intelligence information, and we need to do that in a
cost-efﬁcient manner. Part of that requires us to have the best technology. When we started four
years ago, one of the most obvious patterns that we found was that we had a lot of fragmentation
of our content acquisition processes where they were based, who was doing them, and more
importantly, what processes they were following or not following.
The opportunity that we immediately saw was to consolidate it all, not just around the central
capability, but into an optimal capability, with real experts around it making it work and
effectively creating a platform as a service (PaaS) for our internal experts in each content area to
perform their tasks just as usual, but faster, better, more reliably, and more consistently.
Fundamentally, we are a platform for web-content acquisition. And that is part of our content-
shared platform because it's all part of a bigger picture, where we take content from so many
sources and many different kinds of sources, and not just web.
Gardner: So, your customers are essentially other organizations within Thomson Reuters. Is that
Saraiva: That's right. I don't know the exact percentage, but I would guess that about half of
what we do is content management, rather than site technology, per se. And a
lot of those content management tasks are highly specialized because that's the
only way we're going to add value. We're going to understand the content,
where it comes from, what it means, and we are going to present it and
structure it in the best possible way for our customers.
So, the needs of our internal groups and internal content teams are huge, very
demanding, and very specialized. But they all have certain things in common.
We found many of them were using Excel macros or some other technologies to perform their
We tried to capture what was common, in spite of all that diversity, to leverage the best possible
value from the technology that we have. But also, from our know-how, expertise, and best
practices around how to source content, how to be compliant with the required rules, and
producing consistent, high-quality data that we could trust, we could claim to our customers that
they could trust our content because we know exactly what happened to it from beginning to the
Gardner: Just for the beneﬁt of our listeners, Thomson Reuters is a large company. Tell us how
large, and tell us some numbers around the number of different units within the company that
you are providing this data to.
Saraiva: We are a large organization. We have about 50,000 employees worldwide in the
majority of countries. For example, our news operations have reporters on the ground throughout
We have all languages represented, both internally and in terms of our customers, and the content
that we provide to our customers. We're a truly diverse organization.
We have a huge number of individual groups organized around the types of customers that we
serve. Are they global? Are they regional? Are they local? Are they large organizations? Are they
small organizations? Are they hedge funds? Are they fund managers? Are they investment
banks? Are they analysts? We have a variety of customers that we serve within each of our
customer organizations around the world.
And that degree of specialty that I mentioned earlier, at some point, has to take shape. It takes
shape in the vast number of different teams we have specializing in one kind of content. It may
be, perhaps, just a language, French or Chinese. It may be fundamentals, versus real-time data.
We have to have the expertise and the centers of excellence for each of those areas, so that we
really understand the content.
Gardner: You had massive redundancy in how people would go about this task of getting
information from the web. It probably was costly. When you decided that you wanted to create a
platform and have a centralized approach to doing this, what were the decisions that you made
around technology? What were some of the hurdles that you had to overcome?
Saraiva: We were looking for a platform that we would be able to support and manage in a cost-
effective manner. We were looking for something that we could trust and rely on. We were
looking for something that our users could make sense of and actually be productive with. So,
that was relatively simple.
The biggest challenge, in my opinion, from the start, was the fact that it's very hard to take a big
organization with an inherently fragmented set of operating units and try to change it, because
trying to introduce a single, central capability. It sounds great on paper, but when you start trying
to persuade your users that there's value to them in in migrating their current processes, they'll be
concerned that the change is not in their interest.
And there is a degree of psychology at work in trying to not only work with that reluctance that
all businesses have to face, but also to inﬂuence it positively and try to demonstrate that value to
our end users was far in excess to the threat that they perceived.
Gardner: I've heard someone refer to that as having insanely good products. That's going to
change people's behavior. Is that what you've been able to accomplish?
Saraiva: Absolutely. I can think of examples that are truly amazing, in my opinion. One is about
the agility that we've gained through the introduction of technology such as this one, and not just
the user of that technology, but the optimal use of it. Some time ago, before RSA was used in
some departments, we had important customers who had an urgent, desperate need for a piece of
information that we happened not to have, for whatever reason. It happens all the time.
We tried to politely explain that it might take us a while, because it would have to go through a
development team that traditionally build C++ components. They were a small team and they
were very busy. They had other priorities. Ultimately, that little request, for us, was a small part
of everything we were trying to do. For that customer, it was the most important thing.
The conversation to explain why it was going to take so long why we were not giving them the
importance that they deserved was a difﬁcult conversation to have. We wanted to be better than
that. Today, you can build a robot quickly. You can do it and plug it into the architecture that we
have so that the customer can very quickly see it appearing almost real time in their product.
That's an amazing change.
Gardner: So, how did the Kapow platform come to your attention? What was the story behind
your adoption of this?
Saraiva: We spent some time looking at the technologies available. We spoke with a number of
other customers and other people we knew. We did our own research, including a little bit of the
shotgun kind of research that you tend to do on the Internet, trying to ﬁnd what's available. Very
quickly, we had a short list of ﬁve technologies or so.
All of them promised to be great, but ultimately, they had to pass the acid test, which was
evaluation in terms of our technical operations experts. Is this something that we are able to run?
And also in terms of the capabilities we were expecting. They were quite demanding, because we
had a variety of users that we needed to cater to.
But ultimately, most importantly, we needed the conﬁdence that we could get our job done. If we
are going to invest in a given technology, we want to know that it can be used to solve a given
kind of problem without too much fuss, complexity, or delay, because if that doesn't happen, you
have a problem. You have only partially achieved the promise, and you will forever be chasing
alternatives to ﬁll that gap.
Kapow absolutely gives us that kind of conﬁdence. Our developers, who at ﬁrst had a little bit of
skepticism about the ability of a tool to be so amazing, tried it. After the ﬁrst robot, typically,
their reaction was "Wow." They love it, because they know they can do their job. And that's what
we all want. We want to be able to do our jobs. Our customers want to use our products to do
their jobs. We're all in the same kind of game. We just need to be very, very good at what we do.
Kapow gave us that.
Gardner: Approximately how long have you been using Kapow? Do you have any metrics that
might give an indication of what beneﬁts are there? Maybe it's reduced number of developer
hours or rapid use for creating robots that can get you the information you want. Any sense of the
Saraiva: Perhaps, the most interesting examples are those about web sources that were
critically important to us, and that until we were able to leverage Kapow, we just couldn't
It was not even a matter of it taking a long time. We were not able to do it. With Kapow, it was a
straightforward process. We just click, follow the process that really mirrors a complex workﬂow
in the ﬂow chart that we designed, and the job is done.
In terms of the rapid development of the solutions, it was at least a reduction from several
months to weeks. And this is typical. You have cases where it's much faster. You have cases
where it's slower, because there are complex, high-risk automation processes that we need to take
some time to test. But the development process is shortened dramatically.
Gardner: We're here at the Kapow User Summit. We've been hearing about newer versions, the
Kapow platform 9.2. Is there anything in particular that you've heard here so far that has piqued
your interest? Something you might be able to apply to some of these problems right away?
Saraiva: A lot of what we've been doing and focusing on over the last four years was around a
pattern whereby we have data ﬂowing into the company, being processed and transformed. We're
adding our value, and it's ﬂowing out to our customers. There is, however, another type of web
sourcing and acquisition that we're now beginning to work with which is more interactive. It's
more about the unpredictable, unplanned need for information on demand.
There, interestingly, we have the problem of integrating the button that produces that fetch for
data into the end-user workﬂows. That was something that was not possible with previous
versions of Kapow or not straightforward. We would have to build our own interfaces, our own
queues, and our own API to interface with the robo-server.
Now, with Kapplets it all looks very, very straightforward because we can easily see that we
could have an arbitrary optimized workﬂow solution or tool for some of our users that happens to
embed a Kapplet that allows a user to perform research on demand, perhaps on the customer,
perhaps on a company for the kind of data that we wouldn't traditionally be acquiring data on a
constant ﬁxed basis.
Gardner: Looking to the future about deployments, we heard the possibility of a cloud version
of Kapow. How would you prefer to move in the future on deployments? It sounds as if the
direction of bridging organizational boundaries continues for you, maybe delivering this to
mobile devices speciﬁcally having a cloud-based Kapow set of platform services would make
Saraiva: Over time, things keep changing. The main advantage of a cloud-based service running
Kapow would be in freeing us from the hassle of having to manage our own infrastructure.
Although currently, we run a relatively standard, low-scale infrastructure, it's always a cost, an
overhead, and an extra worry that you have to conﬁgure networks.
And you have to worry about security. You have to ensure that things are being monitored and
that you respond to alarms and so on. In theory, if we were able to get exactly the same service
that we now have internally based in the cloud, we could scale it much more transparently
without much planning. That would deﬁnitely give us an advantage.
So, right now, I'm beginning to think about that precise question. For the next few years, are we
going to have just hosted infrastructure at our premises, or are we going to begin leveraging the
cloud properly, because then we can focus on what we really want which is to get value out of
Gardner: I'm afraid we're about out of time, but quickly, now that you've been doing this for
some time, do you any advice that you might offer to others who are grappling with similar
issues around multiple data sources, not being able to use APIs, needing a synthetic API
approach, what lessons have you learned that you might be able to share?
Saraiva: I suppose the most important message I would want to share is about conﬁdence in
technology. When I started this, I had worked for years in technology, many of those years in
web technology, some complex web technology. And yet, when I started thinking about web
content acquisition, I didn't really think it could be done very well.
I thought this is going to be a challenge, which is partly the reason why I was interested in it.
And I've been amazed at what is possible with technologies such as Kapow. So, my message
would be don't worry that technology such as Kapow will not be able to do the job for you. Don't
fear that you will be better off using your own bespoke C++ based solution. Go for it, because it
really works. Go for it and make the most of it, because you will need it with so much data,
especially on the Internet. You have to have that.
Gardner: I’m afraid we’ll have to leave it there. We've been talking about how Thomson
Reuters in London has improved information integration and delivery using Kapow technology
and a Synthetic APIs approach to gain signiﬁcant business beneﬁts.
Please join me in thanking our guest, Pedro Saraiva, the product manager for Content Shared
Platforms and Rapid Sourcing at Thomson Reuters. Thanks for being on BrieﬁngsDirect.
And thanks to our audience for joining this special discussion, coming to you from the 2013
Kapow.wow user conference in Redwood Shores, California.
I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host throughout this series of
Kapow Software-sponsored BrieﬁngsDirect discussions. Thanks for listening, and come back
Listen to the podcast. Find it on iTunes. Sponsor: Kapow Software
Transcript of a BrieﬁngsDirect podcast on how Kapow Software helps a worldwide data
company manage web acquisition in a cost-effective and consistent way. Copyright Interarbor
Solutions, LLC, 2005-2013. All rights reserved.
You may also be interested in:
• Kapow Mobile Katalyst debuts as new means to rapidly convert web applications to
mobile apps sans APIs
• Kapow launches data integration platform for rapid data delivery to multiple devices
• Kapow delivers Web Data Server 7.2 to make BI easier to extract from across web-based
• Web data services extend business intelligence depth and breadth across social, mobile,
• Web data services provide ease of data access and distribution from variety of sources,
• Web data services -- here's why text-based content access and management plays crucial
role in real-time BI management
• Real-time web data services in action at Deutsche Boerse