Fast Acquisition of Diverse Unstructured Data Sources Makes HPE Tools a Star at LogitBot

Fast Acquisition of Diverse Unstructured Data Sources
Makes HPE Tools a Star at LogitBot
Transcript of a discussion on how high-performing big-data analysis powers an innovative
artificial intelligence-based investment opportunity.
Listen to the podcast. Find it on iTunes. Get the mobile app. Sponsor: Hewlett
Packard Enterprise.
Dana Gardner: Hello, and welcome to the next edition to the Hewlett Packard Enterprise
(HPE) Voice of the Customer podcast series. I’m Dana Gardner, Principal Analyst at Interarbor
Solutions, your host and moderator for this ongoing discussion on digital
transformation. Stay with us now to learn how agile businesses are fending off
disruption in favor of innovation.
Our next case study highlights how high-performing big-data analysis powers
an innovative artificial intelligence (AI)-based investment opportunity and
evaluation tool. We'll learn how LogitBot in New York identifies, manages, and
contextually categorizes truly massive and diverse data sources.
By leveraging entity recognition APIs, LogitBot not only provides investment evaluations from
across these data sets, it delivers the analysis as natural-language information directly into
spreadsheets as the delivery endpoint. This is a prime example of how complex cloud-to core-to
edge processes and benefits can be managed and exploited using the most responsive big-data
APIs and services.
To describe how a virtual assistant for targeting investment opportunities is being supported by
cloud-based big-data services we're joined by Mutisya Ndunda, Founder and CEO of LogitBot in
New York. Welcome.
Mutisya Ndunda: Thank you so much for having us.
Gardner: We're also here with Michael Bishop, CTO of LogicBot. Welcome, Michael.
Michael Bishop: Thank you for having us. It’s good to be here.
Humanization of Machine Learning
For Big Data Success
Learn More
1
Gardner

Gardner: Let’s look at some of the trends driving your need to do what you're doing with AI and
bots, bringing together data, and then delivering it in the format that people want most. What’s
the driver in the market for doing this?
Ndunda: LogitBot is all about trying to eliminate friction between people who have very high-
value jobs and some of the more mundane things that could be automated by AI.
Today, in finance, the industry, in general, searches for investment opportunities
using techniques that have been around for over 30 years. What tends to happen
is that the people who are doing this should be spending more time on strategic
thinking, ideation, and managing risk. But without AI tools, they tend to get
bogged down in the data and in the day-to-day. So, we've decided to help them
tackle that problem.
Gardner: Let the machines do what the machines do best. But how do we decide where the
demarcation is between what the machines do well and what the people do well, Michael?
Bishop: We believe in empowering the user and not replacing the user. So, the machine is able to
go in-depth and do what a high-performing analyst or researcher would do at scale, and it does
that every day, instead of once a quarter, for instance, when research analysts would revisit an
equity or a sector. We can do that constantly, react to events as they happen, and replicate what a
high-performing analyst is able to do.
Gardner: It’s interesting to me that you're not only taking a vast amount of data and putting it
into a useful format and qualitative type, but you're delivering it in a way that’s demanded in the
market, that people want and use. Tell me about this core value and then the edge value and how
you came to decide on doing it the way you do?
Evolutionary process
Ndunda: It’s an evolutionary process that we've embarked on or are going through. The
industry is very used to doing things in a very specific way, and AI isn't something that a lot of
people are necessarily familiar within financial services. We decided to wrap it around things that
are extremely intuitive to an end user who doesn't have the time to learn technology.
So, we said that we'll try to leverage as many things as possible in the back via APIs and all
kinds of other things, but the delivery mechanism in the front needs to be as simple or as
frictionless as possible to the end user. That’s our core principle.
Bishop: Finance professionals generally don't like black boxes and mystery, and obviously, when
you're dealing with money, you don’t want to get an answer out of a machine you can’t
understand. Even though we're crunching a lot of information and making a lot of inferences, at
2
Ndunda

the end of the day, they could unwind it themselves if they wanted to verify the inferences that
we have made.
We're wrapping up an incredibly complicated amount of information, but it still
makes sense at the end of the day. It’s still intuitive to someone. There's not a
sense that this is voodoo under the covers.
Gardner: Well, let’s pause there. We'll go back to the data issues and the user-
experience issues, but tell us about LogitBot. You're a startup, you're in New
York, and you're focused on Wall Street. Tell us how you came to be and what
you do, in a more general sense.
Ndunda: Our professional background has always been in financial services. Personally, I've
spent over 15 years in financial services, and my career led me to what I'm doing today.
In the 2006-2007 timeframe, I left Merrill Lynch to join a large proprietary market-making
business called Susquehanna International Group. They're one of the largest providers of
liquidity around the world. Chances are whenever you buy or sell a stock, you're buying from or
selling to Susquehanna or one of its competitors.
What had happened in that industry was that people were embracing technology, but it was
algorithmic trading, what has become known today as high-frequency trading. At Susquehanna,
we resisted that notion, because we said machines don't necessarily make decisions well, and this
was before AI had been born.
Internally, we went through this period where we had a lot of discussions around, are we losing
out to the competition, should we really go pure bot, more or less? Then, 2008 hit and our
intuition of allowing our traders to focus on the risky things and then setting up
machines to trade riskless or small orders paid off a lot for the firm; it
was the best year the firm ever had, when everyone else was falling
apart.
That was the first piece that got me to understand or to start thinking about how you can
empower people and financial professionals to do what they really do well and then not get
bogged down in the details.
Then, I joined Bloomberg and I spent five years there as the head of strategy and business
development. The company has an amazing business, but it's built around the notion of static
data. What had happened in that business was that, over a period of time, we began to see the
marketplace valuing analytics more and more.
3
Bishop

Make a distinction
Part of the role that I was brought in to do was to help them unwind that and decouple the two
things -- to make a distinction within the company about static information versus analytical or
valuable information. The trend that we saw was that hedge funds, especially the ones that
were employing systematic investment strategies, were beginning to do two things, to embrace
AI or technology to empower your traders and then also look deeper into analytics versus static
data.
That was what brought me to LogitBot. I thought we could do it really well, because the players
themselves don't have the time to do it and some of the vendors are very stuck in their traditional
business models.
Bishop: We're seeing a kind of renaissance here, or we're at a pivotal moment, where we're
moving away from analytics in the sense of business reporting tools or understanding yesterday.
We're now able to mine data, get insightful, actionable information out of it, and then move into
predictive analytics. And it's not just statistical correlations. I don’t want to offend any quants,
but a lot of technology [to further analyze information] has come online recently, and more is
coming online every day.
For us, Google had released TensorFlow, and that made a substantial difference in our ability to
reason about natural language. Had it not been for that, it would have been very difficult one year
ago.
At the moment, technology is really taking off in a lot of areas at once. That enabled us to move
from static analysis of what's happened in the past and move to insightful and actionable
information.
Ndunda: What Michael kind of touched on there is really important. A lot of traditional ways of
looking at financial investment opportunities is to say that historically, this has happened. So,
history should repeat itself. We're in markets where nothing that's happening today has really
happened in the past. So, relying on a backward-looking mechanism of trying to interpret the
future is kind of really dangerous, versus having a more grounded approach that can actually
incorporate things that are nontraditional in many different ways.
So, unstructured data, what investors are thinking, what central bankers are saying, all of those
are really important inputs, one part of any model 10 or 20 years ago. Without machine learning
and some of the things that we are doing today, it’s very difficult to incorporate any of that and
make sense of it in a structured way.
Gardner: So, if the goal is to make outlier events your friend and not your enemy, what data do
you go to to close the gap between what's happened and what the reaction should be, and how do
4

you best get that data and make it manageable for your AI and machine-learning capabilities to
exploit?
Ndunda: Michael can probably add to this as well. We do not discriminate as far as data goes.
What we like to do is have no opinion on data ahead of time. We want to get as much
information as possible and then let a scientific process lead us to decide what data is actually
useful for the task that we want to deploy it on.
As an example, we're very opportunistic about acquiring information about who the most
important people at companies are and how they're connected to each other. Does this guy work
on a board with this or how do they know each other? It may not have any application at that
very moment, but over the course of time, you end up building models that are actually really
interesting.
We scan over 70,000 financial news sources. We capture news information across the world. We
don't necessarily use all of that information on a day-to-day basis, but at least we have it and we
can decide how to use it in the future.
We also monitor anything that companies file and what management teams talk about at investor
conferences or on phone conversations with investors.
Bishop: Conference calls, videos, interviews.
Audio to text
Ndunda: HPE has a really interesting technology that they have recently put out. You can
transcribe audio to text, and then we can apply our text processing on top of that to understand
what management is saying in a structural, machine-based way. Instead of 50 people listening to
50 conference calls you could just have a machine do it for you.
Gardner: Something we can do there that we couldn't have done before is that you can also
apply something like sentiment analysis, which you couldn’t have done if it was a document, and
that can be very valuable.
Bishop: Yes, even tonal analysis. There are a few theories on that, that may or may not pan out,
but there are studies around tone and cadence. We're looking at it and we will see if it actually
pans out.
Gardner: And so do you put this all into your own on-premises data-center warehouse or do you
take advantage of cloud in a variety of different means by which to corral and then analyze this
data? How do you take this fire hose and make it manageable?
5

Bishop: We do take advantage of the cloud quite aggressively. We're split between SoftLayer and
Google. At SoftLayer we have bare-metal hardware machines and some power machines with
high-power GPUs.
Learn More
On the Google side, we take advantage of Bigtable and BigQuery and some of their
infrastructure tools. And we have good, old PostgreSQL in there, as well as DataStax, Cassandra,
and their Graph as the graph engine. We make liberal use of HPE Haven APIs as well and
TensorFlow, as I mentioned before. So, it’s a smorgasbord of things you need to corral in order to
get the job done. We found it very hard to ﬁnd all of that wrapped in a bow with one provider.
We're big proponents of Kubernetes and Docker as well, and we leverage that to avoid lock-in
where we can. Our workload can migrate between Google and the SoftLayer Kubernetes cluster.
So, we can migrate between hardware or virtual machines (VMs), depending on the horsepower
that’s needed at the moment. That's how we handle it.
Gardner: So, maybe 10 years ago you would have been in a systems-integration capacity, but
now you're in a services-integration capacity. You're doing some very powerful things at a clip
and probably at a cost that would have been impossible before.
Bishop: I certainly remember placing an order for a server, waiting six months, and then setting
up the RAID drives. It's amazing that you can just flick a switch and you get a very high-
powered machine that would have taken six months to order previously. In Google, you spin up a
VM in seconds. Again, that's of a horsepower that would have taken six months to get.
Gardner: So, unprecedented innovation is now at our fingertips when it comes to the IT side of
things, unprecedented machine intelligence, now that the algorithms and APIs are driving the
opportunity to take advantage of that data.
Let's go back to thinking about what you're outputting and who uses that. Is the investment result
that you're generating something that goes to a retail type of investor? Is this something you're
selling to investment houses or a still undetermined market? How do you bring this to market?
Natural language interface
Ndunda: Roboto, which is the natural-language interface into our analytical tools, can be
custom tailored to respond, based on the user's level of financial sophistication.
6

At present, we're trying them out on a semiprofessional investment platform, where people are
professional traders, but not part of a major brokerage house. They obviously want to get trade
ideas, they want to do analytics, and they're a little bit more sophisticated than people who are
looking at investments for their retirement account. Rob can be tailored for that specific use
case.
He can also respond to somebody who is managing a portfolio at a hedge fund. The level of
depth that he needs to consider is the only differential between those two things.
In the back, he may do an extra five steps if the person asking the question worked at a hedge
fund, versus if the person was just asking about why is Apple up today. If you're a retail investor,
you don’t want to do a lot of in-depth analysis.
Bishop: You couldn’t take the app and do anything with it or understand it.
Ndunda: Rob is an interface, but the analytics are available via multiple venues. So, you can
access the same analytics via an API, a chat interface, the web, or a feed that streams into you. It
just depends on how your systems are set up within your organization. But, the data always will
be available to you.
Gardner: Going out to that edge equation, that user experience, we've talked about how you
deliver this to the endpoints, customary spreadsheets, cells, pivots, whatever. But it also sounds
like you are going toward more natural language, so that you could query, rather than a deep
SQL environment, like what we get with a Siri or the Amazon Echo. Is that where we're heading?
Bishop: When we started this, trying to parameterize everything that you could ask into enough
checkboxes and forums pollutes the screen. The system has access to an enormous amount of
data that you can't create a parameterized screen for. We found it was a bit of a breakthrough
when we were able to start using natural language.
TensorFlow made a huge difference here in natural language understanding, understanding the
intent of the questioner, and being able to parameterize a query from that. If our initial findings
here pan out or continue to pan out, it's going to be a very powerful interface.
I can't imagine having to go back to a SQL query if you're able to do it natural language, and it
really pans out this time, because we’ve had a few turns of the handle of alleged natural-
language querying.
Gardner: And always a moving target. Tell us specifically about SentryWatch and Precog. How
do these shake out in terms of your go-to-market strategy?
7

How everything relates
Ndunda: One of the things that we have to do to be able to answer a lot of questions that our
customers may have is to monitor financial markets and what's impacting them on a continuous
basis. SentryWatch is literally a byproduct of that process where, because we're monitoring over
70,000 financial news sources, we're analyzing the sentiment, we're doing deep text analysis on
it, we're identifying entities and how they're related to each other, in all of these news events, and
we're sticking that into a knowledge graph of how everything relates to everything else.
It ends up being a really valuable tool, not only for us, but for other people, because while we're
building models. there are also a lot of hedge funds that have proprietary models or proprietary
processes that could benefit from that very same organized relational data store of news. That's
what SentryWatch is and that's how it's evolved. It started off with something that we were doing
as an import and it's actually now a valuable output or a standalone product.
Precog is a way for us to showcase the ability of a machine to be predictive and not be backward
looking. Again, when people are making investment decisions or allocation of capital across
different investment opportunities, you really care about your forward return on your
investments. If I invested a dollar today, am I likely to make 20 cents in profit tomorrow or 30
cents in profit tomorrow?
We're using pretty sophisticated machine-learning models that can take into account unstructured
data sources as part of the modeling process. That will give you these forward expectations about
stock returns in a very easy-to-use format, where you don't need to have a PhD in physics or
mathematics.
You just ask, "What is the likely return of Apple over the next six months," taking into account
what's going on in the economy. Apple was fined $14 billion. That can be quickly added into a
model and reflect a new view in a matter of seconds versus sitting down in a spreadsheet and
trying to figure out how it all works out.
Gardner: Even for Apple, that's a chunk of change.
Bishop: It's a lot money, and you can imagine that there were quite a few analysts on Wall Street
in Excel, updating their models around this so that they could have an answer by the end of the
day, where we already had an answer.
Gardner: How do the Haven OnDemand APIs help the Precog when it comes to deciding those
sources, getting them in the right format, so that you can exploit?
Ndunda: The beauty of the platform is that it simplifies a lot of development processes that an
organization of our size would have to take on themselves.
8

The nice thing about it is that a drag-and-drop interface is really intuitive; you don't need to be
specialized in Java, Python, or whatever it is. You can set up your intent in a graphical way, and
then test it out, build it, and expand it as you go along. The Lego-block structure is really useful,
because if you want to try things out, it's drag and drop, connect the dots, and then see what you
get on the other end.
For us, that's an innovation that we haven't seen with anybody else in the marketplace and it cuts
development time for us significantly.
Gardner: Michael, anything more to add on how this makes your life a little easier?
Lowering cost
Bishop: For us, lowering the cost in time to run an experiment is very important when you're
running a lot of experiments, and the Combinations product enables us to run a lot of varied
experiments using a variety of the Haven APIs in different combinations very quickly. You're
able to get your development time down from a week, two weeks, whatever it is to wire up an
API to assist them.
In the same amount of time, you're able to wire the initial connection and then you have access to
pretty much everything in Haven. You turn it over to either a business user, a data scientist, or a
machine-learning person, and they can drag and drop the connectors themselves. It makes my
life easier and it makes the developers’ lives easier because it gets back time for us.
Gardner: So, not only have we been able to democratize the querying, moving from SQL to
natural language, for example, but we’re also democratizing the choice on sources and
combinations of sources in real time, more or less for different types of analyses, not just the
query, but the actual source of the data.
Bishop: Correct.
Ndunda: Again, the power of a lot of this stuff is in the unstructured world, because valuable
information typically tends to be hidden in documents. In the past, you'd have to have a team of
people to scour through text, extract what they thought was valuable, and summarize it for you.
You could miss out on 90 percent of the other valuable stuff that's in the document.
With this ability now to drag and drop and then go through a document in five different iterations
by just tweaking, a parameter is really useful.
Gardner: So those will be IDOL-backed APIs that you are referring to.
9

Ndunda: Exactly.
Bishop: It’s something that would be hard for an investment bank, even a few years ago, to
process. Everyone is on the same playing field here or starting from the same base, but dealing
with unstructured data has been traditionally a very difficult problem. You have a lot
technologies coming online as APIs; at the same time, they're also coming out as traditional on-
premises [software and appliance] solutions.
We're all starting from the same gate here. Some folks are little ahead, but I'd say that Facebook
is further ahead than an investment bank in their ability to reason over unstructured data. In our
world, I feel like we're starting basically at the same place that Goldman or Morgan would be.
Gardner: It's a very interesting reset that we’re going through. It's also interesting that we talked
earlier about the divide between where the machine and the individual knowledge worker begins
or ends, and that's going to be a moving target. Do you have any sense of how that changes its
characterization of what the right combination is of machine intelligence and the best of human
intelligence?
Empowering humans
Ndunda: I don’t foresee machines replacing humans, per se. I see them empowering humans,
and to the extent that your role is not completely based on a task, if it's based on something
where you actually manage a process that goes from one end to another, those particular
positions will be there, and the machines will free our people to focus on that.
But, in the case where you have somebody who is really responsible for something that can be
automated, then obviously that will go away. Machines don't eat, they don’t need to take
vacation, and if it’s a task where you don't need to reason about it, obviously you can have a
computer do it.
What we're seeing now is that if you have a machine sitting side by side with a human, and the
machine can pick up on how the human reasons with some of the new technologies, then the
machine can do a lot of the grunt work, and I think that’s the future of all of this stuff.
Bishop: What we're delivering is that we distill a lot of information, so that a knowledge worker
or decision-maker can make an informed decision, instead of watching CNBC and being a
single-source reader. We can go out and scour the best of all the information, distill it down, and
present it, and they can choose to act on it.
Our goal here is not to make the next jump and make the decision. Our job is to present the
information to a decision-maker.
10

Gardner: It certainly seems to me that the organization, big or small, retail or commercial, can
make the best use of this technology. Machine learning, in the end, will win.
Ndunda: Absolutely. It is a transformational technology, because for the first time in a really
long time, the reasoning piece of it is within grasp of machines. These machines can operate in
the gray area, which is where the world lives.
Gardner: And that gray area can almost have unlimited variables applied to it.
Ndunda: Exactly. Correct.
Learn More
Gardner: I'm afraid we'll have to leave it there. We've been exploring how high-performing big-
data analysis powers an innovative artiﬁcial intelligence-based investment opportunity in a
valuation tool, and we've learned how LogitBot in New York identiﬁes, manages, and
contextually categorizes truly massive and diverse data sources.
So please join me in thanking our guests. We've been here with Mutisya Ndunda, Founder and
CEO of LogitBot in New York. Thank you, sir.
Ndunda: It was a pleasure. Thank you so much.
Gardner: We've also been here with Michael Bishop, CTO of LogicBot. Thank you, Michael.
Bishop: Thank you, Dana.
Gardner: And a big thank you as well to our audience for joining us for this Hewlett-Packard
Enterprise, Voice of the Customer digital transformation discussion.
I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host for this ongoing series of
HPE sponsored interviews. Thanks again for listening, and do come back next time.
Listen to the podcast. Find it on iTunes. Get the mobile app. Sponsor: Hewlett
Packard Enterprise.
Transcript of a discussion on how high-performing big-data analysis powers an innovative
artificial intelligence-based investment opportunity. Copyright Interarbor Solutions, LLC,
2005-2016. All rights reserved.
You may also be interested in:
11

• WWT took an enterprise Tower of Babel and delivered comprehensive intelligent search
• How Software-defined Storage Translates into Just-In-Time Data Center Scaling
• Big data enables top user experiences and extreme personalization for Intuit TurboTax
• Feedback loops: The confluence of DevOps and big data
• Spirent leverages big data to keep user experience quality a winning factor for telcos
• Powerful reporting from YP's data warehouse helps SMBs deliver the best ad campaigns
• IoT brings on development demands that DevOps manages best, say experts
• Big data generates new insights into what’s happening in the world's tropical ecosystems
• DevOps and security, a match made in heaven
• How Sprint employs orchestration and automation to bring IT into DevOps readiness
• How fast analytics changes the game and expands the market for big data value
• How HTC centralizes storage management to gain visibility and IT disaster avoidance
• Big data, risk, and predictive analysis drive use of cloud-based ITSM, says panel
• Rolta AdvizeX experts on hastening big data analytics in healthcare and retail
• The future of business intelligence as a service with GoodData and HP Vertica
12

Fast Acquisition of Diverse Unstructured Data Sources Makes HPE Tools a Star at LogitBot

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (14)

Recently uploaded

Recently uploaded (20)

Fast Acquisition of Diverse Unstructured Data Sources Makes HPE Tools a Star at LogitBot