The Open Group Conference Plenary Speaker Sees Big-Data Analytics as a Way to Bolster Quality, Manufacturing, and Business Processes
The Open Group Conference Plenary Speaker Sees Big-DataAnalytics as a Way to Bolster Quality, Manufacturing, andBusiness ProcessesTranscript of a BrieﬁngsDirect podcast on how Ford Motor Company is harnessing multipledata streams as a way to improve products and operations.Listen to the podcast. Find it on iTunes. Sponsor: The Open GroupDana Gardner: Hello, and welcome to a special BrieﬁngsDirect thought leadership interview series coming to you in conjunction with The Open Group Conference on January 28 in Newport Beach, California. Im Dana Gardner, Principal Analyst at Interarbor Solutions, and Ill be your host and moderator throughout these business transformation discussions. The conference will focus on big data and the transformation we need to embrace today.We are here now with one of the main speakers at the conference; Michael Cavaretta, PhD,Technical Leader of Predictive Analytics for Ford Research and Advanced Engineering inDearborn, Michigan. We’ll see how Ford has exploited the strengths of big data analytics by directing them internally to improve business results. In doing so, they scour the metrics from the company’s best processes across myriad manufacturing efforts and through detailed outputs from in-useautomobiles, all to improve and help transform their business. [Disclosure: The Open Group is asponsor of BrieﬁngsDirect podcasts.]Cavaretta has led multiple data-analytic projects at Ford to break down silos inside the companyto best deﬁne Ford’s most fruitful datasets. Ford has successfully aggregated customer feedback,and extracted all the internal data to predict how best new features in technologies will improvetheir cars.As a lead-in to his Open Group presentation, Michael and I will now explore how big data isfostering business transformation by allowing deeper insights into more types of data efﬁciently,and thereby improving processes, quality control, and customer satisfaction.With that, please join me in welcoming Michael Cavaretta. Welcome to BrieﬁngsDirect,Michael.Michael Cavaretta: Thank you very much.
Gardner: Your upcoming presentation for The Open Group Conference is going to describesome of these new approaches to big data and how that offers some valuable insights intointernal operations, and therefore making a better product. To start, whats different now in beingable to get at this data and do this type of analysis from, say, ﬁve years ago?Cavaretta: The biggest difference has to do with the cheap availability of storage and processingpower, where a few years ago people were very much concentrated on ﬁltering down the datasetsthat were being stored for long-term analysis. There has been a big sea change with the idea thatwe should just store as much as we can and take advantage of that storage to improve businessprocesses.Gardner: That sounds right on the money, but how do we get here? How do we get to the pointwhere we could start using these beneﬁts from a technology perspective, as you say, betterstorage, networks, being able to move big dataset, that sort of thing, to wrenching out beneﬁts.Whats the process behind the beneﬁt?Sea change in attitudeCavaretta: The process behind the beneﬁts has to do with a sea change in the attitude of organizations, particularly IT within large enterprises. Theres this idea that you dont need to spend so much time ﬁguring out what data you want to store and worry about the cost associated with it, and more about data as an asset. There is value in being able to store it, and being able to go back and extract different insights from it. This really comes from this really cheap storage, access to parallel processing machines, and great software. Gardner: It seems to me that for a long time, the mindset was that data is simplythe output from applications, with applications being primary and the data being almost anafterthought. It seems like we sort ﬂipped that. The data now is perhaps as important, even moreimportant, than the applications. Does that seem to hold true?Cavaretta: Most deﬁnitely, and we’ve had a number of interesting engagements where peoplehave thought about the data thats being collected. When we talk to them about big data, storingeverything at the lowest level of transactions, and what could be done with that, their eyes lightup and they really begin to get it.Gardner: I suppose earlier, when cost considerations and technical limitations were at work, wewould just go for a tip-of-the-iceberg level. Now, as you say, we can get almost all the data. So,is this a matter of getting at more data, different types of data, bringing in unstructured data, allthe above? How much you are really going after here?Cavaretta: I like to talk to people about the possibility that big data provides and I always tellthem that I have yet to have a circumstance where somebody is giving me too much data. You
can pull in all this information and then answer a variety of questions, because you dont have toworry that something has been thrown out. You have everything.You may have 100 questions, and each one of the questions uses a very small portion of the data.Those questions may use different portions of the data, a very small piece, but theyre alldifferent. If you go in thinking, "We’re going to answer the top 20 questions and we’re just goingto hold data for that," that leaves so much on the table, and you dont get any value out of it.Gardner: I suppose too that we can think about small samples or small datasets and aggregatethem or join them. We have new software capabilities to do that efﬁciently, so that we’re able tonot just look for big honking, original datasets, but to aggregate, correlate, and look for alifecycle level of data. Is that fair as well?Cavaretta: Deﬁnitely. Were a big believer in mash-ups and we really believe that there is a lotof value in being able to take even datasets that are not speciﬁcally big-data sizes yet, and thennot go deep, not get more detailed information, but expand the breadth. So its being able toaugment it with other internal datasets, bridging across different business areas as well asaugmenting it with external datasets.A lot of times you can take something that is maybe a few hundred thousand records or a fewmillion records, and then by the time you’re joining it, and appending different pieces ofinformation onto it, you can get the big dataset sizes.Gardner: Just to be clear, you’re unique. The conventional wisdom for big data is to look atwhat your customers are doing, or just the external data. You’re really looking primarily atinternal data, while also availing yourself of what external data might be appropriate. Maybe youcould describe a little bit about your organization, what you do, and why this internal focus is soimportant for you.Internal consultantsCavaretta: Im part of a larger department that is housed over in the research and advanced-engineering area at Ford Motor Company, and we’re about 30 people. We work as internalconsultants, kind of like Capgemini or Ernst & Young, but only within Ford MotorCompany. We’re responsible for going out and looking for different opportunities from thebusiness perspective to bring advanced technologies. So, we’ve been focused on the area ofstatistical modeling and machine learning for I’d say about 15 years or so.And in this time, we’ve had a number of engagements where we’ve talked with differentbusiness customers, and people have said, "Wed really like to do this." Then, wed look at thedatasets that they have, and say, "Wouldn’t it be great if we would have had this. So now wehave to wait six months or a year."
These new technologies are really changing the game from that perspective. We can turn on thecomplete ﬁre-hose, and then say that we dont have to worry about that anymore. Everything iscoming in. We can record it all. We dont have to worry about if the data doesn’t support thisanalysis, because its all there. Thats really a big beneﬁt of big-data technologies.Gardner: If youve been doing this for 15 years, you must be demonstrating a return oninvestment (ROI) or a value proposition back to Ford. Has that value proposition been changing?Do you expect it to change? What might be your real value proposition two or three years fromnow?Cavaretta: The real value proposition deﬁnitely is changing as things are being pushed down inthe company to lower-level analysts who are really interested in looking at things from a data-driven perspective. From when I ﬁrst came in to now, the biggest change has been when AlanMulally came into the company, and really pushed the idea of data-driven decisions.Before, we were getting a lot of interest from people who are really very focused on the data thatthey had internally. After that, they had a lot of questions from their management and from upperlevel directors and vice-president saying, "We’ve got all these data assets. We should be gettingmore out of them." This strategic perspective has really changed a lot of what we’ve done in thelast few years.Gardener: As I listen to you Michael, it occurs to me that you are applying this data-drivenmentality more deeply. As you pointed out earlier, youre also going after all the data, all theinformation, whether that’s internal or external.In the case of an automobile company, youre looking at the factory, the dealers, what drivers aredoing, what the devices within the automobile are telling you, factoring that back into designrelatively quickly, and then repeating this process. Are we getting to the point where this sort ofHoly Grail notion of a total feedback loop across the lifecycle of a major product like anautomobile is really within our grasp? Are we getting there, or is this still kind of theoretical. Canwe pull it altogether and make it a science?Cavaretta: The theory is there. The question has more to do with the actual implementation andthe practicality of it. We still are talking a lot of data where even with new advanced technologiesand techniques that’s a lot of data to store, it’s a lot of data to analyze, there’s a lot of data tomake sure that we can mash-up appropriately.And, while I think the potential is there and I think the theory is there. There is also a work inbeing able to get the data from multiple sources. So everything which you can get back from thevehicle, fantastic. Now if you marry that up with internal data, is it survey data, is itmanufacturing data, is it quality data? What are the things do you want to go after ﬁrst? We can’tdo everything all at the same time.
Highest valueOur perspective has been let’s make sure that we identify the highest value, the greatest ROIareas, and then begin to take some of the major datasets that we have and then push them and getmore detail. Mash them up appropriately and really prove up the value for the technologists.Gardner: Clearly, theres a lot more to come in terms of where we can take this, but I supposeits useful to have a historic perspective and context as well. I was thinking about some of theearly quality gurus like Deming and some of the movement towards quality like Six Sigma. Doesthis fall within that same lineage? Are we talking about a continuum here over that last 50 or 60years, or is this something different?Cavaretta: That’s a really interesting question. From the perspective of analyzing data, usingdata appropriately, I think there is a really good long history, and Ford has been a big follower ofDeming and Six Sigma for a number of years now.The difference though, is this idea that you dont have to worry so much upfront about getting thedata. If youre doing this right, you have the data right there, and this has some great advantages.You’ll have to wait until you get enough history to look for somebody’s patterns. Then again, italso has some disadvantage, which is you’ve got so much data that it’s easy to ﬁnd things thatcould be spurious correlations or models that don’t make any sense.The piece that is required is good domain knowledge, in particular when you are talking aboutmaking changes in the manufacturing plant. Its very appropriate to look at things and be able totalk with people who have 20 years of experience to say, "This is what we found in the data.Does this match what your intuition is?" Then, take that extra step.Gardner: Tell me a little about sort a day in the life of your organization and your team to let usknow what you do. How do you go about making more data available and then reaching some ofthese higher-level beneﬁts?Cavaretta: Were very much focused on interacting with the business. Most of all, we do have todeal with working on pilot projects and working with our business customers to bring advancedanalytics and big data technologies to bear against these problems. So we work in kind of whatwe call push-and-pull model.We go out and investigate technologies and say these are technologies that Ford should beinterested in. Then, we look internally for business customers who would be interested in that.So, were kind of pushing the technologies.From the pull perspective, we’ve had so many successful engagements in such good contacts andgood credibility within the organization that weve had people come to us and say, "We’ve got aproblem. We know this has been in your domain. Give us some help. We’d love to be able tohear your opinions on this."
So we’ve pulled from the business side and then our job is to match up those two pieces. Its bestwhen we will be looking at a particular technology and we have somebody come to us and wesay, "Oh, this is a perfect match."Big dataThose types of opportunities have been increasing in the last few years, and weve been veryhappy with the number of internal customers that have really been very excited about the areas ofbig data.Gardner: Because this is The Open Group Conference and an audience that’s familiar with theIT side of things, Im curious as to how this relates to software and software development. Ofcourse there are so many more millions of lines of code in automobiles these days, softwarebeing more important than just about everything. Are you applying a lot of what you are doing tothe software side of the house or are the agile and the feedback loops and the performancemanagement issues a separate domain, or it’s your crossover here?Cavaretta: Theres some crossover. The biggest area that weve been focused on has beenpicking information, whether internal business processes or from the vehicle, and then being ableto bring it back in to derive value. We have very good contacts in the Ford IT group, and theyhave been fantastic to work with in bringing interesting tools and technology to bear, and thenlooking at moving those into production and what’s the best way to be able to do that.A fantastic development has been this idea that we’re using some of the more agile techniques inthis space and Ford IT has been pushing this for a while. It’s been fantastic to see them workwith us and be able to bring these techniques into this new domain. So were pushing theenvelope from two different directions.Gardner: It sounds like you will be meeting up at some point with a complementary nature toyour activities.Cavaretta: Deﬁnitely.Gardner: Let’s move on to this notion of the "Internet of things," a very interesting concept thatlot of people talk about. It seems relevant to what weve been discussing.We have sensors in these cars, wireless transfer of data, more-and-more opportunity for locationinformation to be brought to bear, where cars are, how theyre driven, speed information, all sortsof metrics, maybe making those available through cloud providers that assimilate this data.So let’s not go too deep, because this is a multi-hour discussion all on its own, but how is thisnotion of the Internet of things being brought to bear on your gathering of big data and applyingit to the analytics in your organization?
Cavaretta: It is a huge area, and not only from the internal process perspective -- RFID tagswithin the manufacturing plans, as well as out on the plant ﬂoor, and then all of the informationthat’s being generated by the vehicle itself.The Ford Energi generates about 25 gigabytes of data per hour. So you can imagine sellingcouple of million vehicles in the near future with that amount of data being generated. There arehuge opportunities within that, and there are also some interesting opportunities having to dowith opening up some of these systems for third-party developers. OpenXC is an initiative thatwe have going on to add at Research and Advanced Engineering.Huge number of sensorsWe have a lot of data coming from the vehicle. There’s huge number of sensors andprocessors that are being added to the vehicles. Theres data being generated there, as well ascommunication between the vehicle and your cell phone and communication between vehicles.Theres a group over at Ann Arbor Michigan, the University of Michigan TransportationResearch Institute (UMTRI), that’s investigating that, as well as communication between thevehicle and let’s say a home system. It lets the home know that youre on your way and it’s timeto increase the temperature, if it’s winter outside, or cool it at the summer time.The amount of data that’s been generated there is invaluable information and could be used for alot of beneﬁts, both from the corporate perspective, as well as just the very nature of theenvironment.Gardner: Just to put a stake in the ground on this, how much data do cars typically generate? Doyou have a sense of what now is the case, an average?Cavaretta: The Energi, according to the latest information that I have, generates about 25gigabytes per hour. Different vehicles are going to generate different amounts, depending on thenumber of sensors and processors on the vehicle. But the biggest key has to do with notnecessarily where we are right now but where we will be in the near future.With the amount of information thats being generated from the vehicles, a lot of it is just internalstuff. The question is how much information should be sent back for analysis and to ﬁnddifferent patterns? That becomes really interesting as you look at external sensors, temperature,humidity. You can know when the windshield wipers go on, and then to be able to take thatinformation, and mash that up with other external data sources too. Its a very interesting domain.Gardner: So clearly, its multiple gigabytes per hour per vehicle and probably going muchhigher.Cavaretta: Easily.
Gardner: Lets move forward now for those folks who have been listening and are interested inbringing this to bear on their organizations and their vertical industries, from the perspective ofskills, mindset, and culture. Are there standards, certiﬁcation, or professional organizations thatyou’re working with in order to ﬁnd the right people?Its a big question. Lets look at what skills do you target for your group, and what ways youthink that you can improve on that. Then, we’ll get into some of those larger issues about cultureand mindset.Cavaretta: The skills that we have in our department, in particular on our team, are in the areaof computer science, statistics, and some good old-fashioned engineering domain knowledge.We’ve really gone about this from a training perspective. Aside from a few key hires, its reallybeen an internally developed group.Targeted trainingThe biggest advantage that we have is that we can go out and be very targeted with the amountof training that we have. There are such big tools out there, especially in the open-source realm,that we can spin things up with relatively low cost and low risk, and do a number of experimentsin the area. Thats really the way that we push the technologies forward.Gardner: Why The Open Group? Why is that a good forum for your message, and for yourresearch here?Cavaretta: The biggest reason is the focus on the enterprise, where there are a lot of advantagesand a lot of business cases, looking at large enterprises and where there are a lot of systems,companies that can take a relatively small improvement, and it can make a large difference onthe bottom-line.Talking with The Open Group really gives me an opportunity to be able to bring people on boardwith the idea that you should be looking at a difference in mindset. Its not "Here’s a way thatdata is being generated, look, try and conceive of some questions that we can use, and we’ll storethat too." Lets just take everything, we’ll worry about it later, and then we’ll ﬁnd the value.Gardner: Im sure the viewers of your presentation on January 28 will be gathering a lot of greatinsights. A lot of the people that attend The Open Group conferences are enterprise architects.What do you think those enterprise architects should be taking away from this? Is theresomething about their mindset that should shift in recognizing the potential that youve beendemonstrating?Cavaretta: Its important for them to be thinking about data as an asset, rather than as a cost.You even have to spend some money, and it may be a little bit unsafe without really solid ROI atthe beginning. Then, move towards pulling that information in, and being able to store it in a way
that allows not just the high-level data scientist to get access to and provide value, but peoplewho are interested in the data overall. Those are very important pieces.The last one is how do you take a big-data project, how do you take something where you’re notstoring in the traditional business intelligence (BI) framework that an enterprise can develop, andthen connect that to the BI systems and look at providing value to those mash-ups. Those arereally important areas that still need some work.Gardner: Another big constituency within The Open Group community are those businessarchitects. Is there something about mindset and culture, getting back to that topic, that thosebusiness-level architects should consider? Do you really need to change the way you think aboutplanning and resource allocation in a business setting, based on the fruits of things that you aredoing with big data?Cavaretta: I really think so. The digital asset that you have can be monetized to change the waythe business works, and that could be done by creating new assets that then can be sold tocustomers, as well as improving the efﬁciencies of the business.High quality dataThis idea that everything is going to be very well-deﬁned and there is a lot of work that’s beingput into making sure that data has high quality, I think those things need to be changedsomewhat. As youre pulling the data in, as you are thinking about long-term storage, it’s morethe access to the information, rather than the problem in just storing it.Gardner: Interesting that you brought up that notion that the data becomes a product itself andeven a proﬁt center perhaps.Cavaretta: Exactly. There are many companies, especially large enterprises, that are looking attheir data assets and wondering what can they do to monetize this, not only to just pay for theefﬁciency improvement but as a new revenue stream.Gardner: Were almost out of time. For those organizations that want to get started on this, arethere any 20/20 hindsights or Monday morning quarterback insights you can provide. How doyou get started? Do you appoint a leader? Do you need a strategic roadmap, getting this cultureor mindset shifted, pilot programs? How would you recommend that people might begin theprocess of getting into this?Cavaretta: Were deﬁnitely a huge believer in pilot projects and proof of concept, and we like todevelop roadmaps by doing. So get out there. Understand that its going to be messy. Understandthat it maybe going to be a little bit more costly and the ROI isnt going to be there at thebeginning.
But get your feet wet. Start doing some experiments, and then, as those experiments turn fromjust experimentation into really providing real business value, that’s the time to start looking at amore formal aspect and more formal IT processes. But youve just got to get going at this point.Gardner: I would think that the competitive forces are out there. If you are in a competitiveindustry, and those that you compete against are doing this and you are not, that could spell sometrouble.Cavaretta: Deﬁnitely.Gardner: We’ve been talking with Michael Cavaretta, PhD, Technical Leader of PredictiveAnalytics at Ford Research and Advanced Engineering in Dearborn, Michigan. Michael and Ihave been exploring how big data is fostering business transformation by allowing deeperinsights into more types of data and all very efﬁciently. This is improving processes, updatingquality control and adding to customer satisfaction.Our conversation today comes as a lead-in to Michael’s upcoming plenary presentation. He isgoing to be talking on January 28 in Newport Beach California, as part of The Open GroupConference.You will hear more from Michael and others, the global leaders on big data that are going to begathering to talk about business transformation from big data at this conference. So a big thankyou to Michael for joining us in this fascinating discussion. I really enjoyed it and I look forwardto your presentation on the 28.Cavaretta: Thank you very much.Gardner: And I would encourage our listeners and readers to attend the conference or followmore of the threads in social media from the event. Again, it’s going to be happening fromJanuary 27 to January 30 in Newport Beach, California.This is Dana Gardner, Principal Analyst at Interarbor Solutions, your host and moderator throughthe thought leadership interviews. Thanks again for listening, and come back next time.Listen to the podcast. Find it on iTunes. Sponsor: The Open GroupTranscript of a BrieﬁngsDirect podcast on how Ford Motor Company is harnessing multipledata streams as a way to improve products and operations. Copyright The Open Group andInterarbor Solutions, LLC, 2005-2013. All rights reserved.You may also be interested in: • The Open Group Trusted Technology Forum is Leading the Way to Securing GLobal IT Supply Chains
• Corporate Data, Supply Chains Remain Vulnerable to Cyber Crime Attacks Says Open Group Conference Speaker• Open Group Conference Speakers Discuss the Cloud: Higher Risk or Better Security?• Capgeminis CTO on Why Cloud Computing Exposes the Duality Between IT and Business• San Francisco Conference observations: Enterprise transformation, enterprise architecture, SOA and a splash of cloud computing• MITs Ross on how enterprise architecture and IT more than ever lead to business transformation• Overlapping criminal and state threats pose growing cyber security threat to global Internet commerce, says Open Group speaker