Hello everybody, I’m Tom Bryans, Deliver Manager for Ford Motor Company’s Connected Vehicle Data Platform. <P> Joining me is Dan Totten from our Enterprise Architecture Group. <P> We also have Joe Niemiec from who is our resident architect from HortonWorks. <P><P> What I would like to provide you is, an overview of our Ford Connected Vehicle Journey.
After I get through my presentation, we should have time for a Q&A at the end.
who owns a Ford Mustang ?
What is your name ? <Simon..
Hey <Simon>, thanks for driving a Ford.
I have a mustang as well, and it is awesome!!!!!
Ford<P>has been undergoing a transformation <P> We are going from a Car Company <P> where Revenue is generated from Vehicles, Parts and Accessories.<P>
To a Mobility Company<P>
What is a mobility company ? <P>
A mobility company helps to change the way the world moves.<P> Mobility is about human progress.<P> getting Food to stores<P>Ambulances to emergencies<P> It is about Getting to work<P> and getting home.
There are 4 megatrends that Ford is looking at<P> 1 -Urbanization – growing population in urban environments Here, the Infrastructure simply can’t keep up with number of vehicles 2 - Global middle class growth – it is expected to double from 2B-4B by 2030 – and ione of the middle class dreams is to … own a car 3 - Air Quality – we are very concerned about urban air pollution; 4 - Changing consumer attitudes – things like car sharing and car request services.
Ford has a Blueprint for Mobility – that blueprint includes: Vehicles that talk to one another as well as the infrastructure People that want to share vehicles Mobility vs Gridlock And every one is aware of Autonomous Vehicles This blueprint may mean less vehicles <P> but more usage At the center of that transformation is the connected vehicle.
So, What is a Connected Vehicle ? <Click> A connected Vehicle has the means to Transmit Data <Click>
As well as the means to Receive Data
By receiving Data, Think of things like Over the Air Software Updates or A feature such as a remote start request. <Click>
So, How do we collect data from our connected vehicles.
There are three basic ways <Click>
The first is Data Over Voice – This is where we use the consumers cell phone to transmit a very small payload over a cellular voice connection – think a 2400 baud modem. <Click>
Next, we have Modems Embedded in the vehicle that use 4G cellular data to transmit a more robust payload. <Click>
And formally, we have Plug In Devices that plug into the vehicles OBD II port and provide a very rich and frequent payload. <Click>
Where does data go from the Vehicle ?
As the Data leaves the Vehicle it goes to various Cloud providers. <Click>
The Data over Voice packages goes to one of our partners - Airbiquity <Click>
The Embedded Modem data goes to partners Accenture, Telogis or the Ford Cloud <Click>
The Plug In data goes to the Delphi, Azure or Ford Clouds <Click>
Where does the data go from the cloud? <Click> From the Cloud, the data securely travels via various protocols Including: HTTP via rest based web services. Secure FTP And streaming using AMQP <P> A big challenge is that this data is all in different formats? <Click>
The data is then stored in our Ford Enterprise Hadoop Data Platform <Click>
Where we Land and Transform it <Click> We do that utilizing various Hadoop components including Storm, MapReduce, Oozie, PIG and Java. <Click> The Data is then stored <Click> in HIVE and HBASE<Click>
What data is collected from our vehicles ?
There are more lines of code in our vehicles than there are in airplanes. Data that we Collect from the Vehicle includes: <Click>
CAN Signals and Messages These are things like Odometer, Engine RPM, Location and Temperatures <Click>
Diagnostic Trouble Codes or DTC’s which include thing like: Cylinder Misfires and Emissions Faults. <Click>
And Warning Indicator Lamp Status which include things like: Service Engine Soon, Low Fuel, Low Tire Pressure. <Click>
How much Data can a Vehicle Generate ? <Click>
A single vehicle can generate 25GB of CAN traffic in an hour.
Of course, not all this data is transferred back, most of the traffic is computer to computer communications. <Click>
What Does the Data that we receive look like ? <Click>
What we receive is an encrypted; base64 encoded blob; which then has to be decrypted; decoded; transformed; and stored. <Click>
This is a Use Case of the Remote Start Feature that is available in our Sync Connect equipped vehicles – You have your Car, the Ford Cloud and our Hadoop Data Platform <Click>
Using your Ford Pass Mobile App, <Click>
you send a remote start request which travels to the Ford Cloud. <Click>
The Ford Cloud will send that request to the vehicle as wall as to the data platform where we record that a Remote Start Request was Received from the Mobile device and that a Remote Start Request was sent to the Vehicle. <Click>
Upon success, the vehicle will send a response back to the cloud which will route the confirmation to the data platform <Click> as well as to the FordPass mobile App. <Click>
The next use Case, is our Ford Credit Variable Lease <Click>
Let’s say that Simon leases a mustang for a term of $350/month for 24 months and gets 24,000 miles. Simon ride shares to work every day, and just uses the Mustang on the weekends <Click>
As you know, Data like odometer is transmitted from the vehicle, to the Ford Cloud and back to the Enterprise Data Platform <Click>
Ford Credit pulls the Odometer from our data store <Click> and with that information is able to offer Simon a variable lease of $250/month Base
Plus a variable usage cost based on the actual miles driven that month. <Click>
A little bit about our Architecture patterns As discussed earlier, We can receive data from
Azure Event hubs, <P> secure FTP, <P> and restful Webservices
We also collect data from our Internal Ford databases.
We use Oozie to schedule our workflows; and Ingest the data using Storm Spouts, Java Applications, Java web Services; or we Sqoop it from internal databases.
The raw Data is brought directly into HDFS Where we transform it using Java, MapReduce and PIG And utilize De-dup logic, classify the data, partition the data, validate the data and; We than Load it into Hive and HBASE tables. <Long Pause> For security, We control Access to the Data with the help of Ranger.
One other thing to note; We are currently conducting POC’s utilizing NiFi as an ingestion framework
Some of the Opportunities and challenges that we have faced include: Country Specific Regulations GPS not allowed to leave certain countries Right to be Forgotten Data Privacy Ford strictly follows all Automotive Privacy Alliance Principles Customers must explicitly Opt In Integration with Partners and Suppliers Each has a different Interface and data specification Difficult performing Integration Testing & Creating Test Data Storage & Updates of Data In Place updates in HIVE is challenging We use, HIVE for vehicle data HBASE for reference data Standards and Frameworks Created a CVDP common framework STORM for real time data Knox to land data in HDFS through WAS Security Data must be Encrypted At Rest and In Transit Country level partitioning
A little bit about Ford’s Hadoop Cluster We have 261 Data Nodes With over 5200 CPU cores 65 TB of RAM 5 PB usable storage In 10 Racks
Thank You Very much for your time, and I would like to open the floor for Q&A. Any business oriented questions I would be happy to answer; Any technical questions, Dan can handle; And any really technical questions can go to Simon or Joe.