IoT solutions have found their way into many different vertical industries as shown in the slide. As these different industries have specific needs, it is best to categorize the type of IoT solution which you want to build, and define what type of data you will need to collect for analyses.
-Manufacturing uses IoT solutions in machine to machine communications in what are called “smart factories”. IoT elevates automation to a new level providing unparalleled process control, control , quality and efficiency.
- Retail uses IoT solutions in everything from learning customer trends to “real-time” priding and inventory control with RFID tagging.
Energy companies deploy “Smart grids” and use IoT for telemetry , metering and general data collection
Transportation companies used IoT for global logistics to analyze traffic pattern, increase efficiency and save on variable costs like fuel
Consumers are now seeing IoT in wearable devices like fitness trackers, and collision avoidance in their automobiles
Healthcare uses IoT solutions in all types of solutions from patient monitoring to drug companies collecting remote data on trials
Time-to-analysis bottlenecked by need to decide questions (queries) before developing the schema
In order to gain meaning from Big Data, you need “Data Science”
Business Intelligence is different than data science
BI reports on historical performance – retrospective reporting and on-going business monitoring
What happened last quarter? How many did we sell?
Data science is about predicting the future and understanding why things happen
What is the optimal solution? What will happen next?
Data science provides a new approach to uncovering and acting on the insights buried across the wealth of available data sources
Ade range of devices (traffic lights, parking meters, weather instruments, etc.) and video cameras (traffic, pedestrian and bike traffic flow) generating data about city operations. A citizen could combine these sensor and video-generated data with other data sources, such as social media (Facebook, Instagram, Yelp) + citizen comments (emails, phone calls) + city reports (police blotters, fire reports, emergency services, construction permits, work orders, building hours, etc.) + local events (concerts, sporting events, farmers markets, parades, festivals, etc.) to create a rich perspective on the city’s activities, problems and overall economic and social vitality.
In order to gain meaning from Big Data, you need “Data Science”
Business Intelligence is different than data science
BI reports on historical performance – retrospective reporting and on-going business monitoring
What happened last quarter? How many did we sell?
Data science is about predicting the future and understanding why things happen
What is the optimal solution? What will happen next?
Data science provides a new approach to uncovering and acting on the insights buried across the wealth of available data sources
Once you know the decisions, the next step is to brainstorm the questions stakeholders need to answer in support of key decisions. This process will help to identify variables and metrics that might be better predictors of the decisions we are trying to make. While most organizations have a good handle on the “descriptive” (What happened?) questions, the business stakeholders struggle with the “predictive” (what is likely to happen?) and the “prescriptive” (what should I do?) questions (see Figure 1).
Getting smart starts by understanding the city’s key business initiative or business objective (i.e., “what” we want to accomplish). For example, let’s identify and understand the decisions that city management (our key business stakeholder in this example) needs to make to support the business initiative of “Improving traffic flow.” This could include:
Traffic flow decisions: New roads? New lanes? New turn lanes? New bike lanes? Pedestrian crossings? Railroad crossings? Bus stops?
Road repair and maintenance decisions: Fixing potholes? Repaving surfaces? Materials and equipment needed? When to fix potholes and repave streets?
Construction permits decisions: Types of permits needed? Impact on traffic flow? Length of time to complete the work? Number of employees to consider?
Events management decisions: Traffic (cars and pedestrians) attending proposed event? Impact on normal traffic flow? Date, time, location and duration of events?
Parks decisions: Location of parks? Size of parks? Hours of operation? Park equipment maintenance?
Schools decisions: Location and size of new schools? Hours of operations? Location of stoplights and stop signs?
Property of William D Schmarzo
However, enabling big data initiatives is difficult. Many enterprise IT departments were not built to support the third platform and they struggle to deploy and maintain a big data infrastructure which is based on an ever-expanding suite of big data technologies and tools.
Data all over the company in different formats with different levels of governance and security…
A never-ending list of new technologies and applications that is changing every day… and their interactions with the infrastructure to run optimally
Once enabled, customers see the potential and the numbers of Projects, questions and objectives arise quickly. Use cases explode across the organization as the benefits of analytics are understood, The organization struggles to prioritize and to focus on the use case with the most impact and feasibility for the business. and the prioritization and rapid execution become paramount – enabling them to become an Agile Digital Business
Customers are very good at building a physical Hadoop cluster, but struggle to take the next steps:
They often focus on technical activities vs. business outcomes
They have little or no data governance, but need it desperately
They struggle with managing and deploying the various tools
They can’t easily expand restrictive physical infrastructures, slowing analytics to a crawl
They cannot keep up with a never-ending array of emerging technologies
Ultimately, most efforts to stand up big data initiatives stall or fail due to the complexity of the infrastructure, tools, technologies and culture changes required.
BUT EXTRACTING BUSINESS VALUE IS HARD
EXPECT DEPLOYMENT, MANAGEMENT, AND GOVERNANCE COMPLEXITY
The Workshop Objectives are to align business and IT goals around big data, identify strategic opportunities for big data analytics, prioritize key use cases by assessing feasibility and ROI, demonstrate the potential value using data science techniques, and to recommend the appropriate analytics engagement and deployment roadmap.
Technology Track
Some clients have already made progress implementing certain data and analytics use cases, and now the IT organization seeks to expand its capabilities and operationalize the processes, to meet growing demands for better/faster data and analytics. But what often happens is that IT hits a technology wall, because the underlying infrastructure, tools and processes don’t support the new demands of the business. Some typical scenarios we see are gaps in the Big Data capabilities within the IT environment, and long delays in delivery of incoming requests for data and analytics. Uniquely, we help you understand your technology gaps in context to your business goals. The point is that you need to make the right recommendations about where to invest.
Big Data Technology Advisory:
For customers who need to document and/or understand the existing technical environment and it’s limitations with respect to big data
For customers who want a technology roadmap for specific big data capabilities
Identifies gaps in current infrastructure
Produces a plan for activating technical big data use cases
Big Data Proof of Technology:
Pilots a known technology use on EMC equipment within the customer’s IT environment
Demonstrates how the data lake approach and functionality works with existing customer systems and data sources
Big Data Technology Implementation:
Install and configure core data lake architecture
Automate data ingest, data preparation and analytics execution around a specific technology use case
Implementation of Platforms for Big Data and Analytics Solutions:
Implement data lakes and analytics and app/dev capabilities into production environments
Integration of analytic results into management applications
Define and implement business rules and policies as they relate to data
Overview of Business Track
Our point of view is that success starts with aligning IT and the business around a single strategic business initiative within a 9-12 month timeframe. This helps us identify an analytics use case that will accelerate a current business goal or solve a current problem. You need to deliver the right analytic recommendations to the data science teams – the workhorses of your Big Data ecosystem – to help them surface insights that can drive business value. We have a unique methodology to identify and prioritize a single analytics use case with the best combination of implementation feasibility and business value. It’s a 3-week engagement that applies research, interviews, data science expertise and techniques to your business – culminating in a 1-day workshop to identify and agree on the best analytics use case and path forward to solving a business problem. This approach sets us apart from the “bring in a bunch of technology and see what it can do” approach that’s pushed by many vendors. We call this a Big Data Vision Workshop.
The next step is for you to understand the ROI of your use case, so you know in advance, how it will pay off, how much, and over what timeframe. We’ll prove out your prioritized use case ahead of time, on-site, with your data, on a real analytics environment to generate the required analytic lift. We then model an app/process for you that would leverage that insight to achieve the business objective. We call this a Proof of Value. EMC’s proven end-to-end methodology includes:
Inventory, evaluation, and prioritization of data sources
Data preparations, integration and enrichment
Analytic models, visualizations, rules, scoring, etc.
Documentation of actionable insights
Future state data architecture, IT process, organization and governance recommendations
Implementation Roadmap
Confirmed analytic lift and high-level business case / ROI
For an applied analytics implementation, we can configure the architecture for production in your environment, and build your application, so you can implement your solution quickly and easily – speeding time-to-value while establishing a path for future use cases. We have both the consulting and technology components under one roof to help you assess, prove, and deploy, your use case.
Understand content, scope and relationships within the data
Diagrams 1 & 2: Average net income ranges from category 1 of $192,728 to category 7 of $29,958. Category 4 does not have data.
Diagram 3: Cardmembers located at mid to upper categories (1-3) and the lowest category (7). The lowest category however, are the youngest population with high peak at age 21. This could be a student group.
Diagram 4: Cardmember household size are congregated at the 1 to 3 ranges
Converged Infrastructure – Blocks and Racks for delivering storage, compute, and local storage
- This is quota allocated to a workspace
- The BDL Leverages Isilon as the optimal multi-protocol Data Lake for storing raw content – and then making it accessible to the BDL via its HDFS protocol support
The Data Curator is built on a key notion of Data Awareness of the BDL
It allows for the indexing of content in the Lake as well as anywhere in the enterprise and beyond
Data Scientists can search for, and even sample, data in the lake and across the enterprise that will help them build their model. Statistic – 80% of an data scientist’s time is spent looking for and getting the right data
Once found, the Data Curator has an ingestion capability that supports the transfer, blending, and wrangling of that “right data” into a Data Scientists private workspace or sandbox for further preparation
The Platform Manager provisions analytic applications and curated data sets into the Users workspace.
The Data Governor supports the ability to have policy-driven data stewardship for Security, Lineage, and Quality