This presentation given by Think Big's senior data scientist Eliano Marques at Digital Natives conference in Berlin, Germany (November 2015), details how to go from experimentation to productionization for a predictive maintenance use case.
The benefits of Hadoop for analytics make it a popular option for many companies looking to expand their analytics suite. However, adding Hadoop as an analytics platform to an existing environment based on more traditional data structures and methods poses several key challenges. Review these slides to understand key challenges and strategies to expanding the analytics suite to use Hadoop, such as: architectural integration with existing platforms, skills and organizational readiness, and the importance of a vision and a clear path forward.
This presentation by Think Big principal Matt Cooke and Martin Oberhuber, Senior Data Scientist, discusses high frequency trading, requirements for success, and underlying architectures which may include Apache Spark.
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...Precisely
Teams working on new business initiatives, whether for enhancing customer engagement, creating new value, or addressing compliance considerations, know that a successful strategy starts with the synchronization of operational and reporting data from across the organization into a centralized repository for use in advanced analytics and other projects. However, the range and complexity of data sources as well as the lack of specialized skills needed to extract data from critical legacy systems often causes inefficiencies and gaps in the data being used by the business.
The first part of our webcast series on Foundation Strategies for Trust in Big Data provides insight into how Syncsort Connect with its design once, deploy anywhere approach supports a repeatable pattern for data integration by enabling enterprise architects and developers to ensure data from ALL enterprise data sources– from mainframe to cloud – is available in the downstream data lakes for use in these key business initiatives.
Foundational Strategies for Trust in Big Data Part 2: Understanding Your DataPrecisely
Teams working on new initiatives whether for customer engagement, advanced analytics, or regulatory and compliance requirements need a broad range of data sources for the highest quality and most trusted results. Yet the sheer volume of data delivered coupled with the range of data sources including those from external 3rd parties increasingly precludes trust, confidence, and even understanding of the data and how or whether it can be used to make effective data-driven business decisions.
The second part of our webcast series on Foundation Strategies for Trust in Big Data provides insight into how Trillium Discovery for Big Data with its natively distributed execution for data profiling supports a foundation of data quality by enabling business analysts to gain rapid insight into data delivered to the data lake without technical expertise.
Accelerating Fast Data Strategy with Data VirtualizationDenodo
"Information from the past won't support the insights of the future - businesses need real-time data," said Forrester Analyst Noel Yuhanna. In this presentation, he explains the challenges of latent data faced by business users, the need to accelerate fast data strategy using data virtualization, and the implications of such strategy.
This presentation is part of the Fast Data Strategy Conference, and you can watch the video here goo.gl/a2xNyZ.
Predictive Analytics - Big Data Warehousing MeetupCaserta
Predictive analytics has always been about the future, and the age of big data has made that future an increasingly dynamic place, filled with opportunity and risk.
The evolution of advanced analytics technologies and the continual development of new analytical methodologies can help to optimize financial results, enable systems and services based on machine learning, obviate or mitigate fraud and reduce cybersecurity risks, among many other things.
Caserta Concepts, Zementis, and guest speaker from FICO presented the strategies, technologies and use cases driving predictive analytics in a big data environment.
For more information, visit www.casertaconcepts.com or contact us at info@casertaconcepts.com
The benefits of Hadoop for analytics make it a popular option for many companies looking to expand their analytics suite. However, adding Hadoop as an analytics platform to an existing environment based on more traditional data structures and methods poses several key challenges. Review these slides to understand key challenges and strategies to expanding the analytics suite to use Hadoop, such as: architectural integration with existing platforms, skills and organizational readiness, and the importance of a vision and a clear path forward.
This presentation by Think Big principal Matt Cooke and Martin Oberhuber, Senior Data Scientist, discusses high frequency trading, requirements for success, and underlying architectures which may include Apache Spark.
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...Precisely
Teams working on new business initiatives, whether for enhancing customer engagement, creating new value, or addressing compliance considerations, know that a successful strategy starts with the synchronization of operational and reporting data from across the organization into a centralized repository for use in advanced analytics and other projects. However, the range and complexity of data sources as well as the lack of specialized skills needed to extract data from critical legacy systems often causes inefficiencies and gaps in the data being used by the business.
The first part of our webcast series on Foundation Strategies for Trust in Big Data provides insight into how Syncsort Connect with its design once, deploy anywhere approach supports a repeatable pattern for data integration by enabling enterprise architects and developers to ensure data from ALL enterprise data sources– from mainframe to cloud – is available in the downstream data lakes for use in these key business initiatives.
Foundational Strategies for Trust in Big Data Part 2: Understanding Your DataPrecisely
Teams working on new initiatives whether for customer engagement, advanced analytics, or regulatory and compliance requirements need a broad range of data sources for the highest quality and most trusted results. Yet the sheer volume of data delivered coupled with the range of data sources including those from external 3rd parties increasingly precludes trust, confidence, and even understanding of the data and how or whether it can be used to make effective data-driven business decisions.
The second part of our webcast series on Foundation Strategies for Trust in Big Data provides insight into how Trillium Discovery for Big Data with its natively distributed execution for data profiling supports a foundation of data quality by enabling business analysts to gain rapid insight into data delivered to the data lake without technical expertise.
Accelerating Fast Data Strategy with Data VirtualizationDenodo
"Information from the past won't support the insights of the future - businesses need real-time data," said Forrester Analyst Noel Yuhanna. In this presentation, he explains the challenges of latent data faced by business users, the need to accelerate fast data strategy using data virtualization, and the implications of such strategy.
This presentation is part of the Fast Data Strategy Conference, and you can watch the video here goo.gl/a2xNyZ.
Predictive Analytics - Big Data Warehousing MeetupCaserta
Predictive analytics has always been about the future, and the age of big data has made that future an increasingly dynamic place, filled with opportunity and risk.
The evolution of advanced analytics technologies and the continual development of new analytical methodologies can help to optimize financial results, enable systems and services based on machine learning, obviate or mitigate fraud and reduce cybersecurity risks, among many other things.
Caserta Concepts, Zementis, and guest speaker from FICO presented the strategies, technologies and use cases driving predictive analytics in a big data environment.
For more information, visit www.casertaconcepts.com or contact us at info@casertaconcepts.com
This is the third in our three part webinar series on cloud-enabled customer insights. Learn how to scale your customer analytics operations up and out with Microsoft Azure Data Lake.
Scott Fairbanks, Senior BI Consultant at CCG, demonstrates the key differentiators between traditional warehouse architectures and new cloud technologies. Learn the key competitors in the cloud space, and what elements separates them in terms of linking analytic solutions.
Modern Integrated Data Environment - Whitepaper | QuboleVasu S
A whit-paper is about building a modern data platform for data driven organisations with using cloud data warehouse with modern data platform architecture
https://www.qubole.com/resources/white-papers/modern-integrated-data-environment
Introduction to Machine Learning with Azure & DatabricksCCG
Join CCG and Microsoft for a hands-on demonstration of Azure’s machine learning capabilities. During the workshop, we will:
- Hold a Machine Learning 101 session to explain what machine learning is and how it fits in the analytics landscape
- Demonstrate Azure Databricks’ capabilities for building custom machine learning models
- Take a tour of the Azure Machine Learning’s capabilities for MLOps, Automated Machine Learning, and code-free Machine Learning
By the end of the workshop, you’ll have the tools you need to begin your own journey to AI.
Claudia Imhoff of the Boulder BI Brain Trust gives the lowdown on integrating real-time data to leverage modern BI practices for your business in this Information Builders Innovation Session presentation.
Your smarter data analytics strategy - Social Media Strategies Summit (SMSS) ...Clark Boyd
The volume and velocity of available data brings with it a huge amount of new opportunities for marketers. However, without the analytics know-how to avail of this data, these are opportunities that are often missed. Moreover, the variety of different data sources and analytics platforms only add to this complexity.
This presentation covers:
- How to define and communicate an analytics framework
- How to set up analytics dashboards for a range of stakeholders
- The people and skills you need for an optimal analytics team
- Practical tips for improving your campaign measurement
More and more applications are leveraging the power of NoSQL as a primary means of data storage. This session, as presented at Teradata Partners Conference 2015, by Bryce Cottam, Principal Architect at Think Big, a Teradata company, covered how to successfully model application data on NoSQL storage engines for everyday application use. The presentation explores common design patterns, techniques and tips that will help developers leverage the horizontal scalability of NoSQL stores while embracing their inherent limitations. Topics include: Denormalization, Intelligent Keys (including avoiding hot-spotting), Counters, and Data Sharding.
Industrial Analytics and Predictive Maintenance 2017 - 2022Rising Media Ltd.
In this session we will present the results of two recent, international studies on the state of data analytics in industrial settings. You will get insights from an in-depth industry survey of 151 analytics professionals and decision-makers in industrial companies, providing a deep-dive into strategies, project types, cost structures and skill-demand in IoT-based analytics. In addition we will present a survey focusing on predictive analytics covering the market potential and expected development until 2022.
This is the third in our three part webinar series on cloud-enabled customer insights. Learn how to scale your customer analytics operations up and out with Microsoft Azure Data Lake.
Scott Fairbanks, Senior BI Consultant at CCG, demonstrates the key differentiators between traditional warehouse architectures and new cloud technologies. Learn the key competitors in the cloud space, and what elements separates them in terms of linking analytic solutions.
Modern Integrated Data Environment - Whitepaper | QuboleVasu S
A whit-paper is about building a modern data platform for data driven organisations with using cloud data warehouse with modern data platform architecture
https://www.qubole.com/resources/white-papers/modern-integrated-data-environment
Introduction to Machine Learning with Azure & DatabricksCCG
Join CCG and Microsoft for a hands-on demonstration of Azure’s machine learning capabilities. During the workshop, we will:
- Hold a Machine Learning 101 session to explain what machine learning is and how it fits in the analytics landscape
- Demonstrate Azure Databricks’ capabilities for building custom machine learning models
- Take a tour of the Azure Machine Learning’s capabilities for MLOps, Automated Machine Learning, and code-free Machine Learning
By the end of the workshop, you’ll have the tools you need to begin your own journey to AI.
Claudia Imhoff of the Boulder BI Brain Trust gives the lowdown on integrating real-time data to leverage modern BI practices for your business in this Information Builders Innovation Session presentation.
Your smarter data analytics strategy - Social Media Strategies Summit (SMSS) ...Clark Boyd
The volume and velocity of available data brings with it a huge amount of new opportunities for marketers. However, without the analytics know-how to avail of this data, these are opportunities that are often missed. Moreover, the variety of different data sources and analytics platforms only add to this complexity.
This presentation covers:
- How to define and communicate an analytics framework
- How to set up analytics dashboards for a range of stakeholders
- The people and skills you need for an optimal analytics team
- Practical tips for improving your campaign measurement
More and more applications are leveraging the power of NoSQL as a primary means of data storage. This session, as presented at Teradata Partners Conference 2015, by Bryce Cottam, Principal Architect at Think Big, a Teradata company, covered how to successfully model application data on NoSQL storage engines for everyday application use. The presentation explores common design patterns, techniques and tips that will help developers leverage the horizontal scalability of NoSQL stores while embracing their inherent limitations. Topics include: Denormalization, Intelligent Keys (including avoiding hot-spotting), Counters, and Data Sharding.
Industrial Analytics and Predictive Maintenance 2017 - 2022Rising Media Ltd.
In this session we will present the results of two recent, international studies on the state of data analytics in industrial settings. You will get insights from an in-depth industry survey of 151 analytics professionals and decision-makers in industrial companies, providing a deep-dive into strategies, project types, cost structures and skill-demand in IoT-based analytics. In addition we will present a survey focusing on predictive analytics covering the market potential and expected development until 2022.
Razorfish Multi-Channel Marketing: Better Customer Segmentation and TargetingTeradata Aster
Matt Comstock, Vice President Business Intelligence Office, Razorfish, presents at the Big Analytics 2012 Roadshow.
From search to email to social, customers are interacting with your brand across a variety of channels. But what do people do once they view an advertisement or get an email? What common behaviors are displayed once they’re on your site? By combining media exposure/behavior, site-side media, and in-store purchase data, you can understand better the impact media has on driving value to your business. Come to this session to learn how better data-driven multi-channel analysis lets you see what consumers do before they become a customer to understand what content influences which segments of users by media audience. Discover new segmentation and targeting strategies to improve engagement with your brand and increase advertising lift. See how a leader in digital marketing uses a combination of technologies including Teradata Aster, Hadoop, and Amazon Web Services to handle big data and provide big analytics to improve business value.
[Tutorial] building machine learning models for predictive maintenance applic...PAPIs.io
This talk introduces the landscape and challenges of predictive maintenance applications in the industry, illustrates how to formulate (data labeling and feature engineering) the problem with three machine learning models (regression, binary classification, multi-class classification) using a publicly available aircraft engine run-to-failure data set, and showcases how the models can be conveniently trained and compared with different algorithms in Azure ML.
How to Use Algorithms to Scale Digital BusinessTeradata
Gartner defines digital business as the creation of new business designs by blurring the digital and physical worlds. Digital business creates new business opportunities, but the amount of data generated will eclipse the human ability to process it. Further, many complex decisions will need to be made in timeframes, and at scales, that are impossible by human actors. Gartner analyst Chet Geschickter will explain share advice on how to leverage algorithmic business principles to drive digital business success.
Deep Learning Use Cases - Data Science Pop-up SeattleDomino Data Lab
Companies like Google, Microsoft, Amazon and Facebook are in fierce competition for teams that can build deep-learning applications. Because of deep learning's general usefulness in pattern recognition, those applications are surprisingly diverse, ranging from image recognition to machine translation. This talk will explore deep learning use cases for the major data types -- image, sound, text and time series -- as they're emerging in the private sector. Presented by Chris Nicholson, Co-Founder and CEO at Skymind.
significance_of_test_estimating_in_the_software_development.pptxsarah david
Accurate estimations helps project managers to maintain a well-organized project timeline. By having a clear understanding of the time required for testing activities, realistic schedules can be developed, ensuring effective coordination with development and other project tasks.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
Software organizations that want to maximize the yield of Software Testing find that choosing the right testing strategy is hard, and most testing managers are ill-prepared for this. The organization has to learn how to plan testing efforts based on the characteristics of each project and the many ways the software product is to be used. This tutorial is intended for Software professionals who are likely to be responsible for defining the strategy and planning of the testing effort and managing it through its life cycle. These roles are usually Testing Managers or Project Managers.
significance_of_test_estimating_in_the_software_development.pptxsarah david
Accurate estimations helps project managers to maintain a well-organized project timeline. By having a clear understanding of the time required for testing activities, realistic schedules can be developed, ensuring effective coordination with development and other project tasks.
significance_of_test_estimating_in_the_software_development.pdfsarah david
Accurate estimations helps project managers to maintain a well-organized project timeline. By having a clear understanding of the time required for testing activities, realistic schedules can be developed, ensuring effective coordination with development and other project tasks.
Best Practices for Implementing Self-Service AnalyticsMattSaxton5
Self-service analytics is generally recognized as a valuable asset within corporate strategies, and it’s easy to see why: it provides process experts with the user-friendly tools they need to tackle their day-to-day challenges. It allows problems to be resolved faster and frees up central analytics groups to focus on other pressing issues.
In this ebook, we will share five key learnings from some of our most successful customers in order to help you drive your self-service analytics journey towards success.
Learn more about advanced industrial analytics at www.trendminer.com
significance_of_test_estimating_in_the_software_development.pdfsarah david
Accurate estimations helps project managers to maintain a well-organized project timeline. By having a clear understanding of the time required for testing activities, realistic schedules can be developed, ensuring effective coordination with development and other project tasks.
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
In the engineering sector, mastering the intricacies of project management demands innovative solutions. This webinar explores the integration of AI into project planning for engineering, tackling both immediate challenges in planning and execution while also setting the stage for unprecedented efficiency and quality. With a spotlight on practical applications, we’ll explore strategies for harnessing AI to optimize resource distribution, ensure precise time management, and elevate project quality. Discover how adopting a technology-forward approach, exemplified by platforms like OnePlan, can transform project outcomes, enhance team collaboration, and boost overall profitability without sacrificing the high standards engineering projects require.
Best Practices: Planning Data Analytic into Your AuditsFraudBusters
These slide accompany a video training presentation from AuditNet®. The video is available to view at http://bit.ly/1eBRLiZ (registration with AuditNet.tv required)
Learning Objectives:
Gain an appreciation, based on the attendee participants, of their successes and pitfalls when planning data analytics.
Understand some common approaches to overcoming obstacles to planning data analytics based on case studies from companies and survey attendees themselves.
Learn how planning analytics can be integrated into top audit areas.
Outline an effective data request process to ensure complete and accurate extractions of data every time.
See how analytics can maximize the annual audit plan and better ensure focus is placed on organizational risk.
This Workshop Teaches Business Leaders How To Implement AI Technologies To Serve Customers Better Than Anybody Else.
AGENDA
Introduction to Artificial Intelligence
Extracting Value & Delivering Value
Predictive & Preventive maintenance
Marine market, Jet engines
How to prepare & implement AI Playbook
GSTi India’s mission is to provide end-to-end IT solutions for clients across the globe by aligning, creating, developing and providing efficient and cost effective services.
Doing Analytics Right - Designing and Automating AnalyticsTasktop
There is no “one-sized fits all” of development analytics. It is not as simple as “here are the measures you need, go implement them.” The world of software delivery is too complex, and software organizations differ too significantly, to make it that simple. As discussed in the first webinar, the analytics you need depend on your unique business goals and environment.
That said, the design of your analytics solution will still require:
* The dashboards,
* the required data, and
* an appropriate choice of analytical techniques and statistics to apply to the data.
This webinar will describe a straightforward method for finding your analytic solution. In particular, we will explain how to adapt the Goal, Question, Metric (GQM) method to development processes. In addition, we will explain how to avoid “the light is brighter here” analytics anti-pattern: the idea that organizations tend to design metrics programs around the data they can easily get, rather than figuring out how to get the data they really need.
With so much noise and buzzwords floating around regarding data analytics, it can be rather difficult to decipher between the signal (what is worthwhile) and what is only talk. Sometimes the rhetoric even starts within your organization, confounding the issue further. During Andrew’s session, he will provide attendees with the knowledge they need to tune out the bogus information while gleaning valuable insights for developing and deploying their audit analytics program. The presentation will conclude with tangible examples of a successful Manufacturing Audit Analytics program, and recommendations for how to get yours up and running. After attending, participants will be able to articulate how steps for setting up an analytics program within their departments, as well be armed with knowledge for educating senior leadership on the fundamental changes in technology that are occurring, and what is just marketing.
Cloudbyz ppm, integrated enterprise ppm-alm-apm on force.comDinesh Sheshadri
Cloudbyz PPM is an integrated enterprise project portfolio management (PPM), application life cycle management (ALM) and application portfolio management (APM) built on Salesforce 1 platform. Cloudbyz PPM is focused on providing agility, real-time visibility and enhanced collaboration and productivity to CIO / IT organization.
Similar to Big Data Analytics: From Insights to Production (20)
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
1. From insights to production with Big Data Analytics
Eliano Marques – Senior Data Scientist
November 2015
2.
3. Large scale solutions typically are part of a discovery
process and fully integrated with the organization strategy
Big Data Analytics Strategy and Ambition
1
Business analytics roadmap
Capture of analytics use
cases and development of
analytics roadmap(s) with
business areas
Productionisation
Large scale
deployment of
analytics use case
based on agile scrum
principles & methods
Analytics
1
23
4
Experimentation
Agile analytics discovery PoC
on offline/ online data to
prove analytics potential prior
to decision on large scale
productionisation
Validation
Decision on
whether to promote
analytics use case for
productionisation
Shared Big Data Analytics governance
4. Use case – Predictive Maintenance
Business analytics roadmap
CFO & Director of
Assets/Production
• What is the outcome of different capital investment for the next 5
years? How do I measure the impact on maintenance?
• Which assets/parts should be targeted for replacement? How to
prioritise them over time?
• How to plan ahead overall costs? What options are available?Director of
Operations
• How to predict demand for reactive maintenance? Can it be
reduced? What is the optimal mix between pro-active vs. reactive
maintenance?
• How to predict stock levels for assets/parts? Can it be minimise?
• What capacity is needed? Do we need to sub-contract?
Field Teams
Lead
• How to increase field force efficiency? How can we reduce
engineering visits?
• How to prioritise faults?
• How to predict false alerts?
Strategy
Tactical
Operational
1
5. Use case – Predictive Maintenance
Experimentation
Production Team
Experiment Owner
Business
and
data
Workshops
Experiment
Development
Experiment
Testing
Experiment
Results
Key
activities:
Key
iterations:
Who’s
involved:
Weekly sessions to check
experiment progress and
validate initial results
Delivery workshop with
program management to
share experiment results
Initial workshops between
experiment owners, data
owners, data engineers and
data scientists
Data engineers
Data Scientists
Key
Outputs:
H1: What's the impact of different
capital investment strategies?
H2: Can sensor data be use to predict
time-to-fail or risk-to-fail of asset parts?
H3: How to minimise faults detection
root-cause and uplift efficiency?
• Segment field force by
time to detect root
cause patterns
• Predict root-cause of
failure by type of
asset/part
• Validate/test models with
key stakeholders
• Link sensors with faults
• Prioritise sensors by
criticality of failure
• Develop models and
Predict time/risk to fail by
asset/part
• Validate/test models
with key stakeholders
• Build target investment
models linked with
maintenance, volumes
and workforce
• Develop simulation
tool and run scenarios
on demand
• Validate/test solution
with key stakeholders
2
6. Use case – Predictive Maintenance
Validation
Business
case
assumptions
Business
case
development
Workshop
preparation
Validation
workshop
Key
activities:
Key
iterations:
Who’s
involved:
Meeting with business area
lead to validate business
case
Validation workshop with
steering committee to obtain
approval for moving solution
to production
Meetings with production
team and business area
leads to get business case
inputs
Key
Outputs:
H2: Can sensor data be use to predict
time-to-fail or risk-to-fail of asset parts?
Pos-experimentation question:
Is it worth moving to production?
Experiment team
Experiment Owner
Steering Comm.
Production team
Analytics
Technology
costs
and
changes
assumptions
Business
value
assumptions
Business
case
Downstream
ApplicationsInformation Sources
Evaluate
Source
Data
Prepare Source
Metadata
Prepare Datafor
Ingest
Enterprise Data Lake
Sequence Automate
Apply Structure
Compress Protect
DashboardEngine
Collect & Manage
Metadata
Perimeter-Authentication-Authorisation
Ingest
3
• New ingestions? How
many models? Prediction
frequency? Rules
engine?
• How users will access
and make decisions on
demand?
• What’s the size of
benefit? Is it tangible?
• Is the use case viable
financially? What’s the
ROI? What’s is the Pay-
back period?
7. Use case – Predictive Maintenance
Productionisation
Release
Planning
Create
Project
Backlog
Production
Deployment
Key
activities:
Key
iterations:
Who’s
involved:
Bi-weekly sign-off of development
progress by program management
and business area lead
Regular meetings in an agile
scrum format including sprint
planning, daily scrums, and
sprint review
Key
Outputs:
Experiment team
Experiment Owner
Production Team
Scrum Master
Gov.,
Maint &
Training
H2: Can sensor data be use to predict
time-to-fail or risk-to-fail of asset parts?
Pos-experimentation question:
Is it worth moving to production?
YES
Sprint
Cycles
Model
3
Model
2
Model
1
• Business and field
engineers can now act on
real time signals based on
predictions of time/risk to
fail for assets and parts
• Rules can be automated
to act on high-risk threads
• Pro-active maintenance
decisions can now be
made to optimise costs
and maintenance
efficiency
Downstream
ApplicationsInformation Sources
Evaluate
Source
Data
Prepare Source
Metadata
Prepare Datafor
Ingest
Enterprise Data Lake
Sequence Automate
Apply Structure
Compress Protect
DashboardEngine
Collect & Manage
Metadata
Perimeter-Authentication-Authorisation
Ingest
Solution running
4
✔