Industrial IoT is currently transforming how businesses capitalize their big data. Changes in how business is done, combined with multiple technology drivers make geo-distributed data increasingly important for enterprises. These changes are causing serious disruption across a wide range of industries.
A presentation pertaining to the integration of real-time data to the cloud with significant potential in the areas of Industrial IT,Real-time sensor information processing and Smart grids applied to various vertical industries. This is related to my blog post at www.cloudshoring.in
In the wake of IoT becoming ubiquitous, there has been a large interest in the industry to develop novel techniques for anomaly detection at the Edge. Example applications include, but not limited to, smart cities/grids of sensors, industrial process control in manufacturing, smart home, wearables, connected vehicles, agriculture (sensing for soil moisture and nutrients). What makes anomaly detection at the Edge different? The following constraints be it due to the sensors or the applications necessitate the need for the development of new algorithms for AD.
* Very low power and low compute/memory resources
* High data volume making centralized AD infeasible owing to the communication overhead
* Need for low latency to drive fast action taking
Guaranteeing privacy In this talk we shall throw light on the above in detail. Subsequently, we shall walk through the algorithm design process for anomaly detection at the Edge. Specifically, we shall dive into the need to build small models/ensembles owing to limited memory on the sensors. Further, how to training data in an online fashion as long term historical data is not available due to limited storage. Given the need for data compression to contain the communication overhead, can one carry out anomaly detection on compressed data? We shall throw light on building of small models, sequential and one-shot learning algorithms, compressing the data with the models and limiting the communication to only the data corresponding to the anomalies and model description. We shall illustrate the above with concrete examples from the wild!
Elastic como solución de analítica avanzada en los procesos del sector petrolero. Analítica de datos de sensores en tiempo real para adicionar valor a las decisiones estratégicas de las organizaciones
Enabling Real-Time Business with Change Data CaptureMapR Technologies
Machine learning (ML) and artificial intelligence (AI) enable intelligent processes that can autonomously make decisions in real-time. The real challenge for effective ML and AI is getting all relevant data to a converged data platform in real-time, where it can be processed using modern technologies and integrated into any downstream systems.
This lecture aims to give some food for thought regarding how the current High Performance Computing systems (hardware and software) tends to merge with Big Data ones (Machine Learning, Analytics and Enterprise workloads) in order to meet both workloads demands sharing the same clusters.
CTO of ParStream Joerg Bienert hold a presentation on February 25, 2014 about Big Data for Business Users. He talked about several use cases of current ParStream customers and ParStreams' technology itself.
A presentation pertaining to the integration of real-time data to the cloud with significant potential in the areas of Industrial IT,Real-time sensor information processing and Smart grids applied to various vertical industries. This is related to my blog post at www.cloudshoring.in
In the wake of IoT becoming ubiquitous, there has been a large interest in the industry to develop novel techniques for anomaly detection at the Edge. Example applications include, but not limited to, smart cities/grids of sensors, industrial process control in manufacturing, smart home, wearables, connected vehicles, agriculture (sensing for soil moisture and nutrients). What makes anomaly detection at the Edge different? The following constraints be it due to the sensors or the applications necessitate the need for the development of new algorithms for AD.
* Very low power and low compute/memory resources
* High data volume making centralized AD infeasible owing to the communication overhead
* Need for low latency to drive fast action taking
Guaranteeing privacy In this talk we shall throw light on the above in detail. Subsequently, we shall walk through the algorithm design process for anomaly detection at the Edge. Specifically, we shall dive into the need to build small models/ensembles owing to limited memory on the sensors. Further, how to training data in an online fashion as long term historical data is not available due to limited storage. Given the need for data compression to contain the communication overhead, can one carry out anomaly detection on compressed data? We shall throw light on building of small models, sequential and one-shot learning algorithms, compressing the data with the models and limiting the communication to only the data corresponding to the anomalies and model description. We shall illustrate the above with concrete examples from the wild!
Elastic como solución de analítica avanzada en los procesos del sector petrolero. Analítica de datos de sensores en tiempo real para adicionar valor a las decisiones estratégicas de las organizaciones
Enabling Real-Time Business with Change Data CaptureMapR Technologies
Machine learning (ML) and artificial intelligence (AI) enable intelligent processes that can autonomously make decisions in real-time. The real challenge for effective ML and AI is getting all relevant data to a converged data platform in real-time, where it can be processed using modern technologies and integrated into any downstream systems.
This lecture aims to give some food for thought regarding how the current High Performance Computing systems (hardware and software) tends to merge with Big Data ones (Machine Learning, Analytics and Enterprise workloads) in order to meet both workloads demands sharing the same clusters.
CTO of ParStream Joerg Bienert hold a presentation on February 25, 2014 about Big Data for Business Users. He talked about several use cases of current ParStream customers and ParStreams' technology itself.
Machine Learning Success: The Key to Easier Model ManagementMapR Technologies
Join Ellen Friedman, co-author (with Ted Dunning) of a new short O’Reilly book Machine Learning Logistics: Model Management in the Real World, to look at what you can do to have effective model management, including the role of stream-first architecture, containers, a microservices approach and a DataOps style of work. Ellen will provide a basic explanation of a new architecture that not only leverages stream transport but also makes use of canary models and decoy models for accurate model evaluation and for efficient and rapid deployment of new models in production.
Michael will discuss some of the issues and challenges around Big Data. It is all very well building Big Data friendly databases to manage the tidal wave of real-time data that the IoT inevitably creates but this must also be incorporated into legacy data to deliver actionable insight.
Big Data in IoT & Deep Learning
Challenges of IoT Big Data Analytics Applications
Challenges of Cloud-based IoT Platform
Cloud-based IoT Platform Use Case: GE Predix for Smart Building Energy Management
Fog/Edge Computing & Micro Data Centers
Deep Learning for IoT Big Data Analytics Introduction
Deep Learning for IoT Big Data Analytics Use Case
Distributed Deep Learning
Big Data + IoT + Cloud + Deep Learning Insights from Patents
Big Data + IoT + Cloud + Deep Learning Strategy Development
Designing Data-Intensive Applications
Xanadu Functionality
Xanadu Use Case
Xanadu + Deep Learning + Hadoop Integration
Delivered this talk as part of Spark & Kafka Summit 2017 organized by Unicom Learning Conference.
Big data processing is undoubtedly one of the most exciting areas in computing today, and remains an area of fast evolution and introduction of new ideas. Apache Spark is at the cusp of overtaking MapReduce to emerge as the de-facto standard for big data processing. Thanks to its multi-functional capabilities (SQL, Structured Streaming, ML Pipelines and GraphX) under one unified platform , Spark is now a dominant compute technology across various industry use cases and real-time analytics applications. Apache Spark in past few years has seen successful production and commercial deployments across E-Commerce, Healthcare and Travel industry.
Session gave audience an understanding about the latest and upcoming trends in Big-Data Analytics and the role of Spark in enabling those future use-cases of advanced analytics.
Session explored the latest concepts from Apache Spark 2.x and introduction to various ML/DL frameworks that can run Spark along with some real-life use-cases and applications from Retail and IoT verticals.
Before IoT was even a buzz word, our Heavy Industry customers have been running control systems for core parts of their business. Mining, Oil & Gas and Manufacturing have relied on PLCs and embedded systems, but are looking at liberating this data into modern, open platforms. Come and see how AWS tools and services can help accelerate this process with a focus on Edge and Time series data.
How Data-Driven Approaches are Changing Your Data Management Strategies
Introducing data-driven strategies into your business model alters the way your organization manages and provides information to your customers, partners and employees. Gone are the days of “waterfall” implementation strategies from relational data to applications within a data center. Now, data-driven business models require agile implementation of applications based on information from all across an organization–on-premises, cloud, and mobile–and includes information from outside corporate walls from partners, third-party vendors, and customers. Data management strategies need to be ready to meet these challenges or your new and disruptive business models will fail at the most critical time: when your customers want to access it.
Churn prediction is big business. It minimizes customer defection by predicting which customers are likely to cancel a service. Though originally used within the telecommunications industry, it has become common practice for banks, ISPs, insurance firms, and other verticals. More: http://info.mapr.com/WB_PredictingChurn_Global_DG_17.06.15_RegistrationPage.html
The prediction process is data-driven and often uses advanced machine learning techniques. In this webinar, we'll look at customer data, do some preliminary analysis, and generate churn prediction models – all with Spark machine learning (ML) and a Zeppelin notebook.
Spark’s ML library goal is to make machine learning scalable and easy. Zeppelin with Spark provides a web-based notebook that enables interactive machine learning and visualization.
In this tutorial, we'll do the following:
Review classification and decision trees
Use Spark DataFrames with Spark ML pipelines
Predict customer churn with Apache Spark ML decision trees
Use Zeppelin to run Spark commands and visualize the results
Big Data LDN 2018: 7 SUCCESSFUL HABITS FOR DATA-INTENSIVE APPLICATIONS IN PRO...Matt Stubbs
Date: 14th November 2018
Location: Keynote Theatre
Time: 11:50 - 12:20
Speaker: Ellen Friedman
Organisation: MapR
About: We’ve seen that over 90% of our customers have large scale projects successfully in production. What are they doing right? And how can you adapt their effective habits to your own business?
Value comes from big data when you have successful production deployments of data-intensive AI and analytics applications tied to practical business goals. Doing this well can be difficult on many levels. Each business presents its own challenges, but we’ve observed a number of habits that are common to many of the organizations who are getting value from their production deployments.
This presentation will explore 7 key habits that can make a difference and use real world examples to show you why. From architecture to technology to organizational culture, you’ll learn practical approaches that can improve your likelihood of success in production.
ML Workshop 1: A New Architecture for Machine Learning LogisticsMapR Technologies
Having heard the high-level rationale for the rendezvous architecture in the introduction to this series, we will now dig in deeper to talk about how and why the pieces fit together. In terms of components, we will cover why streams work, why they need to be persistent, performant and pervasive in a microservices design and how they provide isolation between components. From there, we will talk about some of the details of the implementation of a rendezvous architecture including discussion of when the architecture is applicable, key components of message content and how failures and upgrades are handled. We will touch on the monitoring requirements for a rendezvous system but will save the analysis of the recorded data for later. Listen to the webinar on demand: https://mapr.com/resources/webinars/machine-learning-workshop-1/
Data Warehouse Modernization: Accelerating Time-To-Action MapR Technologies
Data warehouses have been the standard tool for analyzing data created by business operations. In recent years, increasing data volumes, new types of data formats, and emerging analytics technologies such as machine learning have given rise to modern data lakes. Connecting application databases, data warehouses, and data lakes using real-time data pipelines can significantly improve the time to action for business decisions. More: http://info.mapr.com/WB_MapR-StreamSets-Data-Warehouse-Modernization_Global_DG_17.08.16_RegistrationPage.html
The Synapse IoT Stack: Technology Trends in IOT and Big DataInMobi Technology
This is the presentation from Big Data November Bangalore Meetup 2014.
http://technology.inmobi.com/events/bigdata-meetup
Talk Outline:
- What does THE HIVE provide?
- Goals of Synapse Tech Stack
- THE HIVE Startups
- Demystifying IoT Market
- Synapse Stack for IoT
- Big Data Challenge
- Synapse Lambda Architecture
- Synapse Components
- Synapse Internals
- AKILI – Synapse Machine Learning
Changes in how business is done combined with multiple technology drivers make geo-distributed data increasingly important for enterprises. These changes are causing serious disruption across a wide range of industries, including healthcare, manufacturing, automotive, telecommunications, and entertainment. Technical challenges arise with these disruptions, but the good news is there are now innovative solutions to address these problems. http://info.mapr.com/WB_Geo-distributed-Big-Data-and-Analytics_Global_DG_17.05.16_RegistrationPage.html
Watch this recorded webcast and listen to Infochimps CSO and Co-Founder, Dhruv Bansal, and Think Big Analytics Principal Architect, Douglas Moore, share successful use cases and recommendations for building real-time predictive analytics in your enterprise.
Machine Learning Success: The Key to Easier Model ManagementMapR Technologies
Join Ellen Friedman, co-author (with Ted Dunning) of a new short O’Reilly book Machine Learning Logistics: Model Management in the Real World, to look at what you can do to have effective model management, including the role of stream-first architecture, containers, a microservices approach and a DataOps style of work. Ellen will provide a basic explanation of a new architecture that not only leverages stream transport but also makes use of canary models and decoy models for accurate model evaluation and for efficient and rapid deployment of new models in production.
Michael will discuss some of the issues and challenges around Big Data. It is all very well building Big Data friendly databases to manage the tidal wave of real-time data that the IoT inevitably creates but this must also be incorporated into legacy data to deliver actionable insight.
Big Data in IoT & Deep Learning
Challenges of IoT Big Data Analytics Applications
Challenges of Cloud-based IoT Platform
Cloud-based IoT Platform Use Case: GE Predix for Smart Building Energy Management
Fog/Edge Computing & Micro Data Centers
Deep Learning for IoT Big Data Analytics Introduction
Deep Learning for IoT Big Data Analytics Use Case
Distributed Deep Learning
Big Data + IoT + Cloud + Deep Learning Insights from Patents
Big Data + IoT + Cloud + Deep Learning Strategy Development
Designing Data-Intensive Applications
Xanadu Functionality
Xanadu Use Case
Xanadu + Deep Learning + Hadoop Integration
Delivered this talk as part of Spark & Kafka Summit 2017 organized by Unicom Learning Conference.
Big data processing is undoubtedly one of the most exciting areas in computing today, and remains an area of fast evolution and introduction of new ideas. Apache Spark is at the cusp of overtaking MapReduce to emerge as the de-facto standard for big data processing. Thanks to its multi-functional capabilities (SQL, Structured Streaming, ML Pipelines and GraphX) under one unified platform , Spark is now a dominant compute technology across various industry use cases and real-time analytics applications. Apache Spark in past few years has seen successful production and commercial deployments across E-Commerce, Healthcare and Travel industry.
Session gave audience an understanding about the latest and upcoming trends in Big-Data Analytics and the role of Spark in enabling those future use-cases of advanced analytics.
Session explored the latest concepts from Apache Spark 2.x and introduction to various ML/DL frameworks that can run Spark along with some real-life use-cases and applications from Retail and IoT verticals.
Before IoT was even a buzz word, our Heavy Industry customers have been running control systems for core parts of their business. Mining, Oil & Gas and Manufacturing have relied on PLCs and embedded systems, but are looking at liberating this data into modern, open platforms. Come and see how AWS tools and services can help accelerate this process with a focus on Edge and Time series data.
How Data-Driven Approaches are Changing Your Data Management Strategies
Introducing data-driven strategies into your business model alters the way your organization manages and provides information to your customers, partners and employees. Gone are the days of “waterfall” implementation strategies from relational data to applications within a data center. Now, data-driven business models require agile implementation of applications based on information from all across an organization–on-premises, cloud, and mobile–and includes information from outside corporate walls from partners, third-party vendors, and customers. Data management strategies need to be ready to meet these challenges or your new and disruptive business models will fail at the most critical time: when your customers want to access it.
Churn prediction is big business. It minimizes customer defection by predicting which customers are likely to cancel a service. Though originally used within the telecommunications industry, it has become common practice for banks, ISPs, insurance firms, and other verticals. More: http://info.mapr.com/WB_PredictingChurn_Global_DG_17.06.15_RegistrationPage.html
The prediction process is data-driven and often uses advanced machine learning techniques. In this webinar, we'll look at customer data, do some preliminary analysis, and generate churn prediction models – all with Spark machine learning (ML) and a Zeppelin notebook.
Spark’s ML library goal is to make machine learning scalable and easy. Zeppelin with Spark provides a web-based notebook that enables interactive machine learning and visualization.
In this tutorial, we'll do the following:
Review classification and decision trees
Use Spark DataFrames with Spark ML pipelines
Predict customer churn with Apache Spark ML decision trees
Use Zeppelin to run Spark commands and visualize the results
Big Data LDN 2018: 7 SUCCESSFUL HABITS FOR DATA-INTENSIVE APPLICATIONS IN PRO...Matt Stubbs
Date: 14th November 2018
Location: Keynote Theatre
Time: 11:50 - 12:20
Speaker: Ellen Friedman
Organisation: MapR
About: We’ve seen that over 90% of our customers have large scale projects successfully in production. What are they doing right? And how can you adapt their effective habits to your own business?
Value comes from big data when you have successful production deployments of data-intensive AI and analytics applications tied to practical business goals. Doing this well can be difficult on many levels. Each business presents its own challenges, but we’ve observed a number of habits that are common to many of the organizations who are getting value from their production deployments.
This presentation will explore 7 key habits that can make a difference and use real world examples to show you why. From architecture to technology to organizational culture, you’ll learn practical approaches that can improve your likelihood of success in production.
ML Workshop 1: A New Architecture for Machine Learning LogisticsMapR Technologies
Having heard the high-level rationale for the rendezvous architecture in the introduction to this series, we will now dig in deeper to talk about how and why the pieces fit together. In terms of components, we will cover why streams work, why they need to be persistent, performant and pervasive in a microservices design and how they provide isolation between components. From there, we will talk about some of the details of the implementation of a rendezvous architecture including discussion of when the architecture is applicable, key components of message content and how failures and upgrades are handled. We will touch on the monitoring requirements for a rendezvous system but will save the analysis of the recorded data for later. Listen to the webinar on demand: https://mapr.com/resources/webinars/machine-learning-workshop-1/
Data Warehouse Modernization: Accelerating Time-To-Action MapR Technologies
Data warehouses have been the standard tool for analyzing data created by business operations. In recent years, increasing data volumes, new types of data formats, and emerging analytics technologies such as machine learning have given rise to modern data lakes. Connecting application databases, data warehouses, and data lakes using real-time data pipelines can significantly improve the time to action for business decisions. More: http://info.mapr.com/WB_MapR-StreamSets-Data-Warehouse-Modernization_Global_DG_17.08.16_RegistrationPage.html
The Synapse IoT Stack: Technology Trends in IOT and Big DataInMobi Technology
This is the presentation from Big Data November Bangalore Meetup 2014.
http://technology.inmobi.com/events/bigdata-meetup
Talk Outline:
- What does THE HIVE provide?
- Goals of Synapse Tech Stack
- THE HIVE Startups
- Demystifying IoT Market
- Synapse Stack for IoT
- Big Data Challenge
- Synapse Lambda Architecture
- Synapse Components
- Synapse Internals
- AKILI – Synapse Machine Learning
Changes in how business is done combined with multiple technology drivers make geo-distributed data increasingly important for enterprises. These changes are causing serious disruption across a wide range of industries, including healthcare, manufacturing, automotive, telecommunications, and entertainment. Technical challenges arise with these disruptions, but the good news is there are now innovative solutions to address these problems. http://info.mapr.com/WB_Geo-distributed-Big-Data-and-Analytics_Global_DG_17.05.16_RegistrationPage.html
Watch this recorded webcast and listen to Infochimps CSO and Co-Founder, Dhruv Bansal, and Think Big Analytics Principal Architect, Douglas Moore, share successful use cases and recommendations for building real-time predictive analytics in your enterprise.
Predictive Maintenance Using Recurrent Neural NetworksJustin Brandenburg
My presentation from AnacondaCON 2018 where I discussed using Recurrent Neural Networks, Python, Tensorflow and the MapR Platform to develop deploy a predictive maintenance model for an IoT device in the manufacturing industry.
Spark and MapR Streams: A Motivating ExampleIan Downard
Businesses are discovering the untapped potential of large datasets and data streams through the use of technologies for big data processing and storage. By leveraging these assets they’re creating a new generation of applications that derive value from data they used to throw away. In this presentation Ian Downard shows how to build operational environments for these types of applications with the MapR Converged Data Platform and he describes examples of a next-generation applications that use Java APIs for MapR Streams, Apache Spark, Apache Hive, and MapR-DB. He shows how these technologies can be used to join and transform unbounded datasets to find signals and derive new data streams for a financial scenario involving real-time algorithmic trading and historical analysis using SQL. He also discusses how MapR enables you to run real-time data applications with the speed, reliability, and security you need for a production environment.
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...MapR Technologies
Big data technologies are being applied to a wide variety of use cases. We will review tangible examples of machine learning, discuss an autonomous driving project and illustrate the role of MapR in next generation initiatives. More: http://info.mapr.com/WB_Machine-Learning-for-Chickens_Global_DG_17.11.02_RegistrationPage.html
Real-Time Robot Predictive Maintenance in ActionDataWorks Summit
Industry 4.0 IoT applications promise vast gains in productivity from reduced downtime, higher product quality and higher efficiency. Modern industrial robots integrate hundreds of sensors of all kinds, generating tremendous volumes of data rich in valuable information. However, the reality is that some of the most advanced industrial makers in the world are barely getting started making use of this data, with relatively rudimentary, bespoke monitoring systems built at tremendous cost.
We believe that it is now possible, using a well-chosen selection of enterprise open source big data projects, to successfully deploy Industry 4.0 pilot use cases in a matter of months, at a small fraction of the cost of equivalent projects at leading high-tech makers. We propose to show a working prototype of just such a system, and explain in some detail how it was made.
Our presentation describes a working real-time ML-based anomaly detection system. We show a working industrial robot-analog installed with a wireless movement sensor. Our system scores the data in a cloud-based cluster. For added realism, the system we demonstrate live includes a working augmented-reality headset that can show the real-time status overlaid on the working robot.
This talk is about demonstrating a concrete example of a real-time predictive maintenance system, built as a series of microservices connected by Kafka streams and powered by the excellent H2O distributed Machine Learning tool. Our goal is for our attendees to get a feel for what can be realistically achieved by a few non-genius-level engineers in a few months of effort using the best in open source technology for real-time streams (Kafka) and Machine learning (H2O).
Where appropriate, we’ll mention how our choice of using the MapR Converged Data Platform made the development easier thanks to some of its unique features.
Speaker
Cao Yi, MapR
How Spark is Enabling the New Wave of Converged Cloud Applications MapR Technologies
Apache Spark has become the de-facto compute engine of choice for data engineers, developers, and data scientists because of its ability to run multiple analytic workloads with a single, general-purpose compute engine.
But is Spark alone sufficient for developing cloud-based big data applications? What are the other required components for supporting big data cloud processing? How can you accelerate the development of applications which extend across Spark and other frameworks such as Kafka, Hadoop, NoSQL databases, and more?
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...MapR Technologies
IT budgets are shrinking, and the move to next-generation technologies is upon us. The cloud is an option for nearly every company, but just because it is an option doesn’t mean it is always the right solution for every problem.
Most cloud providers would prefer that every customer be tightly coupled with their proprietary services and APIs to create lock-in with that cloud provider. The savvy customer will leverage the cloud as infrastructure and stay loosely bound to a cloud provider. This creates an opportunity for the customer to execute a multicloud strategy or even a hybrid on-premises and cloud solution.
Jim Scott explores different use cases that may be best run in the cloud versus on-premises, points out opportunities to optimize cost and operational benefits, and explains how to get the data moved between locations. Along the way, Jim discusses security, backups, event streaming, databases, replication, and snapshots across a variety of use cases that run most businesses today.
Self-Service Data Science for Leveraging ML & AI on All of Your DataMapR Technologies
MapR has launched the MapR Data Science Refinery which leverages a scalable data science notebook with native platform access, superior out-of-the-box security, and access to global event streaming and a multi-model NoSQL database.
Designing data pipelines for analytics and machine learning in industrial set...DataWorks Summit
Machine learning has made it possible for technologists to do amazing things with data. Its arrival coincides with the evolution of networked manufacturing systems driven by IoT. In this presentation we’ll examine the rise of IoT and ML from a practitioners perspective to better understand how applications of AI can be built in industrial settings. We'll walk through a case study that combines multiple IoT and ML technologies to monitor and optimize an industrial heating and cooling HVAC system. Through this instructive example you'll see how the following components can be put into action:
1. A StreamSets data pipeline that sources from MQTT and persists to OpenTSDB
2. A TensorFlow model that predicts anomalies in streaming sensor data
3. A Spark application that derives new event streams for real-time alerts
4. A Grafana dashboard that displays factory sensors and alerts in an interactive view
By walking through this solution step-by-step, you'll learn how to build the fundamental capabilities needed in order to handle endless streams of IoT data and derive ML insights from that data:
1. How to transport IoT data through scalable publish/subscribe event streams
2. How to process data streams with transformations and filters
3. How to persist data streams with the timeliness required for interactive dashboards
4. How to collect labeled datasets for training machine learning models
At the end of this presentation you will have learned how a variety of tools can be used together to build ML enhanced applications and data products for instrumented manufacturing systems.
Speakers
Ian Downard, Sr. Developer Evangelist, MapR
William Ochandarena, Senior Director of Product Management, MapR
An Introduction to the MapR Converged Data PlatformMapR Technologies
Listen to the webinar on-demand: http://info.mapr.com/WB_Partner_CDP_Intro_EMEA_DG_17.05.31_RegistrationPage.html
In this 90-minute webinar, we discuss:
- The MapR Converged Data Platform and its components
- Use cases for the Converged Data Platform
- MapR Converged Partner Program
- How to get started with MapR
- Becoming a partner
In memory computing principles by Mac Moore of GridGainData Con LA
In the presentation, we will provide an overview of general in-memory computing principles and the drivers behind it. We will start with a summary of the technical drivers (abundant hardware resources) and market forces (the rise of Big Data). We will cover popular and emerging use cases for in-memory computing, from financial industry trading platforms to mobile payment processing, online advertising, online/mobile gaming back-ends and more. We will then present some foundational concepts and terminology, and discuss considerations around any in-memory solution. From there, we will illustrate how a complete in-memory computing stack like GridGain combines clustering, high performance computing, in-memory data grids, stream processing and Hadoop acceleration into one unified and easy to use platform.
7 Habits for Big Data in Production - keynote Big Data London Nov 2018Ellen Friedman
You can improve your chances for success with data intensive large scale applications (AI, machine learning and analytics) in production.
This keynote presentation from Big Data London shows you how.
Similar to MapR Edge : Act Locally Learn Globally (20)
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
To deal with the devices that can create terabytes of data in short time windows, a new architecture is required. While the MapR Converged Data Platform has always been good for the consumer IoT model and for traditional big data use cases, the industrial IoT model now has an optimal approach with MapR Edge.
MapR Edge is a small footprint edition of the MapR Converged Data Platform that addresses the need to capture, process, and analyze IoT data close to the source and is optimized to run on small-footprint, commodity hardware. MapR Edge puts computing power close to the data to enable processing in real-time. It works in conjunction with a core MapR deployment so you can perform local processing while also taking advantage of a larger, centralized cluster of aggregated data.
What does it solve ? Whats the ideal environment for edge?
Its ideal for dealing with the characteristics of edge locations in industrial IoT environments.
Data sources that create huge volumes of data such as mining sites, blast sites, manufacturing plants are ideal environments for MapR Edge.
Slow or occasionally connected sites can be managed by the bandwidth-aware replication in MapR.
Space constrained locations that require a small computing footprint can be handled by MapR Edge, as it runs as a 3 to 5 node cluster on mini PCs such as the Intel NUCs, which are typically the size of a small book and can be bought off the shelf.
Other vendors make similar PCs that are hardened for remote and even harsh locations as part of an IoT deployment.
We describe the interaction of MapR Edge clusters with a centralized core cluster as “act locally, learn globally.” This architecture gives you processing power to immediately act on events close to the data source, while also delivering the data, or subsets of it, to the central cluster to gain globally-based insights on aggregated data. These insights, in the form of machine learning models, are then deployed to the edge clusters for more real-time processing. This architecture is just another way that MapR lets you operationalize your data.
One of our oil company customers had a manual and expensive process for collecting data from their oil wells. They drive out to the remote locations, download the data on laptops, and drive to headquarters to upload that data. This is the only practical way for them to collect data since the amount of data that’s create overwhelms their connection bandwidth.
With MapR, they can use edge nodes to monitor the wells and ingest data, as well as down-sample the collected data into a size that can be easily delivered over the Internet to the home cluster.
One of the other example for edge is a large medical device company in Europe. They sell MRI Machines and other medical equipment to the hospitals.
Traditionally, there was a manual process and manual file syncing of the data coming from the medical devices on a batch manner, and then sent to the central cluster where a machine learning diagnosis app will look at the data and send diagnosis was sent back to the hospital.
With MapR edge in picture, they have started doing processing locally at the edge, and they anonymize the scan data for compliance purposes, send it over the Internet to the core cluster to do large-scale processing, and then return the results immediately. They not only get quick turnaround, but they can also store the private data at the medical center versus having it stored at the home cluster.
MapR is also working with companies like Audi and Daimler for their connected car and self driving use cases.
They perform long test runs on new cars that collect a huge amount of data from sensors. After a 24-hour test run, they swap out disks from the car with a new set of disks, and then are able to analyze the data taken from the car. This forces delays of up to 24 hours since there is no reasonable connectivity from the car to the home base during test runs.
With MapR Edge in the trunk, they can monitor the car during the test run and respond immediately should an error occur. But more importantly, they can use the cluster to identify the most important data, which pertains to events surrounding a driving exception. These exceptions can be “critical interventions” for self-driving cars, in which the human driver had to take control of the car due to some unforeseeable condition.
That leads us to a reference architecture slide. If you take mining as an example :
In mining, self-driving vehicles, including mine cars and ore trucks, are helping to streamline operations and reduce costs.
Using sensors to monitor the health of machinery in use, companies can shift to a condition-based maintenance model (maintaining equipment when there is an actual need through predictive analytics) rather than relying on a regular maintenance schedule or repairing equipment only when it breaks down.
On the far left, We have small edge cluster on a worksite or manufacturing site ingesting time series data coming from drills, valves , and trucks. Once the raw or aggregated data is transferred to the central data platform, it is typically enriched with the work orders, alarms, meta data etc coming from your traditional SAP systems or asset management systems. This central data lake can be used to provide various dashobards to mining and process engineers to not only visualize, explore data coming from various work sites, but also ability to correlate it.
It also becomes a central data lake for your analyts and data science team to work on building predictive models, which can then be deployed on the edge.