Breaking Down Analytical and Computational Barriers Across the Energy Industry Using Databricks

•

1 like•953 views

Spark Summit

Spark Summit 2016 talk by Jonathan Farland (DNV GL)

Data & Analytics

• IntroductionsWho is DNV GL?
• Overview
• Data Science
Energy Analytics
• Demonstration
Statistical
Computing Pilot
• Plans
Concepts in
Development
• DiscussionQ&A
Agenda

DNV GL
Policy Production
Transmission &
Distribution
Use

400
offices
100
countries
16,000
employees
150
years

Policy, Advisory and Research
Demand Side
Management
Energy
Analytics
Load Research
Services
Market Research
and Program
Evaluation

Electricity Distribution Grid
Transmission Distribution ConsumerGeneration Transmission Distribution ConsumerGeneration
Wind
Farms
Photo
Voltaic
Aggregated
Utility Scale
2-50 MW
Utility
Scale
100kW-2MW
Distributed
Scale
25kW-100kW
Residential
Commercial
& Industrial
DistributionTransmissionGeneration
Bulk
Storage
> 50 MW
Distribution
System
End User
Bulk
System
PhotovoltaicWind
Farms

Energy Data Science
Jonathan Farland
DNV GL Energy

Forecasting Approaches
– Similar Day Matching
– Statistically Adjusted Engineering
(SAE)
– Univariate Time Series (ARIMA)
– Multiple Linear Regression
– Econometric
– Machine / Statistical Learning
– Semiparametric Regression
– Artificial Neural Networks
– Fuzzy Logic
– Support Vector Machines
– Gradient Boosting

Additive Semiparametric Model
𝑦" = ℎ 𝑡𝑖𝑚𝑒 + 𝑓 𝑤𝑒𝑎𝑡ℎ𝑒𝑟 + 𝛼 𝑏𝑒ℎ𝑎𝑣𝑖𝑜𝑟 + 𝜀"
Short Term
Electricity
Demand
Time of Year
Prevailing
Atmosphere
Conditions
Recent Demand
Behavior

Emerging Technologies
Photovoltaic
Cells (e.g., Solar)
Electric Vehicles Storage Wind
Energy Efficiency Demand
Response

21
Load Shifting: Electric Vehicles
0
5
10
15
20
25
30
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Demand(kW)
Hour Ending
Standard Rate Electric Vehicle Rate

22
-
20,000
40,000
60,000
80,000
100,000
120,000
140,000
160,000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Load(kWh)
Hour Ending
Forecasted - DR Reduction Forecasted - DR Baseline
Forecasted - DR Impacted Load Actual DR - Reduction
Load Reduction: Demand Response

Benefits of Big Data from Advanced Metering Infrastructure
ü A deeper understanding of demand and thereforehuman
behavior (think energy efficiency)
ü Cost effective operating costs
ü Real-timenotification of power outages
ü ImprovedSystem Planning and Reliability
ü Allows for integration of disruptive technologies like
Electric Vehicles
Statistical Computing Pilot

Pilot
Design
Data
Generating
Process
Analytics
Statistical Computing Pilot

Energy
Consumption
Climatic
- Temperature
- Humidity
- Wind Speed
- Solar
Demographic
Firmographic
Economic
Financial
Energy
Efficiency
Program
Tracking
Grid
Infrastructure
Statistical Computing Pilot
Data Diversity

Key Focus Areas
Performance Scalability
Granularity

Current Concepts in Development
Weather Normalization at Scale (e.g., California)
Real-time Energy Forecasting Using Statistical Learning and Spark Streaming API
Real-time Customer Sentiment Analysis
Grid Reliability Analysis
Cybercrime Protection of Electricity Grids

THANK YOU.
Jonathan Farland – DNV GL
jon.farland@dnvgl.com
Andrew Stryker – DNV GL
Andrew.stryker@dnvgl.com

This talk will cover the tools we used, the hurdles we faced and the work arounds we developed with the help from Databricks support in our attempt to build a custom machine learning model and use it to predict the TV ratings for different networks and demographics. The Apache Spark machine learning and dataframe APIs make it incredibly easy to produce a machine learning pipeline to solve an archetypal supervised learning problem. In our applications at Cadent, we face a challenge with high dimensional labels and relatively low dimensional features; at first pass such a problem is all but intractable but thanks to a large number of historical records and the tools available in Apache Spark, we were able to construct a multi-stage model capable of forecasting with sufficient accuracy to drive the business application. Over the course of our work we have come across many tools that made our lives easier, and others that forced work around. In this talk we will review our custom multi-stage methodology, review the challenges we faced and walk through the key steps that made our project successful.

Geospatial Analytics at Scale with Deep Learning and Apache Spark

Databricks

"Deep Learning is now the standard in object detection, but it is not easy to analyze large amounts of images, especially in an interactive fashion. Traditionally, there has been a gap between Deep Learning frameworks, which excel at image processing, and more traditional ETL and data science tools, which are usually not designed to handle huge batches of complex data types such as images. In this talk, we show how manipulating large corpora of images can be accomplished in a few lines of code because of recent developments in Apache Spark. Thanks to Spark’s unique ability to blend different libraries, we show how to start from satellite images and rapidly build complex queries on high level information such as houses or buildings. This is possible thanks to Magellan, a geospatial package, and Deep Learning Pipelines, a library that streamlines the integration of Deep Learning frameworks in Spark. At the end of this session, you will walk away with the confidence that you can solve your own image detection problems at any scale thanks to the power of Spark."

Improving Traffic Prediction Using Weather Data with Ramya Raghavendra

Spark Summit

As common sense would suggest, weather has a definite impact on traffic. But how much? And under what circumstances? Can we improve traffic (congestion) prediction given weather data? Predictive traffic is envisioned to significantly impact how driver’s plan their day by alerting users before they travel, find the best times to travel, and over time, learn from new IoT data such as road conditions, incidents, etc. This talk will cover the traffic prediction work conducted jointly by IBM and the traffic data provider. As a part of this work, we conducted a case study over five large metropolitans in the US, 2.58 billion traffic records and 262 million weather records, to quantify the boost in accuracy of traffic prediction using weather data. We will provide an overview of our lambda architecture with Apache Spark being used to build prediction models with weather and traffic data, and Spark Streaming used to score the model and provide real-time traffic predictions. This talk will also cover a suite of extensions to Spark to analyze geospatial and temporal patterns in traffic and weather data, as well as the suite of machine learning algorithms that were used with Spark framework. Initial results of this work were presented at the National Association of Broadcasters meeting in Las Vegas in April 2017, and there is work to scale the system to provide predictions in over a 100 cities. Audience will learn about our experience scaling using Spark in offline and streaming mode, building statistical and deep-learning pipelines with Spark, and techniques to work with geospatial and time-series data.

Spark Summit EU talk by Javier Aguedes

Spark Summit

How Spark Enables the Internet of Things: Efficient Integration of Multiple ...

sparktc

High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...

Spark Summit

As advanced sensor technologies are becoming widely deployed in the energy industry, the availability of higher-frequency data results in both analytical benefits and computational costs. To an energy forecaster or data scientist, some of these benefits might include enhanced predictive performance from forecasting models as well as improved pattern recognition in energy consumption across building types, economic sectors, and geographies. To a utility or electricity service provider, these benefits might include significantly deeper insights into their diverse customer base. However, these advantages can come with a high computational price tag. With Spark 2.0, User-Defined Functions can be applied across grouped SparkDataFrames in the SparkR API to solve the multivariate optimization and model selection problems typically required for fitting site-level models. This recently added feature of Spark 2.0 on Databricks has allowed DNV GL to efficiently fit predictive models that relate weather, electricity, water, and gas consumption across virtually any number of buildings.

Digital Attribution Modeling Using Apache Spark-(Anny Chen and William Yan, A...

Spark Summit

Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow

Databricks

The data science lifecycle consists of multiple iterative steps: data collection, data cleaning/exploration, feature engineering, model training, model deployment and scoring among others. The process is often tedious and error-prone and requires considerable human effort. Apart from these challenges, when it comes to leveraging ML in enterprise applications, especially in regulated environments, the level of scrutiny for data handling, model fairness, user privacy, and debuggability is very high. In this talk, we present the basic features of Flock, an end-to-end platform that facilitates adoption of ML in enterprise applications. We refer to this new class of applications as Enterprise Grade Machine Learning (EGML). Flock leverages MLflow to simplify and automate some of the steps involved in supporting EGML applications, allowing data scientists to spend most of their time on improving their ML models. Flock makes use of MLflow for model and experiment tracking but extends and complements it by providing automatic logging, deeper integration with relational databases that often store confidential data, model optimizations and support for the ONNX model format and the ONNX Runtime for inference. We will also present our ongoing work on automatically tracking lineage between data and ML models which is crucial in regulated environments. We will showcase Flock’s features through a demo using Microsoft’s Azure Data Studio and MLflow.

Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa

Spark Summit

Realtime streaming architecture in INFINARIO

Jozo Kovac

Dev Ops Training

Spark Summit

Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...

Databricks

In this session, you will learn how CERN easily applied end-to-end deep learning and analytics pipelines on Apache Spark at scale for High Energy Physics using BigDL and Analytics Zoo open source software running on Intel Xeon-based distributed clusters. Technical details and development learnings will be shared using an example of topology classification to improve real-time event selection at the Large Hadron Collider experiments. The classifier has demonstrated very good performance figures for efficiency, while also reducing the false positive rate compared to the existing methods. It could be used as a filter to improve the online event selection infrastructure of the LHC experiments, where one could benefit from a more flexible and inclusive selection strategy while reducing the amount of downstream resources wasted in processing false positives. This is part of CERN’s research on applying Deep Learning and Analytics using open source and industry standard technologies as an alternative to the existing customized rule based methods. We show how we could quickly build and implement distributed deep learning solutions and data pipelines at scale on Apache Spark using Analytics Zoo and BigDL, which are open source frameworks unifying Analytics and AI on Spark with easy to use APIs and development interfaces seamlessly integrated with Big Data Platforms.

Visualizing AutoTrader Traffic in Near Real-Time with Spark Streaming-(Jon Gr...

Spark Summit

Spark Summit EU talk by Sameer Agarwal

Spark Summit

Headaches and Breakthroughs in Building Continuous Applications

Databricks

At SpotX, we have built and maintained a portfolio of Spark Streaming applications -- all of which process records in the millions per minute. From pure data ingestion, to ETL, to real-time reporting, to live customer-facing products and features, continuous applications are in our DNA. Come along with us as we outline our journey from square one to present in the world of Spark Streaming. We'll detail what we've learned about efficient processing and monitoring, reliability and stability, and long term support of a streaming app. Come learn from our mistakes, and leave with some handy settings and designs you can implement in your own streaming apps.

Apache Spark At Scale in the Cloud

Databricks

Using Apache Spark to analyze large datasets in the cloud presents a range of challenges. Different stages of your pipeline may be constrained by CPU, memory, disk and/or network IO. But what if all those stages have to run on the same cluster? In the cloud, you have limited control over the hardware your cluster runs on. You may have even less control over the size and format of your raw input files. Performance tuning is an iterative and experimental process. It’s frustrating with very large datasets: what worked great with 30 billion rows may not work at all with 400 billion rows. But with strategic optimizations and compromises, 50+ TiB datasets can be no big deal. By using Spark UI and simple metrics, explore how to diagnose and remedy issues on jobs: Sizing the cluster based on your dataset (shuffle partitions) Ingestion challenges – well begun is half done (globbing S3, small files) Managing memory (sorting GC – when to go parallel, when to go G1, when offheap can help you) Shuffle (give a little to get a lot – configs for better out of box shuffle) – Spill (partitioning for the win) Scheduling (FAIR vs FIFO, is there a difference for your pipeline?) Caching and persistence (it’s the cost of doing business, so what are your options?) Fault tolerance (blacklisting, speculation, task reaping) Making the best of a bad deal (skew joins, windowing, UDFs, very large query plans) Writing to S3 (dealing with write partitions, HDFS and s3DistCp vs writing directly to S3)

Spark Autotuning: Spark Summit East talk by Lawrence Spracklen

Spark Summit

While the performance delivered by Spark has enabled data scientists to undertake sophisticated analyses on big and complex data in actionable timeframes, too often, the process of manually configuring the underlying Spark jobs (including the number and size of the executors) can be a significant and time consuming undertaking. Not only it does this configuration process typically rely heavily on repeated trial-and-error, it necessitates that data scientists have a low-level understanding of Spark and detailed cluster sizing information. At Alpine Data we have been working to eliminate this requirement, and develop algorithms that can be used to automatically tune Spark jobs with minimal user involvement, In this presentation, we discuss the algorithms we have developed and illustrate how they leverage information about the size of the data being analyzed, the analytical operations being used in the flow, the cluster size, configuration and real-time utilization, to automatically determine the optimal Spark job configuration for peak performance.

London Cassandra Meetup 10/23: Apache Cassandra at British Gas Connected Home...

DataStax Academy

Distributed Data Processing using Spark by Panos Labropoulos_and Sarod Yataw...

Spark Summit

What's hot

Spark Streaming and IoT by Mike Freedman

Spark Summit

How Spark Enables the Internet of Things- Paula Ta-Shma

Spark Summit

Evolving Beyond the Data Lake: A Story of Wind and Rain

MapR Technologies

Apache Spark for Machine Learning with High Dimensional Labels: Spark Summit ...

Spark Summit

Geospatial Analytics at Scale with Deep Learning and Apache Spark

Databricks

Improving Traffic Prediction Using Weather Data with Ramya Raghavendra

Spark Summit

Spark Summit EU talk by Javier Aguedes

Spark Summit

How Spark Enables the Internet of Things: Efficient Integration of Multiple ...

sparktc

High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...

Spark Summit

Digital Attribution Modeling Using Apache Spark-(Anny Chen and William Yan, A...

Spark Summit

Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow

Databricks

Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa

Spark Summit

Realtime streaming architecture in INFINARIO

Jozo Kovac

Dev Ops Training

Spark Summit

Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...

Databricks

Visualizing AutoTrader Traffic in Near Real-Time with Spark Streaming-(Jon Gr...

Spark Summit

Spark Summit EU talk by Sameer Agarwal

Spark Summit

Headaches and Breakthroughs in Building Continuous Applications

Databricks

Apache Spark At Scale in the Cloud

Databricks

Spark Autotuning: Spark Summit East talk by Lawrence Spracklen

Spark Summit

What's hot (20)

Spark Streaming and IoT by Mike Freedman

How Spark Enables the Internet of Things- Paula Ta-Shma

Evolving Beyond the Data Lake: A Story of Wind and Rain

Apache Spark for Machine Learning with High Dimensional Labels: Spark Summit ...

Geospatial Analytics at Scale with Deep Learning and Apache Spark

Improving Traffic Prediction Using Weather Data with Ramya Raghavendra

Spark Summit EU talk by Javier Aguedes

How Spark Enables the Internet of Things: Efficient Integration of Multiple ...

High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...

Digital Attribution Modeling Using Apache Spark-(Anny Chen and William Yan, A...

Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow

Real-Time Anomoly Detection with Spark MLib, Akka and Cassandra by Natalino Busa

Realtime streaming architecture in INFINARIO

Dev Ops Training

Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...

Visualizing AutoTrader Traffic in Near Real-Time with Spark Streaming-(Jon Gr...

Spark Summit EU talk by Sameer Agarwal

Headaches and Breakthroughs in Building Continuous Applications

Apache Spark At Scale in the Cloud

Spark Autotuning: Spark Summit East talk by Lawrence Spracklen

Viewers also liked

London Cassandra Meetup 10/23: Apache Cassandra at British Gas Connected Home...

DataStax Academy

Distributed Data Processing using Spark by Panos Labropoulos_and Sarod Yataw...

Spark Summit

A Scaleable Implemenation of Deep Leaning on Spark- Alexander Ulanov

Spark Summit

Spark and Cassandra: An Amazing Apache Love Story by Patrick McFadin

Spark Summit

Netflix branding stumbles

Mayur Verma

Distributed Heterogeneous Mixture Learning On Spark

Spark Summit

Natural Sparksmanship – The Art of Making an Analytics Enterprise Cross the C...

Spark Summit

Sparkling Random Ferns by P Dendek and M Fedoryszak

Spark Summit

Data Science at Scale by Sarah Guido

Spark Summit

Netflix in France

Selenia Furnari

Leveraging Docker and CoreOS to provide always available Cassandra at Instacl...

DataStax

With a growing customer base and Cassandra clusters running on-top of a number of the world’s largest cloud and bare-metal hosting providers, Instaclustr is at the forefront of always-on Cassandra hosting. Instaclustr leverages the power of Docker, a modern containerization solution for Linux, and CoreOS, a lightweight Linux distribution tailored to running software inside containers, to build a stable and adaptable Cassandra hosting platform.

Distributed Heterogeneous Mixture Learning On Spark

Spark Summit

Sparking Science up with Research Recommendations by Maya Hristakeva

Spark Summit

ベンダーロックインフリーのビジネスクラウドの世界

ミランティスジャパン株式会社

Inside Apache SystemML by Frederick Reiss

Spark Summit

Student Presentation Sample (Netflix) -- Information Security 365/765 -- UW-M...

Nicholas Davis

The final assignment in the Information Security 365/765 course I teach at UW-Madison, is for teams of students to put together company focused IT security presentations, in which they take the concepts learned in class throughout the entire semester, and apply them to a real company. Here is a sample from Team Netflix! I am proud of the students, and feel that they have gained a solid foundation in the field of information security. Another semester come and gone!

Shifting Data Science into High Gear

Spark Summit

PowerStream: Propelling Energy Innovation with Predictive Analytics

Spark Summit

Netflix - Book de Campanha

Rafael Brandani

Highlights and Challenges from Running Spark on Mesos in Production by Morri ...

Spark Summit

Viewers also liked (20)

London Cassandra Meetup 10/23: Apache Cassandra at British Gas Connected Home...

Distributed Data Processing using Spark by Panos Labropoulos_and Sarod Yataw...

A Scaleable Implemenation of Deep Leaning on Spark- Alexander Ulanov

Spark and Cassandra: An Amazing Apache Love Story by Patrick McFadin

Netflix branding stumbles

Distributed Heterogeneous Mixture Learning On Spark

Natural Sparksmanship – The Art of Making an Analytics Enterprise Cross the C...

Sparkling Random Ferns by P Dendek and M Fedoryszak

Data Science at Scale by Sarah Guido

Netflix in France

Leveraging Docker and CoreOS to provide always available Cassandra at Instacl...

Distributed Heterogeneous Mixture Learning On Spark

Sparking Science up with Research Recommendations by Maya Hristakeva

ベンダーロックインフリーのビジネスクラウドの世界

Inside Apache SystemML by Frederick Reiss

Student Presentation Sample (Netflix) -- Information Security 365/765 -- UW-M...

Shifting Data Science into High Gear

PowerStream: Propelling Energy Innovation with Predictive Analytics

Netflix - Book de Campanha

Highlights and Challenges from Running Spark on Mesos in Production by Morri ...

Similar to Breaking Down Analytical and Computational Barriers Across the Energy Industry Using Databricks

Aia tata power_26112010 r1Ajinkya Dalvi

Increasing Role of Technology in Power Distribution: Moving towards Smarter Grid

Tata Power Delhi Distribution Limited

G.E.T. Smart - Smart Grid: IBM PresentationWashington Technology Industry Association

Smart Energy Systems of FutureSchneider Electric India

1.1_Power Systems Engineering R&D_Ton_EPRI/SNL Symposium

Sandia National Laboratories: Energy & Climate: Renewables

The Smart Power Grid

Stephen Lee

SEI Smart City Offers Catalogue

Schneider Electric India

Digital Grid Technologies for Smooth Integration of Renewable Energy Resources

Moustafa Shahin

A Vision for a Holistic and Smart Grid with High Benefits to Society

Stephen Lee

TPDDL Smart Grid Journey

Tata Power Delhi Distribution Limited

Electric Power Industry In Transition

HIMADRI BANERJI

ARC's Larry O'Brien Process Automation Presentation @ ARC Industry Forum 2010

ARC Advisory Group

ARC's Larry O'Brien Process Automation Presentation @ ARC Industry Forum 2010 in Orlando, FL. Using Process Automation to Optimize Energy Consumption 􀂍 The Cost of Energy 􀂍 How Well is Energy Managed in Today’s Plants? 􀂍 Using Your Process Automation Infrastructure with an Eye Toward Optimizing Energy Consumption 􀂍 The Business Value of Integrated Power & Automation 􀂍 Enabling Technologies 􀂍 Training Your People and Managing Knowledge 􀂍 Moving Forward

IEEE Presentation SDG&EBert Taube

Smart grid iit-jodhpur-apr10tec2

Optimizing Generation, Distribution, Renewables, and Demand Response for a Sm...John Dirkman, PE

The merits of integrating renewables with smarter grid carimet

Rick Case, PMP, P.E.

e-Research & the art of linking Astrophysics to Deforestation

David Wallom

SCE smart grid development, Paul De Martini, SCEUniversity of Southern California

Advanced utility data management and analytics for improved situational awar...

Power System Operation

Grid Software solutions smart grids

alexa842003

Similar to Breaking Down Analytical and Computational Barriers Across the Energy Industry Using Databricks (20)

Aia tata power_26112010 r1

Increasing Role of Technology in Power Distribution: Moving towards Smarter Grid

G.E.T. Smart - Smart Grid: IBM Presentation

Smart Energy Systems of Future

1.1_Power Systems Engineering R&D_Ton_EPRI/SNL Symposium

The Smart Power Grid

SEI Smart City Offers Catalogue

Digital Grid Technologies for Smooth Integration of Renewable Energy Resources

A Vision for a Holistic and Smart Grid with High Benefits to Society

TPDDL Smart Grid Journey

Electric Power Industry In Transition

ARC's Larry O'Brien Process Automation Presentation @ ARC Industry Forum 2010

IEEE Presentation SDG&E

Smart grid iit-jodhpur-apr10

Optimizing Generation, Distribution, Renewables, and Demand Response for a Sm...

The merits of integrating renewables with smarter grid carimet

e-Research & the art of linking Astrophysics to Deforestation

SCE smart grid development, Paul De Martini, SCE

Advanced utility data management and analytics for improved situational awar...

Grid Software solutions smart grids

More from Spark Summit

FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang

Spark Summit

In this session we will present a Configurable FPGA-Based Spark SQL Acceleration Architecture. It is target to leverage FPGA highly parallel computing capability to accelerate Spark SQL Query and for FPGA’s higher power efficiency than CPU we can lower the power consumption at the same time. The Architecture consists of SQL query decomposition algorithms, fine-grained FPGA based Engine Units which perform basic computation of sub string, arithmetic and logic operations. Using SQL query decomposition algorithm, we are able to decompose a complex SQL query into basic operations and according to their patterns each is fed into an Engine Unit. SQL Engine Units are highly configurable and can be chained together to perform complex Spark SQL queries, finally one SQL query is transformed into a Hardware Pipeline. We will present the performance benchmark results comparing the queries with FGPA-Based Spark SQL Acceleration Architecture on XEON E5 and FPGA to the ones with Spark SQL Query on XEON E5 with 10X ~ 100X improvement and we will demonstrate one SQL query workload from a real customer.

VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...

Spark Summit

In this talk, we’ll present techniques for visualizing large scale machine learning systems in Spark. These are techniques that are employed by Netflix to understand and refine the machine learning models behind Netflix’s famous recommender systems that are used to personalize the Netflix experience for their 99 millions members around the world. Essential to these techniques is Vegas, a new OSS Scala library that aims to be the “missing MatPlotLib” for Spark/Scala. We’ll talk about the design of Vegas and its usage in Scala notebooks to visualize Machine Learning Models.

Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu

Spark Summit

This presentation introduces how we design and implement a real-time processing platform using latest Spark Structured Streaming framework to intelligently transform the production lines in the manufacturing industry. In the traditional production line there are a variety of isolated structured, semi-structured and unstructured data, such as sensor data, machine screen output, log output, database records etc. There are two main data scenarios: 1) Picture and video data with low frequency but a large amount; 2) Continuous data with high frequency. They are not a large amount of data per unit. However the total amount of them is very large, such as vibration data used to detect the quality of the equipment. These data have the characteristics of streaming data: real-time, volatile, burst, disorder and infinity. Making effective real-time decisions to retrieve values from these data is critical to smart manufacturing. The latest Spark Structured Streaming framework greatly lowers the bar for building highly scalable and fault-tolerant streaming applications. Thanks to the Spark we are able to build a low-latency, high-throughput and reliable operation system involving data acquisition, transmission, analysis and storage. The actual user case proved that the system meets the needs of real-time decision-making. The system greatly enhance the production process of predictive fault repair and production line material tracking efficiency, and can reduce about half of the labor force for the production lines.

A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...

Spark Summit

Graph is on the rise and it’s time to start learning about scalable graph analytics! In this session we will go over two Spark-based Graph Analytics frameworks: Tinkerpop and GraphFrames. While both frameworks can express very similar traversals, they have different performance characteristics and APIs. In this Deep-Dive by example presentation, we will demonstrate some common traversals and explain how, at a Spark level, each traversal is actually computed under the hood! Learn both the fluent Gremlin API as well as the powerful GraphFrame Motif api as we show examples of both simultaneously. No need to be familiar with Graphs or Spark for this presentation as we’ll be explaining everything from the ground up!

No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...

Spark Summit

Building accurate machine learning models has been an art of data scientists, i.e., algorithm selection, hyper parameter tuning, feature selection and so on. Recently, challenges to breakthrough this “black-arts” have got started. In cooperation with our partner, NEC Laboratories America, we have developed a Spark-based automatic predictive modeling system. The system automatically searches the best algorithm, parameters and features without any manual work. In this talk, we will share how the automation system is designed to exploit attractive advantages of Spark. The evaluation with real open data demonstrates that our system can explore hundreds of predictive models and discovers the most accurate ones in minutes on a Ultra High Density Server, which employs 272 CPU cores, 2TB memory and 17TB SSD in 3U chassis. We will also share open challenges to learn such a massive amount of models on Spark, particularly from reliability and stability standpoints. This talk will cover the presentation already shown on Spark Summit SF’17 (#SFds5) but from more technical perspective.

Apache Spark and Tensorflow as a Service with Jim Dowling

Spark Summit

In Sweden, from the Rise ICE Data Center at www.hops.site, we are providing to reseachers both Spark-as-a-Service and, more recently, Tensorflow-as-a-Service as part of the Hops platform. In this talk, we examine the different ways in which Tensorflow can be included in Spark workflows, from batch to streaming to structured streaming applications. We will analyse the different frameworks for integrating Spark with Tensorflow, from Tensorframes to TensorflowOnSpark to Databrick’s Deep Learning Pipelines. We introduce the different programming models supported and highlight the importance of cluster support for managing different versions of python libraries on behalf of users. We will also present cluster management support for sharing GPUs, including Mesos and YARN (in Hops Hadoop). Finally, we will perform a live demonstration of training and inference for a TensorflowOnSpark application written on Jupyter that can read data from either HDFS or Kafka, transform the data in Spark, and train a deep neural network on Tensorflow. We will show how to debug the application using both Spark UI and Tensorboard, and how to examine logs and monitor training.

Apache Spark and Tensorflow as a Service with Jim Dowling

Spark Summit

MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...

Spark Summit

With the rapid growth of available datasets, it is imperative to have good tools for extracting insight from big data. The Spark ML library has excellent support for performing at-scale data processing and machine learning experiments, but more often than not, Data Scientists find themselves struggling with issues such as: low level data manipulation, lack of support for image processing, text analytics and deep learning, as well as the inability to use Spark alongside other popular machine learning libraries. To address these pain points, Microsoft recently released The Microsoft Machine Learning Library for Apache Spark (MMLSpark), an open-source machine learning library built on top of SparkML that seeks to simplify the data science process and integrate SparkML Pipelines with deep learning and computer vision libraries such as the Microsoft Cognitive Toolkit (CNTK) and OpenCV. With MMLSpark, Data Scientists can build models with 1/10th of the code through Pipeline objects that compose seamlessly with other parts of the SparkML ecosystem. In this session, we explore some of the main lessons learned from building MMLSpark. Join us if you would like to know how to extend Pipelines to ensure seamless integration with SparkML, how to auto-generate Python and R wrappers from Scala Transformers and Estimators, how to integrate and use previously non-distributed libraries in a distributed manner and how to efficiently deploy a Spark library across multiple platforms.

Next CERN Accelerator Logging Service with Jakub Wozniak

Spark Summit

The Next Accelerator Logging Service (NXCALS) is a new Big Data project at CERN aiming to replace the existing Oracle-based service. The main purpose of the system is to store and present Controls/Infrastructure related data gathered from thousands of devices in the whole accelerator complex. The data is used to operate the machines, improve their performance and conduct studies for new beam types or future experiments. During this talk, Jakub will speak about NXCALS requirements and design choices that lead to the selected architecture based on Hadoop and Spark. He will present the Ingestion API, the abstractions behind the Meta-data Service and the Spark-based Extraction API where simple changes to the schema handling greatly improved the overall usability of the system. The system itself is not CERN specific and can be of interest to other companies or institutes confronted with similar Big Data problems.

Powering a Startup with Apache Spark with Kevin Kim

Spark Summit

In Between (A mobile App for couples, downloaded 20M in Global), from daily batch for extracting metrics, analysis and dashboard. Spark is widely used by engineers and data analysts in Between, thanks to the performance and expendability of Spark, data operating has become extremely efficient. Entire team including Biz Dev, Global Operation, Designers are enjoying data results so Spark is empowering entire company for data driven operation and thinking. Kevin, Co-founder and Data Team leader of Between will be presenting how things are going in Between. Listeners will know how small and agile team is living with data (how we build organization, culture and technical base) after this presentation.

Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra

Spark Summit

Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...

Spark Summit

In many cases, Big Data becomes just another buzzword because of the lack of tools that can support both the technological requirements for developing and deploying of the projects and/or the fluency of communication between the different profiles of people involved in the projects. In this talk, we will present Moriarty, a set of tools for fast prototyping of Big Data applications that can be deployed in an Apache Spark environment. These tools support the creation of Big Data workflows using the already existing functional blocks or supporting the creation of new functional blocks. The created workflow can then be deployed in a Spark infrastructure and used through a REST API. For better understanding of Moriarty, the prototyping process and the way it hides the Spark environment to the Big Data users and developers, we will present it together with a couple of examples based on a Industry 4.0 success cases and other on a logistic success case.

How Nielsen Utilized Databricks for Large-Scale Research and Development with...

Spark Summit

Large-scale testing of new data products or enhancements to existing products in a research and development environment can be a technical challenge for data scientists. In some cases, tools available to data scientists lack production-level capacity, whereas other tools do not provide the algorithms needed to run the methodology. At Nielsen, the Databricks platform provided a solution to both of these challenges. This breakout session will cover a specific Nielsen business case where two methodology enhancements were developed and tested at large-scale using the Databricks platform. Development and large-scale testing of these enhancements would not have been possible using standard database tools.

Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...

Spark Summit

Goal Based Data Production with Sim Simeonov

Spark Summit

Since the invention of SQL and relational databases, data production has been about specifying how data is transformed through queries. While Apache Spark can certainly be used as a general distributed query engine, the power and granularity of Spark’s APIs enables a revolutionary increase in data engineering productivity: goal-based data production. Goal-based data production concerns itself with specifying WHAT the desired result is, leaving the details of HOW the result is achieved to a smart data warehouse running on top of Spark. That not only substantially increases productivity, but also significantly expands the audience that can work directly with Spark: from developers and data scientists to technical business users. With specific data and architecture patterns spanning the range from ETL to machine learning data prep and with live demos, this session will demonstrate how Spark users can gain the benefits of goal-based data production.

Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...

Spark Summit

Have you imagined a simple machine learning solution able to prevent revenue leakage and monitor your distributed application? To answer this question, we offer a practical and a simple machine learning solution to create an intelligent monitoring application based on simple data analysis using Apache Spark MLlib. Our application uses linear regression models to make predictions and check if the platform is experiencing any operational problems that can impact in revenue losses. The application monitor distributed systems and provides notifications stating the problem detected, that way users can operate quickly to avoid serious problems which directly impact the company’s revenue and reduce the time for action. We will present an architecture for not only a monitoring system, but also an active actor for our outages recoveries. At the end of the presentation you will have access to our training program source code and you will be able to adapt and implement in your company. This solution already helped to prevent about US$3mi in losses last year.

Getting Ready to Use Redis with Apache Spark with Dvir Volk

Spark Summit

Getting Ready to use Redis with Apache Spark is a technical tutorial designed to address integrating Redis with an Apache Spark deployment to increase the performance of serving complex decision models. To set the context for the session, we start with a quick introduction to Redis and the capabilities Redis provides. We cover the basic data types provided by Redis and cover the module system. Using an ad serving use-case, we look at how Redis can improve the performance and reduce the cost of using complex ML-models in production. Attendees will be guided through the key steps of setting up and integrating Redis with Spark, including how to train a model using Spark then load and serve it using Redis, as well as how to work with the Spark Redis module. The capabilities of the Redis Machine Learning Module (redis-ml) will be discussed focusing primarily on decision trees and regression (linear and logistic) with code examples to demonstrate how to use these feature. At the end of the session, developers should feel confident building a prototype/proof-of-concept application using Redis and Spark. Attendees will understand how Redis complements Spark and how to use Redis to serve complex, ML-models with high performance.

Deduplication and Author-Disambiguation of Streaming Records via Supervised M...

Spark Summit

Here we present a general supervised framework for record deduplication and author-disambiguation via Spark. This work differentiates itself by – Application of Databricks and AWS makes this a scalable implementation. Compute resources are comparably lower than traditional legacy technology using big boxes 24/7. Scalability is crucial as Elsevier’s Scopus data, the biggest scientific abstract repository, covers roughly 250 million authorships from 70 million abstracts covering a few hundred years. – We create a fingerprint for each content by deep learning and/or word2vec algorithms to expedite pairwise similarity calculation. These encoders substantially reduce compute time while maintaining semantic similarity (unlike traditional TFIDF or predefined taxonomies). We will briefly discuss how to optimize word2vec training with high parallelization. Moreover, we show how these encoders can be used to derive a standard representation for all our entities namely such as documents, authors, users, journals, etc. This standard representation can simplify the recommendation problem into a pairwise similarity search and hence it can offer a basic recommender for cross-product applications where we may not have a dedicate recommender engine designed. – Traditional author-disambiguation or record deduplication algorithms are batch-processing with small to no training data. However, we have roughly 25 million authorships that are manually curated or corrected upon user feedback. Hence, it is crucial to maintain historical profiles and hence we have developed a machine learning implementation to deal with data streams and process them in mini batches or one document at a time. We will discuss how to measure the accuracy of such a system, how to tune it and how to process the raw data of pairwise similarity function into final clusters. Lessons learned from this talk can help all sort of companies where they want to integrate their data or deduplicate their user/customer/product databases.

MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...

Spark Summit

The use of large-scale machine learning and data mining methods is becoming ubiquitous in many application domains ranging from business intelligence and bioinformatics to self-driving cars. These methods heavily rely on matrix computations, and it is hence critical to make these computations scalable and efficient. These matrix computations are often complex and involve multiple steps that need to be optimized and sequenced properly for efficient execution. This work presents new efficient and scalable matrix processing and optimization techniques based on Spark. The proposed techniques estimate the sparsity of intermediate matrix-computation results and optimize communication costs. An evaluation plan generator for complex matrix computations is introduced as well as a distributed plan optimizer that exploits dynamic cost-based analysis and rule-based heuristics The result of a matrix operation will often serve as an input to another matrix operation, thus defining the matrix data dependencies within a matrix program. The matrix query plan generator produces query execution plans that minimize memory usage and communication overhead by partitioning the matrix based on the data dependencies in the execution plan. We implemented the proposed matrix techniques inside the Spark SQL, and optimize the matrix execution plan based on Spark SQL Catalyst. We conduct case studies on a series of ML models and matrix computations with special features on different datasets. These are PageRank, GNMF, BFGS, sparse matrix chain multiplications, and a biological data analysis. The open-source library ScaLAPACK and the array-based database SciDB are used for performance evaluation. Our experiments are performed on six real-world datasets are: social network data ( e.g., soc-pokec, cit-Patents, LiveJournal), Twitter2010, Netflix recommendation data, and 1000 Genomes Project sample. Experiments demonstrate that our proposed techniques achieve up to an order-of-magnitude performance.

Indicium: Interactive Querying at Scale Using Apache Spark, Zeppelin, and Spa...

Spark Summit

Kapil Malik and Arvind Heda will discuss a solution for interactive querying of large scale structured data, stored in a distributed file system (HDFS / S3), in a scalable and reliable manner using a unique combination of Spark SQL, Apache Zeppelin and Spark Job-server (SJS) on Yarn. The solution is production tested and can cater to thousands of queries processing terabytes of data every day. It contains following components – 1. Zeppelin server : A custom interpreter is deployed, which de-couples spark context from the user notebooks. It connects to the remote spark context on Spark Job-server. A rich set of APIs are exposed for the users. The user input is parsed, validated and executed remotely on SJS. 2. Spark job-server : A custom application is deployed, which implements the set of APIs exposed on Zeppelin custom interpreter, as one or more spark jobs. 3. Context router : It routes different user queries from custom interpreter to one of many Spark Job-servers / contexts. The solution has following characteristics – * Multi-tenancy There are hundreds of users, each having one or more Zeppelin notebooks. All these notebooks connect to same set of Spark contexts for running a job. * Fault tolerance The notebooks do not use Spark interpreter, but a custom interpreter, connecting to a remote context. If one spark context fails, the context router sends user queries to another context. * Load balancing Context router identifies which contexts are under heavy load / responding slowly, and selects the most optimal context for serving a user query. * Efficiency We use Alluxio for caching common datasets. * Elastic resource usage We use spark dynamic allocation for the contexts. This ensures that cluster resources are blocked by this application only when it’s doing some actual work.

More from Spark Summit (20)

FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang

VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...

Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu

A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...

No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...

Apache Spark and Tensorflow as a Service with Jim Dowling

MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...

Next CERN Accelerator Logging Service with Jakub Wozniak

Powering a Startup with Apache Spark with Kevin Kim

Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra

Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...

How Nielsen Utilized Databricks for Large-Scale Research and Development with...

Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...

Goal Based Data Production with Sim Simeonov

Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...

Getting Ready to Use Redis with Apache Spark with Dvir Volk

Deduplication and Author-Disambiguation of Streaming Records via Supervised M...

MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...

Indicium: Interactive Querying at Scale Using Apache Spark, Zeppelin, and Spa...

Recently uploaded

Best best suvichar in gujarati english meaning of this sentence as Silk road ...

AbhimanyuSinha9

一比一原版(UofS毕业证书)萨省大学毕业证如何办理

v3tuleee

原版定制【微信:41543339】【(UofS毕业证书)萨省大学毕业证】【微信:41543339】成绩单、外壳、offer、留信学历认证（永久存档真实可查）采用学校原版纸张、特殊工艺完全按照原版一比一制作（包括：隐形水印，阴影底纹，钢印LOGO烫金烫银，LOGO烫金烫银复合重叠，文字图案浮雕，激光镭射，紫外荧光，温感，复印防伪）行业标杆！精益求精，诚心合作，真诚制作！多年品质 ,按需精细制作，24小时接单,全套进口原装设备，十五年致力于帮助留学生解决难题，业务范围有加拿大、英国、澳洲、韩国、美国、新加坡，新西兰等学历材料，包您满意。【我们承诺采用的是学校原版纸张（纸质、底色、纹路），我们拥有全套进口原装设备，特殊工艺都是采用不同机器制作，仿真度基本可以达到100%，所有工艺效果都可提前给客户展示，不满意可以根据客户要求进行调整，直到满意为止！】【业务选择办理准则】一、工作未确定，回国需先给父母、亲戚朋友看下文凭的情况，办理一份就读学校的毕业证【微信41543339】文凭即可二、回国进私企、外企、自己做生意的情况，这些单位是不查询毕业证真伪的，而且国内没有渠道去查询国外文凭的真假，也不需要提供真实教育部认证。鉴于此，办理一份毕业证【微信41543339】即可三、进国企，银行，事业单位，考公务员等等，这些单位是必需要提供真实教育部认证的，办理教育部认证所需资料众多且烦琐，所有材料您都必须提供原件，我们凭借丰富的经验，快捷的绿色通道帮您快速整合材料，让您少走弯路。留信网认证的作用: 1:该专业认证可证明留学生真实身份 2:同时对留学生所学专业登记给予评定 3:国家专业人才认证中心颁发入库证书 4:这个认证书并且可以归档倒地方 5:凡事获得留信网入网的信息将会逐步更新到个人身份内，将在公安局网内查询个人身份证信息后，同步读取人才网入库信息 6:个人职称评审加20分 7:个人信誉贷款加10分 8:在国家人才网主办的国家网络招聘大会中纳入资料，供国家高端企业选择人才留信网服务项目： 1、留学生专业人才库服务（留信分析） 2、国（境）学习人员提供就业推荐信服务 3、留学人员区块链存储服务 → 【关于价格问题（保证一手价格）】我们所定的价格是非常合理的，而且我们现在做得单子大多数都是代理和回头客户介绍的所以一般现在有新的单子我给客户的都是第一手的代理价格，因为我想坦诚对待大家不想跟大家在价格方面浪费时间对于老客户或者被老客户介绍过来的朋友，我们都会适当给一些优惠。选择实体注册公司办理，更放心，更安全！我们的承诺：客户在留信官方认证查询网站查询到认证通过结果后付款，不成功不收费！

Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation

Boston Institute of Analytics

Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/

一比一原版(CBU毕业证)卡普顿大学毕业证如何办理

ahzuo

CBU毕业证offer【微信95270640】《卡普顿大学毕业证书》《QQ微信95270640》学位证书电子版：在线制作卡普顿大学毕业证成绩单GPA修改（制作CBU毕业证成绩单CBU文凭证书样本）、卡普顿大学毕业证书与成绩单样本图片、《CBU学历证书学位证书》、卡普顿大学毕业证案例毕业证书制作軟體、在线制作加拿大硕士学历证书真实可查. 如果您是以下情况，我们都能竭诚为您解决实际问题：【公司采用定金+余款的付款流程，以最大化保障您的利益，让您放心无忧】 1、在校期间，因各种原因未能顺利毕业，拿不到官方毕业证+微信95270640 2、面对父母的压力，希望尽快拿到卡普顿大学卡普顿大学毕业证成绩单； 3、不清楚流程以及材料该如何准备卡普顿大学卡普顿大学毕业证成绩单； 4、回国时间很长，忘记办理； 5、回国马上就要找工作，办给用人单位看； 6、企事业单位必须要求办理的；面向美国乔治城大学毕业留学生提供以下服务: 【★卡普顿大学卡普顿大学毕业证成绩单毕业证、成绩单等全套材料，从防伪到印刷，从水印到钢印烫金，与学校100%相同】【★真实使馆认证（留学人员回国证明），使馆存档可通过大使馆查询确认】【★真实教育部认证，教育部存档，教育部留服网站可查】【★真实留信认证，留信网入库存档，可查卡普顿大学卡普顿大学毕业证成绩单】我们从事工作十余年的有着丰富经验的业务顾问，熟悉海外各国大学的学制及教育体系，并且以挂科生解决毕业材料不全问题为基础，为客户量身定制1对1方案，未能毕业的回国留学生成功搭建回国顺利发展所需的桥梁。我们一直努力以高品质的教育为起点，以诚信、专业、高效、创新作为一切的行动宗旨，始终把“诚信为主、质量为本、客户第一”作为我们全部工作的出发点和归宿点。同时为海内外留学生提供大学毕业证购买、补办成绩单及各类分数修改等服务；归国认证方面，提供《留信网入库》申请、《国外学历学位认证》申请以及真实学籍办理等服务，帮助众多莘莘学子实现了一个又一个梦想。专业服务，请勿犹豫联系我如果您真实毕业回国，对于学历认证无从下手，请联系我，我们免费帮您递交诚招代理：本公司诚聘当地代理人员，如果你有业余时间，或者你有同学朋友需要，有兴趣就请联系我你赢我赢，共创双赢你做代理，可以帮助卡普顿大学同学朋友你做代理，可以拯救卡普顿大学失足青年你做代理，可以挽救卡普顿大学一个个人才你做代理，你将是别人人生卡普顿大学的转折点你做代理，可以改变自己，改变他人，给他人和自己一个机会道银边山娃摸索着扯了扯灯绳小屋顿时一片刺眼的亮瞅瞅床头的诺基亚山娃苦笑着摇了摇头连他自己都感到奇怪居然又睡到上午点半掐指算算随父亲进城已一个多星期了山娃几乎天天起得这么迟在乡下老家暑假五点多山娃就醒来在爷爷奶奶嘁嘁喳喳的忙碌声中一骨碌爬起把牛驱到后龙山再从莲塘里采回一蛇皮袋湿漉漉的莲蓬也才点多点半早就吃过早餐玩耍去了山娃的家在闽西山区依山傍水山清水秀门前潺潺流淌的蜿蜒小溪一直都是山娃和小伙伴们盛试

Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...

pchutichetpong

M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years. Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success. MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies. According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”

standardisation of garbhpala offhgfffghh

ArpitMalhotra16

一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理

mbawufebxi

原版定制【微信:41543339】【(Bradford毕业证书)布拉德福德大学毕业证】【微信:41543339】成绩单、外壳、offer、留信学历认证（永久存档真实可查）采用学校原版纸张、特殊工艺完全按照原版一比一制作（包括：隐形水印，阴影底纹，钢印LOGO烫金烫银，LOGO烫金烫银复合重叠，文字图案浮雕，激光镭射，紫外荧光，温感，复印防伪）行业标杆！精益求精，诚心合作，真诚制作！多年品质 ,按需精细制作，24小时接单,全套进口原装设备，十五年致力于帮助留学生解决难题，业务范围有加拿大、英国、澳洲、韩国、美国、新加坡，新西兰等学历材料，包您满意。【我们承诺采用的是学校原版纸张（纸质、底色、纹路），我们拥有全套进口原装设备，特殊工艺都是采用不同机器制作，仿真度基本可以达到100%，所有工艺效果都可提前给客户展示，不满意可以根据客户要求进行调整，直到满意为止！】【业务选择办理准则】一、工作未确定，回国需先给父母、亲戚朋友看下文凭的情况，办理一份就读学校的毕业证【微信41543339】文凭即可二、回国进私企、外企、自己做生意的情况，这些单位是不查询毕业证真伪的，而且国内没有渠道去查询国外文凭的真假，也不需要提供真实教育部认证。鉴于此，办理一份毕业证【微信41543339】即可三、进国企，银行，事业单位，考公务员等等，这些单位是必需要提供真实教育部认证的，办理教育部认证所需资料众多且烦琐，所有材料您都必须提供原件，我们凭借丰富的经验，快捷的绿色通道帮您快速整合材料，让您少走弯路。留信网认证的作用: 1:该专业认证可证明留学生真实身份 2:同时对留学生所学专业登记给予评定 3:国家专业人才认证中心颁发入库证书 4:这个认证书并且可以归档倒地方 5:凡事获得留信网入网的信息将会逐步更新到个人身份内，将在公安局网内查询个人身份证信息后，同步读取人才网入库信息 6:个人职称评审加20分 7:个人信誉贷款加10分 8:在国家人才网主办的国家网络招聘大会中纳入资料，供国家高端企业选择人才留信网服务项目： 1、留学生专业人才库服务（留信分析） 2、国（境）学习人员提供就业推荐信服务 3、留学人员区块链存储服务 → 【关于价格问题（保证一手价格）】我们所定的价格是非常合理的，而且我们现在做得单子大多数都是代理和回头客户介绍的所以一般现在有新的单子我给客户的都是第一手的代理价格，因为我想坦诚对待大家不想跟大家在价格方面浪费时间对于老客户或者被老客户介绍过来的朋友，我们都会适当给一些优惠。选择实体注册公司办理，更放心，更安全！我们的承诺：客户在留信官方认证查询网站查询到认证通过结果后付款，不成功不收费！

Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...

John Andrews

SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation" Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults Description: Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project. Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas

Adjusting primitives for graph : SHORT REPORT / NOTES

Subhajit Sahu

Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is Multiply with different modes (map) 1. Performance of sequential execution based vs OpenMP based vector multiply. 2. Comparing various launch configs for CUDA based vector multiply. Sum with different storage types (reduce) 1. Performance of vector element sum using float vs bfloat16 as the storage type. Sum with different modes (reduce) 1. Performance of sequential execution based vs OpenMP based vector element sum. 2. Performance of memcpy vs in-place based CUDA based vector element sum. 3. Comparing various launch configs for CUDA based vector element sum (memcpy). 4. Comparing various launch configs for CUDA based vector element sum (in-place). Sum with in-place strategies of CUDA mode (reduce) 1. Comparing various launch configs for CUDA based vector element sum (in-place).

原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样

u86oixdj

学校原件一模一样【微信：741003700 】《(Deakin毕业证书)迪肯大学毕业证学位证》【微信：741003700 】学位证，留信认证（真实可查，永久存档）原件一模一样纸张工艺/offer、雅思、外壳等材料/诚信可靠,可直接看成品样本，帮您解决无法毕业带来的各种难题！外壳，原版制作，诚信可靠，可直接看成品样本。行业标杆！精益求精，诚心合作，真诚制作！多年品质 ,按需精细制作，24小时接单,全套进口原装设备。十五年致力于帮助留学生解决难题，包您满意。本公司拥有海外各大学样板无数，能完美还原。 1:1完美还原海外各大学毕业材料上的工艺：水印，阴影底纹，钢印LOGO烫金烫银，LOGO烫金烫银复合重叠。文字图案浮雕、激光镭射、紫外荧光、温感、复印防伪等防伪工艺。材料咨询办理、认证咨询办理请加学历顾问Q/微741003700 【主营项目】一.毕业证【q微741003700】成绩单、使馆认证、教育部认证、雅思托福成绩单、学生卡等！二.真实使馆公证(即留学回国人员证明,不成功不收费) 三.真实教育部学历学位认证（教育部存档！教育部留服网站永久可查）四.办理各国各大学文凭(一对一专业服务,可全程监控跟踪进度) 如果您处于以下几种情况： ◇在校期间，因各种原因未能顺利毕业……拿不到官方毕业证【q/微741003700】 ◇面对父母的压力，希望尽快拿到； ◇不清楚认证流程以及材料该如何准备； ◇回国时间很长，忘记办理； ◇回国马上就要找工作，办给用人单位看； ◇企事业单位必须要求办理的 ◇需要报考公务员、购买免税车、落转户口 ◇申请留学生创业基金留信网认证的作用: 1:该专业认证可证明留学生真实身份 2:同时对留学生所学专业登记给予评定 3:国家专业人才认证中心颁发入库证书 4:这个认证书并且可以归档倒地方 5:凡事获得留信网入网的信息将会逐步更新到个人身份内，将在公安局网内查询个人身份证信息后，同步读取人才网入库信息 6:个人职称评审加20分 7:个人信誉贷款加10分 8:在国家人才网主办的国家网络招聘大会中纳入资料，供国家高端企业选择人才

The affect of service quality and online reviews on customer loyalty in the E...

jerlynmaetalle

一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单

vcaxypu

RUG毕业证【微信95270640】办文凭{格罗宁根大学毕业证}Q微Q微信95270640RUG毕业证书成绩单/学历认证RUG Diploma未毕业、挂科怎么办？+QQ微信：Q微信95270640-大学Offer（申请大学）、成绩单（申请考研）、语言证书、在读证明、使馆公证、办真实留信网认证、真实大使馆认证、学历认证办理国外格罗宁根大学毕业证书 #成绩单改成绩 #教育部学历学位认证 #毕业证认证 #留服认证 #使馆认证（留学回国人员证明） #（证）等真实教育部认证教育部存档中国教育部留学服务中心认证（即教育部留服认证）网站100%可查. 真实使馆认证（即留学人员回国证明）使馆存档可通过大使馆查询确认. 留信网认证国家专业人才认证中心颁发入库证书留信网永久存档可查. 格罗宁根大学格罗宁根大学毕业证学历书毕业证 #成绩单等全套材料从防伪到印刷从水印到钢印烫金跟学校原版100%相同. 国际留学归国服务中心：实体公司注册经营行业标杆精益求精！国外毕业证学位证成绩单办理流程： 1客户提供办理格罗宁根大学格罗宁根大学毕业证学历书信息：姓名生日专业学位毕业时间等（如信息不确定可以咨询顾问：我们有专业老师帮你查询）； 2开始安排制作格罗宁根大学毕业证成绩单电子图； 3格罗宁根大学毕业证成绩单电子版做好以后发送给您确认； 4格罗宁根大学毕业证成绩单电子版您确认信息无误之后安排制作成品； 5格罗宁根大学成品做好拍照或者视频给您确认； 6快递给客户（国内顺丰国外DHLUPS等快递邮寄格罗宁根大学格罗宁根大学毕业证学历书）。也得开灯开风扇山娃不想浪费电总将小方桌搁在门口看书写作业有一次山娃坐在门口写作业写着写着竟伏在桌上睡着了迷迷糊糊中山娃似乎听到了父亲的脚步声当他晃晃悠悠站起来时才诧然发现一位衣衫破旧的妇女挎着一只硕大的蛇皮袋手里拎着长铁钩正站在门口朝黑色的屋内张望不好坏人小偷山娃一怔却也灵机一动立马仰起头双手拢在嘴边朝楼上大喊：“爸爸爸——有人找——那人一听朝山娃尴尬地笑笑悻悻地走了山娃立马“嘭的一声将铁门锁死受

一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理

slg6lamcq

原版定制【微信:41543339】【(Adelaide毕业证书)阿德莱德大学毕业证】【微信:41543339】成绩单、外壳、offer、留信学历认证（永久存档真实可查）采用学校原版纸张、特殊工艺完全按照原版一比一制作（包括：隐形水印，阴影底纹，钢印LOGO烫金烫银，LOGO烫金烫银复合重叠，文字图案浮雕，激光镭射，紫外荧光，温感，复印防伪）行业标杆！精益求精，诚心合作，真诚制作！多年品质 ,按需精细制作，24小时接单,全套进口原装设备，十五年致力于帮助留学生解决难题，业务范围有加拿大、英国、澳洲、韩国、美国、新加坡，新西兰等学历材料，包您满意。【我们承诺采用的是学校原版纸张（纸质、底色、纹路），我们拥有全套进口原装设备，特殊工艺都是采用不同机器制作，仿真度基本可以达到100%，所有工艺效果都可提前给客户展示，不满意可以根据客户要求进行调整，直到满意为止！】【业务选择办理准则】一、工作未确定，回国需先给父母、亲戚朋友看下文凭的情况，办理一份就读学校的毕业证【微信41543339】文凭即可二、回国进私企、外企、自己做生意的情况，这些单位是不查询毕业证真伪的，而且国内没有渠道去查询国外文凭的真假，也不需要提供真实教育部认证。鉴于此，办理一份毕业证【微信41543339】即可三、进国企，银行，事业单位，考公务员等等，这些单位是必需要提供真实教育部认证的，办理教育部认证所需资料众多且烦琐，所有材料您都必须提供原件，我们凭借丰富的经验，快捷的绿色通道帮您快速整合材料，让您少走弯路。留信网认证的作用: 1:该专业认证可证明留学生真实身份 2:同时对留学生所学专业登记给予评定 3:国家专业人才认证中心颁发入库证书 4:这个认证书并且可以归档倒地方 5:凡事获得留信网入网的信息将会逐步更新到个人身份内，将在公安局网内查询个人身份证信息后，同步读取人才网入库信息 6:个人职称评审加20分 7:个人信誉贷款加10分 8:在国家人才网主办的国家网络招聘大会中纳入资料，供国家高端企业选择人才留信网服务项目： 1、留学生专业人才库服务（留信分析） 2、国（境）学习人员提供就业推荐信服务 3、留学人员区块链存储服务 → 【关于价格问题（保证一手价格）】我们所定的价格是非常合理的，而且我们现在做得单子大多数都是代理和回头客户介绍的所以一般现在有新的单子我给客户的都是第一手的代理价格，因为我想坦诚对待大家不想跟大家在价格方面浪费时间对于老客户或者被老客户介绍过来的朋友，我们都会适当给一些优惠。选择实体注册公司办理，更放心，更安全！我们的承诺：客户在留信官方认证查询网站查询到认证通过结果后付款，不成功不收费！

一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单

ewymefz

IIT毕业证【微信95270640】购买（伊利诺伊理工大学毕业证成绩单硕士学历）Q微信95270640代办IIT学历认证留信网伪造伊利诺伊理工大学学位证书精仿伊利诺伊理工大学本科/硕士文凭证书补办伊利诺伊理工大学 diplomaoffer,Transcript购买伊利诺伊理工大学毕业证成绩单购买IIT假毕业证学位证书购买伪造伊利诺伊理工大学文凭证书学位证书,专业办理雅思、托福成绩单，学生ID卡，在读证明，海外各大学offer录取通知书，毕业证书，成绩单，文凭等材料:1:1完美还原毕业证、offer录取通知书、学生卡等各种在读或毕业材料的防伪工艺（包括烫金、烫银、钢印、底纹、凹凸版、水印、防伪光标、热敏防伪、文字图案浮雕，激光镭射，紫外荧光，温感光标）学校原版上有的工艺我们一样不会少，不论是老版本还是最新版本，都能保证最高程度还原，力争完美以求让所有同学都能享受到完美的品质服务。 #一整套伊利诺伊理工大学文凭证件办理#—包含伊利诺伊理工大学伊利诺伊理工大学毕业证假文凭学历认证|使馆认证|归国人员证明|教育部认证|留信网认证永远存档教育部学历学位认证查询办理国外文凭国外学历学位认证#我们提供全套办理服务。一整套留学文凭证件服务：一：伊利诺伊理工大学伊利诺伊理工大学毕业证假文凭毕业证 #成绩单等全套材料从防伪到印刷水印底纹到钢印烫金二：真实使馆认证（留学人员回国证明）使馆存档三：真实教育部认证教育部存档教育部留服网站永久可查四：留信认证留学生信息网站永久可查国外毕业证学位证成绩单办理方法： 1客户提供办理伊利诺伊理工大学伊利诺伊理工大学毕业证假文凭信息：姓名生日专业学位毕业时间等（如信息不确定可以咨询顾问：我们有专业老师帮你查询）； 2开始安排制作毕业证成绩单电子图； 3毕业证成绩单电子版做好以后发送给您确认； 4毕业证成绩单电子版您确认信息无误之后安排制作成品； 5成品做好拍照或者视频给您确认； 6快递给客户（国内顺丰国外DHLUPS等快读邮寄）。教育部文凭学历认证认证的用途：如果您计划在国内发展那么办理国内教育部认证是必不可少的。事业性用人单位如银行国企公务员在您应聘时都会需要您提供这个认证。其他私营 #外企企业无需提供！办理教育部认证所需资料众多且烦琐所有材料您都必须提供原件我们凭借丰富的经验帮您快速整合材料让您少走弯路。实体公司专业为您服务如有需要请联系我: 微信95270640声和哐咣的关门声待山娃醒来时父亲早已上班去了床头总搁着山娃最爱吃的馒头和肉包还有白花花的豆浆父亲中午留在工地吃饭和午休山娃的中饭是对面快餐店送来的不用山娃付钱父亲早跟老板谈妥了钱到时一起结父亲给山娃配了台手机二手货诺基亚的父亲说有什么事只管给他挂电话能拥有自己的手机山娃很高兴除了玩游戏发短信除了挂电话给爷爷奶奶和母亲山娃还给班主任邱老师连挂了二个电话并给同学阿强和阿昌家挂山娃兴奋地向他们诉说城市一

一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单

ewymefz

UPenn毕业证【微信95270640】办理宾夕法尼亚大学毕业证原版一模一样、UPenn毕业证制作【Q微信95270640】《宾夕法尼亚大学毕业证购买流程》《UPenn成绩单制作》宾夕法尼亚大学毕业证书UPenn毕业证文凭宾夕法尼亚大学本科毕业证书,学历学位认证如何办理【留学国外学位学历认证、毕业证、成绩单、大学Offer、雅思托福代考、语言证书、学生卡、高仿教育部认证等一切高仿或者真实可查认证服务】代办国外（海外）英国、加拿大、美国、新西兰、澳大利亚、新西兰等国外各大学毕业证、文凭学历证书、成绩单、学历学位认证真实可查。办国外宾夕法尼亚大学宾夕法尼亚大学硕士学位证成绩单教育部学历学位认证留信认证大使馆认证留学回国人员证明修改成绩单信封申请学校offer录取通知书在读证明offer letter。快速办理高仿国外毕业证成绩单： 1宾夕法尼亚大学毕业证+成绩单+留学回国人员证明+教育部学历认证（全套留学回国必备证明材料给父母及亲朋好友一份完美交代）; 2雅思成绩单托福成绩单OFFER在读证明等留学相关材料（申请学校转学甚至是申请工签都可以用到）。 3.毕业证 #成绩单等全套材料从防伪到印刷从水印到钢印烫金高精仿度跟学校原版100%相同。专业服务请勿犹豫联系我！联系人微信号：95270640诚招代理：本公司诚聘当地代理人员如果你有业余时间有兴趣就请联系我们。国外宾夕法尼亚大学宾夕法尼亚大学硕士学位证成绩单办理过程： 1客户提供办理信息：姓名生日专业学位毕业时间等（如信息不确定可以咨询顾问：我们有专业老师帮你查询）； 2开始安排制作毕业证成绩单电子图； 3毕业证成绩单电子版做好以后发送给您确认； 4毕业证成绩单电子版您确认信息无误之后安排制作成品； 5成品做好拍照或者视频给您确认； 6快递给客户（国内顺丰国外DHLUPS等快读邮寄）。我们在哪里父母对我们的爱和思念为我们的生命增加了光彩给予我们自由追求的力量生活的力量我们也不忘感恩正因为这股感恩的线牵着我们使我们在一年的结束时刻义无反顾的踏上了回家的旅途人们常说父母恩最难回报愿我能以当年爸爸妈妈对待小时候的我们那样耐心温柔地对待我将渐渐老去的父母体谅他们以反哺之心奉敬父母以感恩之心孝顺父母哪怕只为父母换洗衣服为父母喂饭送汤按摩酸痛的腰背握着父母的手扶着他们一步一步地慢慢散步.娃

社内勉強会資料_LLM Agents　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　.

NABLAS株式会社

一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理

oz8q3jxlp

原版定制【微信:41543339】【(Deakin毕业证书)迪肯大学毕业证】【微信:41543339】成绩单、外壳、offer、留信学历认证（永久存档真实可查）采用学校原版纸张、特殊工艺完全按照原版一比一制作（包括：隐形水印，阴影底纹，钢印LOGO烫金烫银，LOGO烫金烫银复合重叠，文字图案浮雕，激光镭射，紫外荧光，温感，复印防伪）行业标杆！精益求精，诚心合作，真诚制作！多年品质 ,按需精细制作，24小时接单,全套进口原装设备，十五年致力于帮助留学生解决难题，业务范围有加拿大、英国、澳洲、韩国、美国、新加坡，新西兰等学历材料，包您满意。【我们承诺采用的是学校原版纸张（纸质、底色、纹路），我们拥有全套进口原装设备，特殊工艺都是采用不同机器制作，仿真度基本可以达到100%，所有工艺效果都可提前给客户展示，不满意可以根据客户要求进行调整，直到满意为止！】【业务选择办理准则】一、工作未确定，回国需先给父母、亲戚朋友看下文凭的情况，办理一份就读学校的毕业证【微信41543339】文凭即可二、回国进私企、外企、自己做生意的情况，这些单位是不查询毕业证真伪的，而且国内没有渠道去查询国外文凭的真假，也不需要提供真实教育部认证。鉴于此，办理一份毕业证【微信41543339】即可三、进国企，银行，事业单位，考公务员等等，这些单位是必需要提供真实教育部认证的，办理教育部认证所需资料众多且烦琐，所有材料您都必须提供原件，我们凭借丰富的经验，快捷的绿色通道帮您快速整合材料，让您少走弯路。留信网认证的作用: 1:该专业认证可证明留学生真实身份 2:同时对留学生所学专业登记给予评定 3:国家专业人才认证中心颁发入库证书 4:这个认证书并且可以归档倒地方 5:凡事获得留信网入网的信息将会逐步更新到个人身份内，将在公安局网内查询个人身份证信息后，同步读取人才网入库信息 6:个人职称评审加20分 7:个人信誉贷款加10分 8:在国家人才网主办的国家网络招聘大会中纳入资料，供国家高端企业选择人才留信网服务项目： 1、留学生专业人才库服务（留信分析） 2、国（境）学习人员提供就业推荐信服务 3、留学人员区块链存储服务 → 【关于价格问题（保证一手价格）】我们所定的价格是非常合理的，而且我们现在做得单子大多数都是代理和回头客户介绍的所以一般现在有新的单子我给客户的都是第一手的代理价格，因为我想坦诚对待大家不想跟大家在价格方面浪费时间对于老客户或者被老客户介绍过来的朋友，我们都会适当给一些优惠。选择实体注册公司办理，更放心，更安全！我们的承诺：客户在留信官方认证查询网站查询到认证通过结果后付款，不成功不收费！

Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf

Linda486226

Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...

Subhajit Sahu

Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.

【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】

NABLAS株式会社

Recently uploaded (20)

Best best suvichar in gujarati english meaning of this sentence as Silk road ...

一比一原版(UofS毕业证书)萨省大学毕业证如何办理

Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation

一比一原版(CBU毕业证)卡普顿大学毕业证如何办理

Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...

standardisation of garbhpala offhgfffghh

一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理

Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...

Adjusting primitives for graph : SHORT REPORT / NOTES

原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样

The affect of service quality and online reviews on customer loyalty in the E...

一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单

一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理

一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单

一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单

社内勉強会資料_LLM Agents　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　.

一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理

Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf

Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...

【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】

Breaking Down Analytical and Computational Barriers Across the Energy Industry Using Databricks

1. Breaking Down Analytical and Computational Barriers in Energy Data Analytics Jonathan Farland DNV GL Energy

2. • IntroductionsWho is DNV GL? • Overview • Data Science Energy Analytics • Demonstration Statistical Computing Pilot • Plans Concepts in Development • DiscussionQ&A Agenda

3. DNV GL Policy Production Transmission & Distribution Use

4. 400 offices 100 countries 16,000 employees 150 years

5. Policy, Advisory and Research Demand Side Management Energy Analytics Load Research Services Market Research and Program Evaluation

6. Electricity Distribution Grid

7. Electricity Distribution Grid Transmission Distribution ConsumerGeneration Transmission Distribution ConsumerGeneration Wind Farms Photo Voltaic Aggregated Utility Scale 2-50 MW Utility Scale 100kW-2MW Distributed Scale 25kW-100kW Residential Commercial & Industrial DistributionTransmissionGeneration Bulk Storage > 50 MW Distribution System End User Bulk System PhotovoltaicWind Farms

8. The Rise of The Smart Grid

9. Energy Data Science Jonathan Farland DNV GL Energy

10. Terminology

11. 11

12. 12

13. 13

14. 14

15. 15

16. 16

17. Forecasting Approaches – Similar Day Matching – Statistically Adjusted Engineering (SAE) – Univariate Time Series (ARIMA) – Multiple Linear Regression – Econometric – Machine / Statistical Learning – Semiparametric Regression – Artificial Neural Networks – Fuzzy Logic – Support Vector Machines – Gradient Boosting

18. Additive Semiparametric Model 𝑦" = ℎ 𝑡𝑖𝑚𝑒 + 𝑓 𝑤𝑒𝑎𝑡ℎ𝑒𝑟 + 𝛼 𝑏𝑒ℎ𝑎𝑣𝑖𝑜𝑟 + 𝜀" Short Term Electricity Demand Time of Year Prevailing Atmosphere Conditions Recent Demand Behavior

19. Additive Semiparametric Model 𝑦" = ℎ 𝑡𝑖𝑚𝑒 + 𝑓 𝑤𝑒𝑎𝑡ℎ𝑒𝑟 + 𝛼 𝑏𝑒ℎ𝑎𝑣𝑖𝑜𝑟 + 𝜀" Short Term Electricity Demand Time of Year Prevailing Atmosphere Conditions Recent Demand Behavior

20. Emerging Technologies Photovoltaic Cells (e.g., Solar) Electric Vehicles Storage Wind Energy Efficiency Demand Response

21. 21 Load Shifting: Electric Vehicles 0 5 10 15 20 25 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Demand(kW) Hour Ending Standard Rate Electric Vehicle Rate

22. 22 - 20,000 40,000 60,000 80,000 100,000 120,000 140,000 160,000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Load(kWh) Hour Ending Forecasted - DR Reduction Forecasted - DR Baseline Forecasted - DR Impacted Load Actual DR - Reduction Load Reduction: Demand Response

23. Databricks + Spark Pilot

24. Benefits of Big Data from Advanced Metering Infrastructure ü A deeper understanding of demand and thereforehuman behavior (think energy efficiency) ü Cost effective operating costs ü Real-timenotification of power outages ü ImprovedSystem Planning and Reliability ü Allows for integration of disruptive technologies like Electric Vehicles Statistical Computing Pilot

25. Pilot Design Data Generating Process Analytics Statistical Computing Pilot

26. Energy Consumption Climatic - Temperature - Humidity - Wind Speed - Solar Demographic Firmographic Economic Financial Energy Efficiency Program Tracking Grid Infrastructure Statistical Computing Pilot Data Diversity

27. Key Focus Areas Performance Scalability Granularity

28. Going Further

29. Use Cases

30. DEMONSTRATION

31. VISION OF THE FUTURE

32. Current Concepts in Development Weather Normalization at Scale (e.g., California) Real-time Energy Forecasting Using Statistical Learning and Spark Streaming API Real-time Customer Sentiment Analysis Grid Reliability Analysis Cybercrime Protection of Electricity Grids

33. THANK YOU. Jonathan Farland – DNV GL jon.farland@dnvgl.com Andrew Stryker – DNV GL Andrew.stryker@dnvgl.com

Breaking Down Analytical and Computational Barriers Across the Energy Industry Using Databricks

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Breaking Down Analytical and Computational Barriers Across the Energy Industry Using Databricks

Similar to Breaking Down Analytical and Computational Barriers Across the Energy Industry Using Databricks (20)

More from Spark Summit

More from Spark Summit (20)

Recently uploaded

Recently uploaded (20)

Breaking Down Analytical and Computational Barriers Across the Energy Industry Using Databricks