Asymmetry in Large-Scale Graph Analysis, Explained

Paper presentation at the 2nd International Workshop on Graph Data Management Experiences and Systems (GRADES'14)

Asymmetry in Large-Scale
Graph Analysis, Explained
2nd International Workshop on
Graph Data Management Experiences and Systems (GRADES 2014)
Vasiliki Kalavri1
, Stephan Ewen2
, Kostas Tzoumas3
,
Vladimir Vlassov4
, Volker Markl5
, Seif Haridi6
1, 4, 6
KTH Royal Institute of Technology, 2, 3, 5
Technical University of Berlin
1, 4, 6
{kalavri, vladv, haridi}@kth.se
2, 3, 5
{firstname.lastname}@tu-berlin.de

Motivation
● Many of large-scale data processing
applications include
fixed point iterations
○ social network analysis
○ web graph analysis
○ machine learning
2

Asymmetrical Convergence
● Often, in fixed point
iterations, some
elements converge
faster than others
● Not all elements
require an update in
every iteration
3

Can we detect the elements that
require recomputation and
avoid redundant computations?
4

Contributions
● A categorization of optimizations for fixed point
iterative graph processing
● Necessary conditions under which, it is safe to
apply optimizations
● A mapping of existing techniques to graph
processing abstractions
● An implementation of template execution plans
Optimized algorithms yield order of
magnitude gains!
5

Iterative Plans - Bulk
● In each iteration, all
elements are computed
● Always applicable
7

Iterative Plans - Dependency
8
● In each iteration, only
elements whose at least
one neighbor has
changed are computed
● The state is computed
using the values of all
neighbors
● Always applicable

Iterative Plans - Incremental
9
● In each iteration, only elements
whose at least one neighbor
has changed are computed
● The state is computed using only
the values of updated
neighbors
● Applicable when the update
function is idempotent and
weakly monotonic (e.g. min)

Iterative Plans - Delta
10
● In each iteration, only elements
whose at least one neighbor
has changed are computed
● The state is computed using only
the deltas of updated
neighbors
● Applicable when the update
function is linear over the
composition operator (e.g.
sum)

Iteration Techniques Support in Graph
Processing Systems
X : provided by default
X : can be easily implemented
X : possible, but non-intuitive
System Bulk Dependency Incremental Delta
Pregel X X X X
GraphLab X X X X
GraphX X X X X
Powergraph X X X X
Stratosphere X X X X
11

Conclusions & Future Work
● Exploiting asymmetrical convergence can lead to
order of magnitude performance gains
● In the future, we plan to
○ Use cost-based optimization, to automatically select the
most efficient iterative plan, at runtime.
○ Implement a set of representative applications and
compare performance with iterative and graph-processing
systems.
14

A modern enterprise datacenter is a complex, multi-layered system whose components often interact in unpredictable ways. Yet, to keep operational costs low and maximize efficiency, we would like to foresee the impact of changing workloads, updating configurations, modifying policies, or deploying new services. In this talk, I will share our research group’s ongoing work on Strymon: a system for predicting datacenter behavior in hypothetical scenarios using queryable online simulation. Strymon leverages existing logging and monitoring pipelines of modern production datacenters to ingest cross-layer events in a streaming fashion and predict possible effects of such events in what-if scenarios. Predictions are made online by simulating the hypothetical datacenter state alongside the real one. Driven by a real-use case from our industrial partners, I will highlight the challenges we are facing in building Strymon to support a diverse set of data representations, input sources, query languages, and execution models. Finally, I will share our initial design decisions and give an overview of Timely Dataflow; a high-performance distributed streaming engine and our platform of choice for Strymon’s core implementation.

Graphs as Streams: Rethinking Graph Processing in the Streaming Era

Streaming is the latest hot topic in the big data world. We want to process data immediately and continuously. Modern stream processors have matured significantly and offer exceptional features, including sub-second latencies, high throughput, fault-tolerance, and seamless integration with various data sources and sinks. Many sources of streaming data consist of related or connected events: user interactions in a social network, web page clicks, movie ratings, product purchases. These connected events can be naturally represented as edges in an evolving graph. In this talk I will explain how we can leverage a powerful stream processor, such as Apache Flink, and academic research of the past two decades, to build graph streaming applications. I will describe how we can model graphs as streams and how we can compute graph properties without storing and managing the graph state. I will introduce useful graph summary data structures and show how they allow us to build graph algorithms in the streaming model, such as connected components, bipartiteness detection, and distance estimation.

World's toughest and most interesting analysis tasks lie at the intersection of graph data (inter-dependencies in data) and deep learning (inter-dependencies in the model). Classical graph embedding techniques have for years occupied research groups seeking how complex graphs can be encoded into a low-dimensional latent space. Recently, deep learning has dominated the space of embeddings generation due to its ability to automatically generate embeddings given any static graph. Grapharis is a project that revitalizes the concept of graph embeddings, yet it does so in a real setting were graphs are not static but keep changing over time (think of user interactions in social networks). More specifically, we explored how a system like Flink can be used to simplify both the process of training a graph embedding model incrementally but also make complex inferences and predictions in real time using graph structured data streams. To our knowledge, Grapharis is the first complete data pipeline using Flink and Tensorflow for real-time deep graph learning. This talk will cover how we can train, store and generate embeddings continuously and accurately as data evolves over time without the need to re-train the underlying model.

Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...

Approximate computing aims for efficient execution of workflows where an approximate output is sufficient instead of the exact output. The idea behind approximate computing is to compute over a representative sample instead of the entire input dataset. Thus, approximate computing — based on the chosen sample size — can make a systematic trade-off between the output accuracy and computation efficiency. Unfortunately, state-of-the-art systems for approximate computing, such as BlinkDB, ApproxHadoop, primarily target batch analytics, where the input data remains unchanged during the course of sampling. Thus, they are not well-suited for stream analytics. In this talk, we will present the design of StreamApprox, a Flink-based stream analytics system for approximate computing. StreamApprox implements an online stratified reservoir sampling algorithm in Apache Flink to produce approximate output with rigorous error bounds.

Demystifying Distributed Graph Processing

Flink Gelly - Karlsruhe - June 2015Andra Lungu

Machine Learning with Apache Flink at Stockholm Machine Learning Group

Till Rohrmann

Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...

Data stream processing has redefined how many of us build data pipelines. Apache Flink is one of the systems at the forefront of that development: With its versatile APIs (event-time streaming, Stream SQL, events/state) and powerful execution model, Flink has been part of re-defining what stream processing can do. By now, Apache Flink powers some of the largest data stream processing pipelines in open source data stream processing. In this keynote, we will look at the evolution of Stream Processing and Apache Flink during the last year, and what we believe will be the next wave of stream processing applications. We show how the Flink community and users evolved, what use cases are coming up, and how new and upcoming features in Flink are making new types of applications possible. We will also discuss common challenges that companies are facing when adopting stream processing, and how we can help companies to rapidly adopt and roll out stream processing company-wide.

Apache Flink: API, runtime, and project roadmap

Kostas Tzoumas

Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time

SICS: Apache Flink Streaming

Turi, Inc.

Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...

Apache Flink's DataStream API is very expressive and gives users precise control over time and state. However, many applications do not require this level of expressiveness and can be implemented more concisely and easily with a domain-specific API. SQL is undoubtedly the most widely used language for data processing but usually applied in the domain of batch processing. Apache Flink features two relational APIs for unified stream and batch processing, the Table API, a language-integrated relational query API for Scala and Java, and SQL. A Table API or SQL query computes the same result regardless whether it is evaluated on a static file or on a Kafka topic. While Flink evaluates queries on batch input like a conventional query engine, queries on streaming input are continuously processed and their results constantly updated and refined. In this talk we present Flink’s unified relational APIs, show how streaming SQL queries are processed, and discuss exciting new use-cases.

Data Stream Analytics - Why they are important

Paris Carbone

Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...

SK telecom shares our experience of using Flink in building a solution for Predictive Maintenance (PdM). Our PdM solution named metatron PdM consists of (1) a Deep Neural Network (DNN)-based prediction model for precise prediction, and (2) a Flink-based runtime system which applies the model to a sliding window on sensor data streams. Efficient handling of multi-sensor streaming data for real-time prediction of equipment condition is a critical component of our product. In this talk, we first show why we choose Flink as a core engine for our streaming use case in which we generate real-time predictions using DNNs trained with Keras on top of TensorFlow and Theano. In addition, we present a comparative study of methods to exploit learning models on JVM such as directly using Python libraries on CPython embedded in JVM, using TensorFlow Java API (including Flink TensorFlow), and making RPC calls to TensorFlow Serving. We then explain how we implement the runtime system using Flink DataStream API, especially with event time, various window mechanisms, timestamp and watermark, custom source and sink, and checkpointing. Lastly, we present how we use the official Flink Docker image for solution delivery and the Flink metric system for monitoring and management of our solution. We hope our use case sets a good example of building a DNN-based streaming solution using Flink.

Pregel: A System For Large Scale Graph ProcessingRiyad Parvez

Apache flink

pranay kumar

Flink Forward Berlin 2017: Francesco Versaci - Integrating Flink and Kafka in...

High-throughput DNA sequencing is a key data acquisition technology which enables dozens of important applications, from oncology to personalized diagnostics. We extended work presented last year to port additional portions of the standard genomics data processing pipeline to Flink. Our Flink-based processor consists of two distinct specialized modules (reader and writer) that are loosely linked via Kafka streams, thus allowing for easy composability and integration into already existing Hadoop workflows. To extend our work we had to manage the dynamical creation and detection of the data streams: the set of output files is not known in advance by the writer, which learns it at running time. Particular care had to be taken to handle the finite nature of the genomic streams: since we use some already existing Hadoop output formats, we had to properly handle the flow of end-of-streams markers through Flink and Kafka, in order to have the final output files correctly finalized.

Apache Flink Training: System Overview

Like a Pack of Wolves: Community Structure of Web Trackers

The shortest path is not always a straight line

What's hot

Gelly-Stream: Single-Pass Graph Streaming Analytics with Apache Flink

MapReduce: Optimizations, Limitations, and Open Issues

Mikio Braun – Data flow vs. procedural programming

Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...

Jen Aman

Deep Stream Dynamic Graph Analytics with Grapharis - Massimo Perini

Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...

Demystifying Distributed Graph Processing

Flink Gelly - Karlsruhe - June 2015Andra Lungu

Machine Learning with Apache Flink at Stockholm Machine Learning Group

Till Rohrmann

Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...

Apache Flink: API, runtime, and project roadmap

Kostas Tzoumas

Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time

SICS: Apache Flink Streaming

Turi, Inc.

Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...

Data Stream Analytics - Why they are important

Paris Carbone

Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...

Pregel: A System For Large Scale Graph ProcessingRiyad Parvez

Apache flink

pranay kumar

Flink Forward Berlin 2017: Francesco Versaci - Integrating Flink and Kafka in...

Apache Flink Training: System Overview

What's hot (20)

Gelly-Stream: Single-Pass Graph Streaming Analytics with Apache Flink

MapReduce: Optimizations, Limitations, and Open Issues

Mikio Braun – Data flow vs. procedural programming

Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...

Deep Stream Dynamic Graph Analytics with Grapharis - Massimo Perini

Flink Forward Berlin 2017: Pramod Bhatotia, Do Le Quoc - StreamApprox: Approx...

Demystifying Distributed Graph Processing

Flink Gelly - Karlsruhe - June 2015

Machine Learning with Apache Flink at Stockholm Machine Learning Group

Flink Forward Berlin 2017: Stephan Ewen - The State of Flink and how to adopt...

Apache Flink: API, runtime, and project roadmap

Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time

SICS: Apache Flink Streaming

Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...

Data Stream Analytics - Why they are important

Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...

Pregel: A System For Large Scale Graph Processing

Apache flink

Flink Forward Berlin 2017: Francesco Versaci - Integrating Flink and Kafka in...

Apache Flink Training: System Overview

Viewers also liked

Like a Pack of Wolves: Community Structure of Web Trackers

The shortest path is not always a straight line

Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15

Apache Flink is a general-purpose platform for batch and streaming distributed data processing. This talk describes how Flink’s powerful APIs, iterative operators and other unique features make it a competitive alternative for large-scale graph processing as well. We take a close look at how one can elegantly express graph analysis tasks, using common Flink operators and how different graph processing models, like vertex-centric, can be easily mapped to Flink dataflows. Next, we get a sneak preview into Flink's upcoming Graph API, Gelly, which further simplifies graph application development in Flink. Finally, we show how to perform end-to-end data analysis, mixing common Flink operators and Gelly, without having to build complex pipelines and combine different systems. We go through a step-by-step example, demonstrating how to perform loading, transformation, filtering, graph creation and analysis, with a single Flink program.

Apache Flink Deep Dive

A Skype case study (2011)

Flink vs. Spark

Slim Baltagi

Viewers also liked (6)

Like a Pack of Wolves: Community Structure of Web Trackers

The shortest path is not always a straight line

Large-scale graph processing with Apache Flink @GraphDevroom FOSDEM'15

Apache Flink Deep Dive

A Skype case study (2011)

Flink vs. Spark

Similar to Asymmetry in Large-Scale Graph Analysis, Explained

Pregel

Weiru Dai

Ad Click Prediction - Paper review

Mazen Aly

A review of the paper “Ad Click Prediction: a View from the Trenches” The paper discusses predicting ad click--through rates (CTR) which is a massive-scale learning problem central to the multi-billion dollar online advertising industry. Presented by Mazen & Arzam in the Data Intensive Computing class at KTH, Stockholm, Sweden. Link of the paper: http://research.google.com/pubs/pub41159.html

Enabling Application Integrated Proactive Fault Tolerance

Dai Yang

Exascale computing is the next major milestone for the HPC community. Due to a steadily increasing probability of failures, cur- rent applications must be made malleable to be able to cope with dynamic resource changes. In this paper, we show first results with LAIK, a lightweight library for dynamically re-distributable application data. This allows to free compute nodes from workload before a predicted failure. For a real-world application, we show that LAIK adds negligi- ble overhead. In addition, we show the effect of different re-distribution strategies.

Improving Resource Utilization in Cloud using Application Placement Heuristics

AtakanAral

Application placement is an important concept when providing software as a service in cloud environments. Because of the potential downtime cost of application migration, most of the time additional resource acquisition is preferred over migrating the applications residing in the virtual machines (VMs). This situation results in under-utilized resources. To overcome this problem static/dynamic estimations on the resource requirements of VMs and/or applications can be performed. A simpler strategy is using heuristics during application placement process instead of naively applying greedy strategies like round-robin. In this paper, we propose a number of novel heuristics and compare them with round robin placement strategy and a few proposed placement heuristics in the literature to explore the performance of heuristics in application placement problem. Our focus is to better utilize the resources offered by the cloud environment and at the same time minimize the number of application migrations. Our results indicate that an application heuristic that relies on the difference between the maximum and minimum utilization rates of the resources not only outperforms other application placement approaches but also significantly improves the conventional approaches present in the literature.

Pregel - Paper Review

Maria Stylianou

KineographZuhair khayyat

Uber Business Metrics Generation and Management Through Apache Flink

Wenrui Meng

8th TUC Meeting - Tim Hegeman (TU Delft). Social Network Benchmark, Analytics...

LDBC council

Online advertising and large scale model fitting

Wush Wu

Machine Learning Infrastructure

SigOpt

Machine learning infrastructure solve data scientists' problems using infrastructure tools. This talk shows the case study of building SigOpt Orchestrate, an ML infrastructure tool. The talk highlights how data scientists' concerns as user mapped to solutions with some of today's most popular infrastructure tools. To learn more about SigOpt Orchestrate: https://sigopt.com/orchestrate Originally given as a talk for UC Berkeley's Women in Electrical Engineering and Computer Science group on January 24, 2019.

IRJET- Latin Square Computation of Order-3 using Open CL

IRJET Journal

M3AT: Monitoring Agents Assignment Model for the Data-Intensive Applications

VladislavKashansky

Big Data, Bigger Analytics

Itzhak Kameli

Gatling

Gaurav Shukla

Performance Test Automation With Gatling

Knoldus Inc.

Scheduling Task-parallel Applications in Dynamically Asymmetric Environments

LEGATO project

Towards an Incremental Schema-level Index for Distributed Linked Open Data G...

Till Blume

Semi-structured, schema-free data formats are used in many applications because their flexibility enables simple data exchange. Especially graph data formats like RDF have become well established in the Web of Data. For the Web of Data, it is known that data instances are not only added, changed, and removed regularly, but that their schemas are also subject to enormous changes over time. Unfortunately, the collection, indexing, and analysis of the evolution of data schemas on the web is still in its infancy. To enable a detailed analysis of the evolution of Linked Open Data, we lay the foundation for the implementation of incremental schema-level indices for the Web of Data. Unlike existing schema-level indices, incremental schema-level indices have an efficient update mechanism to avoid costly recomputations of the entire index. This enables us to monitor changes to data instances at schema-level, trace changes, and ultimately provide an always up-to-date schema-level index for the Web of Data. In this paper, we analyze in detail the challenges of updating arbitrary schema-level indices for the Web of Data. To this end, we extend our previously developed meta model FLuID. In addition, we outline an algorithm for performing the updates.

Memory Efficient Graph Convolutional Network based Distributed Link Prediction

miyurud

Graph Convolutional Networks (GCN) have found multiple applications of graph-based machine learning. However, training GCNs on large graphs of billions of nodes and edges with rich node attributes consume significant amount of time and memory resources. This makes it impossible to train such GCNs on general purpose commodity hardware. Such use cases demand high-end servers with accelerators and ample amounts of memory. In this paper we implement a memory efficient GCN based link prediction on top of a distributed graph database server called JasmineGraph. Our approach is based on federated training on partitioned graphs with multiple parallel workers. We conduct experiments with three real world graph datasets called DBLP-V11, Reddit, and Twitter. We demonstrate that our approach produces optimal performance for a given hardware setting. JasmineGraph was able to train a GCN on the largest dataset DBLP-V11(>10GB) in 20 hours and 24 minutes for 5 training rounds and 3 epochs by partitioning it into 16 partitions with 2 workers on a single server while the conventional training method could not process it at all due to lack of memory. The second largest dataset Reddit took 9 hours 8 minutes to train with conventional training while JasmineGraph took only 3 hours and 11 minutes with 8 partitions-4 workers in the same hardware giving 3 times improved performance. In case of Twitter dataset JasmineGraph was able to give 5 times improved performance. (10 hours 31 minutes vs 2 hours 6 minutes;16 partitions-16 workers).

Toronto meetup 20190917

Bill Liu

Distributed Feature Selection for Efficient Economic Big Data Analysis

IRJET Journal

Similar to Asymmetry in Large-Scale Graph Analysis, Explained (20)

Pregel

Ad Click Prediction - Paper review

Enabling Application Integrated Proactive Fault Tolerance

Improving Resource Utilization in Cloud using Application Placement Heuristics

Pregel - Paper Review

Kineograph

Uber Business Metrics Generation and Management Through Apache Flink

8th TUC Meeting - Tim Hegeman (TU Delft). Social Network Benchmark, Analytics...

Online advertising and large scale model fitting

Machine Learning Infrastructure

IRJET- Latin Square Computation of Order-3 using Open CL

M3AT: Monitoring Agents Assignment Model for the Data-Intensive Applications

Big Data, Bigger Analytics

Gatling

Performance Test Automation With Gatling

Scheduling Task-parallel Applications in Dynamically Asymmetric Environments

Towards an Incremental Schema-level Index for Distributed Linked Open Data G...

Memory Efficient Graph Convolutional Network based Distributed Link Prediction

Toronto meetup 20190917

Distributed Feature Selection for Efficient Economic Big Data Analysis

Recently uploaded

Designing Great Products: The Power of Design and Leadership by Chief Designe...

Essentials of Automations: Optimizing FME Workflows with Parameters

Safe Software

Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place. Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects. Here’s what you’ll gain: - Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows. - Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy. - Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency. - Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity. We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic. Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.

Generating a custom Ruby SDK for your web service or Rails API using Smithy

g2nightmarescribd

Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.

Bits & Pixels using AI for Good.........

Alison B. Lowndes

DevOps and Testing slides at DASA Connect

Kari Kakkonen

Knowledge engineering: from people to machines and back

Elena Simperl

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...

Securing your Kubernetes cluster_ a step-by-step guide to success !

KatiaHIMEUR1

Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster. However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks. In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...

Sri Ambati

Mission to Decommission: Importance of Decommissioning Products to Increase E...

The Art of the Pitch: WordPress Relationships and Sales

Laura Byrne

Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes? All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.

Monitoring Java Application Security with JDK Tools and JFR Events

Ana-Maria Mihalceanu

GraphRAG is All You need? LLM & Knowledge Graph

Guy Korland

Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs. 1. Unifying Large Language Models and Knowledge Graphs: A Roadmap. https://arxiv.org/abs/2306.08302 2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs: https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/

When stars align: studies in data quality, knowledge graphs, and machine lear...

Elena Simperl

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...