How easy (or hard) it is to monitor your graph ql service performance

Public Produced by Luca Ferrari Version 0.9
How easy (or hard) it is to monitor
your GraphQL service performance
by Luca Ferrari
EMEA Solution Architect
in Red Hat

Public Prepared by Luca Ferrari Version 0.9
Agenda
graphql {
what {
challenges
solutions
}
possible {
demo
}
questions(limit: None) {
answers(0:N)
}
}

Graphql boring definition
GraphQL is a query language for APIs and a runtime for fulfilling those queries with your
existing data.
GraphQL provides a complete and understandable description of the data in your API as well
as gives clients the power to ask for exactly what they need and nothing more.
It simplifies evolving APIs over time and enables powerful developer tools
Reference: graphql.org/learn

Quote
With great power comes great responsibility

Quote
With great power comes great responsibility
Voltaire

From ...
Reference: https://github.com/OlegIlyenko/presentation-graphql-introduction

… to
Reference: https://github.com/OlegIlyenko/presentation-graphql-introduction

Graphql advantages
Exact data fetching / Client specific shape of response: with GraphQL, you send a query to your API
and get exactly what you need, nothing more and nothing less. GraphQL minimizes the amount of data
that is transferred across the wire by being selective about the data depending on the client
application’s needs. Thus, a mobile client can fetch less information.
vs
Overfetching or Underfetching

Graphql advantages
One request, many resources / network efficiency: it makes it simple to fetch all required data with
one single request. The structure of GraphQL servers makes it possible to declaratively fetch data as it
only exposes a single endpoint.
vs
Multiple requests to get a composite result and Using network as an unlimited resource /
Network inefficient

Graphql advantages
Modern compatibility: Modern applications are now built-in comprehensive ways where a single
backend application supplies the data that is needed to run multiple clients.
GraphQL embraces these new trends as it can be used to connect the backend application and fulfill
each client’s requirements ( nested relationships of data, fetching only the required data, network usage
requirements, etc.) without dedicating a separate API for each client.
Schema stitching makes it possible to create a single general schema from different schemas. As a
result, each microservice can define its own GraphQL schema.
vs
Multiple APIs for omnichannel experience

Graphql advantages
Field level deprecation: As developers, we are used to calling different versions of an API and often
times getting really weird responses. Traditionally, we version APIs when we’ve made changes to the
resources or to the structure of the resources we currently have hence, the need to deprecate and
evolve a new version.
In GraphQL, it is possible to deprecate API’s on a field level. When a particular field is to be deprecated, a
client receives a deprecation warning when querying the field. After a while, the deprecated field may be
removed from the schema when not many clients are using it anymore.
vs
Versioning / Deprecation / Outdated documentation

Possible issues
1
Caching:
with REST you access resources with URLs,
and thus you would be able to cache on a
resource level. In GraphQL, this becomes
complex as each query can be different
even though it operates on the same
entity.
2
Query performance:
GraphQL gives clients the power to
execute queries to get exactly what they
need. It could also mean that users can ask
for as many fields in as many resources as
they want and build highly complex queries
that slow systems down
3
Security:
OIDC scopes, granular authz or rate limiting
might not be as easy to implement as with
REST services
4
Monitoring performance:
Measuring response time of a GraphQL
endpoint gives us almost no insight into the
health of our GraphQL API.

Performance Monitoring
Imagine a simple query:
query {
viewer {
name
bestFriend {
name
}
}
}
Now a new API client starts using our GraphQL API:
query {
viewer {
friends(first: 1000) {
bestFriend {
name
}
}
}
}

Performance Monitoring
● The queries are not so different, but the second one will see a much higher response time
● Response are slower as we serve more and more complex queries
● But what we really would like to know is the general behaviour of our backend given a comparable
workload
● We are not interested in monitoring the endpoint, but the queries.
● If we are running a private Graphql API with known clients we can control the situation, but
otherwise ...

Observability
Observability is defined as the ability of the internal
states of a system to be determined by its external
outputs.
Observability consists of three pillars - metrics,
traces, and logs.
Drawing conclusions from any one of these pillars alone
is difficult.
Observability means bringing together the information
from all in a coordinated way toward finding bugs and
bottlenecks.

OpenTracing
Distributed tracing is a method used to profile and
monitor applications, especially those built using a
microservices architecture. Distributed tracing helps
pinpoint where failures occur and what causes poor
performance.
OpenTracing is comprised of an API specification,
frameworks and libraries that have implemented the
specification, and documentation for the project.
OpenTracing allows developers to add
instrumentation to their application code using APIs
that do not lock them into any one particular
product or vendor.

OpenTracing and OpenCensus have merged to form
OpenTelemetry!
OpenTelemetry is a collection of tools, APIs, and
SDKs.
You use it to instrument, generate, collect, and
export telemetry data (metrics, logs, and traces) for
analysis in order to understand your software's
performance and behavior.

Jaeger
● A product built at Uber
● Inspired by Dapper from Google
● Donated to CNCF
● Supported libraries in Go, Java, Node.js, Python, C++, C#
● Accepts span in Zipkin format for backward compatibility
● Emit prometheus metrics

Apollo Tracing extension
Apollo Tracing is a GraphQL extension for performance
monitoring that works with most popular GraphQL server
libraries, including Node, Ruby, Scala, Java, and Elixir, and it
enables you to easily get resolver-level performance
information as part of a GraphQL response.
Apollo Tracing works by including data in the extensions field
of the GraphQL response, which is reserved by the GraphQL
spec for extra information that a server wants to return. That
way, you have access to performance traces alongside the
data returned by your query
Reference: https://www.apollographql.com/blog/exposing-trace-data-for-your-graphql-server-with-apollo-tracing-97c5dd391385/

Instana tracing support
Instana offers tracing for GraphQL queries, mutations and
subscriptions. GraphQL tracing is currently supported for the
Ruby, Node.js and Java runtimes.
For each operation, we capture
● the operation type (query, mutation or
subscription-update),
● the operation name (if provided),
● all involved object types,
● the arguments used for each object type, and
● the selected fields for each object type.
Each time a client receives an update due to one of its active
GraphQL subscriptions, we trace this subscription update as a
call from the GraphQL server to the client.
Reference: https://www.instana.com/docs/ecosystem/graphql/

DataDog tracer plugin
DataDog Javascript Tracer provides out-of-the-box
instrumentation for many popular frameworks and libraries
by using a plugin system. By default all built-in plugins are
enabled.
This library is OpenTracing compliant. Use the OpenTracing
API and the Datadog Tracer (dd-trace) library to measure
execution times for specific pieces of code.
This plugin automatically instruments the graphql module.
The graphql integration uses the operation name as the span
resource name. If no operation name is set, the resource
name will always be just query, mutation or subscription.
Reference: https://datadoghq.dev/dd-trace-js/

NewRelic plugin
By using the New Relic Apollo Server plugin to instrument
your applications, you can get to the root cause of issues.
The plugin records the overall timing of the operations and
then parses the payload so you can uncover and diagnose the
cause of your slow GraphQL operations.
Distributed tracing goes further and provides the capability
to understand if the latency is coming from the application
itself or other services.
Reference: https://blog.newrelic.com/product-news/apollo-server-plugin/

Apollo OpenTracing plugin
Apollo Opentracing allows you to integrate open source
baked performance tracing to your Apollo server based on
industry standards for tracing.
➢ Request & Field level resolvers are traced out of the
box
➢ Queries and results are logged, to make debugging
easier
➢ Select which requests you want to trace
➢ Spans transmitted through the HTTP Headers are
picked up
➢ Use the opentracing compatible tracer you like
Reference: https://github.com/DanielMSchmidt/apollo-opentracing

Apollo server
Apollo Server is an open-source, spec-compliant GraphQL server that's compatible with any GraphQL client, including
Apollo Client.
You can use Apollo Server as:
● A stand-alone GraphQL server, including in a serverless environment
● An add-on to your application's existing Node.js middleware (such as Express or Fastify)
● A gateway for a federated data graph
Reference: https://www.apollographql.com/

Components
1. Apollo server with OpenTracing plugin
2. ReactJS client
3. Jaeger platform

Flow
ReactJS
client
Apollo server
resolver
axios
Github REST
API
Jaeger
platform

Demo time!
What just happened?

How easy (or hard) it is to monitor your graph ql service performance

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to How easy (or hard) it is to monitor your graph ql service performance

Similar to How easy (or hard) it is to monitor your graph ql service performance (20)

More from Red Hat

More from Red Hat (20)

Recently uploaded

Recently uploaded (20)

How easy (or hard) it is to monitor your graph ql service performance