Let's start GraphQL: structure, behavior, and architecture
May. 29, 2020•0 likes
1 likes
Be the first to like this
Show More
•408 views
views
Total views
0
On Slideshare
0
From embeds
0
Number of embeds
0
Download to read offline
Report
Software
In this talk, I describe the path to start with GraphQL in a company that has experience with Python stack and REST API. We go from the definition of GraphQL, via behavioral aspects and data management, to the most common architectural questions.
What is GraphQL?
GraphQL was developed internally by Facebook and released in 2015.
From 2018, it's a product of GraphQL Foundation, hosted by Linux Foundation.
GraphQL is a query language for APIs
Philosophy:
Get only what you need
Get many resources in a single request
Describe data contracts
Easier API evolution without versioning
GraphQL is a runtime for fulfilling queries with data
GraphQL is a modern alternative to REST
GraphQL vs REST
GraphQL concepts
GraphQL has a clear separation of structure and behaviour
Main concepts:
Schema
Schema serves as a contract between the client and the server.
Query
In queries clients can request information their required.
Resolvers
Resolvers retrieve data from data sources.
The Schema Definition Language (SDL)
GraphQL has evolved and continue to evolve.
The current specification for GraphQL is from June 2018.
http://spec.graphql.org/June2018/
Everything in GraphQL is a type.
Let’s define our first simple type called Article
type Article {
}
Default Types in GraphQL
Default types out of the box:
Int
Float
String
Boolean
ID
Additional types by particular
implementations of GraphQL:
Date
DateTime
… many others
Let’s extend our type Article with some meaningful fields:
type Article {
id: ID!
title: String!
url: String!
}
The ! following the type name means
that the field is non-nullable (required).
… and define another type Comment to illustrate other field types
type Comment {
id: ID!
text: String!
createdAt: DateTime!
moderationStatus: Status!
moderatorNote: String
}
enum Status {
APPROVED
REJECTED
PENDING
}
Let’s extend Article by creating a one-to-many-relationship to Comment
The array of comments is non-nullable, so we can always expect an array.
And with Comment! we require that
every item of the array (if any) has to be a non-nullable valid Comment object.
type Article {
id: ID!
title: String!
url: String!
comments: [Comment!]!
}
Similarly, we can establish a relation
between Comment and its parent Article:
type Comment {
id: ID!
text: String!
createdAt: DateTime!
moderationStatus: Status!
moderatorNote: String
article: Article!
}
Schema in GraphQL can be expressed as a graph
Nodes are our complex types, e.g., Article
The defined primitive types, e.g., Int or String,
represent terminal nodes (leaves)
Edges are represented by the relations
https://app.graphqleditor.com/
Fetching Data
Only one endpoint has to be exposed.
Difference to REST: clients in GraphQL do not need to know
specific endpoints for each resource they want to obtain.
The possible queries have to be defined before could be used by clients.
GraphQL servers expose only one endpoint (e.g., /graphql) that can return
many data structures based on the actual needs of the particular client.
Let’s define our first Query allArticles that returns list of Articles
type Query {
allArticles: [Article!]!
}
The result will be an array containing zero or more instances of Article
Note, that query definition does not limit which fields to return
and client can receive whatever it is actually needed.
… and to express the required information, client calls this query
query {
allArticles {
id
title
}
}
Additionally to the specified name of the query,
client also expresses list of fields on Article objects to be returned
Queries can have arguments
type Query {
allArticles(N: Int): [Article!]!
}
We indicate that N is an optional parameter by omitting ! after its type’s name.
… and then client can query for 2 articles with
query {
allArticles(N: 2) {
id
title
}
}
Request queries can also have names
So far, we have defined only unnamed request queries,
that makes them non-easy to reuse. Let’s define a named one:
query getRecentArticles {
allArticles(N: 2) {
id
title
comments {
id
text
article {
id
title
}
}
}
}
Names queries can have arguments
A GraphQL query can be parameterized with variables
to maximize query reuse.
query getRecentArticles($numOfArticles: Int!) {
allArticles(N: $numOfArticles) {
id
title
comments {
id
text
article {
id
title
}
}
}
}
Fragments
Fragments are the primary unit of composition in GraphQL and
allow to reuse common repeated selections of fields:
fragment requiredArticleFields on Article {
id
title
}
query getRecentArticles($numOfArticles: Int!) {
allArticles(N: $numOfArticles) {
...requiredArticleFields
comments {
text
article {
...requiredArticleFields
}
}
}
}
Introspection
A GraphQL servers support introspection over its schema.
Introspection allows frontend and backend developers to work independently
and keep the documentation on the contract up-to-date.
Client can execute a special query and request _type information about all
particular types defined in GraphQL server.
https://graphql.org/learn/introspection/
query {
__type(name: "Article") {
name
fields {
name
type {
kind
ofType {
name
kind
}
}
}
}
}
Let’s get information about Article
In this request query we express which information on the type Article has to be inspected.
query {
__schema {
queryType {
name
}
}
}
Schema can be also introspected
{
"__schema": {
"queryType": {
"name": "Query"
}
}
}
… and the response contains the specification of the schema:
Create, Update, and Delete
Mutation - is a write followed by a fetch
Mutations, a special kind of queries, can create, update, or delete data.
type Mutation {
...
}
At the backend, the definition of mutation is just another type:
… and the frontend query has to be prepended with the mutation keyword:
mutation {
...
}
type Mutation {
updateArticle(id: ID!, title: String, url: String): Article
}
Let’s define our first mutation for Article
… and the client could simple execute:
mutation {
updateArticle(id: "cx2a", title: "New title") {
id,
title
}
}
Input types
The special input type allows to define groups of input parameters.
Input type describes an object that does not exist itself but provide
logically required information.
input someName {
...
}
Once defined, input types could be used as a regular object type to
describe input arguments for mutations.
input ModerationInput {
status: String!
note: String
}
Let’s create an input type for moderation action
that updates comment's status of the comment
… and then use it in the mutation
type Mutation {
...
moderateComment($commentID: ID!, $decision: ModerationInput!): Comment
}
Subscribe for updates
Subscription is a long‐lived request
that fetches data in response to source events.
Client creates a subscription by executing a special subscription query:
subscription someName {
...
}
To support this feature, the server has to be able to re-evaluate the query
and maintain a persistent channel to the client to push results back.
Different servers have different implementation of this feature.
GraphQL IDE
https://github.com/graphql/graphiql
GraphiQL / raf k( )l/ is graphical interactive in-browser GraphQL IDE
syntax highlighting, autocomplete
documentation explorer
query history using local storage
supports full GraphQL Language Specification
Queries are parsed, validated and executed.
The structure comes into life with a concrete implementation
that determines the server’s behaviour.
Query is parsed into an abstract syntax tree (AST).
AST is validated against the schema.
GraphQL does not just verify if a request is syntactically correct, but also ensures
that it is unambiguous and mistake‐free in the context of a given GraphQL schema.
Runtime walks through the AST, starting from the root of the tree,
collects data and return the response.
Resolvers
Resolver is a function responsible to collect actual data.
Resolvers exist for each field and type in the schema
and provide the instructions for turning a GraphQL operation
(a query, mutation, or subscription) into data.
Resolvers can get data from any sources
(other APIs, databases, etc.) and can be asynchronous.
If a resolver doesn't specified, GraphQL servers use default resolvers
(they just search in the root a property with the same name as the field)
Resolvers
Resolvers can return objects, scalars (e.g., Strings, Numbers) or null
If an object is returned, execution continues to the next child field.
If a scalar is returned (typically at a leaf node), execution completes.
If null is returned, execution halts and does not continue.
While implementing resolvers it’s easy to come into a trap with overfetching and
n+1 problem, so it is advised to use dedicated libraries that optimize this process.
Graphene
Graphene-Python is a library for building GraphQL APIs in Python
Integrations with different frameworks:
Django: Graphene-Django
SQLAlchemy: Graphene-SQLAlchemy
Google App Engine: Graphene-GAE
https://graphene-python.org/
It uses code-first approach to building a GraphQL APIs, so instead of
writing GraphQL Schema Definition Language (SDL), developers write
Python code to describe the data provided by the server.
class Article(graphene.ObjectType):
id = graphene.ID(required=True)
title = graphene.String(required=True)
url = graphene.String(required=True)
comments = graphene.NonNull(graphene.List(graphene.NonNull(Comment)))
Let’s convert our schema into Python with graphene
type Article {
id: ID!
title: String!
url: String!
comments: [Comment!]!
}
class Comment(graphene.ObjectType):
id = graphene.ID(required=True)
text = graphene.String(required=True)
createdAt = graphene.DateTime(required=True)
moderationStatus = graphene.Enum(
"Status", [("APPROVED", "approved"), ("REJECTED", "rejected"), ("PENDING", "pending")]
)
moderationNote = graphene.String(required=False)
article = graphene.NonNull(graphene.Field(Article))
Let’s convert our schema into Python with graphene
type Comment {
id: ID!
text: String!
createdAt: DateTime!
moderationStatus: Status!
moderatorNote: String
article: Article!
}
enum Status {
APPROVED
REJECTED
PENDING
}
class Query(graphene.ObjectType):
allArticles = graphene.NonNull(graphene.List(Article), N=graphene.Int())
… and the query
type Query {
allArticles(N: Int): [Article!]!
}
At the end, everything is wrapped in a Schema:
schema = graphene.Schema(query=Query)
Fetching Data
Only one endpoint has to be exposed.
from django.conf.urls import url
from graphene_django.views import GraphQLView
from myapp.schema import schema
urlpatterns = [
url(r'^graphql$', GraphQLView.as_view(graphiql=True, schema=schema)),
]
Parameter graphiql=True will enable built-in GraphiQL plugin
For instance, using Django we can simply register
our /graphql endpoint with Graphene-Django:
If Articles and Comments are modeled in Django
import graphene
from myapp.models import Article as DjangoArticle
from myapp.models import Comment as DjangoComment
class Article(graphene.ObjectType):
id = graphene.ID(required=True)
title = graphene.String(required=True)
url = graphene.String(required=True)
comments = graphene.NonNull(graphene.List(graphene.NonNull(Comment)))
def resolve_comments(article, info):
return DjangoComment.objects.filter(article=article)
In graphene, resolvers are named as resolve_ following be the field’s name
… and the query
class Query(graphene.ObjectType):
allArticles = graphene.NonNull(graphene.List(Article, N=graphene.Int()))
def resolve_allArticles(self, info, N):
articles = DjangoArticle.objects.all()
if N is not None:
articles = articles[:N]
return articles
In graphene, resolvers are named as resolve_ following be the field’s name
With Graphene-Django it is even easier:
from graphene_django import DjangoObjectType
class Article(DjangoObjectType):
class Meta:
model = DjangoArticle
class Query(graphene.ObjectType):
allArticles = graphene.NonNull(graphene.List(Article), N=graphene.Int())
def resolve_allArticles(self, info, N=None):
articles = DjangoArticle.objects.all()
if N is not None:
articles = articles[:N]
return articles
We do not need to write resolvers for each field in the model.
Let's imagine now that
Comments are stored in Elasticsearch, while Articles are in PostgreSQL,
and we operate GraphQL server with Django
Resolvers are very flexible
from graphene_django import DjangoObjectType
from myapp.models import Article as DjangoArticle
from myapp.es import es_client
class Article(DjangoObjectType):
class Meta:
model = DjangoArticle
comments = graphene.NonNull(graphene.List(graphene.NonNull(Comment)))
def resolve_comments(article, info):
return es_client.search(article_id=article.id) # Query Elasticsearch
Mutation are defined in a very similar way
class CreateComment(graphene.Mutation):
class Arguments:
articleId = graphene.ID(required=True)
text = graphene.String(required=True)
moderationStatus = graphene.String(required=True)
comment = graphene.Field(lambda: Comment)
@staticmethod
def mutate(self, info, **kwargs):
article = DjangoArticle.objects.get(pk=kwargs.pop("articleId"))
comment = DjangoComment.create(article=article, **kwargs)
comment.save()
return CreateComment(comment)
… and then easy merged into the Schema
class Mutation(graphene.ObjectType):
createComment = CreateComment.Field()
schema = graphene.Schema(
query=Query,
mutation=Mutation,
)
Input types
Read more about how to use Input types in mutations:
http://docs.graphene-python.org/en/latest/types/mutations/
class ModerationInput(graphene.InputObjectType):
moderationStatus = graphene.Enum(
"Status", [("APPROVED", "approved"), ("REJECTED", "rejected"), ("PENDING", "pending")]
)
moderationNote = graphene.String(required=False)
class ModerateComment(graphene.Mutation):
class Arguments:
decision = ModerationInput(required=True)
...
Output types
To return an existing ObjectType instead of a mutation-specific type,
simply set the Output attribute to the desired ObjectType:
class CreateComment(graphene.Mutation):
class Arguments:
text = graphene.String(required=True)
moderationStatus = graphene.String(required=True)
Output = Comment
@staticmethod
def mutate(root, info, **kwargs):
article = DjangoArticle.objects.get(pk=kwargs.pop("articleId"))
comment = DjangoComment.create(article=article, **kwargs)
comment.save()
return comment
Subscriptions in Python
The official lilbrary: https://github.com/graphql-python/graphql-ws
The alternative for Django is Graphene Subscriptions
https://github.com/jaydenwindle/graphene-subscriptions
"Graphene Subscriptions" uses Django Channels that could handle
subscriptions by broadcasting to connected sockets from signals.
https://blog.apollographql.com/how-to-use-subscriptions-in-graphiql-1d6ab8dbd74b
First, we connect signals for the model we want to create subscriptions for
from django.db.models.signals import post_save, post_delete
from graphene_subscriptions.signals import post_save_subscription
from graphene_subscriptions.signals import post_delete_subscription
from myapp.models import Comment
post_save.connect(post_save_subscription, sender=Comment, dispatch_uid="x_post_save")
post_delete.connect(post_delete_subscription, sender=Comment, dispatch_uid="x_post_delete")
It is also possible to listen to non-standard event that are defined within the application.
Then create a subscription. Let’s create it for a new comment
from graphene_subscriptions.events import CREATED
class Subscription(graphene.ObjectType):
newComment = graphene.Field(Comment)
def resolve_new_comment(root, info):
return root.filter(
lambda event:
event.operation == CREATED and
isinstance(event.instance, DjangoComment)
).map(lambda event: event.instance)
schema = graphene.Schema(query=Query, subscription=Subscription)
Each resolver receives an `Observable` of `SubscriptionEvent`'s (`root`), which will emit a new
`SubscriptionEvent` each time one of the connected signals are fired.
Monolith Architecture
/graphql Data storage 1
Data storage 2
Monolith application
lack of modularity
hard to scale
dangerous to change
easy to debug
simply to deploy
faster to develop
Microservice Architecture
Option 1. Resolvers access the data storage of other services directly
lost of data ownership
maintenance hell
permissions issues
Data storage 1
Data storage 2
Service 2
Service 1/graphql
Microservice Architecture
Option 2. Each service exposes its GraphQL endpoint and clients could built
the response themselves by accessing them directly
hard to protect endpoints
client incorporates too much logic correct ownership of data
Data storage 1
Data storage 2
Service 2
Service 1
/graphql
/graphql
Microservice Architecture
Option 3. Each service exposes its GraphQL endpoint, but client communicates
to its own GraphQL backend which is responsible access those endpoints and
joins all sub-schemas in on schema available for the client
no need to know about various
services for the client
client works with a single GraphQL
schema
easy to protect endpoints
correct ownership of data
Data storage 1
Data storage 2
Service 2
Service 1/graphql
/graphql
/graphql
Federation
Federated architecture is a pattern
that allows interoperability and information sharing
between semi-autonomous decentralised services and applications
each service has its own lifecycle
each service can be scaled
independently
services could use different
technologies
easy to build a fault-tolerant
system
Apollo Federation
A specification of federation for GraphQL services
https://www.apollographql.com/docs/apollo-server/federation/introduction/
Apollo Federation allows to expose a single data graph that provides a
unified interface for querying all available data sources (services)
It allows client to fetch data from all registered services
without knowledge of particular data ownership
Apollo Federation
At basic, Apollo Federation provides a way
to declaratively reference types that live in different schemas.
GraphQL types are likely to be interrelated, but not all available in each
particular service, thus we have to specify how to resolve them.
type SomeType @key(fields: "id") {
...
}
The service that owns the type has to provide the primary key
to convert the type into an entity that can be returned to other services.
With federation, let’s split our monolith application from previous examples
assuming that we have two separate services - Articles and Comments.
type Article @key(fields: "id") {
...
}
type Comment @key(fields: "id") {
...
}
In the Articles service we define Article as
… and in Comments service we define Comment:
Type Extension
It is possible to extend types from other services
Comments service assigns users to each comment,
but the User service does not need to know about Comments:
type User @key(fields: "id") {
id: ID!
username: String!
}
As an example, consider a User service that has defined User type:
type User @key(fields: "id") {
id: ID! @external
comments: [Comment!]
}
type Comment @key(fields: "id") {
user: User!
...
}
Apollo Gateway
const { ApolloGateway } = require("@apollo/gateway");
const gateway = new ApolloGateway({
serviceList: [
{ name: "articles", url: "https://articles.example.com/graphql" },
{ name: "users", url: "https://users.example.com/graphql" },
{ name: "comments", url: "https://comments.example.com/graphql" }
]
});
Within gateway we register all federated services.
Such a gateway composes the complete graph and, once requested,
executes the federated queries.
Apollo Server
const { ApolloServer } = require("apollo-server");
const server = new ApolloServer(config: {
gateway,
subscriptions: True
});
server.listen(opts: 3000).then(({url}) => {
console.log("server is ready at ${url}");
});
Apollo Server provides its own in-browser IDE that allows to expose
details on Apollo Federation
Graphene Federation by Preply
https://github.com/preply/graphene-federation
Graphene Federation provides an implementation of Appollo Federation
specifications on top of Python Graphene.
Supports @key decorator to perform Queries across service boundaries
Graphene Federation
Allows to extend remote types and reference to external fields
@key(fields="id")
class SomeType(graphene.ObjectType):
id = graphene.ID(required=True)
def __resolve_reference(self, info, **kwargs):
# find instance by self.id
return instance
@extend(fields="id")
class SomeType(graphene.ObjectType):
id = external(graphene.ID(required=True))
name = graphene.String(required=True)
import graphene
from graphene_federation import build_schema, key
from myapp.models import User as DjangoUser
@key(fields="id"):
class User(graphene.ObjectType):
id = graphene.ID(required=True)
username = graphene.String(required=True)
def __resolve_reference(self, info, **kwargs):
return DjangoUser.objects.get(id=self.id)
}
schema = build_schema(types=[User])
Let’s develop the previous example and design a User service in Python
import graphene
from graphene_federation import build_schema, key, external, extend
@extend(fields="id"):
class User(graphene.ObjectType):
id = external(graphene.ID(required=True))
}
class Comment(graphene.ObjectType):
...
user = graphene.NotNull(User)
def resolve_user(self, info):
return User(id=42)
… and in the Comment service we define User as external
Authentication
Do I know this client?
Before checking the client’s permissions,
we have to figure out who is the client.
Use the standard HTTP authentication methods to put the current
client into the context for GraphQL endpoint.
Authorization
REST uses "all or nothing" access control of the data, but GraphQL do not
https://blog.apollographql.com/authorization-in-graphql-452b1c402a9
Permissions on queries
Check if a particular query is allowed to be executed by the client
Permissions on edges
Check if a particular field is allowed to be returned to the client
Permissions on nodes
Check if a particular instance is allowed to be returned to the client
What client is allowed to see?
Access control
class Credentials(graphene.ObjectType):
access_token = graphene.String(required=True)
@extend(fields="id"):
class User(graphene.ObjectType):
id = external(graphene.ID(required=True))
credentials = graphene.Field(Credentials)
def resolve_credentials(self, info):
# check if client in the context we have required permissions
# to see credentials. If not, return None
return Credentials(access_token="secure token")
}
Do not expose endpoints from the federation directly,
since it will create a vulnerability.
Fault tolerance notes
In microservice architecture, it is possible that some services are not
available due to failures or maintenance. When API Gateway cannot resolve
data of external services because service is down, it will return None.
If we consider the external fields as "required",
the whole entity will not be resolved
Instead, consider to design all external fields to be optional
(required=False), that will allow to return a partial object
Write frontend according to the concept of a fault tolerant system
Fault tolerance notes
{
"data": {
"allArticles": [
{
"id": "1",
"title": "My article",
"comments": [
{
"id": "1",
"text": "Nice article",
}
}
]
}
{
"data": {
"allArticles": [
{
"id": "1",
"title": "My article",
"comments": null
}
]
}
service Comment is up service Comment is down
Final notes
GraphQL is a good fit for complex systems
Solves most of over- and underfetching issues of REST
Allows to speedup the development
Always good documented