GraphQL API Vulnerabilities, Common Attacks & Security Tips

GraphQL API Vulnerabilities, Common Attacks and Security Tips

Developed in 2012 and made open source in 2015 by Facebook, GraphQL (Graph Query Language) has been under the umbrella of the GraphQL Foundation since 2019.

GraphQL is a query language, i.e. a language used to access data in a database or any other information system, in the same way as SQL (Structured Query Language).

It is also an SDL (Specification and Description Language). There is no official implementation provided by its creators. The various existing implementations (Apollo Server, Express GraphQL, graphql-yoga, etc.) follow the specifications linked to GraphQL.

Like all APIs, GraphQL enables data to be transferred between a client and a server. It is an alternative to REST (Representational State Transfer) APIs. The main advantage of GraphQL is that it can provide an application through a single request, delegating the task of structuring the data to the server.

In 2023, a survey conducted by Postman shows that GraphQL APIs are the third most widely used API architecture. A pentester will therefore regularly be confronted with this type of API during pentests.

In this article, we will look at how GraphQL APIs work, the vulnerabilities and attacks common to this type of system, and the best practices and measures to implement to secure your systems. We will also look at the methodology and tools used during a GraphQL API pentest.

Comprehensive Guide to GraphQL APIs

Understanding the Structure of GraphQL: Schemas and Types
GraphQL API Testing Methodology
Discovering GraphQL Endpoints
What tools are used during a GraphQL API pentest?
What are the Most Common Vulnerabilities and Attacks on GraphQL APIs?
Common API Vulnerabilities and Attacks
How to Secure a GraphQL API?

Understanding the Structure of GraphQL: Schemas and Types

Before diving into exploiting the various vulnerabilities associated with this type of API, it is important to take the time to understand it.

From an auditor’s point of view, a thorough understanding of how GraphQL works is crucial.

We will therefore begin by detailing the structure and main concepts of GraphQL, before looking at the various possible attack vectors and exploitable vulnerabilities.

GraphQL schema

The schema is the central element of a GraphQL API. It is the fundamental structure, defining all possible interactions between the server and the client. For a given API, there is only one schema which acts as a reference.

The schema specifies the types of requests that a client can submit, the types of data that can be retrieved from the server, and the relationships between these different types of data.

All the data handled is organised by “type”. We are now going to look at the most commonly encountered types and those that will be the focus of our attention during a pentest.

Scalar types

The scalar type is the most elementary data type in GraphQL. It represents the primitive data assigned to the field:

String: a character string
Int: an integer
Float: a number with a decimal point
Boolean: a true/false value
ID: a unique identifier

The following example illustrates the use of scalar types to define the Book object, which has three fields of types String, String and Int respectively:

These are the building blocks for the more complex types we will see later.

Object types

Object types are the central construct for modelling complex data structures with GraphQL. The vast majority of types defined in a GraphQL API are objects.

An object is made up of fields, each with its own associated data type. This type can be a scalar (String, Int, etc.) for primitive data, or another object to represent nested data.

Other more advanced types such as enumerations, unions or interfaces can also be used, but we won’t go into them here.

Taking the previous example, Book is an object made up of scalar fields only.

We can imagine a more complex object, Library. This will allow us to introduce lists and non-nulls, and to discover a slightly more complex implementation of an object.

The notation [Type!] indicates a list whose elements must be of type Type. The exclamation mark indicates that this field cannot be null (be careful not to confuse an empty list with a null list).

Objects are essential building blocks in GraphQL, enabling data models to be structured.

Special types: queries and mutations

While most of the types defined in a GraphQL schema are objects, there are also two special types: queries and mutations.

These are the types we will use to retrieve or modify data.

The Query type

A Query is used to retrieve data from the GraphQL server. It follows the founding principle of GraphQL by returning only the fields explicitly requested in the query.

Here is an example of a basic query to retrieve certain fields from the Book object:

Here we have a very basic query, called GetLibrary, where we explicitly request certain fields. In response, we’ll get only the fields requested.

Queries can be much more complex by adding arguments, variables, fragments, directives and so on. We won’t go into these details here.

The Mutation type

Mutations, on the other hand, can be used to modify data on the server side (create, update, delete). A Mutation can also return data in its response.

Let’s say we wanted to add a new book to our library. We could use a mutation like this:

Example of an “AddBookInLibrary” mutation

This mutation allows us to add a book to a library. AddBookInLibrary specifies the expected fields for this mutation, and addBook is the field that will be executed.

In return, we request the title field, which will be returned in the response if the request is successful.

Like Queries, Mutations can be very complex and nested as required. But this simple example illustrates their main role of modifying data.

We have now understood the basis of GraphQL. A thorough understanding of these concepts is essential to mastering this language and optimising penetration tests on APIs using it.

GraphQL API Testing Methodology

Before getting to the heart of the matter, it is important to remember that the pentesting of a GraphQL API generally follows the same rules as for other types of API.

The testing methodology will therefore be similar. However, as GraphQL has its own specificities, certain stages and attack vectors are unique.

A reconnaissance phase is necessary to assess the attack surface, unless the client provides full documentation of the API. We will describe the tools and techniques available to us below.

Once we have this information, we’ll move on to the vulnerability identification part. Here we will test vulnerabilities common to all APIs (injections, access control flaws, exposure of accessible data, etc.) as well as vulnerabilities specific to GraphQL.

To find out more about the objectives and testing methodology of an API pentest, we refer you to our dedicated article: API Penetration Testing: Objective, Methodology, Black Box, Grey Box and White Box Tests.

Discovering GraphQL Endpoints

The first crucial step in pentesting a GraphQL API is to discover its endpoint. This is not always obvious, particularly if the API is only used for certain functions or for particular roles.

So we’re going to fuzz our target to find the endpoint, using this Seclists wordlist for example, with the query body “query{__typename}”.

If we get a response containing {“data”:{“__typename”: “Query”}}, then this will confirm the presence of a GraphQL API on the URL under test.

Alternatively, you can simply use the application legitimately and the graphql endpoint will be discovered.

Once the endpoint has been discovered, we can begin the schema discovery and enumeration.

What tools are used during a GraphQL API pentest?

During a GraphQL API pentest, auditors can rely on various tools to facilitate their task at different stages.

Here is an overview of the main tools you need to know about:

Introspection

Introspection is not a tool, but a GraphQL feature. It is used to retrieve the complete schema of an API, which defines its data structure.

We will use this introspection request:

{__schema{queryType{name}mutationType{name}subscriptionType{name}types{…FullType}directives{name description locations args{…InputValue}}}}fragment FullType on __Type{kind name description fields(includeDeprecated :true){name description args{…InputValue}type{…TypeRef}isDeprecated deprecationReason}inputFields{…InputValue}interfaces{…TypeRef}enumValues(includeDeprecated :true){name description isDeprecated deprecationReason}possibleTypes{…TypeRef}}fragment InputValue on __InputValue{name description type{…TypeRef}defaultValue}fragment TypeRef on __Type{kind name ofType{kind name ofType{kind name ofType{kind name ofType{kind name ofType{kind name ofType{kind name ofType{kind name}}}}}}}}

If enabled, this request will return a JSON response detailing all the types, fields, arguments, etc. defined in the schema.

Response containing the introspection schema

Having access to the complete schema is extremely useful, and enables the enumeration phase to be completed.

Once in our possession, it is possible to identify potentially sensitive areas or data that should not be exposed publicly.

However, for security reasons, introspection is generally disabled on GraphQL APIs in production (in theory).

Clairvoyance

When schema introspection is disabled on a GraphQL API, the Clairvoyance tool can be used as an alternative to attempt to reconstruct the schema.

It works by using a wordlist.

By sending a number of requests, the tool relies on GraphQl’s suggestions feature. It should be noted that suggestions may not be returned on the client side, rendering this tool totally ineffective.

By analysing these responses, Clairvoyance is able to reconstruct part of the API schema in the form of JSON.

Although less complete than introspection, this technique provides a good overview of the schema.

GraphQL Voyager

GraphQL Voyager is a valuable tool that can be used to visualise the schema of a GraphQL API, based on the schema that we were able to retrieve using introspection or Clairvoyance.

It is possible to retrieve an introspection query from its interface.

Once the schema has been loaded, the tool will generate a graphical representation of the schema structure. All types, fields and relationships are displayed, making it much easier to understand the API architecture.

Graphic representation of the introspection schema

A panel at the bottom left can be used to explore the list of available queries, mutations and subscriptions in greater detail.

In short, GraphQL Voyager is very useful for quickly identifying areas of interest to investigate.

Postman

Postman is a tool initially designed for developing and testing APIs. Its use during an audit can prove to be a valuable asset, facilitating the repetition of requests with its intuitive interface.

Once the GraphQL endpoint has been configured in Postman, it will automatically perform an introspection request and generate interactive documentation of the API. All types, queries, mutations and subscriptions are listed.

The dedicated interface lets you build and replay queries by selecting the required fields and arguments. Variables can also be easily defined.

Interface listing all available operations

The work of an auditor can therefore be greatly simplified by using this tool for an API pentest.

InQL

InQL is an extension to the Burp Suite proxy designed for testing GraphQL APIs. In terms of features, it is similar to Postman. It can be installed from the Burp BApp Store, in the Extensions section.

It is possible to perform an introspection request to retrieve the schema in two different ways.

The first, directly from a GraphQL:

Generation of the introspection query from the Proxy

And the second, from the InQL tab:

Generation of the introspection query from the extension interface

Whichever method we use, we can then access the list of queries and mutations from the extension interface. It lists the parameters required and the fields available.

We can then send these queries directly to the Repeater panel, in order to run our tests on the chosen routes.

graphql-cop

Graphql-cop lists the main vulnerabilities likely to be present in a GraphQL API. It is an open source tool available on github. However, it has not been maintained since 2022.

It is very easy to use:

By simply specifying the URL of our target, the tool will attempt to find the default graphql paths. Note that the list of paths is very limited:

It is therefore preferable not to rely on this tool to find the graphql endpoint.

A series of tests will then be carried out, including:

Introspection
Suggested fields
Batched queries
Alias overload
And many more.

Once this has been done, we can look at the list of vulnerabilities to which our target is vulnerable.

graphw00f

The last tool we will discuss here is graphw00f. As mentioned previously, GraphQL has no implementation of its own and requires an execution engine to work.

This is where the graphw00f tool comes in. It works by sending specially designed GraphQL requests to the API and will be able to identify the engine thanks to unique signatures present in the errors returned or in the metadata.

From there, we can refer to this table, which identifies potential vulnerabilities depending on the implementation:

List of potential vulnerabilities linked to the GraphQL implementation

As mentioned with graphql-cop, the list of endpoints to find the path is limited:

It is therefore preferable to identify it before using it, or to specify a custom wordlist to use using the -w flag. We’ll use -t to specify the URL of our target, and we’ll get output like this on successful identification:

In this example, the Graphene engine has been identified.

What are the Most Common Vulnerabilities and Attacks on GraphQL APIs?

At this point, the enumeration of our target is complete, and we now need to move on to the phase of testing the available routes.

Many tests are carried out during our audits, and we won’t present them all here, but we’ll look at some of them to better understand how they work.

Some of these tests will be directly linked to the architecture and operation of GraphQL, while others will concern the APIs in general.

Denial of Service (DoS) attacks

DoS attacks aim to overload the targeted server, with the aim of making it much slower or even inaccessible for the duration of the attack.

This has a major impact on the experience of other users, preventing them from using it, and also damaging the company’s image.

GraphQL APIs can be particularly vulnerable to this type of attack if there is no limit to the depth of the querys.

To illustrate this, consider the following image:

When we make a getPaste request referencing an owner, itself referencing pastes, and so on, this request can become extremely heavy for the server. Such a structure risks compromising the smooth running of the application.

The number of objects that the server will return is exponential with the level of depth, which causes overloading.

We can see that the server was momentarily overloaded, which led to an extremely long response (almost 12 seconds). During this time, the server was unavailable.

This was because the query was particularly demanding in terms of resources. An even deeper query would have rendered the server unavailable for a longer period of time, and could even have caused a complete shutdown.

It should be borne in mind that these tests were carried out locally on our own machine, which explains the significant impact observed. As a general rule, corporate servers are more robust and can handle several simultaneous requests (in theory).

However, let’s consider a scenario where this attack is launched from several machines at the same time; this could lead to the same consequences.

Batched queries and aliases

Batched queries and aliases are another aspect that can create vulnerabilities in GraphQL. Although they can be considered from the point of view of denial of service, in this example we are going to use them in a brute force attack scenario.

Let’s imagine that a client has set up an HTTP request limit on a login form, allowing only 10 requests per second to be sent from the same IP address. This measure is designed to counter brute force attacks on the login form.

If batched queries or aliases are authorised, this will not be enough to stop the attacker. Attackers could send a multitude of requests in a single HTTP request, thus circumventing the limitation and succeeding in their attack.

Batched queries, sending 100 login attempts with a single HTTP request

Mutation using aliases, sending 100 login requests with a single HTTP request

This method considerably speeds up a brute force attack. By sending 100 requests, each containing 100 aliases or batched queries, the attacker already makes 10,000 authentication attempts.

Multiplying the number of requests in a single HTTP request increases the effectiveness of the attack exponentially.

Broken Authentication & Authorization

It is common practice to use whitelists or blacklists to authorise or prohibit users from making GraphQL requests. However, this approach can be vulnerable if it is not implemented correctly.

Consider a GraphQL API that exposes a systemHealth request that checks the state of the server. This request requires authentication as an administrator. Without a valid authentication token, the request is rejected, which is expected behaviour.

SystemHealth request rejected without authentication

However, it is possible to bypass this restriction by using an authorised operation name without an authentication token, while querying the systemHealth field which should be protected.

For example, suppose an unauthenticated user is authorised to execute a getPastes request. An attacker could then send the following request:

The problem presented here is that the operation name requested is subject to an authorisation check, whereas the resource requested is not.

In this request, the operation name getPastes is authorised for an unauthenticated user. However, the attacker has also included the systemHealth field, which he should not have access to.

The problem here is that the authorisation check is performed only on the operation name, while the resources requested are not checked individually.

Common API Vulnerabilities and Attacks

Although GraphQL introduces specific vulnerabilities, it should not be forgotten that a GraphQL API is fundamentally a web API.

As such, it can be exposed to the same types of vulnerabilities as other types of APIs. We are going to review 3 vulnerabilities likely to be encountered during an audit.

Stored XSS (Cross-Site Scripting)

A stored Cross-Site Scripting (XSS) vulnerability can have serious consequences. It allows an attacker to inject malicious JavaScript code that will be stored server-side, usually in a database. This malicious code will then be executed in any user’s browser, where the malicious value will be reflected.

The consequences can range from web page defacing to user account theft.

To find out more about XSS vulnerabilities, please see our article: XSS (Cross-Site Scripting) vulnerabilities: principles, types of attacks, exploitations and security best practices.

Let’s take the example of a forum platform where users can publish messages. These messages are then taken up and displayed on the web application’s public pages.

The expected use of this feature would look like this:

Now let’s add a malicious user, who decides to test this application and posts a malicious message. We can see that the comment is sent using the API, but we can also see that it is stored without having validated the user’s input.

The infected content is then taken over and displayed publicly, without encoding the special characters, and so the Javascript code is interpreted and then executed, leaving legitimate users of the site vulnerable:

Our injection is triggered by visiting the publicly exposed page

Arbitrary File Upload & Path Traversal

Another type of vulnerability that can be encountered during an API pentest is an arbitrary file upload combined with a path traversal.

If correctly executed and the conditions are met, this vulnerability can have disastrous consequences, allowing malicious files to be written to the server.

During our exploration we discovered a mutation that allows files to be uploaded to the server. This mutation expects 2 input fields, the name of the file and its content, and will return the result.

UploadPaste mutation, visualised with GraphQL Voyager

Uploading the MyFile file with a mutation

A classic upload works perfectly normally, and a response is sent back with the contents of the file that has been uploaded.

If you take a look at what’s happening on the server side, you’ll see that everything went well, and you’ll find this new file at /opt/dvga/pastes:

By changing the filename to “../../../tmp/pwn”, the query still works but we see that the file is not present in the expected path. Instead, we can find it in the /tmp folder.

If we look at the source code responsible for saving our file, we can see that there is no check on the file name, which explains its presence in the other folder.

Remote Command Execution (RCE)

One of the most critical vulnerabilities that can be found during a pentest is the execution of a command on a remote server or RCE. This can lead to a number of consequences, including data theft, privilege escalation, setting up a backdoor, etc.

We are going to use the importPaste mutation to exploit this vulnerability.

Originally, this functionality was designed to import data from a URL entered by the user, in order to save it on the server.

The implementation of such a feature needs to be particularly careful from a security point of view, and this is the kind of feature that auditors will be particularly interested in.

Here’s the implementation:

The corresponding GraphQL mutation expects different parameters, host, port, path and scheme. So a ‘well-formed’ query might look like this:

ImportPaste mutation requesting http://example.com:80

Now consider that a malicious user decides not to comply with what is expected, and replaces ‘/’ in the path parameter with ‘/; sleep 10’.

In this way, the final command executed by the server would become: “curl http://example.com:80/; sleep 10”, and both commands would be executed.

If we look at the server side, we can confirm this theory.

An attacker could thus attempt to escalate his privileges, pivot in the internal network, interrupt the smooth running of the application, etc.

How to Secure a GraphQL API?

We have now taken a look at GraphQL, how it works and the main vulnerabilities that can be encountered during an audit.

As we have seen, GraphQL is not immune to the attack vectors impacting traditional web APIs, but also includes vulnerabilities specific to its implementation. Rigorous security measures must therefore be put in place.

General recommendations

To ensure protection from a global point of view on a GraphQL API, several protections can be put in place.

By default, GraphQL implementations have default configurations that should be changed:

Disable introspection. This option is enabled by default in most implementations, but should be disabled unless it is considered necessary.
Disable GraphiQL. This is an implementation of GraphQL which is useful during the application development phase, but which should not be present in a production application unless it is considered necessary.
This is not necessarily possible on all implementations, or else implies implementing a ‘custom’ solution, but making the suggestions inaccessible to prevent an attacker from widening his scope of attack.

Preventing denial of service

GraphQL is particularly vulnerable to denial of service attacks if it is incorrectly configured. This type of attack impacts the availability and stability of the API, making it slower or even unavailable.

Here are a few recommendations that can be followed to protect against this type of attack:

Define a maximum depth for queries
Define a maximum limit on the amount of data that can be returned in a response
Apply a throughput limit to incoming requests, by IP, user or even both

Preventing batching attacks

As we saw earlier, batching attacks can be used to carry out bruteforce attacks. To defend against this type of attack, it is necessary to impose limits on incoming requests:

Impose a limit on the maximum number of objects in the same request
Limit the number of queries that can be executed at the same time
Prevent object batching on sensitive queries/mutations
One solution may be to create a concurrent request limit at code level on the number of objects a user can request. This way, if the limit is reached, the user’s other requests would be blocked, even if they were contained in the same HTTP request.

User input validation

The user input sent to the GraphQL API must be strictly validated. This input is often reused in multiple contexts, whether HTTP, SQL or other requests. If it is incorrectly validated, this could lead to injection vulnerabilities.

To validate user input, you can follow these recommendations:

Use GraphQL-specific types, such as scalars or enums. Add custom validators if you need to validate more complex data.
Use a whitelist of authorised characters according to the expected data
Reject invalid entries with non-verbose responses, so as to avoid revealing too much information about the operation of the API and its validation.

Sources :

Author : Théo ARCHIMBAUD – Pentester @Vaadata

Partager l'article