Hiding your AppSync GraphQL Introspection endpoint using AWS Web Application Firewall (WAF) rules

AWS AppSync is in my opinion one of the most underrated AWS services. I have found GraphQL APIs easier to manage at scale (100s of queries, mutations, subscriptions) with less overall code, cognitive load, complexity, points of failure, and less overall toil than back in the day with REST APIs. The relative simplicity of a GraphQL API, especially when backed against a NoSQL data store, with the choice of resolvers in VTL (fast) or Lambda (more expressive) interacting with a variety of data sources ultimately makes for a more streamlined developer experience.

Throw in the serious performance that AppSync can achieve, along with the tight integration with other AWS Services (for example, decorator-based resolving of certain schema keys based on the users Cognito groups) and it's clear AppSync is a huge step forward for developing and evolving web & app APIs.

The Introspection Endpoint

All GraphQL APIs have an introspection endpoint. This is an endpoint that allows queries on the metadata of the schema for that particular GraphQL API. AppSync exposes this endpoint by default.

From the Apollo blog:

We believe that introspection should primarily be used as a discovery and diagnostic tool when we’re in the development phase of building out GraphQL APIs. While we don’t often use introspection directly, it’s important for tooling and GraphQL IDEs like Apollo Studio, GraphiQL, and Postman. Behind the scenes, GraphQL IDEs use introspection queries to power the clean user experience helpful for testing and diagnosing your graph during development.

In a recent penetration test for a client, it surfaced that our introspection endpoint was queryable. I would go ahead and assume most penetration testers would raise this as an issue, as ultimately this API is a single place that does expose the entire surface area of your GraphQL API, which could aid bad actors in constructing a more advanced multi-faceted attack. Without this endpoint, attackers would have to laboriously move through your front-end apps and monitor network traffic, and it would be more difficult to construct the entire picture of your GraphQL API.

So it makes sense to only leave this introspection endpoint available in your development environments, and somehow turn it off in your production environment (and any other public environments). This is also the general guidance from the Apollo developer team.

Luckily one of the many AWS services that integrates well with AppSync is Web Application Firewall or "WAF". With only a little more CDK code over and above the provisioning of our API, we can construct firewall rules that will block queries to the introspection endpoint.

Let's first provision our basic AppSync API:

const api = new appsync.GraphqlApi(this, "Api", {
      name: "UnintrospectableAppSyncApi",
      schema: appsync.SchemaFile.fromAsset(
        path.join(__dirname, "schema.graphql")
      ),
      authorizationConfig: {
        defaultAuthorization: {
          authorizationType: appsync.AuthorizationType.API_KEY,
        },
      },
    });

    new CfnOutput(this, "GraphQLAPIURL", {
      value: api.graphqlUrl,
    });

And with a simple cdk deploy we are up and running.

 ✅  AppsyncIntrospectionWafStack 

✨  Deployment time: 37.23s

Outputs:
AppsyncIntrospectionWafStack.GraphQLAPIURL = https://i33yq5nw3fgkxixzek3ixynz4u.appsync-api.us-east-1.amazonaws.com/graphql
Stack ARN:
arn:aws:cloudformation:us-east-1:XXX:stack/AppsyncIntrospectionWafStack/a4704550-c76c-11ed-80ea-0a701dd151a5

Now it is worth noting that your AppSync API Introspection endpoint is under the same authentication as the rest of your API. In this case, we're using an API key for simplicity. That is of course some protection, but if you have a public API or your API callers need a Cognito account, but anybody can sign up for your app, then it is still very easy for nefarious actors to get access to your introspection endpoint contents.

With our API up we can now query the introspection endpoint:

curl --location --request POST 'https://i33yq5nw3fgkxixzek3ixynz4u.appsync-api.us-east-1.amazonaws.com/graphql' \
--header 'x-api-key: da2-coib25utiffj3gzioeskpkl3ka' \
--header 'Content-Type: application/json' \
--data-raw '{"query":"query MyQuery {\n  __schema {\n    types {\n      name\n    }\n  }\n}","variables":{}}'

Note the __schema token in the body of the request.

And in return we see our entire GraphQL schema:

{"data":{"__schema":{"types":[{"name":"Query"},{"name":"Blog"},{"name":"ID"},{"name":"String"},{"name":"BlogInput"},{"name":"__Schema"},{"name":"__Type"},{"name":"__TypeKind"},{"name":"__Field"},{"name":"__InputValue"},{"name":"Boolean"},{"name":"__EnumValue"},{"name":"__Directive"},{"name":"__DirectiveLocation"}]}}}

Now let's provision our WAF layer to hide the introspection endpoint.

const firewall = new waf.CfnWebACL(this, "waf-firewall", {
      defaultAction: {
        allow: {},
      },
      description: "Block GraphQL introspection queries",
      scope: "REGIONAL",
      visibilityConfig: {
        cloudWatchMetricsEnabled: true,
        metricName: "BlockIntrospectionMetric",
        sampledRequestsEnabled: true,
      },
      rules: [
        {
          name: "BlockIntrospectionQueries",
          priority: 0,
          action: {
            block: {},
          },
          visibilityConfig: {
            sampledRequestsEnabled: true,
            cloudWatchMetricsEnabled: true,
            metricName: "BlockedIntrospection",
          },
          statement: {
            byteMatchStatement: {
              fieldToMatch: {
                body: {},
              },
              positionalConstraint: "CONTAINS",
              searchString: "__schema",
              textTransformations: [
                {
                  type: "LOWERCASE",
                  priority: 0,
                },
              ],
            },
          },
        },
      ],
    });

    new CfnWebACLAssociation(this, "web-acl-association", {
      webAclArn: firewall.attrArn,
      resourceArn: api.arn,
    });

Fairly straightforward stuff. We provision an instance of CfnWebACL with the relevant config, and then an instance of CfnWebACLAssociation to associate that Firewall with our AppSync API.

Note though that we lowercase the "__schema" string, as the matching is case-sensitive. See this unrelated but interesting AppSync vulnerability discovered by the DataDog Security team as a reminder of what can happen when you don't take casing into account!

With another cdk deploy we have our firewall up and running.

Now when we try the same curl command above, we get a 403 Forbidden error.

curl --location --request POST 'https://i33yq5nw3fgkxixzek3ixynz4u.appsync-api.us-east-1.amazonaws.com/graphql' \
--header 'x-api-key: da2-coib25utiffj3gzioeskpkl3ka' \
--header 'Content-Type: application/json' \
--data-raw '{"query":"query MyQuery {\n  __schema {\n    types {\n      name\n    }\n  }\n}","variables":{}}'

{
  "errors" : [ {
    "errorType" : "WAFForbiddenException",
    "message" : "403 Forbidden"
  } ]
}

And that's it! One less tool at the attacker's disposal. And one less thing for your penetration testers to eagerly point out. 😎

As always, the full code is available on Github.