Securing a GraphQL endpoint with Azure API Management

Securing a GraphQL endpoint with Azure API Management

Today I'll explore the different Azure API Management (APIM) networking options and discuss the best option for securing a GraphQL endpoint. As I walk through the approach taken, I'll also explain why securing GraphQL with API Management can add a lot of overhead, work, and cost, which needs to be balanced appropriately for your needs.

In this post, I'll also share how to make an endpoint available for public access using APIM. This will include using an Azure App Service (running GraphQL) on a private endpoint within a virtual network (vNet), configured fully, end-to-end with Azure Bicep.

Networking with Azure API Management

Let's get started first with understanding Azure API Management networking types. This is important to grasp, and it took me a few reads to understand it properly.

  1. External Integration: This option involves integrating your API Management instance with an Azure Virtual Network. The API Management endpoints are accessible from the public internet via an external load balancer/public IP(s), and the gateway can access resources within the vNet. This is typically the best practice for public access.
  2. Internal Integration: In this mode, the API Management endpoints are accessible only from within the vNet via an internal load balancer/ dynamic IP. The gateway can still access resources within the vNet, but no public access is possible. This is useful for scenarios needing access to the APIM Developer Portal as it's coupled to the developer and premium Skus only.
  3. Private Endpoint: This involves enabling secure and private inbound connectivity to the API Management gateway (no Development Portal) using a private endpoint. Only inbound traffic can be allowed from peered virtual networks, ExpressRoute, and S2S VPN connections. In theory, deploying something like an Application Gateway to restore public access and ingress the traffic to the Private Endpoint would be possible. However, this becomes counterintuitive to External integration, where that works out of the box. As such, this is the best bet if your requirements do not need public access​ or you cannot afford the premium Sku.

On selecting the right networking option

It's important to remember two things when it comes to these options:

  • Select a Sku carefully by balancing your cost, performance, security requirements, and constraints. Premium Sku APIM management takes roughly $5k AUD a month to run, which can be a big cost for prototyping a new API like GraphQL.
  • Not all Skus support vNet integration.  For example, if you select the Consumption SKU, you'll need to cater for networking security using other means, like creating inbound firewall rules on things like Azure App Services to only allow traffic from the outbound IP addresses of your APIM resource.
  • These options are mutually exclusive (you cannot select a blend of them), so you must choose the one that best fits your needs. This was confusing as Internal and Private endpoints sounded like they could be blended, but that is not the case. For more on this, check out Azure API Management with an Azure virtual network | Microsoft Learn

For the example bicep template in this blog post below, I selected the "External" as our real-world use case needed Public Access for APIM. For development, I also used the development Sku to keep costs low.

One thing to note is that I did a test using an Azure Application Gateway with the Private Endpoint option. Sadly though, the performance was very poor, with latency being upwards of 150ms! Not great for an API.

GraphQL deployed in Azure

Typically, when a development team builds a GraphQL API in the context of Microsoft Azure, they'll deploy this in either an Azure Function or Azure App Service. By default, these services are publically accessible, which is fine for many solutions, but for GraphQL, this can be quite dangerous.

The danger is in the fundamental intent of GraphQL design; it's meant to be one API to access (and even mutate) all your data. If your development team has done that, GraphQL could be an easy entry point for any hacker if compromised if made publicly available without any coarse-grain protection. Worse still, you don't even need to be a hacker to be malicious.  It could just be a poorly implemented application consuming the GraphQL API that runs up costs and drives up the performance of your backend databases and other APIs into the ground.

This is where APIM make a lot of sense. APIM can rate-limit and protect against a lot of malicious attacks. It can even be set up to support something like cost-query analysis with custom rules.

Deploying APIM

So word of warning, APIM is a beast of a 'resource'. Much like many PaaS resources in Microsoft Azure, APIM is just a collection of virtual machines that Microsoft manages and they take some time to spin up. For example, when testing many of the Bicep templates early on, APIM would take up to 1.5 hours to provision on the first run, and sometimes, the resource wouldn't successfully deploy at all. This resulted in having to delete the entire resource group to remove the 'broken' resource!  

Example Deployment

Now that we have explained the nuts and bolts of APIM and how it can work with a GraphQL API endpoint, we'll share three (3) bicep templates to configure APIM with a GraphQL endpoint.

We're assuming you have deployed an Azure App Service with a Private Endpoint configured with the GraphQL API deployed and listening on the URL path of /graphql. Microsoft's official documentation is very comprehensive if you need guidance on how to do that. See Connect privately to an App Service apps using private endpoint | Microsoft Learn.

First, here is the apiManagement.bicep file. This

  • Deploys APIM
  • Configures App Insights for logging
  • Disables old ciphers to uplift security
// API Management
@description('The name of the API Management service instance')
param apiManagementServiceName string = 'apiservice${uniqueString(resourceGroup().id)}'

@description('The email address of the owner of the service')
@minLength(1)
param publisherEmail string

@description('The name of the owner of the service')
@minLength(1)
param publisherName string

@description('The pricing tier of this API Management service')
@allowed([
  'Basic'
  'Consumption'
  'Developer'
  'Premium'
  'Standard'
])
param sku string = 'Developer'

@description('The instance size of this API Management service.')
@allowed([
  1
  2
])
param skuCount int = 1

@description('The way in which the API Management service will listen on the Virtual Nertwork')
@allowed([
  'Internal'
  'External'
])
param virtualNetworkType string = 'External'

@description('The Public IP resource Id to allow APIM to receive internet traffic')
param publicIpId string = ''

@description('Location for all resources.')
param location string = 'australiaeast'

@description('The Subnet resource Id to allow APIM to be on a internal virtual network to access resources with Private EndPoints')
param subnetId string = ''

@description('The Application Insights instance name for API Manangement logging.')
param appInsightsName string = ''

@description('The Application Insights resource Id for API Manangement logging.')
param appInsightsResourceId string = ''

@description('The Application Insights instrumentation key for API Manangement logging.')
param appInsightsInstrumentationKey string = ''

// Custom properties to disable all the old ciphers. Note consumption sku plans do not allow this to be controlled'
var customProperties = (sku != 'Consumption') ? {
  'Microsoft.WindowsAzure.ApiManagement.Gateway.Security.Protocols.Tls11': 'false'
  'Microsoft.WindowsAzure.ApiManagement.Gateway.Security.Protocols.Tls10': 'false'
  'Microsoft.WindowsAzure.ApiManagement.Gateway.Security.Backend.Protocols.Tls11': 'false'
  'Microsoft.WindowsAzure.ApiManagement.Gateway.Security.Backend.Protocols.Tls10': 'false'
  'Microsoft.WindowsAzure.ApiManagement.Gateway.Security.Backend.Protocols.Ssl30': 'false'
  'Microsoft.WindowsAzure.ApiManagement.Gateway.Protocols.Server.Http2': 'false'
  'Microsoft.WindowsAzure.ApiManagement.Gateway.Security.Ciphers.TripleDes168': 'false'
  'Microsoft.WindowsAzure.ApiManagement.Gateway.Security.Protocols.Ssl30': 'false'
} : {}

resource apiManagementService 'Microsoft.ApiManagement/service@2022-08-01' = {
  name: apiManagementServiceName
  location: location
  tags: tags
  sku: {
    name: sku
    capacity: (sku != 'Consumption') ? skuCount : 0
  }
  properties: {
    publisherEmail: publisherEmail
    publisherName: publisherName
    virtualNetworkType: (sku != 'Consumption') ? virtualNetworkType : null
    publicIpAddressId: (sku != 'Consumption') ? !empty(publicIpId) ? publicIpId : null : null
    publicNetworkAccess: (sku != 'Consumption') ? !empty(publicIpId) ? 'Enabled' : 'Disabled' : 'Enabled'
    notificationSenderEmail: 'apimgmt-noreply@mail.windowsazure.com'
    virtualNetworkConfiguration: (sku != 'Consumption') ? !empty(subnetId) ? { subnetResourceId: subnetId } : null : null
    customProperties: customProperties
    disableGateway: false
    apiVersionConstraint: {
      minApiVersion: '2019-12-01'
    }
  }
}

resource apiManagementServiceLogger 'Microsoft.ApiManagement/service/loggers@2022-08-01' = if (!empty(appInsightsName)) {
  parent: apiManagementService
  name: appInsightsName
  properties: {
    loggerType: 'applicationInsights'
    credentials: {
      instrumentationKey: appInsightsInstrumentationKey
    }
    isBuffered: true
    resourceId: appInsightsResourceId
  }
}

output apiManagementResourceId string = apiManagementService.id
output apiManagementLocation string = apiManagementService.location
output apiManagementName string = apiManagementService.name
output apiManagementGatewayUrl string = apiManagementService.properties.gatewayUrl

Note: Omitted from this template are Log Analytics and Storage diagnostics, something I'd highly encourage to be configured. However, I omitted it to shorten the code snippet.

Now, here is the second bicep file to deploy the API itself. We've called this apiManagement.API.bicep. This template:

  • Takes the URL of an Azure App Service and appends the /graphql path.
  • Configures the API with the right settings, like enforcing the use of HTTPS
  • Configures the App Service to be registered with APIM for end-to-end coupling.
// API Management
@description('The name of the API Management service instance')
param apiManagementServiceName string = 'apiservice${uniqueString(resourceGroup().id)}'

@description('Location for all resources.')
param location string = 'australiaeast'

@description('The email address of the owner of the service')
@minLength(1)
param publisherEmail string

@description('The name of the owner of the service')
@minLength(1)
param publisherName string

// API Management exposed interal API
@description('The internal App Service or Function App name that exposes the internal API to be used by APIM')
param webAppName string = ''

@description('The internal App Service or Function App URL that exposes the internal API to be used by APIM')
param webAppURL string = ''

@description('The internal API revision number, incremented by 1, as a string')
param api_revision string = '1'

@description('The internal API description, as seen in the APIM Developer Portal')
param api_description string = ''

@description('The internal API type hosted on the App Service or Function App')
@allowed([
  'graphql'
  'http'
  'soap'
  'websocket'
])
param api_type string = 'graphql'

@description('The internal API protocol hosted on the App Service or Function App')
@allowed([
  'http'
  'https'
  'ws'
  'wss'
])
param api_protocols array = [ 'https' ]

@description('Determines if a subscription key is required to use this API')
param api_subscriptionRequired bool = true

@description('Register the API App Service to only recieve requests from API Management. Note this does not support on Consumption only skus, use api_FirewallsIPs instead')
param api_registerAPIM bool = true

@description('The path in the WebApp for the API endpoint.')
param webAppPath string = 'graphql'

@description('The path in APIM for the API endpoint.')
param api_path string = 'graphql'

@description('The Application Insights instance name for API Manangement logging.')
param appInsightsName string = ''

resource apiManagementService_API 'Microsoft.ApiManagement/service/apis@2022-08-01' = {
  name: '${apiManagementServiceName}/${webAppName}'
  properties: {
    displayName: '${webAppName} API connection'
    apiRevision: api_revision
    subscriptionRequired: api_subscriptionRequired
    description: api_description
    contact: {
      name: publisherName
      email: publisherEmail
    }
    serviceUrl: '${webAppURL}/${webAppPath}'
    path: api_path
    protocols: api_protocols
    authenticationSettings: {
      oAuth2AuthenticationSettings: []
      openidAuthenticationSettings: []
    }
    subscriptionKeyParameterNames: api_subscriptionRequired ? {
      header: 'Ocp-Apim-Subscription-Key'
      query: 'subscription-key'
    } : null
    type: api_type
    isCurrent: true
  }
}

resource apiManagementServiceLogger 'Microsoft.ApiManagement/service/apis/diagnostics@2022-08-01' = if (!empty(appInsightsName)) {
  parent: apiManagementService_API
  name: 'applicationinsights'
  properties: {
    alwaysLog: 'allErrors'
    httpCorrelationProtocol: 'Legacy'
    verbosity: 'information'
    logClientIp: true
    loggerId: resourceId('Microsoft.ApiManagement/service/loggers', webAppName, appInsightsName)
    frontend: {
      request: {
        headers: []
        body: {
          bytes: 0
        }
      }
      response: {
        headers: []
        body: {
          bytes: 0
        }
      }
    }
    backend: {
      request: {
        headers: []
        body: {
          bytes: 0
        }
      }
      response: {
        headers: []
        body: {
          bytes: 0
        }
      }
    }
  }
}

resource webAppExisting 'Microsoft.Web/sites@2021-02-01' existing = {
  name: webAppName
}

resource webAppApiManagement 'Microsoft.Web/sites/config@2022-09-01' = if (api_registerAPIM) {
  name: 'web'
  parent: webAppExisting
  properties: {
    apiManagementConfig: {
      id: resourceId('Microsoft.ApiManagement/service@2022-08-01', apiManagementServiceName)
    }
  }
}

output apiManagementApiPath string = apiManagementService_API.properties.path

Finally, here is a bicep template called apiManagement.Policy.bicep. This template:

  • Set the rate limits based on ten (10) calls per hour from the same source IP address.
  • Can support other custom policies and settings (like CORS) if we want it to.
// API Management
@description('The name of the API Management service instance')
param apiManagementServiceName string = 'apiservice${uniqueString(resourceGroup().id)}'

@description('The internal App Service or Function App name that exposes the internal API to be used by APIM')
param webAppName string = ''

@description('The APIM policy name')
param apiManagementPolicyName string

@description('The APIM policy in XML')
// '''
// <policies>
//     <inbound>
//         <rate-limit calls="10" renewal-period="60" increment-condition="(context.Request.IpAddress != null)" >
//             <counter-key>@(context.Request.IpAddress)</counter-key>
//         </rate-limit>
//         <cors allow-credentials="true">
//             <allowed-origins>
//                 <origin>https://studio.apollographql.com</origin>
//             </allowed-origins>
//             <allowed-methods>
//                 <method>GET</method>
//                 <method>POST</method>
//             </allowed-methods>
//             <allowed-headers>
//                 <header>*</header>
//             </allowed-headers>
//             <expose-headers>
//                 <header>*</header>
//              </expose-headers>
//         </cors>
//         <base />
//     </inbound>
//     <backend>
//         <base />
//     </backend>
//     <outbound>
//         <base />
//     </outbound>
//     <on-error>
//         <base />
//     </on-error>
// </policies>
// '''
param apiManagementPolicyXML string = '''
<policies>
    <inbound>
        <base />
        <rate-limit-by-key 
            calls="10"
            renewal-period="60"
            counter-key="@(context.Request.IpAddress)" />
    </inbound>
    <backend>
        <base />
    </backend>
    <outbound>
        <base />
    </outbound>
    <on-error>
        <base />
    </on-error>
</policies>
'''

resource apimPolicy 'Microsoft.ApiManagement/service/apis/policies@2022-08-01' = {
    name: '${apiManagementServiceName}/${webAppName}/${apiManagementPolicyName}'
    properties: {
        value: apiManagementPolicyXML
        format: 'xml'
    }
}

output policyId string = apimPolicy.id

Some important notes about APIM bicep deployment

<when condition="(context.Request.Body.As<JObject>().SelectToken("$.operationName")?.ToString() == "agenda" || context.Request.Body.As<string>().Contains("agenda")))">
                <cache-lookup vary-by-developer="false" vary-by-developer-groups="false" downstream-caching-type="none">
                    <vary-by-header>Accept</vary-by-header>
                    <vary-by-header>Accept-Charset</vary-by-header>
                    <vary-by-query-parameter>query</vary-by-query-parameter>
                </cache-lookup>
            </when>
Example code snippet of caching introspecting the JSON object
  • minApiVersion is very important for diagnostics support. Currently, APIM diagnostic config fails when the version is higher than 2019-12-01. See Tutorial - Monitor published APIs in Azure API Management | Microsoft Learn
  • Ironically minApiVersion has an upcoming breaking change in October of this year. We're referencing this here as these templates may break in the near future because of change. See Azure.APIM.MinAPIVersion - PSRule for Azure
  • In Azure Portal, the screen that tries to load the schema from the GraphQL endpoint fails. This is because the portal is trying to access the endpoint directly, which fails because the App Service is on a private endpoint. The solution deployed still works fine. Just that you won't be able to validate on this screen.
Error message as shown in the Azure Portal for the GraphQL API endpoint (schema page)

Conclusion

In wrapping up, deploying and securing a GraphQL API with Azure API Management can be complex, but it can offer valuable pay-offs. The choice of networking configuration is paramount, serving as a crucial balancing act between cost, performance, and security.

Whilst APIM might add a bit of overhead and cost, its benefits in terms of security and control can make it well worth your while, especially when dealing with GraphQL APIs that grant access to sensitive or critical data. Further enhancements, such as caching, could substantially improve the performance of a GraphQL API in a production environment.

Happy deploying!