Cloudflare Integration
Summary
Section titled “Summary”Runway services with external load balancers currently use Google’s Cloud Armor for DDoS protection when exposed to the internet.
Service owners who prefer Cloudflare protection currently need manual configuration by the infrastructure team, as seen with docs.runway.gitlab.com in config-mgmt.
This blueprint outlines the implementation of a self-service option that will allow service owners to enable Cloudflare protection for their services directly through Runway.
Motivation
Section titled “Motivation”Runway currently lacks self-service Cloudflare integration and doesn’t meet our production readiness criteria for WAF.
Since Cloudflare serves as the CDN for most of our production infrastructure, we aim to enhance Runway by providing service owners with a simple, intuitive way to protect their services using the same Cloudflare technology that safeguards our core production environment.
Example Runway workload interested in this feature: GitLab Secrets Manager
- To enable Runway workloads with an external load balancer to be fronted by Cloudflare by default. Workloads can disable this functionality if desired.
- To protect Runway endpoints with a standard set of WAF rules.
- Initial implementation will not include rate limiting capabilities, but it should establish the foundation necessary to support this feature in future iterations.
Non-goals
Section titled “Non-goals”-
Ability to customize WAF rules.
Rationale: we’re starting with a standardized set of WAF rules for all Runway workloads to ensure consistent protection. As we learn from implementation, we may explore options for customization in future iterations. Workloads requiring highly specialized WAF configurations may benefit from disabling the automated Cloudflare zone functionality described in this Blueprint and, instead, deploy a dedicated Cloudflare zone managed outside of Runway.
-
Ability to set rate limits.
Rationale: while we recognize the value of customizable rate limiting, we’ve prioritized core functionality for this initial iteration. We plan to incorporate rate limiting options in a future update after gathering more detailed requirements and user feedback.
Design
Section titled “Design”Dedicated Cloudflare Account
Section titled “Dedicated Cloudflare Account”Cloudflare API tokens have limited granularity in terms of zones that can be managed and this is undesireable from a security perspective. Since our GitLab account is home to several mission critical zones, we will use a separate Runway account and allow the API token access to zones in the Runway account only. Should the API token become compromised, it will only affect zones in the Runway account.
Cloudflare Zones
Section titled “Cloudflare Zones”Root Zone (svc.gitlab.net
)
Section titled “Root Zone (svc.gitlab.net)”Provisioner will create and manage a Cloudflare zone called svc.gitlab.net
in the Runway account. This zone will mainly be used for managing NS
records pointing to the .svc.gitlab.net
zones.
Additionally, in the config-mgmt
repository, we will add NS
records to gitlab.net
to point at the nameservers for svc.gitlab.net
.
Runway Workloads
Section titled “Runway Workloads”Runway workloads with an external load balancer will, by default, have two Cloudflare zones created for their workload by the Reconciler:
- Production:
<workload name>.svc.gitlab.net
- Staging:
<workload name>.staging.svc.gitlab.net
The origin for these endpoints will be their <workload name>[.staging].runway.gitlab.net
GCP load balancer endpoints.
Workloads will be able to disable the automatic Cloudflare zone creation if desired. For example: AI gateway does not need this functionality as this service is currently routed via Cloud Connector.
Reconciler will provision a standard set of WAF rules (TBD) to each zone. We will work closely with the Foundations team to define the standard set of WAF rules.
Advanced Certificate
Manager
(paid add-on) will be enabled for svc.gitlab.net
and we will use Total TLS to
issue certificates for each proxied hostname.
Restricting inbound traffic
Section titled “Restricting inbound traffic”Runway workloads accept unrestricted inbound traffic by default. For enhanced security, you can configure your workload to accept traffic exclusively from Cloudflare by adding spec.network_policies.cloudflare: true
to your runway.yml
file:
spec: network_policies: cloudflare: true
Any requests directed at the load balancer for your workload (bypassing Cloudflare) will receive a 403
response.
Architecture
Section titled “Architecture”Diagram notes:
NS
records forsvc.gitlab.net
that live ingitlab.net
will be managed out-of-band in theconfig-mgmt
repository.- Provisioner will create/manage the
svc.gitlab.net
zone, which is primarily used for servingNS
records for each.svc.gitlab.net
subdomain. - Reconciler will, by default, create two zones for each workload (
<workload>.svc.gitlab.net
and<workload>.staging.svc.gitlab.net
). It will also create proxied DNS records, standard set of WAF rules, etc.
Observability
Section titled “Observability”We will add a new instance of
cloudflare-exporter
to scrape all zones in the GitLab Runway account. Similar to Cloud
Connector,
we can leverage cloudflare_zone_firewall_events_count
to alert on anomalies.