Image by: Mason @ Loc Cyber
Stop black-box testing your cloud
4/29/2026
The Google Cloud Platform (GCP) is one of the largest cloud providers for hosting infrastructure. It provides a permission model that can quickly get complex, and is a privileged entry-point to a company's infrastructure. Hiring a penetration test from a no-knowledge, no-access perspective (i.e. black-box testing), has some perceived value, but it doesn't match up to the reality of how attackers are compromising Google Cloud. The most common realistic entry-points are:
- Third party compromise. (i.e. Vercel's recent breach)
- Developer compromise. (i.e. accidentally publishing access keys online).
- Weak credentials.
- Misconfigured open services. (Google's default policy constraints on organizations make it harder and harder to misconfigure this).
- Compromise of CI/CD infrastructure or pipelines.
- Lateral movement after application compromise. (Why not put SSH keys in compute metadata, and allow anyone access to compute SSH by default?)
Effective testing of an organisations GCP configurations requires consultants to use the same no-knowledge methodologies built around creative exploration and enumeration, but with privileged access and internal documentation to avoid bottle-necks. This testing, combining a black-box methodology with white-box access is known as gray-box testing.
What Gray-Box Testing Actually Looks Like
Gray-box testing starts with a defined level of initial access meant to mirror a plausible compromise scenario. Depending on the threat model, that might mean:
- A set of leaked service account credentials with an unknown permission scope.
- A developer identity with access typical for engineers on the team.
- A compromised CI/CD pipeline runner with Workload Identity Federation configured.
From there, the methodology looks similar to a black-box approach-enumeration, privilege escalation attempts, lateral movement across projects, assessment of data exposure.
The difference is consultants spend their time on what actually matters: understanding what an attacker can do once they've obtained access. It also allows for more meaningful testing of Identity Access Management (IAM), which tends to be one of the more complex and easy to misconfigure areas in GCP environments. The interaction between primitive roles, predefined roles, custom roles, organisation policies, and resource-level bindings makes "least privilege" easy to claim and hard to actually verify.
Gray-box testing can uncover real-world misconfigurations like iam.serviceAccounts.actAs on a high-privilege
service account, storage.buckets.setIamPolicy on a sensitive bucket or compute.instances.setMetadata
somewhere it shouldn't be.
What Good Testing Tends to Surface
A well-scoped gray-box engagement usually turns up findings across a few areas:
- Identity and access: Over-permissioned identities, unused service accounts with active keys, escalation paths through iam.serviceAccounts.actAs or resource-level bindings, default service accounts with broad roles still attached.
- Secrets and credentials: Service account keys that shouldn't exist, keys in Compute metadata, secrets in Cloud Build substitutions instead of Secret Manager, application secrets in container images or environment variables.
- Data exposure: Overly permissive Storage buckets, Cloud SQL instances with public IPs and no authorised networks, BigQuery datasets with organisation-wide access, Pub/Sub topics accessible to unexpected identities.
- Network configuration: Firewall rules with broad ingress on administrative ports, Compute instances without OS Login enforced, VPC Service Controls absent or misconfigured.
- Audit and detection gaps: Cloud Audit Logs disabled for certain services, no alerting on sensitive IAM changes, key usage not being monitored.
Finding these misconfigurations requires privileged access to be provided to testers.
Conclusion
The value of gray-box testing is in more honestly modelling the threat model. An organisation that comes out of a black-box engagement knowing their external perimeter looks reasonable may not have learned much about whether someone who compromised a developer laptop last Tuesday could exfiltrate customer data by the weekend.