Skip to content

Instantly share code, notes, and snippets.

@angrycub
Last active January 17, 2022 12:42
Show Gist options
  • Save angrycub/995d1f66e74fce989082ae230b64067a to your computer and use it in GitHub Desktop.
Save angrycub/995d1f66e74fce989082ae230b64067a to your computer and use it in GitHub Desktop.
Consul Connect - Consul ACL Token Handling Part 1

Consul Connect / ACL Enformcement Flow

Nomad starts the JobEndpoint

In the beginning of time, the Nomad server makes a JobEndpoint using the NewJobEndpoints function.

Job is submitted

A job is submitted to the API (either directly or via the CLI).

The API sends a JobRegisterRequest to Register(). The Register() call is forwarded to the leader where execution continues.

Determine Kind

At line 97, it calls the admissionControllers function on the Job.

At line 45, The admissionControllers function calls the admissionMutators function.

The mutators specified on the JobEndpoint created at Server start are:

		mutators: []jobMutator{
			jobCanonicalizer{},
			jobConnectHook{},
			jobExposeCheckHook{},
			jobImpliedConstraints{},
		},

Of specific note is the jobConnectHook here. The admissionMutators function call the jobConnectHook's Mutate function which in turn calls its groupConnectHook function.

The groupConnectHook function iterates over all of the services in the group and checks to see if they Connect-enabled services.

For each enabled type it applies a Kind which is used in later validation steps.

Once all of the mutation hooks have run, control is returned to the admissionController function which then calls the admissionValidators funtion to range over the admission validators.

		validators: []jobValidator{
			jobConnectHook{},
			jobExposeCheckHook{},
			jobValidate{},
			&memoryOversubscriptionValidate{srv: s},

Again, jobConnectHook will be of note. The admissionValidators function calls Validate. This in turn calls the groupConnectValidate function.

groupConnectValidate runs a series of validations specific to Connect. Upon completion it returns back to admissionControllers. If it errored, it returns a nil Job, nil warnings, and the error encountered, otherwise, it returns a validated Job, an []error with any warnings, and a nil for the error value.

admissionControllers returns to Register function. Any errors cause Register to return an error. Otherwise, the mutated/validated function replaces args.Job.

Validate the provided token

In line 279, Register calls checkConsulToken with the result of args.Job.ConsulUsages() as its argument.

Build the ConsulUsages struct

ConsulUsages returns a map from a Consul namespace to job components that require Consul, including ConsulConnect, Task.Kinds, Services from groups and tasks, and a bool indicating if Consul KV is in use (determined by the presence of template stanzas).

In the case of OSS, the only map key will be the empty string "".

This key will refer to a new ConsulUsages struct that contains a map of services built from the group and task levels

While scanning the tasks, if a Template is found, then KV is set to true. (code)

https://github.com/hashicorp/nomad/blob/9a8c68f6cdad6fa091cef68a6f4020953da6c3e5/nomad/job_endpoint.go#L279-L282

	// Enforce the job-submitter has a Consul token with necessary ACL permissions.
	if err := checkConsulToken(args.Job.ConsulUsages()); err != nil {
		return err
	}

Validate ConsulUsages against token

The checkConsulToken takes the returned map[string]*ConsulUsage and begins to validate it against the Consul token supplied in the Job (j).

Does the server require token?

If the servers are not configured to require authentication for Consul—that is to say they are not configured with allow_unauthenticated=false, then Nomad will consider the request to be authorized. (code)

Did the user present a sufficient token?

Next, Nomad ranges the keys in the ConsulUsages generated earlier—these correlate to the Consul Namespaces the bearer token must be able to access to create all the objects specified in the Job. Again, for OSS, this will always be a single key of the empty string. (code for iteration)

For each usage in the struct, Nomad calls the consulACLs.CheckPermissions function.

This calls CheckPermissions on each usage and against the Job ConsulToken.

(link to code for CheckPermissions function)

func (c *consulACLsAPI) CheckPermissions(ctx context.Context, namespace string, usage *structs.ConsulUsage, secretID string) error {
	// consul not used, nothing to check
	if !usage.Used() {
		return nil
	}

9a8c68f6cdad6fa091cef68a6f4020953da6c3e5
	// If namespace is not declared on nomad jobs, assume default consul namespace
	// when comparing with the consul ACL token. This maintains backwards compatibility
	// with existing connect jobs, which may already be authorized with Consul tokens.
	if namespace == "" {
		namespace = "default"
	}

	// lookup the token from consul
	token, readErr := c.readToken(ctx, secretID)
	if readErr != nil {
		return readErr
	}

	// if the token is a global-management token, it has unrestricted privileges
	if c.isManagementToken(token) {
		return nil
	}

	// if the token cannot possibly be used to act on objects in the desired
	// namespace, reject it immediately
	if err := namespaceCheck(namespace, token); err != nil {
		return err
	}

	// verify token has keystore read permission, if using template
	if usage.KV {
		allowable, err := c.canReadKeystore(namespace, token)
		if err != nil {
			return err
		} else if !allowable {
			return errors.New("insufficient Consul ACL permissions to use template")
		}
	}

	// verify token has service write permission for group+task services
	for _, service := range usage.Services {
		allowable, err := c.canWriteService(namespace, service, token)
		if err != nil {
			return err
		} else if !allowable {
			return errors.Errorf("insufficient Consul ACL permissions to write service %q", service)
		}
	}

	// verify token has service identity permission for connect services
	for _, kind := range usage.Kinds {
		service := kind.Value()
		allowable, err := c.canWriteService(namespace, service, token)
		if err != nil {
			return err
		} else if !allowable {
			return errors.Errorf("insufficient Consul ACL permissions to write Connect service %q", service)
		}
	}

	return nil
}

At this point, if everything is cool, the JobEndpoint is okay with the Consul information in the job and the provided token.

Line 324, The Register function clears out the ConsulToken from the Job, since Nomad will use its token to register services, access KV, and derive SI tokens for Consul Connect sidecars and gateways.

Then the job is upsert to be scheduled.

From @apollo13:

@angrycub Ah sorry to hear; https://github.com/hashicorp/nomad/blob/main/nomad/structs/consul.go#L56-L88 should imo publish https://github.com/hashicorp/nomad/blob/77f6ecbbbffd9306abc05afe3e273c061145411f/nomad/structs/consul.go#L37 and not just Services/KV.

The code that should then verify the Kinds is at https://github.com/hashicorp/nomad/blob/77f6ecbbbffd9306abc05afe3e273c061145411f/nomad/consul.go#L254-L264

From shoenig:

https://github.com/hashicorp/nomad/blob/v1.1.0/command/agent/consul/service_client.go#L954 is where we apply the kind to the job based on jobspec characteristics. Nomad's currently configured to allow you to build a gateway for services that you can write to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment