apparentlymart/TFEphemeralResources.md Secret

## TFEphemeralResources.md

      
    Raw
  

              TFEphemeralResources.md
            
          
    Ephemeral Resources

Background

Terraform currently has two resource "modes":


Managed resources are those where Terraform considers itself to "own" a
corresponding external object indefinitely. Terraform is responsible for its
full lifecycle, including creating it, applying updates to it, and possibly
eventually destroying it.
"Managed" is the primary resource mode, and so it's given the honor of
being declared with blocks simply named resource. Managed resource objects
are persisted between runs using information saved in the state.


Data resources represent some existing object that Terraform will read the
data from without taking ownership of it. Data resources are declared using
data blocks, and their only lifecycle action is "read".
Data resources are retained in the state so that Terraform can determine when
they are changed, but their data is always re-read on each operation. Until
Terraform 0.13, Terraform squandered the ability to do something useful with
its ability to check for data resource changes, but in Terraform 0.13 we
finally started showing changes since the last operation in the rendered diff
to help users understand how data resource changes impacted the configurations
of managed resources.


What both of these resource modes have in common is that their results are
saved in a state snapshot after each terraform apply. This allows Terraform
to compare new configuration/state with prior state.
Due to some implementation details of how data resources were initially
implemented, they ended up acquiring another unintended use-case: generating
temporary values, such as credentials, whose scope is intended only to include
a single Terraform operation. Because Terraform prior to 0.13 was not
explicitly showing changes to data resources unless their reads were deferred
until the apply step, this other use-case appeared to mostly work aside from
there being no opportunity to explicitly delete or otherwise release a
temporary object created by a data resource.
A specific example is the Vault provider: it was initially using managed
resources to represent the creation of a leases for secrets, which was a poor
fit because those values would then not be available until the apply step and
so Vault-issued credentials could not be used to configure providers.
Data resources gave an opportunity to acquire a lease during Terraform's
refresh phase, but at the expense of violating one of the key assumptions about
data resources: that reading them does not have externally-visible side effects.
Each time a vault_generic_secret data resource is read, a new lease object
is created inside the target Vault cluster which is cleaned up only because
leases have an expiration process managed by the Vault server. To mitigate this,
the Vault provider itself intentionally issues itself a short-lived child
token to limit the effective lifetime of its leases. However, because of how
data resources participate in the plan/apply cycle, this in turn means that
the apply must happen very shortly after the plan was created in order to still
be working with a valid lease.
This issuing of credentials is one example of a use-case involving an ephemeral
object that ought to exist only for the duration of a single Terraform
operation, and should ideally be closed or deleted automatically at the end
of that operation. Another similar use-case that has been discussed is
temporary networking tunnels,
allowing Terraform to temporarily gain connectivity to a network it would not
naturally have access to. In this case, the temporary object is naturally
scoped to the system where Terraform is running (it's not a "remote object"
in the usual sense) but it does still have an "open"/"close" lifecycle where
ideally it should remain open for as short a time as possible.
Most of the existing use-cases for this sort of ephemeral object tend to
naturally relate to other ephemeral state inherent to Terraform. For example,
provider configurations are active only for a single operation, and tend to
consume credentials and network tunnels. Provisioners are active only briefly
during the creation of a managed resource, and similarly can consume credentials
and network tunnels. Due to the ephemeral nature of a time-limited credential
or a network tunnel, it doesn't make sense to use it as part of the
configuration of managed and data resources: their data outlives a single
operation, and so it would tend to be useless for them to refer to an ephemeral
object.
This document observes that "ephemeral objects" seems to be a real use-case,
distinct from the use-cases of managed and data resources, and proposes an
explicit representation of them in the Terraform language as a third resource
mode with its own lifecycle.
Proposal

Our usual representation of "remote" objects (some of which are more remote than
others) is resources, and resource modes are how we recognize that not all
resources have the same core lifecycle. Therefore this document proposes to
add a new "ephemeral" resource mode, and an associated ephemeral block for
declaring such a resource:
ephemeral "vault_generic_secret" "example" {
  path = "secret/rundeck_auth"
}

provider "rundeck" {
  url        = "http://rundeck.example.com/"
  auth_token = ephemeral.vault_generic_secret.example.data["auth_token"]
}
ephemeral "ssh_tunnel" "example" {
  user        = "terraform"
  tunnel_host = "bastion.prod.example.com"
  remote_host = "consul.prod.example.com"
  remote_port = 443
}

provider "consul" {
  # local_address would be something like 127.0.0.1:45623, reflecting a
  # local port dynamically allocated for the SSH tunnel.
  address = ephemeral.ssh_tunnel.example.local_address
}
An ephemeral resource would have a lifecycle consisting of two actions, which
might be expressed as provider protocol operations as follows:


OpenEphemeral: given an object representing the content of the resource's
configuration block, create the ephemeral remote object and return a bigger
object adding additional information about that ephemeral object, such as
the SSH tunnel local address or credential information in the above examples.
This is roughly analogous to "creating" a managed resource, but with a
different verb to help distinguish it from the idea of creating some
long-lived persistent object, as an analogy to opening a local file, a
socket, etc.


CloseEphemeral: given an object returned by a previous call to
OpenEphemeral, clean up any externally-visible state associated with the
ephemeral object (e.g. explicitly end a Vault lease, or explicitly close
an SSH tunnel listen socket). This operation returns nothing except a
possible set of error or warning diagnostics.
This is roughly analogous to "destroying" a managed resource, but with
a different verb to help distinguish it from the idea of destroying some
long-lived persistent object, and as the opposite of "open" above.


A significant difference for the ephemeral resource lifecycle compared to the
managed resource lifecycle is that the open and close operations will always
appear together in a graph: any graph walk that opens an ephemeral must also
close it, to limit the scope to that single walk. We'll explore more details
about the graph representation of ephemeral resources in a later section.
Restrictions for Ephemeral Resource Configuration

The lifecycle of an ephemeral resource is similar to that of a provider
configuration: each one is re-created for each walk and then destroyed before
the end of that walk. We've not historically imposed any explicit restrictions
on what objects can be referred to in provider configurations, but in
retrospect we've seen that we should have: provider configurations cannot
feasibly make use of values that are determined only after apply, because we
need to configure providers even for planning.
Learning from that historical error, I propose that an initial implementation
of ephemeral resources impose the following restrictions, checked during
or after graph construction and before graph walking:


Ephemeral resources may derive their values only from other ephemeral
resources, either directly or indirectly. That is, an ephemeral resource
could refer to an instance of another ephemeral resource, or it could refer
to a named value that is derived only from other ephemeral resources, but it
may not refer to a managed or data resource, nor may it refer to a named
value derived from one.


Outputs from ephemeral resources may not be used in either managed nor data
resource configurations, because their results will outlive a single walk
and thus become invalid immediately. Again, this rule applies indirectly
too: a named value derived from an ephemeral resource may not be used in
a managed nor data resource.


Provider configurations and provisioner configurations (including their
associated connection blocks) may refer to ephemeral resources, either
directly or indirectly.


We may be able to relax some of these restrictions if we later implement
something like
the Partial Apply proposal,
e.g. by allowing managed resources to be used as part of the configuration of
an ephemeral resource but deferring it and anything that depends on it until
a subsequent plan/apply if the managed resource is not yet created. Being
restrictive in the initial implementation will give the greatest freedom to
selectively loosen those restrictions as Terraform's other capabilities change.
Interaction with Terraform State

Because the full lifecycle of an ephemeral resource is completed separately
during each walk, there is no need to persist any record of it in saved
state snapshots. Instead, the ephemeral resource state will exist only briefly
in memory during its open window.
For ephemeral resources that issue credentials, this creates a significant
advantage over the existing "abuse" of data resources: the temporary
credentials will exist only in memory within the Terraform Core and provider
processes, and never be written out in a state snapshot.
This does not fully address the "sensitive values" class of problems -- there
are still use-cases around resources that generate persistent secrets like
private keys associated with TLS certificates -- but implementing ephemeral
resources would likely take some of the heat off in user feedback about
sensitive values by addressing a common sub-section of that problem space.
Graph Construction with Ephemeral Resources

Since the primary use-cases for ephemeral resources are in management of
objects that are in some sense sensitive -- credentials directly, or privileged
access to a remote network derived from some credentials -- our aim would be
to avoid opening them at all when possible and, when we do need to open them,
to keep the window of time they are open as short as possible.
With that in mind, and considering the restrictions on referencing from the
previous section, the additional graph construction behaviors for ephemeral
resources would be:


For each ephemeral resource, check to see if there is at least one valid
reference to it from a provider configuration that will be opened in this
operation (i.e. that has at least one associated resource in the graph) or,
during the apply phase only, from a provisioner associated with a managed
resource that is planned for creation or destruction. If not, create no
additional objects and halt further processing for that ephemeral resource.


For each provider configuration that makes use of a given ephemeral resource,
locate the provider configuration's open and close graph nodes. The open node
for the provider configuration depends on the open node for the ephemeral
resource. The close node for the ephemeral resource depends on the close node
for the provider. Or, diagrammatically:
   ephemeral.vault_generic_secret.example (open)
                         ⇧
             provider.rundeck (open)
                         ⇧
          rundeck_job.example (any action)
                         ⇧
            provider.rundeck (close)
                         ⇧
   ephemeral.vault_generic_secret.example (close)


During the apply phase only, for each provisioner associated with a managed
resource planned for creation or destruction whose provisioner configurations
refers to an ephemeral resource, locate the create and/or destroy node for
the managed resource and mark it as dependent on the open node for
the ephemeral resource, and mark the close node for the ephemeral resource
as dependent on the managed resource node. Or, diagrammatically:
   ephemeral.vault_generic_secret.example (open)
                         ⇧
        rundeck_job.example (create/destroy)
                         ⇧
   ephemeral.vault_generic_secret.example (close)

(Only create-time provisioners need to be considered for managed resources
planned for creation, and only destroy-time provisioners for those planned
for destruction.)


In addition to the above behaviors, ephemeral resources must also follow a
similar behavior as for provider configurations in that they must be
forcefully closed even if an error occurs before their "close" node is reached
during graph traversal. The only cases where an ephemeral remote object should
persist after a graph walk is completed are if the CloseEphemeral operation
itself fails (the provider's own responsibility) or if Terraform encounters
a panic condition.
Interactions with the Plan/Apply flow

The key distinguishing factor for ephemeral resources is that they are
processed in exactly the same way for all walk types. The only differences are
a result of the interactions with other objects in the graph: ephemeral
resources would never appear in a validate graph, for example, because in
practice such a graph doesn't contain any provider configurations nor managed
resource create/destroy actions.
As a follow-on consequence of that, ephemeral resources to not explicitly
participate in the plan/apply flow: there will never be an entry in a generated
plan representing opening or closing an ephemeral resource. Instead, the
ephemeral resource behaviors are an implied side-effect of all other operations,
and no information about an ephemeral resource opened and closed during plan
is available during a subsequent apply. The apply walk will open and then later
close any necessary ephemeral resources itself.
This addresses the problem of ephemeral credentials generated during plan
becoming unavailable before the plan is applied: the apply phase will instead
issue its own credentials, entirely separate from those issued during the
plan phase.
Provider SDK Representation

I'll leave the finer details of Provider SDK Representation for the SDK team
to define, but I want to note a few things related to it.
Firstly, from a provider protocol standpoint Terraform Core will consider an
ephemeral resource type to be totally distinct from a managed or data resource
type of the same name. This continues the precedent that e.g. a managed
resource type aws_vpc is not connected in any technical way to a data
resource type aws_vpc, and instead the relationship between them is a UX
concern managed by provider developers.
The SDK may in practice be designed to allow sharing implementation between
resources types with the same name but of different modes. From Terraform Core's
perspective, that would be an implementation detail of the SDK. I'd recommend
caution about making such sharing of implementation the default behavior,
because e.g. it would be confusing if a data "ssh_tunnel" "example block were
to be treated as valid, create an SSH tunnel process during its read, and then
totally lose track of that process and not formally clean it up.
The design of ephemeral resources does intentionally have various things in
common with other resource modes, though. For example, the representation of
configuration as an object, and the open action augmenting that object with
additional "computed" attribute values in a similar way as we see for both
managed and data resources. In principle then, the same abstractions used to
represent config-in-state-out transformations for managed and data resource
types should be adaptable to ephemeral resource types too.