ShyamsundarR/cephfs-csi.md

## cephfs-csi.md

      
    Raw
  

              cephfs-csi.md
            
          
    CephFS integration with CSI

This document aims to list, at a high level, CSI 1 requirements and how these are met by existing CephFS features, or are features that need to be developed or improved in CephFS. It also covers features that are needed in ceph-csi, to close the gap between RBD and CephFS integrations.
NOTE: This is to serve as a running document, and would get modifications as the CSI specification evolves and its implementation in ceph-csi 2 evolves over time, or as more insight is gleaned for the existing requirements.
Summary

This section captures the gaps and potential future requirements (and associated trackers) based on the analysis in the subsequent sections. It should be sufficient to read through this section, for features and gaps, for non-CSI implementors.
The subsequent sections starting from Feature classification should be of interest to CSI implementors, and provides more detailed background for the requirements summarized here.


Snapshots and clones

MANDATORY CephFS snapshot should be independent of the subvolume being snapshotted, and support parent volume deletes

The desired independence is only from the perspective of volume deletes when a snapshot of the volume exists, and the ability to clone a snapshot when the parent volume is deleted.
[TODO] Investigate if this is already the case, and if the parent volume is moved to "trash" and garbage collected once dependent snapshots are deleted.

There is a caveat noted in the subvolume operations when deleting (subvolume rm) a volume as follows, "The removal of a subvolume fails if it has snapshots, or is non-existent."


RBD is providing the said solution by always providing a clone for a snapshot request, rather than use RBD snapshots as the representation for the clone
Tracker:

None


MANDATORY CephFS should support an interface that can fetch metadata regarding created snapshots

Snapshot requests may be retried/replayed, to respond back with the right metadata about snapshots that are already created, an interface like ceph fs subvolume info for snapshots is required
Support for the same already exists and is integrated into ceph-csi for RBD
Tracker:

None


DESIRABLE CephFS snapshot protection prior to cloning should be handled by Ceph

CephFS snapshot requires protection prior to cloning. This workflow has been revised with RBD, where the snapshot being cloned is protected internally and further can be deleted (ends up in "trash") and its subsequent garbage collection is deferred.
The requirement is to make it simpler for ceph-csi to use a similar workflow to RBD for the purpose of cloning.
NOTE: This requirement is not a prerequisite for clone operation to be implemented in ceph-csi. Ceph-csi can leverage the existing workflow, and further a CSI snapshot undergoing cloning may not be deleted as per the CSI protocol, as it is in use.
Tracker:

CephFS


REJECTED CephFS should provide an interface to clone a volume from a volume, rather than ceph-csi having to snapshot and clone the volume as a 2 step operation.

This is a requirement that simplifies ceph-csi implementation
Initial doubts were regarding, if the source volume should be snapshotted by CephFS prior to cloning, or is this not required. Further investigation and discussion on the enforcing of the source volume not being in use, has been clarified by the k8s community. The summary being, it is the plugins and users responsibility to ensure data consistency of the source volume.
The workflow hence is a to snapshot and then clone, and would be in line with the workflow for an explicit snapshot, followed by a clone of the snapshot
RBD also had a similar requirement and it stands rejected currently
Tracker:

CephFS
k8s-provisioner


FUTURE CephFS clones are full copies, hence to backup a volume the workflow by any backup operation would involve copying the volume content as part of a clone operation and subsequently copying the created volume contents to a backup store. This makes it a double copy operation, and would be inefficient.

CSI protocol also does not support any means to use a snapshot of a volume more directly, say as a read only mount for such purposes
[TODO] Investigate options where a clone request can carry any annotations that can enable this to be a light weight clone of the snapshot, by ceph-csi, rather than a full clone
Tracker:

None


FUTURE CephFS should provide a mechanism to freeze/unfreeze a mounted volume, like fsfreeze

CSI protocol may be enhanced to support better data consistency while taking snapshots. A newer request being discussed is the Freeze/Unfreeze request, that needs support by CephFS
Tracker:

None


FUTURE CephFS should have the ability to generate snapshot delta between 2 given snapshots, to enable backup vendors or data transfer agents, to optimize local filesystem inspection for changed data and to lower data transfer across networks

Backup and data protection vendors desire the ability to take incremental or differential backups
The typical means to achieve this with a filesystem would be to inspect the last modified time stamps of all inodes and take appropriate actions
Instead of the typical mechanism of crawling the file system and backup full contents of every file, it would be desirable to have a delta list between 2 snapshots, that can help optimize this operation
NOTE: This requirement should be evaluated based on interest and also based on how this would be exposed in the future
Tracker:

None


Choice of mounter

FUTURE If CephFS subvolumes should be mounted via FUSE, rather than using the kernel mounter, ceph-csi needs to solve the problem of maintaining the existing mounts due to any node service restarts

NOTE: This is a ceph-csi requirement and not a CephFS requirement, but it maybe driven due to CephFS needs to use the FUSE mounter
NOTE: RBD also requires a similar ceph-csi handling if the direction is to use rbd-nbd driver in the future
Tracker:

None


Topology based provisioning support

MANDATORY ceph-csi should integrate with ceph fs subvolume info to support topology based provisioning

This is a ceph-csi gap, but noted here as it was dependent on the CephFS tracker which is now complete
Topology based provisioning for RBD is already supported in ceph-csi
Tracker:

ceph-csi


FUTURE Enhance ceph-csi topology support leveraging multi-mds and subtree pinning features of CephFS

This is a ceph-csi gap, but noted here to clarify if this is feasible before creating the required trackers
Currently, topology based support chooses a datapool for a subvolume closer to where the workload would be scheduled. This makes all MDS operations potentially cross topology boundaries. To be strictly topology constrained, if multi-MDS and subtree pinning could be leveraged, an MDS closer to the node where the workload is scheduled can be pinned to the subvolume.
RBD does not have a separate MDS as such, and hence is fully topology constrained as it stands
Tracker:

None


Mirroring for DR or related use cases

FUTURE CephFS should provide a mirroring solution at a subvolume granularity, such that a standby DR site can takeover operations in case of distasted

This requirement is outside the scope of ceph-csi, although CSI should be able to respond back with the same CephFS mirrored volume on the DR site when provisioning and life cycle requests are made
RBD has mirroring feature that is available, but is yet to be integrated to ceph-csi
Tracker:

None


Encryption

FUTURE CephFS should provide for a client side per subvolume encryption feature

This feature can then be leveraged by ceph-csi and any KMS on the client side, to provide per volume encryption
RBD has LUKS based encryption support that is already leveraged by ceph-csi
Tracker:

None


Compression

FUTURE CephFS should support IO hints to the OSD on data compression

IO hints are supported in Ceph that help avoid or force compression of specific IO requests. These hints seem absent in CephFS clients, and may need consideration
This does not impact ceph-csi directly, but is a requirement that is satisfied (but not integrated into ceph-csi) by RBD and hence may need consideration
Tracker:

None


FUTURE CephFS should support in flight compression to reduce network bandwidth usage

This is useful in cloud environments where cross topology traffic is charged, hence any optimization at the cost of relative performance is desired
The assumption is that the messenger protocol would be enhanced to compress data, hence this is not strictly a CephFS concern
Tracker:

ceph-csi


Ceph-go integration to all CephFS CLIs

MANDATORY All CephFS CLIs and interfaces used by ceph-csi should have an equivalent ceph-go API binding

ceph-go based API invocations are resource friendly, and is the forward looking path for ceph-csi
This requirement is more a ceph-go repository requirement and is being handled by the maintainers of the same, but noted here for completeness
Tracker:

None


Miscellaneous

FUTURE CephFS should be able to return a list of nodes where a volume is mounted

Certain CSI requests, like ListVolumes and ControllerGetVolume (alpha), require an OPTIONAL response regarding where the volume is mounted
Currently both requests are not supported by ceph-csi, owing to the latter being in alpha state, and the former (ListVolumes) not providing required secrets for operating against a ceph cluster
Tracker:

None


FUTURE CephFS subvolume list should provide info for listed subvolumes as an optimization, as instead a list needs to followup with a info for each listed item

Ideally both ListSnapshots and ListVolume also need extra information regarding each volume or snapshot that is covered in ceph fs subvolume info, as a result as a future optimization it may help to have an ls -l equivalent that returns the extra metadata for each item listed.
Tracker:

None


FUTURE CephFS client mounts need a mechanism to detect if a mount is healthy

Certain CSI requests, ListVolumes, NodeGetVolumeStats and ControllerGetVolume (alpha), have an OPTIONAL response field indicating if the health of the volume
Currently only NodeGetVolumeStats is supported by ceph-csi, and is not returning this response as it is OPTIONAL
NOTE: This maybe a ceph-csi concern to determine the health of the mount, and not a CephFS concern, but is noted here in case additional CephFS support is required
Tracker:

None


FUTURE Multi-tenant noisy neighbor prevention [TODO]

There could be additional requirements in this regard, dealing with fairness across different workloads (tenants?) using the same storage provider.


UNKNOWN CephFS should have the ability to fence stale clients

CephFS would need the ability to fence or disregard stale mounts and possibly blacklist them, to prevent inadvertent modifications from the stale client
A stale client is typically when a volume is used as are read-write by a single node only
This is a more kubernetes and CSI environment requirement, and needs clarification in the specifications, but noted here for completeness and any CephFS constructs that may need to be provided
Tracker:

ceph-csi


Feature classification

Features are classified into,

Eco-system considerations for Container Orchestrator (CO, typically kubernetes) and CSI deployments
CSI requests (gRPCs)
Storage backend features that can be exposed to CO environments

The first set is to cover features needed due to various environmental factors of CSI and COs. The second set covers the various CSI calls and its resulting requirements, and the last set covers storage backend features that can be exposed to the CO (e.g. data compression, encryption).
Eco-system considerations for CO and CSI deployments


Ability for a CO to perform storage volume life cycle management on CephFS

Storage lifecycle would include, create/mount/snapshot/clone/delete/resize and related operations
This is primarily supported by the ceph fs subvolume interface provided by CephFS 3
All current lifecycle operations are covered in the CSI requests section


Ability for multiple CO instances to use the same instance of CephFS

This is provided by CephFS using the ceph fs subvolumegroup interface, and helps isolate the various instances based on the subvolumegroup name
Fine grained cephx id and keys that restrict access to created subvolumegroup, provide required authentication/access isolation
GAP: This is already supported in ceph-csi for CephFS, there is a reverse gap in RBD which is being addressed here


Ability for a single CO instance to achieve logical isolation of volumes created using the same CephFS instance

This can be looked at as a sub-part of the previous requirement, except this is from the same CO instance
The solution remains the same, i.e to use ceph fs subvolumegroup interface
The caveat is to ensure that the volume ID/name per volume is unique across the entire CO instance. This is not a CephFS concern, but something to be aware of, that the CSI plugin will ensure
FUTURE: There could be additional requirements in this regard, dealing with fairness across different workloads (tenants?) using the same storage provider.


CSI nodeplugin restarts

CSI nodeplugin operates on every node that requires a mount of a volume
For upgrade or related reasons, the nodeplugin service maybe restarted, and this can interfere with the existing mounts becoming stale if FUSE is used as the mounter
The default or suggested mounter to use hence, is the kernel CephFS mounter
NOTE: If there are strong reasons to use the FUSE mounter, the stated use case of node plugin restarts would need to be handled


Fencing

Mostly to deal with volumes that are expected to be mounted on a single node for read/write cases
GAP: CephFS would need the ability to fence or disregard stale mounts and possibly blacklist them, to prevent inadvertent modifications from the stale client
This is currently being debated in this issue at the CO/CSI layers


Scale [TODO]

To note down typical scaling parameters, number of volumes/snapshots, sizes, active/passive IO rates
Most of this may just end up as guesses that will change over time and across users


CSI secrets

CSI requests are made using secrets that pertain to the operation context. These secrets, for example, are mentioned in the StorageClass, a k8s construct that specifies parameters for the request and also which driver would handle requests made using the StorageClass. Dynamic volume requests are made referencing a StorageClass, that is then used to pass on the secrets and parameters to the respective storage plugin CSI request
In the case of ceph, secrets contain cephx ID and Key
All CSI requests do not contain the secrets field, as a result a subset of requests, that require secrets to operate against the ceph cluster, are not supported by ceph-csi
FUTURE: There is an open proposal to move secrets from the CSI requests to the CSI controller and node services instead. This helps serve requests that do not pass any secrets, and is the accepted pattern by the CSI, k8s community and other storage vendors.

Some requests that stand unsupported due to the CSI secrets constraint hence may need to be supported if the secrets are moved into the CSI plugin instance


Minimum Ceph versions across the client and storage servers

[TODO] Based on the subvolume group of interfaces, and newly added interfaces for CSI support, it would help to list out what minimum versions of Ceph and clients are required to support the stack.


CSI requests

CSI requests (or, gRPC calls) are separated into 2 categories,

Controller services (controller plugin)
Node services (node plugin)

The controller service is responsible for most of the CSI volumes lifecycle management activities. The node service is responsible to mount and manage "use" of the created volumes. The node service does not participate in the IO path, and is only responsible to setup the volume for access by the workload.
Controller services


CreateVolume

Creates a volume of a given size and capability (capabilities include access and type, as detailed further below)
Primarily relies on CephFS subvolume series of commands to create the required volume
There are 2 other variants for create volume, which are based on a VolumeContentSource field in the create request as follows,

Create a volume from a snapshot of another volume (VolumeContentSource is a CephFS snapshot)
Create a volume from another volume (clone) (VolumeContentSource is another CephFS volume)

The suggested workflow for ceph-csi is to create a snapshot and clone the volume from the same, thereby reusing existing ceph fs subvolume snapshot and clone operations
From a user perspective, it maybe prudent to instead create a non-ephemeral snapshot of the volume to clone from, and reuse the clone from snapshot version of CreateVolume instead for efficiency reasons
NOTE: There is a ticket open to support clone inherently by teh ceph CLI, but this is not a blocker for the feature implementation in ceph-csi


NOTE: It is feasible that other forms of VolumeContentSource requests may be standardized in the future
Volumes have an attribute of access mode that defines how the volume is intended to be accessed, these being,

Single node reader
Single node reader/writer
Multi node reader
Multi node reader/writer
Multi node reader, single node writer
The one unknown is how, single node writer multiple readers, is required to be implemented and who owns the restriction that there is ever only a single writer

kubernetes does not support this mode as of this writing, and as a result for now this is not a requirement that needs addressing in this environment, nor can further details be elaborated on whose responsibility this would be eventually


The rest of the access modes can be supported by CephFS, as the actual ensuring of the fact that the volume is read only or writeable as well, is left to the CO and CSI, based on how it needs to be mounted (e.g ro/rw)


Volumes are also classified as Block or Mount volume types, and with CephFS the type is always going to be "Mount". This is again controlled and filtered by CSI and not a CephFS concern
[TODO] Topology based provisioning constraints

Short note, CephFS has everything in place to use a single non-topology constrained MDS and different topology constrained data pools to provision topology constrained volumes
FUTURE: With the MDS subtree pinning feature, and multi-MDS support in the horizon(?), the topology constraints can be extended to the MDS as well


GAP: There is no way to use created CSI snapshots of a volume directly as per CSI protocol. IOW, a snapshot cannot be mounted read only for activities such as backing up the snapshot contents or replicating the same across storage clusters

The usage hence is to create (clone) a volume from a snapshot as its VolumeContentSource before use
This makes the current CephFS clone operation to gain access to data in the snapshot more heavy weight for such use cases as backup


GAP: Currently clone from a snapshot for CephFS is a full copy, hence time to create such volumes are indeterminate (depends on the amount of data and metadata to copy)

This hence makes the CreateVolume CSI call unresponsive, and alternatives to not making this hang are discussed and noted here


GAP: RBD snapshots no longer need to be protected during cloning, and further are automatically placed in the trash when such snapshots that are being cloned are deleted. This makes it a desirable feature in CephFS as well, to avoid the explicit protection requirements. The gap is not a MUST address gap, but more to align the workflow across RBD and CephFS
GAP: To support request retires, if a volume being created already exists, the CSI plugin would need to read its attributes and respond with the corresponding size, time and relevant metadata for the request. This is satisfied using the ceph fs subvolume info interface that is provided. ceph-csi is yet to integrate with this to provide required correctness in such scenarios.
FUTURE Ability to restore clone a snapshot or volume to a pool different than the source pool. This already stands supported with CephFS subvolume clones, as the full copy of the filesystem can be cloned to a different data pool layout as desired.


DeleteVolume

Deletes a volume that was created using the CreateVolume request
Deletion should be independent of existing snapshots for the volume, IOW it should be possible to delete a volume that has existing snapshots, and further these snapshots could still be used, in the future, to create a new volume by cloning the same
GAP: There is a caveat noted in the subvolume operations when deleting (subvolume rm) a volume as follows, "The removal of a subvolume fails if it has snapshots, or is non-existent.".

Need to understand if the subvolume would remain in trash, and hence not a concern from the CSI request perspective (i.e DeleteVolume will be a success, but volume is not deleted in CephFS and remains in trash), or the CSI deletion request itself would fail till all snapshots are deleted.


Controller[Publish/Unpublish]Volume

These are not implemented by ceph-csi
These are to control which node a volume can be published to (i.e mounted and used), and serves as a check with the controller service before requesting the node service to mount the same
NOTE: There were inclinations to use this where fencing is required, but the way forward is not yet designed


ValidateVolumeCapabilities

Given a CSI volume, returns various volume access modes and access types
Involves ability to inspect CephFS subvolume for attributes of interest, typically size (quota), time stamps, data pool parameter
Currently ceph-csi does not call into CephFS to validate any fields, and only checks if the newly requested capabilities do not include "Block"
FUTURE: In the future is ceph-csi needs to call into CephFS for the subvolume info, the ceph fs subvolume info interface would address the requirement


ListVolumes

It has been deliberated and decided not to support this RPC via ceph-csi
This request also has the issue of not carrying the CSI secrets in its request
Further this request does not carry the provisioning parameters of interest, that help narrow down where to list volumes from (e.g pool, subvolume groupname)
In the future if this is required, CephFS subvolume group of commands have the subvolume ls CLI that ceph-csi can leverage to provide the required data


ControllerGetVolume (Alpha)

This is a CSI alpha feature, meaning it is completely experimental and may not be made part of the specification
Intention is to return if a volume is healthy (VolumeCondition field) and also list of nodes the volume is published on (VolumeStatus field)
This is currently not implemented in ceph-csi
This request also does not carry the CSI secrets
[TODO] Need to list out how this can be achieved with CephFS, when it requires support in ceph-csi
NOTE: VolumeStatus is also returned in the ListVolumes request, which is unsupported in ceph-csi

For RBD the, to be investigated, option could be to leverage the watcher output, to return a list of nodes currently using the volume
Unsure if CephFS also maintains watchers, and if this is a fairly reliable source of truth


NOTE: VolumeCondition is also returned in ListVolumes (unsupported), and NodeGetVolumeStats requests. The latter, in ceph-csi, currently does not return a VolumeCondition as the field is optional in the response


GetCapacity

Intention of this request is to return available capacity of the storage backed, given parameters of provisioning (e.g subvolumegroup, data pool)
The provisioning parameters are the same as those sent in a CreateVolume request, which inturn identify the pool, data pool, ceph cluster of choice etc. IOW this enables us to zero in on a subvolume group and return available bytes left within the group (if such restrictions are possible)
This is currently unsupported by ceph-csi as this request does not come with required [CSI Secrets]. As a result querying Ceph/CephFS for available storage capacity is not feasible.


ControllerGetCapabilities

Purely a CSI instance to CO communication on various features flags that the CSI controller plugin supports


CreateSnapshot

Create a snapshot of a volume
GAP: Currently this is not implemented in ceph-csi in any form for CephFS, an initial alpha version of the same exists for RBD but is being revamped to not have any dependency between a CSI snapshot and its source volume
ceph fs subvolume snapshot group of commands would satisfy the integration point requirements for ceph-csi
GAP: Like create requests that may be retired, snapshot requests may also be replayed. To respond back with the right metadata about snapshots that are already created, an interface like ceph fs subvolume info for snapshots is desired.
CSI Snapshots will have a separate lifecycle independent of the originating volume. For example, it is possible that the original volume needs to be restored from a snapshot, in which case it would be deleted and recreated from the CSI snapshot as the VolumeContentSource. As a result the storage backed also needs to be able to support such isolation, even if it is synthetic.

NOTE: It is feasible that a clone from a snapshot can be created as a newly named volume, and then the older volume deleted or garbage collected as a workflow by the users, but having the above independence eases the workflow substantially
QUESTION: Can CephFS volumes containing snapshots be deleted (even if they still live on in trash) and then subsequently the snapshots be accessed for clone operations?


DeleteSnapshot

Deletes a snapshot
No relevant discussion required here, covered by the ceph fs subvolume snapshot interface


ListSnapshots

Currently not implemented or planned to be implemented for ceph-csi. Primarily owing to the CSI secrets issue, and also the request not carrying the CreateVolume parameters to zero in on which pool/group the listing should be from
CephFS already has the interface ceph fs subvolume snapshot ls that returns said data

FUTURE: Ideally both ListSnapshots and ListVolume also need extra information regarding each volume or snapshot that is covered in ceph fs subvolume info, as a result for a future optimization it may help to have an ls -l equivalent that returns the extra metadata for each item listed.


ControllerExpandVolume

Expand an existing volume, supported via ceph fs subvolume resize in CephFS
Expansion can be online/offline, which is fine w.r.t CephFS as it changes the quota which is supported when the volume is in use
This is an expansion request and not a shrink, hence only expansion requires support (although with quotas this is immaterial)
GAP: ceph-csi should ideally inspect the current size and not resize the volume to the new size if the new size is smaller, as per the CSI specification. The specification states, the volume should be at least as large as the request, and if already bigger can just respond back with success. For inspection of current size the interface ceph fs subvolume info is available, and hence if ceph-csi needs to update its checks, is feasible to do so.


Node services


Node[Stage/Unstage]Volume

Request for the initial global mount of a CSI volume on a node
For CephFS this would mount the subvolume path using the mounter of choice (FUSE/kernel), and further needs an interface to convert the subvolume to a path in the CephFS instance for the mount, which is provided for using ceph fs subvolume getpath interface
NOTE: As noted in CSI nodeplugin restarts section, if using FUSE as the mounter, and the nodeplugin is restarted, all mounts would become stale.


Node[Publish/Unpublish]Volume

Request for a subsequent bind mount of the global mount to a specific workload path in the node
This is a bind mount that is executed on the node and hence there is no interaction with CephFS required at this stage
NOTE: A bind mount may add the read only flag to the bind mount, when the global mount is a rw mount, to support various volume access types as detailed in CreateVolume


NodeGetVolumeStats

Used to get statfs information about mounted volume on a node
statfs output for a CephFS subvolume should reflect maximum and free inodes and block information at the subvolume granularity. This is already the case and hence supported.


NodeGetCapabilities

Purely a CSI instance to CO communication on various features flags that the CSI node plugin supports


NodeGetInfo

Mostly CSI node specific data exchange between CSI and the CO
There is one option in the response that states now many volumes this node can support. If in the future it is required to control the number of volumes per node, given node characteristics, this maybe leveraged to control the maximum mounted instances per node.


NodeExpandVolume

Post expansion of the volume on the controller, a live mount may require a resize on published nodes. This request is to achieve hte same
This request is a NOP for CephFS as once the quota is reset on the subvolume, the mount would get refreshed with the updated values subsequently


Node[Freeze/Unfreeze] FUTURE

Upcoming proposal to add a node level volume Freeze/Unfreeze operation
Intention of freeze is to pause changes to the volume, till it is unfrozen
FUTURE: Explore (or elaborate) ways in which to freeze CephFS subvolume mounts. Would fsfreeze be an alternative here?


Additional storage features


Encryption

GAP: RBD client side per-volume encryption is supported in ceph-csi (using LUKS and integrated to VAULT as the KMS). There is currently no solution available for CephFS


Compression

Pool level compression settings is configurable for all pools, that back either RBD or CephFS. This is done via Rook and addresses at rest compression
GAP: IO hints are supported in Ceph that help avoid or force compression of specific IO blocks. These hints seem absent in CephFS clients, and may need consideration
FUTURE: In flight compression is possibly at the transport layer (ceph messenger v2), to keep it generic across all protocols using the same


DR and Backup/Restore

The bulk of Backup/Restore is to take snapshots periodically and back them up to a backup vendor controlled data store
The ability to clone snapshots is required to access the snapshot data as per the CSI protocol
CONCERN: As current CephFS clones are full logical filesystem copies, when used to backup purposes this would result in a double copy of the data
Mirroring is the other sought after solution, more in the disaster recovery space than for long term data retention and backup

RBD supports mirroring, and a prototype was created with ceph-csi to demonstrate its DR capabilities
GAP: CephFS does not have a mirroring solution yet, and a proposal is in the works for the same


FUTURE: Ability to restore a snapshot to a different pool than the originating volume is a desirable feature. This already stands supported with CephFS subvolume clones, as the full copy of the filesystem can be cloned to a different data pool layout as desired.
FUTURE: Ability to generate snapshot delta between 2 given snapshots, this enables backup vendors or data transfer agents, to optimize local filesystem inspection for changed data and to lower data transfer across the networks


go-ceph 4 API bindings for all interface

GAP: Not all interfaces that are used by ceph-csi has an equivalent mapping to go-ceph. Especially some of the extended manager commands. This is required to ensure performance, scale and resource consumption optimization of the controller service and node service, as using multiple CLI invocations is both costly in terms of time to completion of the request, and also resource intensive when multiple CLIs are executed in parallel (for parallel requests).


UID/GID mapping [TODO]


References

[1] Container Storage Interface (CSI) specification
[2] Ceph-csi integration
[3] CephFS subvolumes
[4] Go bindings for Ceph
Eco-system projects and groups of interest


CSI specification
WG notes/meetings:

Storage
Data Protection
NOTE: It is probably best to keep track of the meeting minutes and notes from the WG meetings


k8s sidecar repositories:

k8s CSI developer documentation
provisioner
snapshotter
resizer
attacher


KEPs (Kubernetes Enhancement Proposals)

sig-storage
NOTE: There are other SIGs within the KEPs that may have related enhancements to storage, and are possibly best kept track of using the label sig/storage