Data mover micro service design #7576

Lyndon-Li · 2024-03-28T06:31:26Z

Fix issue #7198. Add the design for data mover micro service

codecov · 2024-03-28T06:39:57Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 58.80%. Comparing base (24941b4) to head (470484d).
Report is 245 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #7576      +/-   ##
==========================================
- Coverage   61.70%   58.80%   -2.91%     
==========================================
  Files         263      345      +82     
  Lines       28861    28759     -102     
==========================================
- Hits        17808    16911     -897     
- Misses       9793    10420     +627     
- Partials     1260     1428     +168

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

reasonerjt · 2024-05-29T06:19:10Z

design/vgdp-micro-service/vgdp-micro-service.md

+- This enable users to to control resource (i.e., cpu, memory) request/limit in a granular manner, e.g., control them per backup/restore of a volume
+- This increases the resilience, crash of one VGDP activity won't affect others
+- In the cases that the backup storage must be represented by a Kubernetes persistent volumes (i.e., nfs storage, [COSI][3]), this avoids to dynamically mount the persistent volumes to node-agent pods and cause node-agent pods to restart (this is not accepted since node-agent lose it current state after its pods restart)  
+- This prevents unnecessary full backup. Velero's fs uploaders support file level incremental backup by comparing the file name and metadata. However, at present the files are visited by host path, while pod and PVC's ID are part of the host path, so once the pod is recreated, the same file is regarded as a different file since the pod's ID has been changed. If the fs uploader is in a dedicate pod and files are visited by pod's volume path, files' full path are not changed after pod restarts, so incremental backups could continue.  


... once the pod is recreated, the same file is regarded as a different file....

Here is the pod the backuppod created by VBDM? Have you verified the incremental backup scenario?

The pod refers to the source pod to be backed up.
The incremental backup compares the full path for each file, while the source pod's ID is part of the full path. Once the pod is recreated, the full path will change, so the same file will be treated as a new file.

reasonerjt · 2024-05-29T06:28:16Z

design/vgdp-micro-service/vgdp-micro-service.md

+
+## Goals
+- Create a solution to make VGDP instances as micro services
+- Modify the VBDM to launch VGDP micro services instead of running the VGDP instances in place


I suggest modifying this statement or adding one more statement to specify what we wanna implement.

Like offload the work of data movement from node agent to backup pod

reasonerjt · 2024-05-29T06:35:31Z

design/vgdp-micro-service/vgdp-micro-service.md

+## Detailed Design
+### Exposer
+At present, the exposer sets ```velero-helper pause``` as the command run by backupPod/restorePod. Now, VGDP-MS command will be used, the command is like below:
+```velero vgdp-ms backup --this-pod xxx --this-container xxx --this-volume xxx --resource-timeout xxx --log-format xxx --log-level xxx```  


This is subjective but isn't it easier to understand by calling it velero data-move? Or compile into another binary like velero-data-mover

To lower the overhead to build the extra binary, we should keep it in velero, but rename the sub-command.

Done, changed the command name to velero data-mover

reasonerjt · 2024-05-29T06:36:47Z

design/vgdp-micro-service/vgdp-micro-service.md

+Below are the parameters of the commands:  
+**this-pod**: Deliver the backupPod/restorePod's name to VGDP-MS. Even though VGDP-MS is running inside the backupPod/restorePod, it is not easy to get the name of pod it is running in, so it is simply delivered through the parameter.  
+**this-container**: When backupPod/restorePod is started, the cluster may insert other containers to the pod, so we cannot assume that the container running VGDP-MS is always the first container in the pod. So the container name is simply delivered through the parameter.  
+**this-volume**: The name of the snapshot/target volume. VGDP-MS accesses the snapshot/target volume by matching this-volume to the volumeMount/volumeDevice in backupPod/restorePod's spec, so that it could access the snapshot/target volume locally.  


Why not pass the path directly to the command?

I have the same concern about passing this-pod and this-container to the command.
The workload should not need to know the pod or container name of where it runs.

Besides the full path, we also need the volume mode(fs or block) to initialize the backup. So instead of passing more detailed parameters (path, mode, etc.), it is better to simply pass these 3 parameters and let the VGDP to get the detailed info from them.

Moreover, the fileSystemBR should work for both host-path mode(for PVB/PVR) and pod-path mode(data mover), passing 3 simple parameters and deducing the info varying from different modes is helpful to reduce the code redundancy.

Conclusively, we need these 3 parameters for below purposes:

Find the path of the volume to be backed up

Find the DUCR/DDCR this pod is for

Create events to the backupPod/restorePod

Retrieve the volume mode(fs or block)

this-pod is a workload created by velero right? or a user workload mounting the volume prior to backup?

this-pod runs a subcommand of velero

After the further discussion, we will not use this-pod, this-container and this-volume, instead, we will pass the specific info required by the vgdp-ms --- volume path, volume mode, DUCR/DDCR

Done on the changes

reasonerjt · 2024-05-29T06:46:36Z

design/vgdp-micro-service/vgdp-micro-service.md

+
+In order to have the same capability and permission with node-agent, below pod configurations are inherited from node-agent and set to backupPod/restorePod's spec:
+- Volumes: Some configMaps will be mapped as volumes to node-agent, so we add the same volumes of node-agent to the backupPod/restorePod
+- Environment Variables


If time permits, we should follow the best practice to limit/minimize the scope and capability of the backupPod/restorePod.

Yes, theoretically, the backupPod/restorePod only needs to host the VGDP instance. And besides that, we also need a kube client so as to fulfill below communicating and controlling purposes:

Watch some status of DUCR/DDCR for synchronization with the controller, i.e., ready for start VGDP; VGDP cancellation

Deliver information to the controller, i.e., VGDP progress

There is no other functionalities in backupPod/restorePod

We may open an issue to limit the scope of the roles of the backupPod/restorePod but it can be optional for phase 1 implementation.

Added a clarification for the roles/privileges/capabilities of the backupPod/restorePod.
We can open an issue for further changes after this micro service feature is merged.

design/vgdp-micro-service/vgdp-micro-service.md

reasonerjt · 2024-05-29T07:00:17Z

design/vgdp-micro-service/vgdp-micro-service.md

+
+We use two mechanism to make the watch:  
+**Pod Phases**: VGDP Watcher watches the backupPod/restorePod's phases updated by Kubernetes. That is, VGDP Watcher creates an informer to watch the pod resource for the backupPod/restorePod and detect that the pod reaches to one of the terminated phases (i.e., PodSucceeded, PodFailed). We also check the availability & status of the backupPod/restorePod at the beginning of the watch so as to detect the starting of the backupPod/restorePod.  
+**Custom Kubernetes Events**: VGDP-MS generates Kubernetes events and associates them to the backupPod/restorePod at the time of VGDP instance starting/stopping and progress update, then VGDP Watcher creates another informer to watch the Event resource associated to the backupPod/restorePod.  


What's the specific use case for this? reporting the transfer status?
IMHO, we wanna try to make the code that runs in backupPod less k8s-aware or k8s-dependent, so we can test it or verify certain cases more easily. But this is an immature thought and we can have further discussions on this topic.

Custom Kubernetes Events doesn't mean a mechanism out of k8s scope, so it is still k8s aware/dependent.
Here the Events refer to the Kubernetes Event resource, that is, we generate events to the backupPod/restorePod during the data movement.
We use the Events to transfer message to the controller during the data movement, since the Pod Phase and Pod Terminate Message mechanism can only deliver info at the time of pod terminates.

After offline discussion, we agreed to choose k8s events as the approach for backup/restore pod to communicate to the node-agent, but maybe point the involvedObject to the DUCR is a better choice.

After some further checks, I can see below negative points if we want to make involvedObject as DUCR/DDCR:

Kubernetes' etcd is not good at to store huge number of objects, this is why the event objects are automatically removed after 1 hour(by default). Since DUCR/DDCR are to be permanently kept, it will increase the burden to etcd when there are lots of DUCR/DDCR. On the other hand, since the events are deleted after 1 hour, we have no way to keep the events for further use, so there is no gain to keep the events with DUCR/DDCR.

The ms-service solution introduced in this design is capable to be reused by other requirements in future and Event is a core part of this solution. Adding DUCR/DDCR as involvedObject requires adding one more input parameter for the ms-service. As a comparison, originally, we only require this-pod, this-container and this-volume as the input.

After a further discussion, we decided to associate the events to DUCR/DDCR

Done on the changes

shawn-hurley

I don't see this, but it feels like a perfect thing to at least think through how a third-party datamover could now plugin into this system.

Lyndon-Li · 2024-06-04T02:39:45Z

I don't see this, but it feels like a perfect thing to at least think through how a third-party datamover could now plugin into this system.

The scope of this design is for VGDP, VGDP is a part of VBDM (Velero Build-in Data Mover), so this design doesn't cover 3rd data movers.
3rd data movers still follow this section in the Volume Snapshot Data Movement design. Specifically, 3rd data movers handles the DUCR/DDCR and they are free to make their data movers micro service style or monolith service style.

I can add a clarification for this in the design.

shawn-hurley · 2024-06-04T13:05:30Z

So my thought process was:

We could use a generic controller to handle the DUCR/DDCR and then move all the data movers to micro services and have the generic controller be able to control what data mover is called when.

The biggest problem with the design at the moment is that I can only have a single Data Mover. I have to copy the full VBDM code and extend it for my use case (a particular volume for a particular driver).

Maybe this isn't how you want to solve that problem, and that is ok, but it is an option.

I would like to see us think through a solution to this problem though.

Lyndon-Li · 2024-06-05T02:40:15Z

So my thought process was:

We could use a generic controller to handle the DUCR/DDCR and then move all the data movers to micro services and have the generic controller be able to control what data mover is called when.

The biggest problem with the design at the moment is that I can only have a single Data Mover. I have to copy the full VBDM code and extend it for my use case (a particular volume for a particular driver).

Maybe this isn't how you want to solve that problem, and that is ok, but it is an option.

I would like to see us think through a solution to this problem though.

I think this is possible theoretically, though we still need to cover lots of details, e.g., how do we publish the communication mechanism so that 3rd data movers could implement.
However, the biggest problem here is this conflicts with the existing 3rd data mover integration mechanism that has already been defined in the Volume Snapshot Data Movement design --- the 3rd data mover is integrated through DUCR/DDCR, which is simple and clear.
Therefore, without enough motivation, we don't want to modify the original design. And the benefit I can see for your proposal is that 3rd data mover authors don't need to implement the DUCR/DDCR controllers, but I don't think that is motivative enough. Moreover, I am not sure if all the 3rd data mover authors want this.

Anyway, this is out of the scope of the current design. I suggest to open another issue after this design is merged and we can continue this discussion in the issue.

Signed-off-by: Lyndon-Li <lyonghui@vmware.com>

github-actions bot added the has-changelog label Mar 28, 2024

Lyndon-Li marked this pull request as ready for review March 28, 2024 06:32

github-actions bot requested review from sseago and ywk253100 March 28, 2024 06:32

github-actions bot assigned Lyndon-Li Mar 28, 2024

Lyndon-Li requested review from reasonerjt, blackpiglet, qiuming-best and allenxu404 March 28, 2024 06:32

Lyndon-Li force-pushed the data-mover-micro-service-design branch from adc1979 to 2264d56 Compare March 28, 2024 08:37

reasonerjt reviewed May 29, 2024

View reviewed changes

shawn-hurley reviewed Jun 3, 2024

View reviewed changes

Lyndon-Li force-pushed the data-mover-micro-service-design branch from 2264d56 to 3527da1 Compare June 4, 2024 10:22

data mover micro service design

470484d

Signed-off-by: Lyndon-Li <lyonghui@vmware.com>

Lyndon-Li force-pushed the data-mover-micro-service-design branch from 3527da1 to 470484d Compare June 5, 2024 02:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data mover micro service design #7576

Data mover micro service design #7576

Lyndon-Li commented Mar 28, 2024 •

edited

codecov bot commented Mar 28, 2024 •

edited

reasonerjt May 29, 2024

Lyndon-Li May 29, 2024

reasonerjt May 29, 2024

Lyndon-Li Jun 4, 2024

reasonerjt May 29, 2024 •

edited

reasonerjt May 31, 2024 •

edited

Lyndon-Li Jun 4, 2024

reasonerjt May 29, 2024

Lyndon-Li May 29, 2024 •

edited

kaovilai Jun 3, 2024

Lyndon-Li Jun 4, 2024

Lyndon-Li Jun 4, 2024

Lyndon-Li Jun 4, 2024

reasonerjt May 29, 2024

Lyndon-Li May 31, 2024

reasonerjt May 31, 2024

Lyndon-Li Jun 4, 2024

reasonerjt May 29, 2024

Lyndon-Li May 29, 2024

reasonerjt May 31, 2024

Lyndon-Li Jun 3, 2024

Lyndon-Li Jun 4, 2024

Lyndon-Li Jun 4, 2024

shawn-hurley left a comment

Lyndon-Li commented Jun 4, 2024

shawn-hurley commented Jun 4, 2024

Lyndon-Li commented Jun 5, 2024

Data mover micro service design #7576

Are you sure you want to change the base?

Data mover micro service design #7576

Conversation

Lyndon-Li commented Mar 28, 2024 • edited

codecov bot commented Mar 28, 2024 • edited

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

reasonerjt May 29, 2024 • edited

Choose a reason for hiding this comment

reasonerjt May 31, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Lyndon-Li May 29, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shawn-hurley left a comment

Choose a reason for hiding this comment

Lyndon-Li commented Jun 4, 2024

shawn-hurley commented Jun 4, 2024

Lyndon-Li commented Jun 5, 2024

Lyndon-Li commented Mar 28, 2024 •

edited

codecov bot commented Mar 28, 2024 •

edited

reasonerjt May 29, 2024 •

edited

reasonerjt May 31, 2024 •

edited

Lyndon-Li May 29, 2024 •

edited