Introduction

The MTC codebase has grown over the years to add in support for multiple methods of PV migration. As we added in the additional methods the code grew in complexity. We recently addressed this complexity by refactoring to create Go Interfaces to obtain interchangeable components. This pattern has worked well for the development team and we’d like to describe more of the thinking process and steps we took to arrive at this implementation in this post.

The problems

Restic requires first backing up data to object storage like s3. This incurs an additional cost since everything must first be uploaded from the source and then downloaded to the destination. This can have benefits, especially in environments with restricted networking where the clusters cannot communicate with each other. Conversely downtime is often limited and lengthy migrations can be a huge burden. A direct migration with rsync might provide a large performance improvement.

We also witnessed at least one case where data had become damaged on the backing store and rsync dealt with it more gracefully than restic, which simply froze and stopped transferring data. While direct volume migration was added to MTC it resulted in quite a bit of additional non-reusable code for this use case.

Furthermore, as of right now we also only support transfers wrapped in stunnel and exposed over routes. Using routes elliminates the ability to work with Kubernetes clusters, another capability we would like to add in the future. Mixing and matching transfer protocols, transports, and endpoint types would quickly become a convoluted tangle of code if we didn’t do something else.

The Solution

First, to deal with simplifying controller code we decided to move most of the logic for data migrations into a new library, crane-lib.

Next, when sitting down to think about the problem we broke out the components we are using for direct volume migrations today. Those are rsync, stunnel, and routes. Each of these has served us well, but also potentially has limitations. It would be nice if we could swap each out for another option as neeed. To do this we labeled each of these layers. rsync was is the transfer program. stunnel is the transport wrapper, and created an interface for each of these layers.

Transfer

type Transport interface {
	CA() *bytes.Buffer
	Crt() *bytes.Buffer
	Key() *bytes.Buffer
	Port() int32
	ClientContainers() []v1.Container
	ClientVolumes() []v1.Volume
	ServerContainers() []v1.Container
	ServerVolumes() []v1.Volume
	Direct() bool
	CreateServer(client.Client, endpoint.Endpoint) error
	CreateClient(client.Client, endpoint.Endpoint) error
}

Transport

type Transfer interface {
	Source() *rest.Config
	Destination() *rest.Config
	Endpoint() endpoint.Endpoint
	Transport() transport.Transport
	// TODO: define a more generic type for auth
	// for example some transfers could be authenticated by certificates instead of username/password
	Username() string
	Password() string
	CreateServer(client.Client) error
	CreateClient(client.Client) error
	PVCs() PVCPairList
}

Endpoint

type Endpoint interface {
	Create(client.Client) error
	Hostname() string
	Port() int32
	NamespacedName() types.NamespacedName
	Labels() map[string]string
	IsHealthy(c client.Client) (bool, error)
}

We then set off to create at least two implementations at each layer to prove that they were easily interchangeable and help us refine the interfaces. To do this we needed to ensure that each one implemented the interface.

For transfer rsync and rclone were implemented. For transport we implemented stunnel and null (there is no sense in wrapping applications that provide their own encryption) and for endpoints we added route and load balancer.

Through testing we developed a Compatibility Matrix that verified our expectations while allowing us to resolve issues and tweak the interfaces.

Now when setting up each layer of our transfer we only need to be concerned with choosing the appropriate option, which is as simple as picking, for example transfer, err := rclone.NewTransfer(s, r, srcCfg, destCfg, pvcList) or transfer, err := rsync.NewTransfer(s, r, srcCfg, destCfg, pvcList)

Refining the Interfaces

Once we started testing more deeply we realized we did not need and did not want setters for every parameter so we elliminated them so that we had only getters. Most all data is entered at creation and becomes immutable in this way. There is a burden to creating each function defined in the interface, even if they are relatively simple, so ellimating them eases development of future options. Ideally one should be able to copy an interface and adjust configs for the new component they would like to use at the given layer and be ready to go almost within minutes.

Next Steps

At the moment we are focused on bringing crane-lib up to parity with the current direct volume migration code so we can use it in 1.6.0. In early testing the MTC controller code has become simpler and easy to follow.

As we look forward to crane 2.0 more endpoint options such as nodeport, and even services, for intercluster transfers, and additional options for Kubernetes to OpenShift transfers seem like good candidates for enabling transfers within in the requirements of whatever network(s) the clusters are running in.

We would also welcome contributions improving or adding options to facilitate transfer within or between clusters.

Konveyor Crane
crane-lib
cran-lib example test
crane 2.0
Go by Example: Interfaces