Skip to content

Conversation

gnufied
Copy link
Contributor

@gnufied gnufied commented Nov 14, 2018

Version of #222

The original PR was timing out while loading.

@jdef jdef requested review from saad-ali and jdef and removed request for saad-ali November 15, 2018 14:42
Copy link
Member

@jdef jdef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM mod CI complaints

@jdef jdef requested review from saad-ali and jieyu November 15, 2018 14:44
@jdef
Copy link
Member

jdef commented Nov 15, 2018

@saad-ali @jieyu @julian-hj @ddebroy PTAL

@gnufied
Copy link
Contributor Author

gnufied commented Nov 15, 2018

@jdef I am going to squash these commits while rebasing if everything looks okay. cheers.

@gnufied gnufied force-pushed the resize-proposal-updated branch from 917322d to f61d06d Compare November 15, 2018 15:22
Copy link
Member

@saad-ali saad-ali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Approve

@saad-ali
Copy link
Member

@julian-hj @ddebroy Will merge after your LGTM

`NodeExpandVolume` ONLY supports expansion of already node-published or node-staged volumes on the given `volume_path`.

If plugin has `STAGE_UNSTAGE_VOLUME` node capability then:
* `NodeExpandVolume` MUST be called after successful `NodeStageVolume`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would seem to contradict line 1776 in the offline ControllerExpandVolume OFFLINE instructions. Those indicate that a volume must be unstaged before it can be expanded, but this line says it must be staged. Maybe this section seems to be about the online flow. Does it need to be updated to reflect that NodeExpandVolume also sometimes participates in offline flow? Or am I just confused?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those indicate that a volume must be unstaged before it can be expanded...

The way that I'm reading the ControllerExpandVolume OFFLINE instructions is that the volume must be unstaged before it can be controller-expanded. Controller-expansion and node-expansion are different events.

This section talks about node-expansion.

Is there a way that we could make that more clear?
Did I misunderstand the concern?
Am I reading the proposal wrong?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jdef no you are right. ONLINE and OFFLINE capabilities are entirely about whether you can call ControllerExpandVolume while volume is published/available or not. I thought language of OFFLINE clarified this:

// OFFLINE indicates that volumes currently published and
// available on a node SHALL NOT be expanded via
// ControllerExpandVolume

NodeExpandVolume ALWAYS requires that volume is either published or staged. It does not support any kind of OFFLINE expansion on the node.

@gnufied gnufied force-pushed the resize-proposal-updated branch from f61d06d to df16f23 Compare November 17, 2018 10:50
@gnufied gnufied force-pushed the resize-proposal-updated branch from df16f23 to fe26ef3 Compare November 26, 2018 15:31
@gnufied
Copy link
Contributor Author

gnufied commented Nov 26, 2018

@julian-hj can you take another look?

add support for volume expansion RPC calls.
also add plugin capabilities that allow CO to query
whether volume expansion is supported online or offline.
@gnufied gnufied force-pushed the resize-proposal-updated branch from fe26ef3 to ec6a414 Compare November 26, 2018 16:29
@julian-hj
Copy link
Contributor

@gnufied hmm. To me, it would be natural to assume that OFFLINE as a plugin capability means that the volume must be offline in order to be expanded, particularly since OFFLINE is a plugin capability, not a controller capability. So I think the proposed spec is still fairly confusing.

If it is acceptable for a CO to call NodeExpandVolume when a plugin reports OFFLINE capability, then I think we need to say that explicitly in the description of OFFLINE. Or if ONLINE/OFFLINE is really only about ControllerExpandVolume behavior, then probably it should be moved to ControllerGetCapabilities rather than GetPluginCapabilities.

@gnufied
Copy link
Contributor Author

gnufied commented Nov 26, 2018

A OFFLINE plugin capability can't mean that - volume has to be offline on the node for NodeExpandVolume too. That was ruled out because of say - supporting XFS file system on Azure, which requires volume to be mounted for file system expansion but volume must be OFFLINE for controller expansion. So in my view - applying one value of online and offline to both controller and node operation isn't going to work and hence this proposal simplifies the design by making node side operation always online.

It is true to some extent that, may be then we should not use plugin capabilities to define this property and stick this with controller capabilities. But - there are two problems with that:

  1. Controller capabilities currently define which RPC call the volume plugin supports. Not which RPC call can be made depending on the state of the volume. That is not what controller capabilities are used for (but may be I am wrong?)
  2. There is another use case that this proposal covers is, online node only expansion. A plugin may not implement any control plane call at all and as long as it declares ONLINE capability, volume expansion operation could be done in node side. The language of ONLINE capability allows that.
    This case will not be adequately covered by moving online and offline capabilities to controller capabilities or we will have to come up with another way of finding out volume expansion capabilities.

@julian-hj
Copy link
Contributor

@gnufied @jdef OK, so if I am understanding things, then what you've said isn't consistent. Earlier, you both agreed that OFFLINE and ONLINE only concerns ControllerExpandVolume but just now, you're saying that OFFLINE applies to NodeExpandVolume too. But that isn't called out in the spec, which is what I was trying to say in my first comment.

So...would it be accurate to say that NodeExpandVolume cannot be called unless the plugin advertises ONLINE capability? If so, then I think we should say that somewhere. If not, then I think the relationship between OFFLINE/ONLINE and the node RPC still needs to be clarified.

@gnufied
Copy link
Contributor Author

gnufied commented Nov 28, 2018

Earlier, you both agreed that OFFLINE and ONLINE only concerns ControllerExpandVolume but just now, you're saying that OFFLINE applies to NodeExpandVolume too.

What you mean by "applies"? I think you meant - even if plugin has only OFFLINE capability it still may require a call of NodeExpandVolume on the node? If yes - that is true. But what I meant earlier is - ONLINE and OFFLINE capability only control whether ControllerExpandVolume call can be made with published volume or not. These capabilities as currently defined do not prescribe whether NodeExpandVolume call can be made when volume is online/offline, because NodeExpandVolume always requires the volume to be published(and hence online) to a node.

Or to rephrase - the capabilities don't say anything at all about when NodeExpandVolume call can be made. So these capabilities and NodeExpandVolume call are not related at all.

So...would it be accurate to say that NodeExpandVolume cannot be called unless the plugin advertises ONLINE capability? If so, then I think we should say that somewhere. I

I think that won't be accurate to say. Using example I gave previously. Using XFS file system on a Azure Disk volume. The plugin will declare expansion capabilities as OFFLINE. Which means - ControllerExpandVolume can only be called when volume is not published to a node. But NodeExpandVolume MUST be called when volume is published to the node. Because a XFS file system can only be expanded when mounted on a node. So - OFFLINE capability only applies to RPC ControllerExpandVolume. NodeExpandVolume does not support offline mode of file system expansion at all. Volume must be published/mounted for file system expansion on the node.

Copy link
Contributor

@julian-hj julian-hj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussion in the Community Meeting, I am OK with this PR, although I still think we should make additional changes after to clarify the way that NodeExpandVolume gets invoked in the OFFLINE flow.

I think we can merge this PR as-is, and then work on clarifying the doc in a separate PR though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants