[maskedtensor] Add missing nan ops tutorial#2046
[maskedtensor] Add missing nan ops tutorial#2046george-qi wants to merge 1 commit intogh/george-qi/5/basefrom
Conversation
[ghstack-poisoned]
jisaacso
left a comment
There was a problem hiding this comment.
Please address comments on masked_tensor vs MaskedTensor. I'm also not sure that this tutorial is all that helpful as a standalone? Would it make more sense to just merge this into the overall tutorial? Or have a second tutorial that aggregate advanced features? This feels like a pretty minor demonstration of nanmean with Tensors vs mean with MT.
| >>> y | ||
| tensor([nan, 1., 4., 9., nan, 5., 12., 21., nan, 9., 20., 33., nan, 13., | ||
| 28., 45.]) | ||
| >>> y.nanmean() |
There was a problem hiding this comment.
it might be useful to inline some comments on what you're trying to show here
| 28., 45.]) | ||
| >>> y.nanmean() | ||
| tensor(16.6667) | ||
| >>> torch.mean(masked_tensor(y, ~torch.isnan(y))) |
There was a problem hiding this comment.
is the goal to have sequential tutorials or keep each self contained? If the latter, can you add the relevant imports up top.
There was a problem hiding this comment.
This tutorial will be merged with overview!
| 28., 45.]) | ||
| >>> y.nanmean() | ||
| tensor(16.6667) | ||
| >>> torch.mean(masked_tensor(y, ~torch.isnan(y))) |
There was a problem hiding this comment.
did you replace masked_tensor with MaskedTensor in pytorch/pytorch@5e9c26c ? If so, can you update the tutorial here?
There was a problem hiding this comment.
masked_tensor is the preferred function to use!
| tensor([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]) | ||
| >>> torch.nanmean(x) | ||
| tensor(nan) | ||
| >>> torch.mean(masked_tensor(x, ~torch.isnan(x))) |
| >>> y | ||
| tensor([nan, 1., 4., 9., nan, 5., 12., 21., nan, 9., 20., 33., nan, 13., | ||
| 28., 45.]) | ||
| >>> y.nanmean() |
There was a problem hiding this comment.
probably outside the scope of this review, but why do we have nanmean() as an API instead of the pandas-style mean(..., skipna=True)
| 28., 45.]) | ||
| >>> y.nanmean() | ||
| tensor(16.6667) | ||
| >>> torch.mean(masked_tensor(y, ~torch.isnan(y))) |
There was a problem hiding this comment.
have we considered API sugar:
(1) instantiating a MT from a Tensor assuming na is the mask
>>> MaskedTensor(y)
MaskedTensor(
[ --, 1.0000, 4.0000, 9.0000, --, 5.0000, 12.0000, 21.0000, --, 9.0000, 20.0000, 33.0000, --, 13.0000, 28.0000, 45.0000]
)(2) instantiating a MT where user just states the mask value instead of passing the mask
y = MaskedTensor(y, mask_value=float(1))There was a problem hiding this comment.
Not yet! I think an unspecified mask could also be an indication that they would like all True values for the mask, so that could be a third option as well.
Another one would be to allow for just MaskedTensor(y) if y is a sparse tensor because then the mask is "implied".
All been discussed and will take note to add in :)
There was a problem hiding this comment.
where are you tracking feature requests?
Stack from ghstack (oldest at bottom):