Skip to content

[maskedtensor] Add missing nan ops tutorial#2046

Closed
george-qi wants to merge 1 commit intogh/george-qi/5/basefrom
gh/george-qi/5/head
Closed

[maskedtensor] Add missing nan ops tutorial#2046
george-qi wants to merge 1 commit intogh/george-qi/5/basefrom
gh/george-qi/5/head

Conversation

Copy link
Copy Markdown

@jisaacso jisaacso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address comments on masked_tensor vs MaskedTensor. I'm also not sure that this tutorial is all that helpful as a standalone? Would it make more sense to just merge this into the overall tutorial? Or have a second tutorial that aggregate advanced features? This feels like a pretty minor demonstration of nanmean with Tensors vs mean with MT.

Comment thread beginner_source/maskedtensor_missing_nan_ops.rst
>>> y
tensor([nan, 1., 4., 9., nan, 5., 12., 21., nan, 9., 20., 33., nan, 13.,
28., 45.])
>>> y.nanmean()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be useful to inline some comments on what you're trying to show here

28., 45.])
>>> y.nanmean()
tensor(16.6667)
>>> torch.mean(masked_tensor(y, ~torch.isnan(y)))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the goal to have sequential tutorials or keep each self contained? If the latter, can you add the relevant imports up top.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tutorial will be merged with overview!

28., 45.])
>>> y.nanmean()
tensor(16.6667)
>>> torch.mean(masked_tensor(y, ~torch.isnan(y)))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you replace masked_tensor with MaskedTensor in pytorch/pytorch@5e9c26c ? If so, can you update the tutorial here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

masked_tensor is the preferred function to use!

tensor([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan])
>>> torch.nanmean(x)
tensor(nan)
>>> torch.mean(masked_tensor(x, ~torch.isnan(x)))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment above on MaskedTensor

>>> y
tensor([nan, 1., 4., 9., nan, 5., 12., 21., nan, 9., 20., 33., nan, 13.,
28., 45.])
>>> y.nanmean()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably outside the scope of this review, but why do we have nanmean() as an API instead of the pandas-style mean(..., skipna=True)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure..

28., 45.])
>>> y.nanmean()
tensor(16.6667)
>>> torch.mean(masked_tensor(y, ~torch.isnan(y)))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have we considered API sugar:

(1) instantiating a MT from a Tensor assuming na is the mask

>>> MaskedTensor(y)
MaskedTensor(
  [      --,   1.0000,   4.0000,   9.0000,       --,   5.0000,  12.0000,  21.0000,       --,   9.0000,  20.0000,  33.0000,       --,  13.0000,  28.0000,  45.0000]
)

(2) instantiating a MT where user just states the mask value instead of passing the mask

y = MaskedTensor(y, mask_value=float(1))

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet! I think an unspecified mask could also be an indication that they would like all True values for the mask, so that could be a third option as well.

Another one would be to allow for just MaskedTensor(y) if y is a sparse tensor because then the mask is "implied".

All been discussed and will take note to add in :)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where are you tracking feature requests?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants