feat: support spark compatible floor function#21933
feat: support spark compatible floor function#21933athlcode wants to merge 4 commits intoapache:mainfrom
Conversation
coderfender
left a comment
There was a problem hiding this comment.
Looks good @athlcode . Added a couple of comments to improve test cases
|
|
||
| impl ScalarUDFImpl for SparkFloor { | ||
| fn name(&self) -> &str { | ||
| "floor" |
There was a problem hiding this comment.
Does spark also support 'flooring' like it does with 'ceil' and 'ceiling' ?
There was a problem hiding this comment.
No. Spark's floor function has no aliases, it's just floor.
| let new_p = (*p - *s as u8 + 1).clamp(1, 38); | ||
| DataType::Decimal128(new_p, 0) | ||
| } | ||
| DataType::Decimal128(p, s) => DataType::Decimal128(*p, *s), |
There was a problem hiding this comment.
So would have same behavior for 0 scale as well?
There was a problem hiding this comment.
When the scale is 0, the value is already a whole number, so floor doesn't change anything, we just keep the type as is. This is the same approach Spark Ceil takes, and it matches Spark's behaviour: floor(DECIMAL(10, 0)) stays DECIMAL(10, 0).
| Ok(ColumnarValue::Scalar(ScalarValue::Int64(v.map($f)))) | ||
| } | ||
| other => internal_err!( | ||
| "floor: expected {} but got {:?}", |
There was a problem hiding this comment.
Could we make error message a bit more robust to suggest datatype mismatch ?
There was a problem hiding this comment.
changed the error message, let me know if its okay, thank you
| query IT | ||
| SELECT floor(arrow_cast(1.5, 'Float32')), arrow_typeof(floor(arrow_cast(1.5, 'Float32'))); | ||
| ---- | ||
| 1 Int64 |
There was a problem hiding this comment.
Would love to have tests for Nan/ Infinity etc
| 1000 Int64 | ||
|
|
||
| query IT | ||
| SELECT floor(arrow_cast(-1000, 'Int64')), arrow_typeof(floor(arrow_cast(-1000, 'Int64'))); |
There was a problem hiding this comment.
Could we also add some tests for boundary cases?
There was a problem hiding this comment.
added tests for boundary cases as well
Which issue does this PR close?
Part of #15914
Rationale for this change
Spark compatible semantics for
floorfunctionWhat changes are included in this PR?
Implements SparkFloor UDF with Spark-compatible return types:
Are these changes tested?
Yes . Added unit tests and SLT tests
Are there any user-facing changes?