Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat](skew & kurt) New aggregate function skew & kurt #40945 #41277

Open
wants to merge 1 commit into
base: branch-3.0
Choose a base branch
from

Conversation

zhiqiang-hhhh
Copy link
Contributor

cherry pick from #40945

`skew`,`skew_pop` and `skewness` is used to calculate
[skewness](https://en.wikipedia.org/wiki/Skewness#Pearson.27s_moment_coefficient_of_skewness)
of a data distribution.
`kurt`,`kurt_pop` and `kurtosis` is used to calculate
[kurtosis](https://en.wikipedia.org/wiki/Kurtosis) of a data
distribution.

The implementation references
ClickHouse/ClickHouse#5200, and modified result
type to AlwaysNullable since doris do not support NaN.

The formula used to calculate skew is `3-th moments / (variance^{1.5})`
The formula used to calculate kurt is `4-th moments / (variance^{2}) -
3`

when value of any result is NaN, doris will return NULL.

doc: apache/doris-website#1127
@zhiqiang-hhhh
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions


#pragma once

#include <stddef.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: inclusion of deprecated C++ header 'stddef.h'; consider using 'cstddef' instead [modernize-deprecated-headers]

Suggested change
#include <stddef.h>
#include <cstddef>

++m[0];
m[1] += x;
m[2] += x * x;
if constexpr (_level >= 3) m[3] += x * x * x;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: statement should be inside braces [readability-braces-around-statements]

Suggested change
if constexpr (_level >= 3) m[3] += x * x * x;
if constexpr (_level >= 3) { m[3] += x * x * x;
}

m[1] += x;
m[2] += x * x;
if constexpr (_level >= 3) m[3] += x * x * x;
if constexpr (_level >= 4) m[4] += x * x * x * x;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: statement should be inside braces [readability-braces-around-statements]

Suggested change
if constexpr (_level >= 4) m[4] += x * x * x * x;
if constexpr (_level >= 4) { m[4] += x * x * x * x;
}

m[0] += rhs.m[0];
m[1] += rhs.m[1];
m[2] += rhs.m[2];
if constexpr (_level >= 3) m[3] += rhs.m[3];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: statement should be inside braces [readability-braces-around-statements]

Suggested change
if constexpr (_level >= 3) m[3] += rhs.m[3];
if constexpr (_level >= 3) { m[3] += rhs.m[3];
}

m[1] += rhs.m[1];
m[2] += rhs.m[2];
if constexpr (_level >= 3) m[3] += rhs.m[3];
if constexpr (_level >= 4) m[4] += rhs.m[4];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: statement should be inside braces [readability-braces-around-statements]

Suggested change
if constexpr (_level >= 4) m[4] += rhs.m[4];
if constexpr (_level >= 4) { m[4] += rhs.m[4];
}

ErrorCode::INTERNAL_ERROR,
"Variation moments should be obtained by 'get_population' method");
} else {
if (m[0] == 0) return std::numeric_limits<T>::quiet_NaN();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: statement should be inside braces [readability-braces-around-statements]

Suggested change
if (m[0] == 0) return std::numeric_limits<T>::quiet_NaN();
if (m[0] == 0) { return std::numeric_limits<T>::quiet_NaN();
}

} else {
if (m[0] == 0) return std::numeric_limits<T>::quiet_NaN();
// to avoid accuracy problem
if (m[0] == 1) return 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: statement should be inside braces [readability-braces-around-statements]

Suggested change
if (m[0] == 1) return 0;
if (m[0] == 1) { return 0;
}

ErrorCode::INTERNAL_ERROR,
"Variation moments should be obtained by 'get_population' method");
} else {
if (m[0] == 0) return std::numeric_limits<T>::quiet_NaN();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: statement should be inside braces [readability-braces-around-statements]

Suggested change
if (m[0] == 0) return std::numeric_limits<T>::quiet_NaN();
if (m[0] == 0) { return std::numeric_limits<T>::quiet_NaN();
}

} else {
if (m[0] == 0) return std::numeric_limits<T>::quiet_NaN();
// to avoid accuracy problem
if (m[0] == 1) return 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: statement should be inside braces [readability-braces-around-statements]

Suggested change
if (m[0] == 1) return 0;
if (m[0] == 1) { return 0;
}

Comment on lines +110 to +111
return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: redundant return statement at the end of a function with a void return type [readability-redundant-control-flow]

Suggested change
return;
}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants