sum Class — pytorch Architecture
Architecture documentation for the sum class in BlasKernel.cpp from the pytorch codebase.
Entity Profile
Source Code
aten/src/ATen/native/cpu/BlasKernel.cpp lines 79–100
template <typename Func>
auto sum(int64_t N, Func f) {
constexpr int ilp_factor = 4;
using acc_t = decltype(f(0));
// Calculate independent partial sums then add together at the end
std::array<acc_t, ilp_factor> partial_sums{};
int64_t i = 0;
for (; i + ilp_factor <= N; i += ilp_factor) {
c10::ForcedUnroll<ilp_factor>{}([&](int k) {
partial_sums[k] += f(i + k);
});
}
for (; i < N; ++i) {
partial_sums[0] += f(i);
}
for (int k = 1; k < ilp_factor; ++k) {
partial_sums[0] += partial_sums[k];
}
return partial_sums[0];
}
Source
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free