apply_geqrf_batched Class — pytorch Architecture
Architecture documentation for the apply_geqrf_batched class in BatchLinearAlgebraLibBlas.cpp from the pytorch codebase.
Entity Profile
Source Code
aten/src/ATen/native/cuda/linalg/BatchLinearAlgebraLibBlas.cpp lines 79–99
template <typename scalar_t>
void apply_geqrf_batched(const Tensor& input, const Tensor& tau) {
auto batch_size = cuda_int_cast(batchCount(input), "batch_size");
auto m = cuda_int_cast(input.size(-2), "m");
auto n = cuda_int_cast(input.size(-1), "n");
auto lda = std::max<int>(1, m);
// cuBLAS batched geqrf requires input to be the device array of pointers to device single matrices
Tensor input_ptr_array = get_device_pointers<scalar_t>(input);
Tensor tau_ptr_array = get_device_pointers<scalar_t>(tau.unsqueeze(-1));
auto input_ptr_array_data = reinterpret_cast<scalar_t**>(input_ptr_array.data_ptr());
auto tau_ptr_array_data = reinterpret_cast<scalar_t**>(tau_ptr_array.data_ptr());
int info;
auto handle = at::cuda::getCurrentCUDABlasHandle();
at::cuda::blas::geqrfBatched(handle, m, n, input_ptr_array_data, lda, tau_ptr_array_data, &info, batch_size);
// info only indicates wrong arguments to geqrfBatched call
// info is a host variable, we can check it without device synchronization
TORCH_INTERNAL_ASSERT(info == 0);
}
Source
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free