getMaxWorkspaceSize Class — pytorch Architecture
Architecture documentation for the getMaxWorkspaceSize class in Conv_v7.cpp from the pytorch codebase.
Entity Profile
Source Code
aten/src/ATen/native/cudnn/Conv_v7.cpp lines 214–238
template <typename algo_t>
size_t getMaxWorkspaceSize(
const ConvolutionArgs& args,
const algo_t* algo,
int n_algo) {
size_t max_ws_size = 0;
size_t max_block_size = 0;
const auto device = c10::cuda::current_device();
// For the native allocator, retrieves the size of the largest unused block.
// For cudaMallocAsync, see c10/cuda/CUDAMallocAsync.cpp:cacheInfo for
// details.
c10::cuda::CUDACachingAllocator::cacheInfo(device, &max_block_size);
for (const auto i : c10::irange(n_algo)) {
cudnnStatus_t err;
size_t sz;
err = getWorkspaceSize(args, algo[i], &sz);
if (CUDNN_STATUS_SUCCESS != err || sz == 0 || sz < max_ws_size ||
sz > max_block_size)
continue;
max_ws_size = sz;
}
return max_ws_size;
}
Source
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free