getMaxWorkspaceSize Class — pytorch Architecture

Architecture documentation for the getMaxWorkspaceSize class in Conv_v7.cpp from the pytorch codebase.

Class cpp

Entity Profile

Source Code

aten/src/ATen/native/cudnn/Conv_v7.cpp lines 214–238

template <typename algo_t>
size_t getMaxWorkspaceSize(
    const ConvolutionArgs& args,
    const algo_t* algo,
    int n_algo) {
  size_t max_ws_size = 0;
  size_t max_block_size = 0;

  const auto device = c10::cuda::current_device();
  // For the native allocator, retrieves the size of the largest unused block.
  // For cudaMallocAsync, see c10/cuda/CUDAMallocAsync.cpp:cacheInfo for
  // details.
  c10::cuda::CUDACachingAllocator::cacheInfo(device, &max_block_size);

  for (const auto i : c10::irange(n_algo)) {
    cudnnStatus_t err;
    size_t sz;
    err = getWorkspaceSize(args, algo[i], &sz);
    if (CUDNN_STATUS_SUCCESS != err || sz == 0 || sz < max_ws_size ||
        sz > max_block_size)
      continue;
    max_ws_size = sz;
  }
  return max_ws_size;
}

Source

View on GitHub

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free