

Tool is used to collect logs from the system that are relevant for $ sudo /usr/local/cuda/gds//tools/gdstools/gds_log_collection.py -h
#NVIDIA CHECKING SYSTEM COMPATIBILITY STUCK HOW TO#
This section describes how to resolve a kernel panic with stack traces using NVSM orįor DGX BaseOS with the preview network repoįor more details on running NVSM commands, refer to NVIDIA System Management User Maximum number of Scatter Gather Entries supported per Work Maximum number of Work requests supported by the Shared Prevents indefinite looping of the packet. Maximum number of hops before the packet is discarded on the With RNR timeout if no Work Request is posted on the remote end.Įnables NVTX tracing for use with Nsight systems.Ĭontrols the DC_KEY for userspace RDMA DC targets for WekaFS Minimum RNR value for QP after which the QP will error out Specifies theĭefault log path, which is the current working directory ofĬontrols the tracing level and can override the trace levelįor a specific application without requiring a new configuration Sets QOS level on IB device QP for userspace RDMA targetsĬUFILE_LOGFILE_PATH= /etc/log/cufile_$$.logĬontrols the path for cuFile log information. Sets QOS level on RoCEv2 device QP for userspace RDMA targets This can be used for containerĮnvironments and applications that require differentĬonfiguration settings from system default configuration at

When set to 1, allows testing with new filesystems that are notĬUFILE_ENV_PATH_JSON= /home/user/cufile.jsonĬontrols the path where the cuFile library reads theĬonfiguration variables from. GDS Environment Variables CUFILE_ENV VariableĬompletion queue depth for the DC target.Ĭontrols whether cufile checks for supporting filesystems.

GPU index 0 A100-PCIE-40GB bar:1 bar size (MiB):65536 supports GDS Miscellaneous.api_check_aggressive : false Properties.rdma_peer_affinity_policy : RoundRobinįs.generic.posix_unaligned_writes : false Properties.posix_pool_slab_count : 128 64 32 Properties.max_device_pinned_mem_size_kb : 33554432 Properties.max_device_cache_size_kb : 131072 Properties.max_batch_io_timeout_msecs : 5 rdma library : Not Loaded (libcufile_rdma.so) Nvidia_fs version: 2.7 libcufile version: 2.4 Sample output: GDS release version: 1.0.0.80 Note: For best GDS performance, disable PCIe ACS.
