Questions tagged [cuda]
CUDA (Compute Unified Device Architecture) is a parallel computing platform and API created by Nvidia to perform GPGPU (General-Purpose computing on GPU).
28
questions
8
votes
2
answers
8k
views
How to run GPGPU Memory Testing
We use a lot of GPGPU computing (mostly with CUDA, but some OpenCL). Often when users are running code, the code errors out with a memory error on only one of our hosts. I suspect one of the cards ...
6
votes
2
answers
3k
views
Force a headless server to load video drivers for the GPU?
I am running a headless server on Ubuntu, with the objective of using GPU's for non graphics computation. However, I have found that without the monitor plugged in the kernel fails to load the ...
5
votes
4
answers
12k
views
Howto set up SGE for CUDA devices?
I'm currently facing the problem of integrating GPU-Servers into an existing SGE environment. Using google I found some examples of Clusters where this has been set up but no information on how this ...
5
votes
1
answer
9k
views
Why is my CUDA GPU-Util ~70% when there are "No running processes found"?
After configuring a system with 2 Tesla K80 cards, I noticed when running nvidia-smi that one of the 4 GPUs was under heavy load despite there being "No running processes found". Why is this happening ...
5
votes
2
answers
5k
views
Can ESXi pass video card to VM to do CUDA?
I have an ESXi 4.1 running on hardware that can run 4 16-lane PCI-e cards. I would like to have access to the underlying hardware from a Linux VM, to run some CUDA programs.
So far all I can see ...
4
votes
2
answers
2k
views
8 GPU machine freezes
We have a SuperMicro GPU server with:
2x Intel(R) Xeon(R) CPU E5-2660 v4 @ 2.00GHz
512GB memory
more than enough disk space
X10DRG-O+-CPU (BIOS Version : 2.0a [current])
X9DRG-O-PCIE PCI-E expander ...
2
votes
1
answer
1k
views
Executing Cuda script in LXC container results in "cuda error: no CUDA-capable device is detected"
I followed the following instructions in order to set up Cuda inside an LXC container.
When I try to execute the sample ./deviceQuery script inside the container following error is returned:
$ ./...
2
votes
1
answer
370
views
How important is the CPU when building a CUDA system?
I'm just a clueless sysadmin and we need to put together a couple of machines specifically for users to use CUDA. We're looking at the Dell PowerEdge T620 and jamming four CUDA cards into the sucker. ...
2
votes
1
answer
1k
views
How AWS does GPU virtualization? [closed]
What kind of technology is Amazon using for GPU virtualization ? Can multiple VMs on an AWS GPU instance concurrently share GPU and have acceleration for their CUDA/ openCL programs ? I know following ...
2
votes
1
answer
2k
views
Hosting with CUDA support
One web application that I'm planning relies in [CUDA]http://www.nvidia.com/object/cuda_home_new.html) for doing heavy math processing. I developed the software at home, but now I'm looking for ...
2
votes
0
answers
531
views
Multi-Tenancy (Multi-user) GPU Container Infrastructure Solution [closed]
What we need: Several teams from different companies want to share our GPUs for deep learning tasks (three computers with several GPUs each). So manage multiple GPUs for multiple users.
Different ...
2
votes
0
answers
295
views
ESXI PCIe GPU Passthrough does not allow for CUDA
I am trying to do cuda development in an ESXI environment, so I installed a Quadro 5800 in my ESXI machine (Dell T7500). I did passthrough to the Windows 7 VM that I will be doing development in, but ...
2
votes
0
answers
229
views
CUDA 5.0 does not see the Tesla C2050
I have upgraded our development machine which is equipped with two Tesla C1060 cards and one Tesla C2050 card to CUDA 5.0.
The machine runs Windows Server 2008R2 (x64). All three cards are visible in ...
1
vote
2
answers
1k
views
Task spooler for computing server on Debian
Recently our university has bought an computing server with one multi-core Xeon and 4 powerfull GeForce videocard for lessons on discipline "High perfomance computing with CUDA".
There is Debian ...
1
vote
1
answer
1k
views
Ubuntu server 20.04 LTS - Installing nvidia & cuda installs gnome as well
I have a GPU server which requires cuda for example for machine learning tasks.
unfortunately, as soon as I install the NVIDIA drivers and cuda, apparently a variant of gnome is installed as well. ...
1
vote
1
answer
5k
views
Where to get an CUDA/GPU enabled version of the HPL benchmark?
After setting up a new compute server for my research group I need to evaluate the overall performance of this machine, including both Tesla cards. I found some information about a CUDA enabled ...
1
vote
2
answers
1k
views
Using CUDA_VISIBLE_DEVICES with sge
Using sge with resource complex called 'gpu.q' that allows resource management of gpu devices (these are all nvidia devices). However on the systems there are multiple gpu devices (in exclusive mode) ...
1
vote
1
answer
385
views
Nvidia Tesla XEN PCIpassthrough
has anyone managed/tried to pci passthrough an nvidia tesla card to a xen domU?
1
vote
1
answer
322
views
nvidia driver not present on debian bullseye after installing cuda
I'm trying to get nvidia gpu drivers and related software installed / upgrades on a debian bullseye system and having trouble. I tried following the instructions for installing cuda, but when I get ...
1
vote
0
answers
110
views
Linux: cuda (pytorch) does not allocate available vram
I am trying out pixray/clipit but cuda fails to allocate the remaining 1GiB of my graphics card.
My graphics card is "Nvidia GTX 1660 super" which has the same amount of RAM as the "...
1
vote
1
answer
2k
views
Xorg not starting in GKE with GPU : (EE) no screens found(EE)
I am trying to run Xorg server that use GPU inside Google Kubernetes Engine
I followed this guide (https://cloud.google.com/kubernetes-engine/docs/how-to/gpus#ubuntu) to setup a
GKE cluster with ...
1
vote
0
answers
656
views
Trouble installing GTX 480 / Tesla 2050 Dual-GPU for CUDA
I recently installed a few dual-GPU CUDA workstations (two dual gtx480s and a dual gtx470) with no apparent trouble. I just tried a gtx480/ Tesla C2050 and not only does deviceQuery fail with a weird ...
0
votes
1
answer
3k
views
What does CUDA compute capability indicate? How to identify device CUDA C version computability?
I am attempting to identify if my Nvidia GPU device is compatible with the latest version of CUDA. Searching the online documentation within CUDA Zone along with the Wikipedia page
I am able to ...
0
votes
1
answer
744
views
CUDA: Is it possible to dynamically restrict the number of cores / threads / clock freq. while a process is running on a GPU?
I'm running multiple NVidia GTX 680 under Ubuntu 10.04 in a pretty hot environment (troubles with rack cooling) and sometimes it's getting over 95C. When I detect the overheating, can I somehow tell ...
0
votes
0
answers
499
views
"none of the providers can be installed" when I ran "sudo yum -y install nvidia-driver-latest-dkms"
I want to install NVIDIA Driver. I ran sudo yum -y install nvidia-driver-latest-dkms, but I got the following problem:
Error: Problem: package
nvidia-driver-latest-dkms-3:545.23.08-1.el7.x86_64 ...
0
votes
1
answer
2k
views
Can not detect CUDA device after restart Google Cloud Notebook
This issue happened when I restarted my cloud notebook server today.
Can be reproduced using the steps below:
Create a Google Cloud Notebook server with Tensorflow or Pytorch and GPU
After start the ...
0
votes
1
answer
21k
views
Difference between pip install cudatoolkit=9.0 and download install from CUDA website
I have a simple confusion, what is difference between pip install cudatoolkit=9.0 and download the run file from https://developer.nvidia.com/cuda-90-download-archive and install cuda 9.0.
Anyone can ...
0
votes
1
answer
1k
views
Configure Singularity to do headless rendering / use OpenGL / glxgears / glxinfo
I want to do headless rendering on a server where I do not have root permissions. Therefore, I created a Singularity container like this:
Bootstrap: docker
From: nvidia/cuda:9.0-runtime-ubuntu16.04
%...