GPU Hog - Setup Guide

Step 1

Create Your Account

Sign up at gpuhog.com. All you need is:

An email address
A payment method — credit or debit card via StripeGPU Hog uses Stripe for secure payment processing. Your card details are encrypted end-to-end and never stored on our servers. You are only billed for actual GPU time used, by the minute.

After signing up, you land on your Dashboard — the control center for all your pods.

Step 2

Choose a GPU

Select the GPU that matches your workload. GPU Hog offers three tiers:

Value

RTX 4090

24 GB VRAM · Ada Lovelace

$0.49/hr

Fine-tuning · Inference · LoRA

Pro

A100 80GB

80 GB VRAM · Ampere

$1.69/hr

Full training · Large models · Multi-GPU

Max

H100 SXM

80 GB VRAM · Hopper

$3.49/hr

Production · Distributed training

💡

Which GPU do I need? If your model fits in 24GB of VRAM (most 7B models, LoRA fine-tunes, inference), the RTX 4090 is the best value. For full fine-tuning of 13B+ parameter models or multi-GPU setups, choose the A100 or H100.

Step 3

Configure Your Pod

A podA pod is your personal cloud machine with GPU(s) attached. It runs a Docker container pre-loaded with everything you need — Python, PyTorch, CUDA drivers, and common ML libraries. Think of it as renting a powerful workstation by the minute. is your cloud GPU instance. Set up the following:

Pod Name — Give it a label you will recognize (e.g., "llama-finetune")
GPU Type & Count — Select your GPU. Start with 1x unless you need multi-GPUMulti-GPU configurations (2x, 4x, 8x) split your workload across cards using frameworks like DeepSpeed or FSDP. Training time scales roughly linearly. Cost multiplies with GPU count. Only needed for very large models or to reduce training time.
Template — Pre-built environmentTemplates are Docker images pre-configured with specific frameworks. The PyTorch template includes Python 3.11, PyTorch 2.4, CUDA 12.4, cuDNN, and common ML libraries. You can also use any custom Docker image from Docker Hub or a private registry. (recommended: PyTorch 2.x)
Container Disk — Temporary storage for your OS and packages. 20 GB minimum, 50 GB recommended
Volume Disk — Persistent storageVolume disk is mounted at /workspace and persists even when you stop your pod. Container disk is wiped on stop. Always save your models, datasets, and results to /workspace so you do not lose them. mounted at /workspace. Survives pod restarts. 50–100 GB recommended

⚙️ Pod Configuration

Pod Name

my-training-pod

GPU

RTX 4090 × 1

Template

PyTorch 2.4 / CUDA 12.4 / Python 3.11

Container Disk

50 GB

Volume Disk

100 GB — mounted at /workspace

Estimated Cost

$0.49/hr (~$11.76/day)

Step 4

Deploy

Click Deploy Pod. Your pod provisions in 30–90 seconds. You will see the status change in your dashboard:

Your Pods

Name	GPU	Status	Uptime	Cost	Actions
my-training-pod	RTX 4090	Running	2m 14s	$0.49/hr

✅

Billing starts when the pod is running and stops when you stop the pod. You are billed by the minute. Volume storage has a small ongoing charge even when stopped.

Step 5

Connect to Your Pod

Once your pod shows Running, click Connect to see your connection options:

🌐

Web Terminal

Browser-based terminal. No setup needed. Great for quick tasks.

🖥️

SSH

Full terminal access from your machine. Best for long-running work.

📓

JupyterLab

Notebook environment for data science. Pre-installed on PyTorch templates.

📝

VS Code / Cursor

Connect your local IDE directly to the pod for full development.

SSH

Web Terminal

JupyterLab

Command ssh root@205.185.xx.xx -p 22110 -i ~/.ssh/id_ed25519

IP Address 205.185.xx.xx

SSH Port 22110

🔑

SSH Keys: Add your public SSH key in Settings → SSH Keys on your dashboard before deploying. This lets you connect without a password. We will cover SSH key setup in a future guide.

Step 6

Using the Console

Once connected, you are inside a full Linux environment. Your GPU is ready to use. Here is what you will see:

root@gpuhog — my-training-pod

# Check your GPU
root@gpuhog:~# nvidia-smi
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15 Driver: 550.54.15 CUDA Version: 12.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce RTX 4090 On | 00000000:01:00.0 Off | Off |
| 30% 32C P8 25W / 450W | 0MiB / 24564MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

# Verify PyTorch sees the GPU
root@gpuhog:~# python3 -c "import torch; print(torch.cuda.get_device_name(0))"
NVIDIA GeForce RTX 4090

# Check available disk space
root@gpuhog:~# df -h /workspace
Filesystem Size Used Avail Use% Mounted on
/dev/vdb 100G 1.2G 99G 2% /workspace

root@gpuhog:~# _

Pre-installed on your pod:

Python 3.11 + pip
PyTorch 2.4 with CUDA 12.4 and cuDNN
ML Libraries — transformers, accelerate, bitsandbytes, datasets, safetensors, peft
Tools — git, wget, curl, vim, nano, htop, tmux, screen
Full root access — install anything with pip install or apt install

💡

Live GPU monitoring: Run watch -n1 nvidia-smi in a separate terminal or tmux pane to watch GPU utilization, memory, and temperature in real-time while training.

Step 7

Upload Your Data

Get your training data, scripts, and models onto the pod. Multiple methods available:

Uploading Data

# Method 1: Download from a URL
root@gpuhog:~# cd /workspace
root@gpuhog:/workspace# wget https://your-site.com/training-data.json
training-data.json 100%[=================>] 42.5M 28.3MB/s in 1.5s

# Method 2: Clone a git repo
root@gpuhog:/workspace# git clone https://github.com/you/your-project.git
Cloning into 'your-project'... done.

# Method 3: SCP from your local machine (run this on YOUR computer)
local:~$ scp -P 22110 ./my-data.json root@205.185.xx.xx:/workspace/
my-data.json 100% 42MB 18.2MB/s 00:02

# Method 4: Hugging Face datasets
root@gpuhog:/workspace# python3 -c "from datasets import load_dataset; ds = load_dataset('your-dataset'); print(f'Loaded {len(ds[\"train\"])} examples')"
Loaded 5,000 examples

⚠️

Always save to /workspace! This is your persistent volume. Files stored anywhere else (like /root or /tmp) are wiped when you stop the pod. Your volume data persists across restarts.

Step 8

Run Your Job

Launch your training, fine-tuning, or inference workload. Always use tmuxtmux keeps your session alive even if your SSH connection drops. Your training continues in the background. Start: "tmux new -s train". Detach: Ctrl+B then D. Reattach: "tmux attach -t train". This is essential for long-running jobs. for long-running jobs:

Training Session

# Start a tmux session (keeps running if you disconnect)
root@gpuhog:/workspace# tmux new -s training

# Install any additional packages
root@gpuhog:/workspace# pip install trl peft wandb --quiet
Successfully installed trl-0.9.6 peft-0.12.0 wandb-0.17.5

# Run your training script
root@gpuhog:/workspace# python3 train.py \
--model_name meta-llama/Llama-3-8B \
--dataset /workspace/training-data.json \
--output_dir /workspace/output \
--num_epochs 3 \
--batch_size 4 \
--learning_rate 2e-4

Loading base model.............. done
Loading dataset (1,250 examples). done
Starting training on 1x RTX 4090

Epoch 1/3 ████████████████████ 100% Loss: 1.4231 12:34
Epoch 2/3 ████████████████████ 100% Loss: 0.8704 12:28
Epoch 3/3 ████████████████████ 100% Loss: 0.5102 12:31

✔ Training complete!
Model saved to: /workspace/output/
Total time: 37m 33s | Peak GPU memory: 21.3 GB / 24.0 GB

💡

Disconnect safely: Press Ctrl+B then D to detach from tmux. Your training keeps running. Reconnect later with tmux attach -t training. You can even close your laptop.

Step 9

Download Your Results

When your job finishes, get your results off the pod:

Downloading Results

# Option 1: SCP to your local machine (run on YOUR computer)
local:~$ scp -r -P 22110 root@205.185.xx.xx:/workspace/output ./my-trained-model
adapter_model.safetensors 100% 2.4GB 45.2MB/s 00:53
adapter_config.json 100% 1.2KB 0.0s
✔ Download complete

# Option 2: Push to Hugging Face Hub (run on the pod)
root@gpuhog:/workspace# huggingface-cli upload your-name/your-model ./output
✔ https://huggingface.co/your-name/your-model

# Option 3: Upload to cloud storage
root@gpuhog:/workspace# aws s3 cp ./output s3://your-bucket/models/ --recursive

Step 10

Stop & Manage Your Pod

When you are done, stop your pod to stop GPU billing:

Your Pods

Name	GPU	Status	Total Cost	Actions
my-training-pod	RTX 4090	Stopped	$1.47 (3h used)

Stop — Releases the GPU. Your /workspace volume is preserved. Small storage fee applies. You can restart later and pick up where you left off.
Start — Resumes your pod with the same volume. May get a different GPU of the same type.
Terminate — Permanently deletes the pod and container disk. Network volumes are preserved separately.

🚨

Do not forget to stop your pod! Running pods bill by the minute, 24/7. If you are done working, stop the pod. You can always restart it later with your files intact.

Reference

Quick Reference Commands

Cheat Sheet

# GPU monitoring (live refresh every second)
$ watch -n1 nvidia-smi

# Check disk space
$ df -h /workspace

# Check running processes on GPU
$ nvidia-smi --query-compute-apps=pid,name,used_memory --format=csv

# Start tmux session
$ tmux new -s mysession

# Detach from tmux (keeps running)
Press Ctrl+B, then D

# Reattach to tmux
$ tmux attach -t mysession

# System resources
$ htop

# Python + PyTorch quick check
$ python3 -c "import torch; print(f'GPU: {torch.cuda.get_device_name(0)}, VRAM: {torch.cuda.get_device_properties(0).total_mem/1e9:.1f}GB')"

💬

Need help? Check the setup guide or reach out to our team through your dashboard.