For Sale: A 12 GB GPU That Runs Local LLMs at ~60 tok/s

GIGABYTE RTX 3060 EAGLE OC 12 GB — $350 shipped anywhere in CONUS.

GIGABYTE RTX 3060 EAGLE OC 12 GB specifications

Specs via TechPowerUp’s GPU database.

I’m selling a graphics card. The quick pitch: it’s a GIGABYTE RTX 3060 EAGLE OC with 12 GB of VRAM — and that VRAM is the whole story for local AI.

Twelve gigs is the sweet spot for running modern small models entirely on your own hardware — no API keys, no per-token billing, nothing leaving the box:

  • 8B-class LLMs at ~60 tok/s. Qwen3-8B fits comfortably and generates at around 60 tokens/second — fast enough to feel interactive.
  • Runs Google’s new Gemma 4 with function calling, so you can wire it into tool-using agents locally.
  • All in vLLM — batched, production-grade serving on a single card.

For a homelab inference box, a quiet always-on coding assistant, or just learning how local models actually behave, 12 GB of VRAM at this price is hard to beat.

Specs at a glance

   
GPU NVIDIA GeForce RTX 3060 (GA106)
Memory 12 GB GDDR6
Card GIGABYTE EAGLE OC, dual-fan
Price $350 shipped — CONUS only

Interested? Reply to the original post on Mastodon and we’ll sort out the details.

Written on June 5, 2026