Running MiniMax-M2 locally - Existing Hardware Advice
reddit.com·9h·
Discuss: r/LocalLLaMA
Flag this post

Hi guys, I really want to run this model on Q6_K_XL (194 GB) by Unsloth or perhaps one of the AWQ \ FP8 Quants.

My setup is complex though, I have two servers:

Server A - 4 x RTX 3090 1900x ThreadRipper 64GB of DDR4 RAM. ( 2133 MT/s ) - Quad Channel

Server B - 2 x RTX 3090 2 x CPUs, each Xeon E5-2695-v4 512GB of DDR4 ECC RAM ( 2133 MT/s ) - Quad Channel per CPU *( total 8 channels if using both Numa nodes or 4 Channels if using 1 )

I have another, 7th 3090 on my main work PC, I could throw it in somewhere if it made a difference, but prefer to get it done with 6.

I can’t place all 6 GPUs on Server B, as it is not supporting MoBo PCIe bifurcation, and does not have enough PCIe Lanes for all 6 GPUs alongside the other PCIe cards ( NVMe storage over PCIe and NIC ).

I CA…

Similar Posts

Loading similar posts...