Hi guys, I really want to run this model on Q6_K_XL (194 GB) by Unsloth or perhaps one of the AWQ \ FP8 Quants.

My setup is complex though, I have two servers:

Server A - 4 x RTX 3090 1900x ThreadRipper 64GB of DDR4 RAM. ( 2133 MT/s ) - Quad Channel

Server B - 2 x RTX 3090 2 x CPUs, each Xeon E5-2695-v4 512GB of DDR4 ECC RAM ( 2133 MT/s ) - Quad Channel per CPU *( total 8 channels if using both Numa nodes or 4 Channels if using 1 )

I have another, 7th 3090 on my main work PC, I could throw it in somewhere if it made a difference, but prefer to get it done with 6.

I can’t place all 6 GPUs on Server B, as it is not supporting MoBo PCIe bifurcation, and does not have enough PCIe Lanes for all 6 GPUs alongside the other PCIe cards ( NVMe storage over PCIe and NIC ).

I CA…

Similar Posts

Loading similar posts...

Keyboard Shortcuts

Navigation
Next / previous item
j/k
Open post
oorEnter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help