ggml: backend-agnostic tensor parallelism by JohannesGaessler · Pull Request #19378 (opens in new tab)
This PR adds support for backend-agnostic tensor parallelism, enabled via specifying --split-mode tensor. This is done by adding a new "meta" backend that internally wraps multiple "...
Read the original article