Customize Large Language Models on Cloud, Azure, GCP, AWS
5 min readJust now
–
LLM fine-tuning needs more consideration when done it in scale: infra/hardware/training data, training environment, base model, cost. Let’s see how cloud providers offer.
Azure
Chat completion FT
training data:
- The system message
- The user message
- The assistant’s response
{"messages": [{"role": "system", "content": "You are an Xbox customer support agent whose primary goal is to help users with issues they are experiencing with their Xbox devices. You are friendly and concise. You only provide factual answers to queries, and do not provide answers that are not related to Xbox."}, {"role": "user", "content": "Is Xbox bett...
Customize Large Language Models on Cloud, Azure, GCP, AWS
5 min readJust now
–
LLM fine-tuning needs more consideration when done it in scale: infra/hardware/training data, training environment, base model, cost. Let’s see how cloud providers offer.
Azure
Chat completion FT
training data:
- The system message
- The user message
- The assistant’s response
{"messages": [{"role": "system", "content": "You are an Xbox customer support agent whose primary goal is to help users with issues they are experiencing with their Xbox devices. You are friendly and concise. You only provide factual answers to queries, and do not provide answers that are not related to Xbox."}, {"role": "user", "content": "Is Xbox better than PlayStation?"}, {"role": "assistant", "content": "I apologize, but I cannot provide personal opinions. My primary job is to assist you with any issues related to your Xbox device. Do you have any Xbox-related issues that need addressing?"}]}
fine-tuning job
- Select a base model.
- Select your training data.
- (Optional) Select your validation data.
- Configure the advanced options. (batch_size, learning_rate_multiplier, n_epochs, seed)
Python-based FT using huggingface transformers + PEFT/QLORA + docker
Steps on using customized docker environment (Dockerfile, requirements.txt)
Use a custom container to deploy a model to an online endpoint
- Register model on Azure ML
- Create custom docker image and store in ACR
- Create Azure ML Environment from custom docker image
- Create online endpoint and deployment
Curated models from Hugging Face Hub
Supported model path
GCP
- Training data preparation
- Approach: full vs PEFT
- Training (hyperparameters like learning rate, batch size, and number of epochs)
- Performance evaluation and deployment
AWS
Main technologies: Sagemaker, Bedrock, notebook
run training with the Hugging Face estimator
SageMaker JumpStart in essence is SageMaker’s Model Zoo. For deploying cohere foundation model, you can use cohere-sagemaker SDK, which further simplifies the deployment process as a wrapper around the usual SageMaker Inference constructs (SageMaker Model, SageMaker Endpoint Configuration, and SageMaker Endpoint).
Amazon SageMaker Jumpstart empowers you to host your own machine learning models, allowing you to choose infrastructure components such as instance sizes and deployment endpoints. In contrast, Amazon Bedrock is a fully managed service provided by Amazon that enables you to make API calls to access models hosted on AWS.
Other
Multi-GPU & multi-node support