Note: This workflow is specifically designed for cloud-based training. For on-device training, refer to the On-Device Training Workflow section.
On-Device Training with train_engine
This workflow is optimized for training on your own infrastucture.
Disable WandB Logging (Optional)
WandB logging can be passed as the value of the arguementreport_to
in the training arguments objects passed to the trainer. It can be fully deactivated as follows:
WandB logging is enabled by default.
Dataset Configuration
Using a Hugging Face Dataset
You can provide the name of a Hugging Face (HF) dataset if you’d like to use a pre-existing dataset. Ensure that you adjust thekeys
, response_template
, and template
to match your dataset’s structure.
Using a Local Dataset
If you don’t want to use a Hugging Face dataset, set from_hf to False and provide a local dataset. The example below shows how to manually create a dataset:Model Selection
Choose a model for training.Large models may cause OOM errors, especially in environments with limited GPU memory. Consider using smaller models or utilizing Simplifine’s other GPU offerings.
Initiating Cloud-Based Training
Finally, use the train_engine.hf_sft function to start training in the cloud. The parameters allow you to control the training process, including whether to use mixed-precision training (fp16
), distributed data parallel (ddp
), and more
For efficient cloud training, consider adjusting gradient_accumulation_steps and using mixed-precision (
fp16=True
) to reduce memory usage.