> ## Documentation Index
> Fetch the complete documentation index at: https://docs.simplifine.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Hugging Face Classification Training

> Learn about the `hf_clf_train` function within Simplifine's Train Engine

```python theme={null}
hf_clf_train(
    model_name:str, 
    dataset_name:str='', 
    hf_data_column:str='', 
    hf_label_column:str='',
    num_epochs:int=3, 
    batch_size:int=8, 
    lr:float=5e-5, 
    from_hf:bool=True, 
    hf_token:str='',
    inputs:list=[], 
    labels:list=[], 
    output_dir:str='clf_output',
    use_peft:bool=False, 
    peft_config=None, 
    report_to='none', 
    wandb_api_key:str='',
    ddp:bool=False, 
    zero:bool=False, 
    fp16:bool=False, 
    bf16:bool=False,
    gradient_accumulation_steps:int=1, 
    gradient_checkpointing:bool=False
):
```

<ParamField path="model_name" type="string" required="true">
  The name or path of the pre-trained model to use.
</ParamField>

<ParamField path="dataset_name" type="string">
  The name of the dataset to be used for training. *Defaults to an empty string.*
</ParamField>

<ParamField path="hf_data_column" type="string">
  The name of the column in the dataset containing the input data. *Defaults to an empty string.*
</ParamField>

<ParamField path="hf_label_column" type="string">
  The name of the column in the dataset containing the labels. *Defaults to an empty string.*
</ParamField>

<ParamField path="num_epochs" type="int">
  The number of training epochs. *Defaults to `3`.*
</ParamField>

<ParamField path="batch_size" type="int">
  The batch size for training. *Defaults to `8`.*
</ParamField>

<ParamField path="lr" type="float">
  The learning rate for optimization. *Defaults to `5e-5`.*
</ParamField>

<ParamField path="from_hf" type="boolean">
  A flag to determine whether to load the dataset from Hugging Face. *Defaults to `True`.*
</ParamField>

<ParamField path="hf_token" type="string">
  The Hugging Face token required for accessing private datasets or models. *Defaults to an empty string.*
</ParamField>

<ParamField path="inputs" type="list">
  A list of input data for training. *Defaults to an empty list.*
</ParamField>

<ParamField path="labels" type="list">
  A list of labels for training. *Defaults to an empty list.*
</ParamField>

<ParamField path="output_dir" type="string">
  The directory to save the output model and logs. *Defaults to `'clf_output'`.*
</ParamField>

<ParamField path="use_peft" type="boolean">
  A flag to enable Parameter-Efficient Fine-Tuning (PEFT). *Defaults to `False`.*
</ParamField>

<ParamField path="peft_config" type="object">
  The configuration object for PEFT. *Defaults to `None`.*
</ParamField>

<ParamField path="report_to" type="string">
  The service to report training logs to (e.g., `wandb`). *Defaults to `'none'`.*
</ParamField>

<ParamField path="wandb_api_key" type="string">
  The API key for Weights and Biases (WandB) logging. *Defaults to an empty string.*
</ParamField>

<ParamField path="ddp" type="boolean">
  A flag to enable Distributed Data Parallel (DDP) training. *Defaults to `False`.*
</ParamField>

<ParamField path="zero" type="boolean">
  A flag to enable ZeRO (Zero Redundancy Optimizer) for memory optimization. *Defaults to `False`.*
</ParamField>

<ParamField path="fp16" type="boolean">
  A flag to enable 16-bit floating-point (FP16) training. *Defaults to `False`.*
</ParamField>

<ParamField path="bf16" type="boolean">
  A flag to enable 16-bit Brain Floating Point (BF16) training. *Defaults to `False`.*
</ParamField>

<ParamField path="gradient_accumulation_steps" type="int">
  The number of steps for gradient accumulation. *Defaults to `1`.*
</ParamField>

<ParamField path="gradient_checkpointing" type="boolean">
  A flag to enable gradient checkpointing for reducing memory usage. *Defaults to `False`.*
</ParamField>
