Instructions to use MateusBarros/granite-company-agent with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use MateusBarros/granite-company-agent with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="MateusBarros/granite-company-agent")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("MateusBarros/granite-company-agent", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use MateusBarros/granite-company-agent with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "MateusBarros/granite-company-agent"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MateusBarros/granite-company-agent",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/MateusBarros/granite-company-agent

SGLang

How to use MateusBarros/granite-company-agent with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "MateusBarros/granite-company-agent" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MateusBarros/granite-company-agent",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "MateusBarros/granite-company-agent" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MateusBarros/granite-company-agent",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use MateusBarros/granite-company-agent with Docker Model Runner:
```
docker model run hf.co/MateusBarros/granite-company-agent
```

Granite Company Agent

Diagram illustrating the Granite Company Agent workflow.

Overview

The Granite Company Agent is a lightweight, fine-tuned language model built on top of a base causal language model.
It is designed to process structured company data (methodologies, courses, teachers, FAQs, testimonials) and provide accurate, context-aware responses to user queries.

Key features:

LoRA-based parameter-efficient fine-tuning.
Easy dataset generation and preprocessing.
Simple inference script to chat with the model.
Built with Hugging Face Transformers, PEFT, and PyTorch.

Repository Structure

granite-company-agent/
│
├── README.md
├── LICENSE
├── requirements.txt
│
├── train_granite.py       # Script to fine-tune the model
├── data_loader.py         # Utility to load CSV, XLSX, or TXT files
├── generate_data.py       # Generate synthetic company data
├── download_model.py      # Download base model from Hugging Face
├── chat_agent.py          # Simple inference/chat script

Setup

Clone the repository:

git https://huggingface.co/MateusBarros/granite-company-agent.git
cd granite-company-agent

Install dependencies:

pip install -r requirements.txt

python generate_data.py

Download the base model:

python download_model.py

Training

Train the model on your dataset: `python train_granite.py`

Training configuration highlights:

Epochs: 3
Batch size: 2 (gradient accumulation used)
Learning rate: 2e-4
LoRA fine-tuning (low-rank adapters)
BF16 precision The trained model and tokenizer are saved in ./granite_company_agent.

Inference / Chat

Use the agent interactively:

python chat_agent.py

Type your question and get responses from the fine-tuned model. Type exit or quit to end the session.

License

This project is released under the Apache-2.0 License.

References

Hugging Face Transformers
PEFT: Parameter-Efficient Fine-Tuning
PyTorch

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for MateusBarros/granite-company-agent

Base model

ibm-granite/granite-3.3-2b-base

Finetuned

ibm-granite/granite-3.3-2b-instruct