A newer version of the Gradio SDK is available:
6.1.0
metadata
title: Business Category Description Generator
emoji: π’
colorFrom: blue
colorTo: purple
sdk: gradio
app_file: app.py
pinned: false
Business Category Description Generator
A Hugging Face Gradio application that generates CLIP-ready visual descriptions for business category keywords from CSV files.
Features
- π€ Upload Multiple CSV Files: Process one or more CSV files at once
- π Batch Processing: Automatically processes all unique categories from your files
- π€ AI-Powered: Uses OpenAI's GPT-OSS-20B model for high-quality descriptions
- π Automatic Retry Logic: 3 attempts per category with intelligent error recovery
- β Validation: JSON validation and quality checks for every description
- π Progress Tracking: Real-time progress updates with success/failure reporting
- πΎ Automatic Saving: Output files with Status column showing results
- π₯ Easy Download: Download all processed files directly from the interface
- β‘ Zero GPU Support: Use Zero GPU for faster, free GPU acceleration
How to Use
1. Deploy to Hugging Face Spaces
- Go to Hugging Face Spaces
- Click "Create new Space"
- Choose "Gradio" as the SDK
- Upload
app.py,requirements.txt, andREADME.md - Add Your HF Token as a Secret (Required):
- Go to your Space's Settings (gear icon)
- Find the "Repository secrets" or "Secrets" section
- Click "Add a secret" or "New secret"
- Enter:
- Name:
HF_TOKEN - Value: Your Hugging Face token (get from https://huggingface.co/settings/tokens)
- Name:
- Click "Save"
- Optional: Enable Zero GPU for Faster Processing:
- Zero GPU provides free GPU acceleration
- No Pro subscription required
- Space will automatically use GPU when available
- Significantly speeds up processing for large batches
- Your app will be deployed and restart automatically!
2. Prepare Your CSV Files
Your CSV files should contain a column with business category keywords. For example:
category,other_column
Car Rental For Self Driven,additional_data
Mehandi,additional_data
Photographer,additional_data
Equipment,additional_data
3. Use the Application
- Upload Files: Upload one or more CSV files
- Specify Column: Enter the name of the column containing categories (default: "category")
- Adjust Settings (optional):
- Max Tokens: 64-512 (default: 256)
- Temperature: 0.1-1.0 (default: 0.7)
- Top-p: 0.1-1.0 (default: 0.9)
- Process: Click "Process Files" and wait for completion
- Download: Download the output CSV files with descriptions
Note: Authentication is handled automatically via the HF_TOKEN secret you configured in Space settings.
Output Format
Each output CSV file contains:
| Column | Description |
|---|---|
Category |
The original category keyword |
Description |
The generated CLIP-ready visual description (validated) |
Raw_Response |
The complete model response (for debugging) |
Status |
"Success" or "Failed" with error details |
Example Output
Category,Description,Raw_Response,Status
Car Rental For Self Driven,"a car available for self-drive rental, parked at a pickup spot without a chauffeur; looks travel-ready, clean, well-maintained, keys handed over to customer","{""Category"": ""Car Rental For Self Driven"", ""Description"": ""...""}",Success
Model Settings
- Max Tokens: Controls the maximum length of generated descriptions (default: 256)
- Temperature: Controls output consistency (default: 0.3)
- 0.2-0.4: Consistent, focused descriptions (recommended)
- 0.5-0.7: Balanced creativity and consistency
- 0.8-1.0: More creative variations
- Top-p: Nucleus sampling parameter, controls diversity (default: 0.9)
Technical Details
- Model: openai/gpt-oss-20b
- Framework: Gradio (latest stable version)
- Retry Logic: 3 attempts per category with 1-second delay between retries
- Validation: JSON parsing, structure validation, and minimum length checks
- Processing: Categories are deduplicated automatically
- Rate Limiting: 0.5-second delay between categories to avoid API throttling
- Output Files: Named as
output_{original_name}_{timestamp}.csv - Zero GPU Support: Free GPU acceleration available for Spaces
Troubleshooting
"HF_TOKEN not found" error
- Make sure you've added
HF_TOKENas a Secret in your Space settings - Go to Space Settings β Secrets β Add a secret
- Name must be exactly:
HF_TOKEN(case-sensitive) - Value: your token from https://huggingface.co/settings/tokens
- Restart your Space after adding the secret (or it will restart automatically)
"Column not found" error
- Check that the column name matches exactly (case-sensitive)
- View the error message to see available columns
Authentication errors
- Ensure your HF token has proper permissions (Read access minimum)
- Check that your account has access to the Inference API
- Verify the token hasn't expired
- Make sure you're using a valid token from https://huggingface.co/settings/tokens
Inconsistent or incomplete output
- Lower the Temperature to 0.2-0.4 for more consistent results
- Check the Status column in output CSV to identify failed categories
- Failed categories can be extracted and reprocessed separately
- Zero GPU will provide more reliable processing with better resources
Slow processing
- The model processes each unique category individually (includes retries)
- Large files with many unique categories will take longer
- Consider splitting very large files into smaller batches
- Zero GPU acceleration is automatically available for your Space
- Each category has a 0.5s delay to prevent rate limiting
Local Development
To run locally:
# Install dependencies
pip install -r requirements.txt
# Set your Hugging Face token as an environment variable
# Windows (PowerShell):
$env:HF_TOKEN="your_hf_token_here"
# Linux/Mac:
export HF_TOKEN="your_hf_token_here"
# Run the app
python app.py
Get your token from: https://huggingface.co/settings/tokens
License
This project uses the GPT-OSS-20B model via Hugging Face Inference API.