Spaces:

piyushdev
/

gpt-oss

Sleeping

App Files Files Community

piyushdev commited on Nov 4

Commit

7ba47b9

verified ·

1 Parent(s): 20be5b8

Update README.md

Browse files

Files changed (1) hide show

README.md +35 -11

README.md CHANGED Viewed

@@ -17,9 +17,12 @@ A Hugging Face Gradio application that generates CLIP-ready visual descriptions
 - 📤 **Upload Multiple CSV Files**: Process one or more CSV files at once
 - 🔄 **Batch Processing**: Automatically processes all unique categories from your files
 - 🤖 **AI-Powered**: Uses OpenAI's GPT-OSS-20B model for high-quality descriptions
-- 📊 **Progress Tracking**: Real-time progress updates during processing
-- 💾 **Automatic Saving**: Output files are automatically generated with timestamps
 - 📥 **Easy Download**: Download all processed files directly from the interface
 ## How to Use
@@ -37,7 +40,12 @@ A Hugging Face Gradio application that generates CLIP-ready visual descriptions
      - **Name**: `HF_TOKEN`
      - **Value**: Your Hugging Face token (get from https://huggingface.co/settings/tokens)
    - Click "Save"
-6. Your app will be deployed and restart automatically!
 ### 2. Prepare Your CSV Files
@@ -71,28 +79,36 @@ Each output CSV file contains:
 | Column | Description |
 |--------|-------------|
 | `Category` | The original category keyword |
-| `Description` | The generated CLIP-ready visual description |
-| `Raw_Response` | The complete model response (JSON format) |
 ## Example Output
 ```csv
-Category,Description,Raw_Response
-Car Rental For Self Driven,"a car available for self-drive rental, parked at a pickup spot without a chauffeur; looks travel-ready, clean, well-maintained, keys handed over to customer","{""Category"": ""Car Rental For Self Driven"", ""Description"": ""...""}"
 ```
 ## Model Settings
-- **Max Tokens**: Controls the maximum length of generated descriptions
-- **Temperature**: Higher values (0.8-1.0) make output more creative, lower values (0.3-0.5) make it more focused
-- **Top-p**: Nucleus sampling parameter, controls diversity
 ## Technical Details
 - **Model**: openai/gpt-oss-20b
 - **Framework**: Gradio (latest stable version)
 - **Processing**: Categories are deduplicated automatically
 - **Output Files**: Named as `output_{original_name}_{timestamp}.csv`
 ## Troubleshooting
@@ -113,10 +129,18 @@ Car Rental For Self Driven,"a car available for self-drive rental, parked at a p
 - Verify the token hasn't expired
 - Make sure you're using a valid token from https://huggingface.co/settings/tokens
 ### Slow processing
-- The model processes each unique category individually
 - Large files with many unique categories will take longer
 - Consider splitting very large files into smaller batches
 ## Local Development

 - 📤 **Upload Multiple CSV Files**: Process one or more CSV files at once
 - 🔄 **Batch Processing**: Automatically processes all unique categories from your files
 - 🤖 **AI-Powered**: Uses OpenAI's GPT-OSS-20B model for high-quality descriptions
+- 🔁 **Automatic Retry Logic**: 3 attempts per category with intelligent error recovery
+- ✅ **Validation**: JSON validation and quality checks for every description
+- 📊 **Progress Tracking**: Real-time progress updates with success/failure reporting
+- 💾 **Automatic Saving**: Output files with Status column showing results
 - 📥 **Easy Download**: Download all processed files directly from the interface
+- ⚡ **Zero GPU Support**: Use Zero GPU for faster, free GPU acceleration
 ## How to Use
      - **Name**: `HF_TOKEN`
      - **Value**: Your Hugging Face token (get from https://huggingface.co/settings/tokens)
    - Click "Save"
+6. **Optional: Enable Zero GPU for Faster Processing**:
+   - Zero GPU provides free GPU acceleration
+   - No Pro subscription required
+   - Space will automatically use GPU when available
+   - Significantly speeds up processing for large batches
+7. Your app will be deployed and restart automatically!
 ### 2. Prepare Your CSV Files
 | Column | Description |
 |--------|-------------|
 | `Category` | The original category keyword |
+| `Description` | The generated CLIP-ready visual description (validated) |
+| `Raw_Response` | The complete model response (for debugging) |
+| `Status` | "Success" or "Failed" with error details |
 ## Example Output
 ```csv
+Category,Description,Raw_Response,Status
+Car Rental For Self Driven,"a car available for self-drive rental, parked at a pickup spot without a chauffeur; looks travel-ready, clean, well-maintained, keys handed over to customer","{""Category"": ""Car Rental For Self Driven"", ""Description"": ""...""}",Success
 ```
 ## Model Settings
+- **Max Tokens**: Controls the maximum length of generated descriptions (default: 256)
+- **Temperature**: Controls output consistency (default: 0.3)
+  - 0.2-0.4: Consistent, focused descriptions (recommended)
+  - 0.5-0.7: Balanced creativity and consistency
+  - 0.8-1.0: More creative variations
+- **Top-p**: Nucleus sampling parameter, controls diversity (default: 0.9)
 ## Technical Details
 - **Model**: openai/gpt-oss-20b
 - **Framework**: Gradio (latest stable version)
+- **Retry Logic**: 3 attempts per category with 1-second delay between retries
+- **Validation**: JSON parsing, structure validation, and minimum length checks
 - **Processing**: Categories are deduplicated automatically
+- **Rate Limiting**: 0.5-second delay between categories to avoid API throttling
 - **Output Files**: Named as `output_{original_name}_{timestamp}.csv`
+- **Zero GPU Support**: Free GPU acceleration available for Spaces
 ## Troubleshooting
 - Verify the token hasn't expired
 - Make sure you're using a valid token from https://huggingface.co/settings/tokens
+### Inconsistent or incomplete output
+- Lower the Temperature to 0.2-0.4 for more consistent results
+- Check the Status column in output CSV to identify failed categories
+- Failed categories can be extracted and reprocessed separately
+- Zero GPU will provide more reliable processing with better resources
 ### Slow processing
+- The model processes each unique category individually (includes retries)
 - Large files with many unique categories will take longer
 - Consider splitting very large files into smaller batches
+- Zero GPU acceleration is automatically available for your Space
+- Each category has a 0.5s delay to prevent rate limiting
 ## Local Development