Qwen3.5-Thinking-AIO-GGUF

prithivMLmods/Qwen3.5-Thinking-AIO-GGUF models (0.8B, 2B, 4B, 9B variants) are experimental "abliterated" evolution of Alibaba's official Qwen3.5 series, applying advanced refusal-minimization techniques to remove safety alignments and internal censorship patterns while preserving or enhancing core reasoning, coding, math, and multimodal capabilities across scales. Created by prithivMLmods as part of their "Unredacted MAX" collection—similar to their Qwen3-VL abliterated releases like Qwen3-VL-8B-Instruct-Unredacted-MAX—these models undergo continual "abliteration" training for unrestricted instruction adherence, dense output generation, and reduced hallucinations on diverse prompts without standard safety refusals, making them suitable for edge deployment (FP8/GGUF quants available) in research, dataset creation, or applications needing uncensored reasoning. Scaling from ultra-light 0.8B (mobile/Pi) to powerful 27B, they retain Qwen3.5's hybrid Gated DeltaNet architecture, 262K context, 201-language vocab, and toggleable thinking modes while prioritizing raw capability over guardrails, outperforming vanilla counterparts in open-ended tasks per community benchmarks though lacking official evaluations.

Open this to view the entire directory.
prithivMLmods/Qwen3.5-Thinking-AIO-GGUF (main)
+-- README.md (13.5 KB)
+-- .gitattributes (5.1 KB)
+-- config.json (32 B)
+-- Qwen3.5-0.8B-Unredacted-MAX-Thinking-GGUF
|   +-- Qwen3.5-0.8B-Unredacted-MAX.F32.gguf (2.8 GB)
|   +-- Qwen3.5-0.8B-Unredacted-MAX.BF16.gguf (1.4 GB)
|   +-- Qwen3.5-0.8B-Unredacted-MAX.F16.gguf (1.4 GB)
|   +-- Qwen3.5-0.8B-Unredacted-MAX.Q8_0.gguf (774.2 MB)
|   +-- Qwen3.5-0.8B-Unredacted-MAX.mmproj-f32.gguf (383.7 MB)
|   +-- Qwen3.5-0.8B-Unredacted-MAX.mmproj-bf16.gguf (197.7 MB)
|   +-- Qwen3.5-0.8B-Unredacted-MAX.mmproj-f16.gguf (197.7 MB)
|   +-- Qwen3.5-0.8B-Unredacted-MAX.mmproj-q8_0.gguf (110.6 MB)
+-- Qwen3.5-2B-Unredacted-MAX-Thinking-GGUF
|   +-- Qwen3.5-2B-Unredacted-MAX.F32.gguf (7.0 GB)
|   +-- Qwen3.5-2B-Unredacted-MAX.BF16.gguf (3.5 GB)
|   +-- Qwen3.5-2B-Unredacted-MAX.F16.gguf (3.5 GB)
|   +-- Qwen3.5-2B-Unredacted-MAX.Q8_0.gguf (1.9 GB)
|   +-- Qwen3.5-2B-Unredacted-MAX.mmproj-f32.gguf (1.2 GB)
|   +-- Qwen3.5-2B-Unredacted-MAX.mmproj-bf16.gguf (640.3 MB)
|   +-- Qwen3.5-2B-Unredacted-MAX.mmproj-f16.gguf (640.3 MB)
|   +-- Qwen3.5-2B-Unredacted-MAX.mmproj-q8_0.gguf (347.8 MB)
+-- Qwen3.5-4B-Unredacted-MAX-Thinking-GGUF
|   +-- Qwen3.5-4B-Unredacted-MAX.F32.gguf (15.7 GB)
|   +-- Qwen3.5-4B-Unredacted-MAX.BF16.gguf (7.8 GB)
|   +-- Qwen3.5-4B-Unredacted-MAX.F16.gguf (7.8 GB)
|   +-- Qwen3.5-4B-Unredacted-MAX.Q8_0.gguf (4.2 GB)
|   +-- Qwen3.5-4B-Unredacted-MAX.mmproj-f32.gguf (1.2 GB)
|   +-- Qwen3.5-4B-Unredacted-MAX.mmproj-bf16.gguf (644.3 MB)
|   +-- Qwen3.5-4B-Unredacted-MAX.mmproj-f16.gguf (644.3 MB)
|   +-- Qwen3.5-4B-Unredacted-MAX.mmproj-q8_0.gguf (349.9 MB)
+-- Qwen3.5-9B-Unredacted-MAX-Thinking-GGUF
    +-- Qwen3.5-9B-Unredacted-MAX.F32.gguf (33.4 GB)
    +-- Qwen3.5-9B-Unredacted-MAX.BF16.gguf (16.7 GB)
    +-- Qwen3.5-9B-Unredacted-MAX.F16.gguf (16.7 GB)
    +-- Qwen3.5-9B-Unredacted-MAX.Q8_0.gguf (8.9 GB)
    +-- Qwen3.5-9B-Unredacted-MAX.mmproj-f32.gguf (1.7 GB)
    +-- Qwen3.5-9B-Unredacted-MAX.mmproj-bf16.gguf (879.0 MB)
    +-- Qwen3.5-9B-Unredacted-MAX.mmproj-f16.gguf (879.0 MB)
    +-- Qwen3.5-9B-Unredacted-MAX.mmproj-q8_0.gguf (595.3 MB)

Model List

Qwen3.5-0.8B-Unredacted-MAX-Thinking

File Name Quant Type File Size File Link
Qwen3.5-0.8B-Unredacted-MAX.BF16.gguf BF16 1.52 GB Download
Qwen3.5-0.8B-Unredacted-MAX.F16.gguf F16 1.52 GB Download
Qwen3.5-0.8B-Unredacted-MAX.F32.gguf F32 3.02 GB Download
Qwen3.5-0.8B-Unredacted-MAX.Q8_0.gguf Q8_0 812 MB Download
Qwen3.5-0.8B-Unredacted-MAX.mmproj-bf16.gguf mmproj-bf16 207 MB Download
Qwen3.5-0.8B-Unredacted-MAX.mmproj-f16.gguf mmproj-f16 207 MB Download
Qwen3.5-0.8B-Unredacted-MAX.mmproj-f32.gguf mmproj-f32 402 MB Download
Qwen3.5-0.8B-Unredacted-MAX.mmproj-q8_0.gguf mmproj-q8_0 116 MB Download

Qwen3.5-2B-Unredacted-MAX-Thinking

File Name Quant Type File Size File Link
Qwen3.5-2B-Unredacted-MAX.BF16.gguf BF16 3.78 GB Download
Qwen3.5-2B-Unredacted-MAX.F16.gguf F16 3.78 GB Download
Qwen3.5-2B-Unredacted-MAX.F32.gguf F32 7.54 GB Download
Qwen3.5-2B-Unredacted-MAX.Q8_0.gguf Q8_0 2.01 GB Download
Qwen3.5-2B-Unredacted-MAX.mmproj-bf16.gguf mmproj-bf16 671 MB Download
Qwen3.5-2B-Unredacted-MAX.mmproj-f16.gguf mmproj-f16 671 MB Download
Qwen3.5-2B-Unredacted-MAX.mmproj-f32.gguf mmproj-f32 1.33 GB Download
Qwen3.5-2B-Unredacted-MAX.mmproj-q8_0.gguf mmproj-q8_0 365 MB Download

Qwen3.5-4B-Unredacted-MAX-Thinking

File Name Quant Type File Size File Link
Qwen3.5-4B-Unredacted-MAX.BF16.gguf BF16 8.42 GB Download
Qwen3.5-4B-Unredacted-MAX.F16.gguf F16 8.42 GB Download
Qwen3.5-4B-Unredacted-MAX.F32.gguf F32 16.8 GB Download
Qwen3.5-4B-Unredacted-MAX.Q8_0.gguf Q8_0 4.48 GB Download
Qwen3.5-4B-Unredacted-MAX.mmproj-bf16.gguf mmproj-bf16 676 MB Download
Qwen3.5-4B-Unredacted-MAX.mmproj-f16.gguf mmproj-f16 676 MB Download
Qwen3.5-4B-Unredacted-MAX.mmproj-f32.gguf mmproj-f32 1.33 GB Download
Qwen3.5-4B-Unredacted-MAX.mmproj-q8_0.gguf mmproj-q8_0 367 MB Download

Qwen3.5-9B-Unredacted-MAX-Thinking

File Name Quant Type File Size File Link
Qwen3.5-9B-Unredacted-MAX.BF16.gguf BF16 17.9 GB Download
Qwen3.5-9B-Unredacted-MAX.F16.gguf F16 17.9 GB Download
Qwen3.5-9B-Unredacted-MAX.F32.gguf F32 35.8 GB Download
Qwen3.5-9B-Unredacted-MAX.Q8_0.gguf Q8_0 9.53 GB Download
Qwen3.5-9B-Unredacted-MAX.mmproj-bf16.gguf mmproj-bf16 922 MB Download
Qwen3.5-9B-Unredacted-MAX.mmproj-f16.gguf mmproj-f16 922 MB Download
Qwen3.5-9B-Unredacted-MAX.mmproj-f32.gguf mmproj-f32 1.82 GB Download
Qwen3.5-9B-Unredacted-MAX.mmproj-q8_0.gguf mmproj-q8_0 624 MB Download

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

image.png

Chat Template

{%- set image_count = namespace(value=0) %}
{%- set video_count = namespace(value=0) %}
{%- macro render_content(content, do_vision_count) %}
    {%- if content is string %}
        {{- content }}
    {%- else %}
        {%- for item in content %}
            {%- if 'image' in item or 'image_url' in item or item.type == 'image' %}
                {%- if do_vision_count %}
                    {%- set image_count.value = image_count.value + 1 %}
                {%- endif %}
                {%- if add_vision_id %}Picture {{ image_count.value }}: {% endif -%}
                <|vision_start|><|image_pad|><|vision_end|>
            {%- elif 'video' in item or item.type == 'video' %}
                {%- if do_vision_count %}
                    {%- set video_count.value = video_count.value + 1 %}
                {%- endif %}
                {%- if add_vision_id %}Video {{ video_count.value }}: {% endif -%}
                <|vision_start|><|video_pad|><|vision_end|>
            {%- elif 'text' in item %}
                {{- item.text }}
            {%- endif %}
        {%- endfor %}
    {%- endif %}
{%- endmacro %}
{%- if tools %}
    {{- '<|im_start|>system\n' }}
    {%- if messages[0].role == 'system' %}
        {{- render_content(messages[0].content, false) + '\n\n' }}
    {%- endif %}
    {{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
    {%- for tool in tools %}
        {{- "\n" }}
        {{- tool | tojson }}
    {%- endfor %}
    {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
{%- else %}
    {%- if messages[0].role == 'system' %}
        {{- '<|im_start|>system\n' + render_content(messages[0].content, false) + '<|im_end|>\n' }}
    {%- endif %}
{%- endif %}
{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
{%- for message in messages[::-1] %}
    {%- set index = (messages|length - 1) - loop.index0 %}
    {%- if ns.multi_step_tool and message.role == "user" %}
        {%- set content = render_content(message.content, false) %}
        {%- if not(content.startswith('<tool_response>') and content.endswith('</tool_response>')) %}
            {%- set ns.multi_step_tool = false %}
            {%- set ns.last_query_index = index %}
        {%- endif %}
    {%- endif %}
{%- endfor %}
{%- for message in messages %}
    {%- set content = render_content(message.content, True) %}
    {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
        {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
    {%- elif message.role == "assistant" %}
        {%- set reasoning_content = '' %}
        {%- if message.reasoning_content is string %}
            {%- set reasoning_content = message.reasoning_content %}
        {%- else %}
            {%- if '</think>' in content %}
                {%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
                {%- set content = content.split('</think>')[-1].lstrip('\n') %}
            {%- endif %}
        {%- endif %}
        {%- if loop.index0 > ns.last_query_index %}
            {%- if loop.last or (not loop.last and reasoning_content) %}
                {{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}
            {%- else %}
                {{- '<|im_start|>' + message.role + '\n' + content }}
            {%- endif %}
        {%- else %}
            {{- '<|im_start|>' + message.role + '\n' + content }}
        {%- endif %}
        {%- if message.tool_calls %}
            {%- for tool_call in message.tool_calls %}
                {%- if (loop.first and content) or (not loop.first) %}
                    {{- '\n' }}
                {%- endif %}
                {%- if tool_call.function %}
                    {%- set tool_call = tool_call.function %}
                {%- endif %}
                {{- '<tool_call>\n{"name": "' }}
                {{- tool_call.name }}
                {{- '", "arguments": ' }}
                {%- if tool_call.arguments is string %}
                    {{- tool_call.arguments }}
                {%- else %}
                    {{- tool_call.arguments | tojson }}
                {%- endif %}
                {{- '}\n</tool_call>' }}
            {%- endfor %}
        {%- endif %}
        {{- '<|im_end|>\n' }}
    {%- elif message.role == "tool" %}
        {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
            {{- '<|im_start|>user' }}
        {%- endif %}
        {{- '\n<tool_response>\n' }}
        {{- content }}
        {{- '\n</tool_response>' }}
        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
            {{- '<|im_end|>\n' }}
        {%- endif %}
    {%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
    {{- '<|im_start|>assistant\n' }}
    {%- if enable_thinking is defined and enable_thinking is false %}
        {{- '<think>\n\n</think>\n\n' }}
    {%- else %}
        {{- '<think>\n' }}
    {%- endif %}
{%- endif %}
Downloads last month
2,456
GGUF
Model size
0.8B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for prithivMLmods/Qwen3.5-Thinking-AIO-GGUF

Quantized
(4)
this model

Collections including prithivMLmods/Qwen3.5-Thinking-AIO-GGUF