tested in comfyui?
seem doesn't work now in comfyui
got prompt
CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16
Requested to load ErnieTEModel
0 models unloaded.
Model ErnieTEModel prepared for dynamic VRAM loading. 4434MB Staged. 0 patches attached. Force pre-loaded 53 weights: 318 KB.
!!! Exception during processing !!! mat1 and mat2 shapes cannot be multiplied (601x3072 and 1536x9216)
Traceback (most recent call last):
File "D:\AI-tools\ComfyUI-aki-v3\ComfyUI\execution.py", line 534, in execute
output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\AI-tools\ComfyUI-aki-v3\ComfyUI\execution.py", line 334, in get_output_data
return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\AI-tools\ComfyUI-aki-v3\ComfyUI\execution.py", line 308, in _async_map_node_over_list
await process_inputs(input_dict, i)
File "D:\AI-tools\ComfyUI-aki-v3\ComfyUI\execution.py", line 296, in process_inputs
result = f(**inputs)
File "D:\AI-tools\ComfyUI-aki-v3\ComfyUI\nodes.py", line 80, in encode
return (clip.encode_from_tokens_scheduled(tokens), )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^
File "D:\AI-tools\ComfyUI-aki-v3\ComfyUI\comfy\sd.py", line 312, in encode_from_tokens_scheduled
pooled_dict = self.encode_from_tokens(tokens, return_pooled=return_pooled, return_dict=True)
File "D:\AI-tools\ComfyUI-aki-v3\ComfyUI\comfy\sd.py", line 376, in encode_from_tokens
o = self.cond_stage_model.encode_token_weights(tokens)
File "D:\AI-tools\ComfyUI-aki-v3\ComfyUI\comfy\sd1_clip.py", line 737, in encode_token_weights
out = getattr(self, self.clip).encode_token_weights(token_weight_pairs)
File "D:\AI-tools\ComfyUI-aki-v3\ComfyUI\comfy\sd1_clip.py", line 45, in encode_token_weights
o = self.encode(to_encode)
File "D:\AI-tools\ComfyUI-aki-v3\ComfyUI\comfy\sd1_clip.py", line 306, in encode
return self(tokens)
File "D:\AI-tools\ComfyUI-aki-v3\python\Lib\site-packages\torch\nn\modules\module.py", line 1779, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "D:\AI-tools\ComfyUI-aki-v3\python\Lib\site-packages\torch\nn\modules\module.py", line 1790, in _call_impl
return forward_call(*args, **kwargs)
File "D:\AI-tools\ComfyUI-aki-v3\ComfyUI\comfy\sd1_clip.py", line 279, in forward
outputs = self.transformer(None, attention_mask_model, embeds=embeds, num_tokens=num_tokens, intermediate_output=intermediate_output, final_layer_norm_intermediate=self.layer_norm_hidden_state, dtype=torch.float32, embeds_info=embeds_info)
File "D:\AI-tools\ComfyUI-aki-v3\python\Lib\site-packages\torch\nn\modules\module.py", line 1779, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "D:\AI-tools\ComfyUI-aki-v3\python\Lib\site-packages\torch\nn\modules\module.py", line 1790, in _call_impl
return forward_call(*args, **kwargs)
File "D:\AI-tools\ComfyUI-aki-v3\ComfyUI\comfy\text_encoders\llama.py", line 824, in forward
return self.model(input_ids, *args, **kwargs)
~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\AI-tools\ComfyUI-aki-v3\python\Lib\site-packages\torch\nn\modules\module.py", line 1779, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "D:\AI-tools\ComfyUI-aki-v3\python\Lib\site-packages\torch\nn\modules\module.py", line 1790, in _call_impl
return forward_call(*args, **kwargs)
File "D:\AI-tools\ComfyUI-aki-v3\ComfyUI\comfy\text_encoders\llama.py", line 749, in forward
x, current_kv = layer(
~~~~~^
x=x,
^^^^
...<3 lines>...
past_key_value=past_kv,
^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "D:\AI-tools\ComfyUI-aki-v3\python\Lib\site-packages\torch\nn\modules\module.py", line 1779, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "D:\AI-tools\ComfyUI-aki-v3\python\Lib\site-packages\torch\nn\modules\module.py", line 1790, in _call_impl
return forward_call(*args, **kwargs)
File "D:\AI-tools\ComfyUI-aki-v3\ComfyUI\comfy\text_encoders\llama.py", line 581, in forward
x = self.mlp(x)
File "D:\AI-tools\ComfyUI-aki-v3\python\Lib\site-packages\torch\nn\modules\module.py", line 1779, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "D:\AI-tools\ComfyUI-aki-v3\python\Lib\site-packages\torch\nn\modules\module.py", line 1790, in _call_impl
return forward_call(*args, **kwargs)
File "D:\AI-tools\ComfyUI-aki-v3\ComfyUI\comfy\text_encoders\llama.py", line 548, in forward
return self.down_proj(self.activation(self.gate_proj(x)) * self.up_proj(x))
~~~~~~~~~~~~~~^^^
File "D:\AI-tools\ComfyUI-aki-v3\python\Lib\site-packages\torch\nn\modules\module.py", line 1779, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "D:\AI-tools\ComfyUI-aki-v3\python\Lib\site-packages\torch\nn\modules\module.py", line 1790, in _call_impl
return forward_call(*args, **kwargs)
File "D:\AI-tools\ComfyUI-aki-v3\ComfyUI\comfy\ops.py", line 392, in forward
return self.forward_comfy_cast_weights(*args, **kwargs)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "D:\AI-tools\ComfyUI-aki-v3\ComfyUI\comfy\ops.py", line 385, in forward_comfy_cast_weights
x = torch.nn.functional.linear(input, weight, bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (601x3072 and 1536x9216)
Prompt executed in 5.86 seconds)
yeah, I have broken for now on the readme. From my other text encoders ive done there is always something off, either comfyUI's way of loading the model or some layer that wants to be kept at bf16.
As well, this quant only targeted the textual layers and not the vision and other multimodule parts, so could be an issue there.
Theres also a few different version of ministral-3-3b so I could be handling the model wrong during quantization with nvidia's model optimizer