Error While using langchain with huggingface models

from langchain_core.prompts import PromptTemplate
from langchain_community.llms import HuggingFaceEndpoint
import os

os.environ[“HUGGINGFACEHUB_API_TOKEN”] = “hf_your_new_token_here”

prompt = PromptTemplate(
input_variables=[“product”],
template=“What is a good name for a company that makes {product}?”
)

llm = HuggingFaceEndpoint(
repo_id=“mistralai/Mistral-7B-Instruct-v0.3”,
temperature=0.7,
timeout=300

)

chains = prompt | llm
print(“LLM Initialized with Token!”)

try:
response = chains.invoke({“product”: “camera”})
print(“AI Suggestion:”, response)
except Exception as e:
print(f"Error details: {e}")

when i run this i get Value error can anyone help me out? Its a basic prompt template and text gen code but still it doesnt work i used various models from Huggingface and its not working well with langchain while chaining the llm with prompt and invoking it.

1 Like

Because the method for using Hugging Face Inference with LangChain has changed frequently in the past, information online is often confusing, making it difficult to find the correct solution through search engines or AI…


The most likely cause in your case is not the prompt template and not the chain operator. It is usually this: HuggingFaceEndpoint is making a text-generation request, while mistralai/Mistral-7B-Instruct-v0.3 is often exposed by the current Hugging Face provider routing as a chat / conversational model instead. There is a public LangChain issue with this exact model and this exact error family, and the HuggingFaceEndpoint reference explicitly says the class is for models that support the text generation task. (GitHub)

Why your code looks correct but still fails

Your code structure is valid:

prompt = PromptTemplate(...)
llm = HuggingFaceEndpoint(...)
chain = prompt | llm
response = chain.invoke(...)

That pattern is fine. The failure usually happens one layer lower, when LangChain calls Hugging Face through HuggingFaceEndpoint. In the reported Mistral case, the stack trace shows HuggingFaceEndpoint calling client.text_generation(...), and the backend rejects it because that model is mapped to conversational for that provider instead of text-generation. (GitHub)

The main background

Hugging Face now separates Text Generation and Chat Completion more clearly than many older examples did. The Text Generation docs describe prompt-based generation, while the Chat Completion docs describe message-based conversational generation. On top of that, Hugging Face Inference Providers automatically route requests to a provider, and support depends on the model + provider + task combination, not just the model name alone. (Hugging Face)

That is why this can fail even though the code looks simple. The prompt is valid. The chain is valid. But the selected backend may say: “this model is available here as chat, not as plain text generation.” (GitHub)


Causes, ranked for your exact code

1. Most likely: model-task mismatch

This is the best match for your case.

HuggingFaceEndpoint is for text-generation models. Your chosen model is an instruct/chat-style model, and there is a documented case where this exact model fails with:

ValueError: Model mistralai/Mistral-7B-Instruct-v0.3 is not supported for task text-generation ...
Supported task: conversational

That means the wrapper is asking for one kind of inference, while the provider only offers another kind for that model. (GitHub)

2. Also likely: you are using the older import path

Your code imports:

from langchain_community.llms import HuggingFaceEndpoint

Current LangChain docs use the dedicated Hugging Face integration package:

from langchain_huggingface import HuggingFaceEndpoint

and the current LangChain docs and chat docs both point to langchain-huggingface as the active integration path. Using the older community path can leave you on examples or behavior that do not match the current Hugging Face routing model. (7x.mintlify.app)

3. Sometimes relevant: missing explicit task="text-generation"

There is a separate LangChain issue where HuggingFaceEndpoint raised a ValueError until task="text-generation" was added explicitly. That is a real issue, but it is a different one. If your actual error says the model only supports conversational, then adding task="text-generation" alone will not solve the real mismatch. It only fixes the “missing task” variant. (GitHub)

4. Worth checking: token and provider setup

Hugging Face’s current Inference Providers docs say you should use a fine-grained token with Make calls to Inference Providers permission. They also note that provider selection is automatic by default, unless you pin one yourself. Wrong token permissions or provider availability problems can produce other failures, although those often show up as authorization or provider-selection errors rather than your exact ValueError. (Hugging Face)


What your code is really doing

When you write this:

llm = HuggingFaceEndpoint(
    repo_id="mistralai/Mistral-7B-Instruct-v0.3",
    temperature=0.7,
    timeout=300
)

you are saying, in effect:

“Treat this model like a plain text-generation endpoint.”

That is the key assumption. For many models that is fine. For this exact Mistral model on current provider-backed inference, that assumption often breaks. The public issue for this model shows the failure is coming from the text-generation path, not from PromptTemplate or LCEL chaining. (GitHub)


The best fixes

Fix 1. Move to the current package first

Install the current packages:

pip install -U langchain langchain-huggingface huggingface_hub

Then import from the current package:

from langchain_huggingface import HuggingFaceEndpoint

That aligns your code with the current documented integration path. LangChain’s current Hugging Face docs use langchain_huggingface, not langchain_community, for these examples. (7x.mintlify.app)


Fix 2. If you want plain prompt → text output, use a model that supports text-generation

HuggingFaceEndpoint is the right tool only when the model supports text generation on the chosen backend. The reference page says exactly that. Hugging Face’s text-generation docs also treat this as a separate task from chat completion. (LangChain)

A safer version of your code looks like this:

from langchain_core.prompts import PromptTemplate
from langchain_huggingface import HuggingFaceEndpoint
import os

os.environ["HUGGINGFACEHUB_API_TOKEN"] = "hf_your_token_here"

prompt = PromptTemplate(
    input_variables=["product"],
    template="What is a good name for a company that makes {product}?"
)

llm = HuggingFaceEndpoint(
    repo_id="Qwen/Qwen2.5-7B-Instruct-1M",
    task="text-generation",
    provider="auto",
    max_new_tokens=64,
    temperature=0.7,
    timeout=300,
)

chain = prompt | llm
response = chain.invoke({"product": "camera"})
print("AI Suggestion:", response)

This fix does two things:

  • it uses the current package path,
  • it makes the task explicit for a model path that is documented in current Hugging Face task docs. (LangChain Documentation)

Fix 3. If you want to keep mistralai/Mistral-7B-Instruct-v0.3, treat it as a chat model

This is the more natural choice for an instruct model. LangChain’s chat docs show ChatHuggingFace instantiated from a HuggingFaceEndpoint, and Hugging Face’s own docs define chat completion as the message-based interface for conversational models. (LangChain Documentation)

Example:

from langchain_core.prompts import ChatPromptTemplate
from langchain_huggingface import HuggingFaceEndpoint, ChatHuggingFace
import os

os.environ["HUGGINGFACEHUB_API_TOKEN"] = "hf_your_token_here"

prompt = ChatPromptTemplate.from_messages([
    ("human", "What is a good name for a company that makes {product}?")
])

llm = HuggingFaceEndpoint(
    repo_id="mistralai/Mistral-7B-Instruct-v0.3",
    task="text-generation",
    provider="auto",
    max_new_tokens=64,
    temperature=0.7,
    timeout=300,
)

chat_model = ChatHuggingFace(llm=llm)

chain = prompt | chat_model
response = chain.invoke({"product": "camera"})
print("AI Suggestion:", response.content)

Important nuance: even here, HuggingFaceEndpoint still sits underneath. This approach is most useful when LangChain’s chat wrapper and the selected provider/model route agree. For provider-backed inference, a direct chat client can sometimes be even clearer. (LangChain Documentation)


Fix 4. Test the model outside LangChain first

This is the fastest way to separate:

  • “LangChain wrapper problem”
    from
  • “model/provider/task problem.”

Hugging Face’s Inference Providers docs show InferenceClient using chat completions directly for conversational models. (Hugging Face)

Example:

from huggingface_hub import InferenceClient

client = InferenceClient()

response = client.chat.completions.create(
    model="mistralai/Mistral-7B-Instruct-v0.3",
    messages=[
        {"role": "user", "content": "What is a good name for a company that makes camera products?"}
    ],
    max_tokens=64,
)

print(response.choices[0].message.content)

If this works and your LangChain code does not, then the problem is almost certainly the wrapper choice rather than the model itself. Hugging Face documents chat.completions.create as the chat-completion path, and the source reference shows it is the intended method for message-based chat use. (Hugging Face)


Fix 5. If your real error is the “missing task” version, add task

This is the smaller fix.

llm = HuggingFaceEndpoint(
    repo_id="some-text-generation-model",
    task="text-generation",
    temperature=0.7,
    timeout=300,
)

That exact change resolved a documented issue where the docs omitted the needed task parameter. (GitHub)


A practical debug checklist

Use this order.

  1. Upgrade to the current package path:

  2. Make sure your token is a fine-grained token with Make calls to Inference Providers permission.
    (Hugging Face)

  3. If you stay with HuggingFaceEndpoint, add task="text-generation".
    (GitHub)

  4. If the error says Supported task: conversational, stop trying to force that model through a plain text-generation flow. Switch to a chat-model path or choose a model/provider combination documented for text generation.
    (GitHub)

  5. Test the same model once with InferenceClient.chat.completions.create(...). That tells you whether the model route itself works.
    (Hugging Face)


My judgment for your exact snippet

Here is the probability ranking I would use.

  • Most likely: HuggingFaceEndpoint is calling the text-generation path, but your chosen Mistral model is being exposed by the provider as conversational instead. (GitHub)
  • Next most likely: you are mixing current Hugging Face routing with the older langchain_community integration path. (7x.mintlify.app)
  • Possible: you also need an explicit task="text-generation" because of your installed versions. (GitHub)
  • Possible but less likely for this exact error: token/provider configuration issue. (Hugging Face)

So the cleanest summary is:

Your prompt and chain are probably fine. The main problem is that you are using a text-generation wrapper with a model that the current Hugging Face backend often exposes as chat/conversational instead. (GitHub)

I tried what you are trying to make me understand here, but i still got the same errors i got multiple errors, like
ValueError, ValidationError, Value error, stop Iteration error too
Possibly tried doing everything, could you please try and run it in an env and let me know if it works well?

1 Like

For mistralai/Mistral-7B-Instruct-v0.3, it doesn’t seem to be deployed (and therefore unusable) in the first place, so have you tried it with models that is available?
https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3

I did not get what you’re trying to say, it isn’t deployed? Actually I’m new to huggingface hun and this Gen AI models and API but I’m trying to learn, can you explain it in a better way please?

Thank you.

1 Like

it isn’t deployed

Yes. Now, few(?) models are deployed. There is virtually no way to use a model via API that hasn’t been deployed.
Also, since the number of free API calls is quite limited, it’s best to think of this as a test purpose.

If you plan to use it heavily, you should consider using a local LLM…

Firstly thank you so much for the info.

So I will have to check for models with the :high_voltage: Inference? So those are models which can be used?

1 Like

have to check for models with the :high_voltage: Inference? So those are models which can be used?

Yes. Yes. However, since the free quota is limited to a few times a month, be careful when making API calls…:sweat_smile:

What do you mean by free quota? Is hugging face also paid? I just wanted to make small applications and test projects nothing big as such. I found issue with the Huggingface so i started using Groq API and their llama model. Its pretty good and the daily creds are enough for my project.

1 Like

What do you mean by free quota? Is hugging face also paid?

Yeah. That’s right. Some features are free, while others require a paid subscription/PAYG. Take a look at this—well, when it comes to Inference, it’s definitely more on the paid side.