Earnings Intelligence Copilot โ Fine-tuned Mistral-7B
A QLoRA fine-tuned version of Mistral-7B-Instruct-v0.3 for structured KPI extraction from SEC filings.
What it does
- Extracts financial KPIs (Revenue, Gross Margin, Operating Income, EPS, Free Cash Flow) from SEC filing chunks as structured JSON
- Returns
{"confidence": "UNVERIFIABLE"}instead of hallucinating when data is not present in the text - Always includes a
source_quotefield grounding the answer in the original text
Training
- Base model: mistralai/Mistral-7B-Instruct-v0.3
- Method: QLoRA (4-bit NF4 quantization, LoRA rank=16, alpha=32)
- Dataset: 619 balanced examples from S&P 500 SEC filings (50% HIGH confidence, 50% UNVERIFIABLE)
- Data source: 10-K and 10-Q filings for 20 S&P 500 companies (2020-2024)
- Hardware: Google Colab T4 GPU (~2 hours)
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
base_model = AutoModelForCausalLM.from_pretrained(
"mistralai/Mistral-7B-Instruct-v0.3",
torch_dtype=torch.float16,
device_map="auto"
)
model = PeftModel.from_pretrained(base_model, "ratnasekhar/earnings-copilot-mistral-7b")
tokenizer = AutoTokenizer.from_pretrained("ratnasekhar/earnings-copilot-mistral-7b")
prompt = """<s>[INST] You are a financial KPI extraction model. Extract metrics from SEC filing chunks as JSON. If a metric cannot be verified from the text, output {"confidence": "UNVERIFIABLE"}. Never invent numbers.
Filing chunk:
Net sales for Q1 FY2024 were $119.6 billion, an increase of 2% compared to Q1 FY2023.
Extract: What was total revenue and its YoY change? [/INST]"""
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=150, do_sample=False)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
Example Output
{
"metric": "Revenue",
"value": 119.6,
"unit": "billion USD",
"period": "Q1 FY2024",
"yoy_change": "+2%",
"source_quote": "Net sales for Q1 FY2024 were $119.6 billion",
"confidence": "HIGH"
}
Part of
This adapter is part of the Earnings Intelligence Copilot project โ a multi-agent system that ingests SEC filings, extracts KPIs, and generates citation-grounded investment memos.
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
Model tree for ratnasekhar/earnings-copilot-mistral-7b
Base model
mistralai/Mistral-7B-v0.3 Finetuned
mistralai/Mistral-7B-Instruct-v0.3