Earnings Intelligence Copilot โ€” Fine-tuned Mistral-7B

A QLoRA fine-tuned version of Mistral-7B-Instruct-v0.3 for structured KPI extraction from SEC filings.

What it does

  • Extracts financial KPIs (Revenue, Gross Margin, Operating Income, EPS, Free Cash Flow) from SEC filing chunks as structured JSON
  • Returns {"confidence": "UNVERIFIABLE"} instead of hallucinating when data is not present in the text
  • Always includes a source_quote field grounding the answer in the original text

Training

  • Base model: mistralai/Mistral-7B-Instruct-v0.3
  • Method: QLoRA (4-bit NF4 quantization, LoRA rank=16, alpha=32)
  • Dataset: 619 balanced examples from S&P 500 SEC filings (50% HIGH confidence, 50% UNVERIFIABLE)
  • Data source: 10-K and 10-Q filings for 20 S&P 500 companies (2020-2024)
  • Hardware: Google Colab T4 GPU (~2 hours)

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

base_model = AutoModelForCausalLM.from_pretrained(
    "mistralai/Mistral-7B-Instruct-v0.3",
    torch_dtype=torch.float16,
    device_map="auto"
)
model = PeftModel.from_pretrained(base_model, "ratnasekhar/earnings-copilot-mistral-7b")
tokenizer = AutoTokenizer.from_pretrained("ratnasekhar/earnings-copilot-mistral-7b")

prompt = """<s>[INST] You are a financial KPI extraction model. Extract metrics from SEC filing chunks as JSON. If a metric cannot be verified from the text, output {"confidence": "UNVERIFIABLE"}. Never invent numbers.

Filing chunk:
Net sales for Q1 FY2024 were $119.6 billion, an increase of 2% compared to Q1 FY2023.

Extract: What was total revenue and its YoY change? [/INST]"""

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=150, do_sample=False)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Example Output

{
  "metric": "Revenue",
  "value": 119.6,
  "unit": "billion USD",
  "period": "Q1 FY2024",
  "yoy_change": "+2%",
  "source_quote": "Net sales for Q1 FY2024 were $119.6 billion",
  "confidence": "HIGH"
}

Part of

This adapter is part of the Earnings Intelligence Copilot project โ€” a multi-agent system that ingests SEC filings, extracts KPIs, and generates citation-grounded investment memos.

Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ratnasekhar/earnings-copilot-mistral-7b

Adapter
(922)
this model