π GLiNER-relex: Generalist and Lightweight Model for Joint Zero-Shot NER and Relation Extraction
GLiNER-relex is a unified model for zero-shot Named Entity Recognition (NER) and Relation Extraction (RE) that performs both tasks simultaneously in a single forward pass. Built on the GLiNER architecture, it extends the span-based approach to jointly identify entities and extract relationships between them.
β¨ Key Features
- Joint Extraction: Simultaneously extracts entities and relations in one forward pass
- Zero-Shot: No fine-tuning required β specify entity types and relation types at inference time
- Label Descriptions: Supports natural language descriptions for entity labels, improving accuracy by providing richer semantic context to the model
- "Other" Entity Type: Use the special
"other"label to extract entities whose types are not explicitly defined but can be inferred from the specified relations - Efficient: Single encoder architecture processes both tasks together
- Flexible: Supports custom entity and relation schemas per inference call
- Production-Ready: ONNX export support for deployment
π¦ Installation
First, install the GLiNER library:
pip install gliner -U
π Quick Start
Basic Usage
from gliner import GLiNER
# Load the model
model = GLiNER.from_pretrained("knowledgator/gliner-relex-base-v1.0")
# Define your entity types and relation types
entity_labels = ["location", "person", "date", "structure"]
relation_labels = ["located in", "designed by", "completed in"]
# Input text
text = "The Eiffel Tower, located in Paris, France, was designed by engineer Gustave Eiffel and completed in 1889."
# Run inference - returns both entities and relations
entities, relations = model.inference(
texts=[text],
labels=entity_labels,
relations=relation_labels,
threshold=0.3,
relation_threshold=0.5,
return_relations=True,
flat_ner=False
)
# Print entities
print("Entities:")
for entity in entities[0]:
print(f" {entity['text']} -> {entity['label']} (score: {entity['score']:.3f})")
# Print relations
print("\nRelations:")
for relation in relations[0]:
head = relation["head"]["text"]
tail = relation["tail"]["text"]
rel_type = relation["relation"]
score = relation["score"]
print(f" {head} --[{rel_type}]--> {tail} (score: {score:.3f})")
Expected output:
Entities:
Eiffel Tower -> structure (score: 0.912)
Paris -> location (score: 0.934)
France -> location (score: 0.891)
Gustave Eiffel -> person (score: 0.923)
1889 -> date (score: 0.856)
Relations:
Eiffel Tower --[located in]--> Paris (score: 0.823)
Eiffel Tower --[designed by]--> Gustave Eiffel (score: 0.847)
Eiffel Tower --[completed in]--> 1889 (score: 0.789)
Using Label Descriptions
You can provide natural language descriptions for entity labels to give the model richer context, which can improve extraction accuracy, especially for ambiguous or domain-specific types:
entity_labels = {
"person": "A human individual, including fictional characters",
"organization": "A company, institution, agency, or other group of people",
"location": "A physical place, geographic region, or address",
"date": "A calendar date, time period, or temporal expression"
}
relation_labels = ["works for", "located in", "founded on"]
text = "Tim Cook has been leading Apple Inc. from its headquarters in Cupertino since 2011."
entities, relations = model.inference(
texts=[text],
labels=entity_labels,
relations=relation_labels,
threshold=0.5,
relation_threshold=0.7,
return_relations=True,
flat_ner=False
)
When using descriptions, pass labels as a dictionary where keys are label names and values are their descriptions. The model encodes these descriptions alongside the text for better semantic matching.
Using the "Other" Entity Type
When you want to extract entities involved in specific relations without predefining every entity type, use the special "other" label. The model will identify entities based on the relations they participate in, even if their type does not match any of the explicitly defined labels:
entity_labels = ["person"]
relation_labels = ["author of", "born in"]
text = "Gabriel GarcΓa MΓ‘rquez, born in Aracataca, wrote One Hundred Years of Solitude."
entities, relations = model.inference(
texts=[text],
labels=entity_labels + ["other"], # "other" captures relation-driven entities
relations=relation_labels,
threshold=0.5,
relation_threshold=0.7,
return_relations=True,
flat_ner=False
)
Expected output:
Entities:
Gabriel GarcΓa MΓ‘rquez -> person (score: 0.931)
Aracataca -> other (score: 0.724)
One Hundred Years of Solitude -> other (score: 0.689)
Relations:
Gabriel GarcΓa MΓ‘rquez --[born in]--> Aracataca (score: 0.812)
Gabriel GarcΓa MΓ‘rquez --[author of]--> One Hundred Years of Solitude (score: 0.795)
This is particularly useful when:
- You care more about extracting relations than classifying every entity type
- The set of possible entity types is open-ended or unknown
- You want to discover entities that are connected to known types through specific relations
Batch Processing
texts = [
"Elon Musk founded SpaceX in Hawthorne, California.",
"Microsoft, led by Satya Nadella, acquired GitHub in 2018.",
"The Louvre Museum in Paris houses the Mona Lisa."
]
entity_labels = ["person", "organization", "location", "artwork"]
relation_labels = ["founder of", "CEO of", "located in", "acquired", "houses"]
entities, relations = model.inference(
texts=texts,
labels=entity_labels,
relations=relation_labels,
threshold=0.5,
relation_threshold=0.5,
batch_size=8,
return_relations=True,
flat_ner=False
)
for i, (text_entities, text_relations) in enumerate(zip(entities, relations)):
print(f"\nText {i + 1}:")
print(f" Entities: {[e['text'] for e in text_entities]}")
print(f" Relations: {[(r['head']['text'], r['relation'], r['tail']['text']) for r in text_relations]}")
Entity-Only Extraction
If you only need entities without relations:
entities = model.inference(
texts=[text],
labels=entity_labels,
relations=[], # Empty list for relations
threshold=0.5,
return_relations=False, # Skip relation extraction
flat_ner=False
)
βοΈ Advanced Configuration
Adjusting Thresholds
You can fine-tune extraction sensitivity with separate thresholds:
entities, relations = model.inference(
texts=texts,
labels=entity_labels,
relations=relation_labels,
threshold=0.5, # Entity confidence threshold
adjacency_threshold=0.6, # Threshold for entity pair candidates
relation_threshold=0.7, # Relation classification threshold
flat_ner=True, # Enforce non-overlapping entities
multi_label=False, # Single label per entity span
return_relations=True
)
We recommend lowering the threshold (entity extraction threshold) and keeping it in the range of 0.3β0.5. For adjacency_threshold, the model provides good results in the 0.5β0.65 range. For relation_threshold, use larger values like 0.7β0.9. Feel free to adjust all of these values based on your project requirements.
π Output Format
Entity Format
{
"start": int, # Start character position
"end": int, # End character position
"text": str, # Entity text span
"label": str, # Entity type
"score": float # Confidence score (0-1)
}
Relation Format
{
"head": {
"start": int,
"end": int,
"text": str,
"type": str,
"entity_idx": int # Index in entities list
},
"tail": {
"start": int,
"end": int,
"text": str,
"type": str,
"entity_idx": int
},
"relation": str, # Relation type
"score": float # Confidence score (0-1)
}
ποΈ Architecture
GLiNER-relex uses a unified encoder architecture that:
- Encodes text and labels jointly using a transformer backbone.
- Identifies entity spans using span-based classification.
- Constructs an adjacency matrix to identify potential entity pairs using graph convolutional networks.
- Classifies relations between selected entity pairs.
This joint approach allows the model to leverage entity information when extracting relations, leading to more coherent predictions.
π Use Cases
- Knowledge Graph Construction: Extract structured facts from unstructured text
- Information Extraction Pipelines: Build end-to-end IE systems
- Document Understanding: Extract entities and their relationships from documents
- Question Answering: Power QA systems with structured knowledge
- Data Enrichment: Automatically annotate text corpora
- Downloads last month
- 22