--- title: MindMatch emoji: 🚀 colorFrom: red colorTo: red sdk: docker app_port: 8501 tags: - streamlit pinned: false short_description: 'Hybrid dating recommendation system combining collaborative ' license: mit --- # Welcome to Streamlit! Edit `/src/streamlit_app.py` to customize this app to your heart's desire. :heart: If you have any questions, checkout our [documentation](https://docs.streamlit.io) and [community forums](https://discuss.streamlit.io). # MatchMind: Dating Profile Recommendation System ## Overview MatchMind is a hybrid dating recommendation system that combines collaborative filtering and content-based approaches to deliver high-quality match suggestions. The system generates text-based user profiles, encodes them into vector embeddings, and stores them in a Qdrant vector database. It retrieves potential matches using semantic search and applies a reranking step with a cross-encoder model for improved match quality. The system mirrors production-grade matching logic and supports benchmarking with LLMs. ## What the Project Does - **Profile Generation:** Converts structured user data and Q&A into natural language profile templates. - **Embedding:** Uses the `nomic-ai/nomic-embed-text-v1.5` model (SentenceTransformer) to generate 768-dimensional embeddings for each profile. - **Vector Database:** Stores and searches embeddings in Qdrant (collections: `dating_M`, `dating_F`). - **Semantic Search:** Retrieves top candidate matches using vector similarity (cosine distance). - **Reranking:** Uses the cross-encoder model `cross-encoder/ms-marco-MiniLM-L-6-v2` to rerank the top candidates for final recommendations. - **Multi-Signal Scoring:** Supports additional signals and benchmarking for match quality. - **Web App:** Streamlit interface for user profile entry, Q&A, match discovery, and full profile viewing. - **Benchmarking:** Optional LLM-based scoring for research and evaluation. ## Models and Technologies Used - **Embedding Model:** `nomic-ai/nomic-embed-text-v1.5` (SentenceTransformer) - **Reranking Model:** `cross-encoder/ms-marco-MiniLM-L-6-v2` (CrossEncoder) - **Vector Database:** Qdrant (collections: `dating_M`, `dating_F`) - **Backend:** Python, Flask, SQLAlchemy, MySQL - **Frontend:** Streamlit - **Benchmarking (optional):** LLMs via Ollama ## Installation 1. Clone the repository. 2. Install dependencies: ``` pip install -r requirements.txt ``` 3. Set up environment variables in a `.env` file (see `config.py` for required keys). ## Try the Web Demo You can try out the MatchMind web app here: https://ultron3002-mindmatch.hf.space/ ## Project Structure - `app.py` — Streamlit web interface - `main.py` — Core ranking and embedding logic - `config.py` — Configuration and environment variables - `db/` — Database connection and SQL utilities - `utils/` — Embedding, ranking, and Qdrant utilities - `page_modules/` — Streamlit page components - `benchmarking/` — LLM benchmarking scripts - `migration_notebooks/`, `templating_notebooks/` — Data and template notebooks ## Requirements See `requirements.txt` for all dependencies. ## License MIT License