How FinRAG Works

An educational companion to the capstone project at market-sentiment.io

What is RAG?

Retrieval-Augmented Generation (RAG) is a technique that combines search with AI. Instead of relying solely on a model's training data, RAG retrieves relevant documents first, then uses them to inform responses. In financial research, this means searching across earnings calls, SEC filings, charts, and news -- using semantic understanding, not just keywords.

Traditional keyword search fails for finance: “revenue beat expectations” and “earnings topped estimates” mean the same thing but share almost no words. Embedding-based search captures this semantic similarity. Click each stage below to see how:

→

How Sentiment Analysis Works

FinBERT is a language model fine-tuned specifically for financial text. It classifies text as positive, negative, or neutral. But knowing the label isn't enough -- we need to understand why.

SHAP (SHapley Additive exPlanations) uses game theory to attribute the prediction to individual tokens. Each token gets a SHAP value showing how much it pushed the prediction toward or away from a given class. This is critical for trust in financial AI -- you need to know if the model is responding to meaningful signals or noise.

Interactive Demo

Type or paste any financial sentence and see how FinBERT classifies it, with SHAP explanations showing which words drove the prediction.

Why Multimodal?

Financial events produce data across multiple formats. An earnings report generates an audio recording (the call), a PDF filing (the 10-K), news articles (analyst reactions), and chart movements (price action). Analyzing only one modality means missing context from the others. FinRAG embeds all four using a single model, enabling cross-modal search.

text

News & Articles

Financial news articles capture market reactions, analyst opinions, and event summaries. Text embeddings excel at capturing semantic nuance -- 'revenue beat expectations' and 'earnings topped estimates' are close in vector space despite sharing few words.

Text chunks (512-word sliding window, 128-word overlap)

audio

Earnings Calls

Earnings call audio captures tone, emphasis, and nuance that transcripts miss. Gemini Embedding 2 natively embeds audio -- no transcription needed -- preserving vocal cues that correlate with management confidence.

Audio segments (60s with 10s overlap, MP4 format)

pdf

SEC Filings

10-K and 10-Q filings contain structured financial data, risk factors, and management discussion. PDF embedding captures both text and layout, preserving table structures and section relationships.

PDF chunks (up to 6 pages per chunk)

image

Financial Charts

Candlestick charts, volume bars, and technical indicators encode price action visually. Image embeddings capture visual patterns -- a head-and-shoulders formation is recognized as a pattern, not just pixels.

Single image per chunk (PNG/JPEG)

This project is the companion application to the capstone research at market-sentiment.io