Technologist & Entrepreneur | Innovator in FinTech, AI, Blockchain, Quantum Computing & Emerging Technologies

Introducing SimpleRAG: A Multimedia Retrieval-Augmented Generation Pipeline

I’m thrilled to announce the release of SimpleRAG, my latest open-source project hosted on GitHub. SimpleRAG is an advanced Retrieval-Augmented Generation (RAG) pipeline designed for processing images, videos, and audio, with specialized capabilities for financial and trading chart analysis.

Simple RAG repository

Whether you're a researcher, trader, or developer looking to explore multimedia data, SimpleRAG offers a powerful, extensible, and user-friendly solution for embedding, analyzing, and visualizing complex datasets.

What is SimpleRAG?

SimpleRAG is a robust pipeline that processes multimedia data—images, videos, and audio—and extracts meaningful insights, particularly for financial trading charts. It leverages state-of-the-art models like CLIP, BLIP, Whisper, and Tesseract OCR to embed, transcribe, caption, and analyze data, storing everything in a vector database (ChromaDB) for fast querying and interactive visualization.

The project is built for rapid experimentation, extensibility, and deep inspection of data and model outputs, making it ideal for trading analytics, multimedia search, and research.

Key Capabilities

Getting Started with Docker

SimpleRAG is designed for maximum compatibility and ease of use. You can run the entire pipeline in a Docker container to ensure a clean environment with all dependencies pre-installed. Here’s how to set it up:

Build the Docker image:

docker build -t simplerag .

Run the container:

docker run -it --rm \
    --network=host \
    -v /Users/igorkomolov/Movies/RAG_Videos:/app/Movies/RAG_Videos \
    -v /Users/igorkomolov/Pictures/RAG_Photos:/app/Pictures/RAG_Photos \
    -v "$PWD/photo_db:/app/photo_db" \
    -e RAG_VIDEO_FOLDER=/app/Movies/RAG_Videos \
    -e RAG_PHOTO_FOLDER=/app/Pictures/RAG_Photos \
    simplerag

You should know that these directories are pointing to my system, so you would need to change it for yours.

Important MacOS Silicon users, please do not use Docker, just run start.sh or install deps yourself.

This command mounts your data folders, sets environment variables (RAG_VIDEO_FOLDER and RAG_PHOTO_FOLDER), and runs build_rag.py in a Python 3.12 environment. You can modify the Dockerfile or entrypoint for custom scripts or workflows. For manual setup instructions (without Docker), refer to start.sh or start.md in the repository.

Features & Updates (August 2025)

Model Management

Chart & Video Frame Analysis

Visualization Tools

Requirements & Installation

To run SimpleRAG locally or extend its functionality, install the following dependencies:

Core Pipeline:

pip install pytesseract opencv-python blip clip whisper ffmpeg-python chromadb

GUI/3D Visualization:

pip install PyQt5 PyQtWebEngine plotly networkx

Notes:

Example Workflow

  1. Run python build_rag.py and select or create a model.
  2. Add images and videos; the pipeline will embed, transcribe, caption, and extract chart data.
  3. Use [d] in the CLI to delete a model if needed.
  4. Inspect metadata with python visualize_metadata.py.
  5. Explore data in 3D with python visualize_metadata_3d.py (GUI required).

Vector Database

SimpleRAG uses ChromaDB to store embeddings and metadata in the /vectordb directory. This enables fast, semantic searches and rich data inspection. I’m actively researching vector database optimizations to further enhance performance and scalability—stay tuned for updates!

Why SimpleRAG?

SimpleRAG is designed for developers, researchers, and traders who need a flexible, powerful tool for multimedia analysis. Its focus on financial chart analysis makes it particularly valuable for trading analytics, while its extensible architecture supports a wide range of use cases, from multimedia search to research prototyping. The interactive CLI and visualization tools make it easy to experiment and gain insights from your data.

Get Involved

Check out the SimpleRAG repository on GitHub to explore the code, contribute, or provide feedback. I’m excited to see how the community uses SimpleRAG for trading, research, and beyond. Feel free to reach out on X or via the repository’s issues page with questions or ideas!

Happy coding, analyzing, and

Don't Miss Out!

Get the latest insights on FinTech, AI, blockchain, quantum computing, and emerging tech—straight to your inbox. Join my mailing list for exclusive updates, tutorials, and real-world strategies. I never spam, only value!