In a groundbreaking development, a team recently participated in a national hackathon organized by Alfa-Bank, focusing on refining Retrieval-Augmented Generation (RAG) for question-answering systems. Their challenge was to create an intelligent RAG pipeline capable of identifying relevant fragments within a data corpus based on user queries. The team opted for this task over another that involved developing a copilot application for small business clients, ultimately delivering a solution that they are excited to share.
Throughout the hackathon, which featured a mix of online and offline stages in Moscow, the teams demonstrated impressive performance, scoring 38.5 and 32 points, respectively. The winning team scored 40.3, showcasing a competitive spirit and strong capabilities. Participants faced a well-defined task, and the support provided was exceptional, with prizes of 250, 150, and 100 for the top three teams. Following the event, there was an opportunity for finalists to secure fast-track interviews.
The project utilized a comprehensive tech stack including Python 3.8+, Jupyter, Pandas, Numpy, and various machine learning libraries such as Scikit-learn and PyTorch. The team adopted an innovative approach combining RAG techniques with knowledge graphs, allowing the system to not only retrieve relevant text but also comprehend information structures and relationships, resulting in more human-like reasoning.
The architecture of the project moved away from linear pipelines to a dynamic graph-based system that could adapt in real-time to related documents and expanding knowledge bases. Key processes included ingestion and chunking of documents, with metadata assigned to each chunk, followed by vectorization for embedding generation. The team chose a balanced solution for indexing using FAISS, implementing a combination of HNSW and IndexFlatIP for optimal response time and search quality.
To create a meaningful knowledge graph, the NLI model MoritzLaurer/mDeBERTa-v3-base-mnli-xnli was employed, which assessed the strength of relationships between text chunks. This nuanced approach significantly improved search quality by categorizing fragments based on semantic proximity.
Testing the models against criteria such as speed, quality, and resource consumption, the team found success with the ai-forever/ru-en-RoSBERTa embeddings, which excelled in bilingual context retention and speed. For reranking, they used DiTy/cross-encoder-russian-msmarco, optimized for the Russian financial domain, showcasing superior accuracy in sorting top candidates.
The selection of a language model led the team to Qwen3-MAX 8B, which provided an ideal balance of generative capacity and efficiency, particularly in handling financial terminology. This innovative hybrid search system not only enhances data retrieval but also sets a new standard for competitors in the market.
As the market evolves, this development signals a shift towards more sophisticated search methodologies, potentially outpacing traditional vector-based approaches and raising the competitive stakes for other players in the field.
Informational material. 18+.