Added Redis, PostgreSQL, error handling, improved chat ui (24b6a4f9) · Commits · Harris, Tyrone / Offline Multilingual Question Answering System

CRAIG_Graph-2024-10-01-035044.mmd→CRAIG_Graph.mmd

+0 −0

File moved.

View file

CRAIG_Graph-2024-10-01-035038.svg→CRAIG_Graph.svg

+1 −1

File changed and moved.

Preview size limit exceeded, changes collapsed.

README.md

+131 −76

Original line number	Diff line number	Diff line
		@@ -33,17 +33,19 @@ This system is designed to provide a seamless and informative question-answering

		## Features

		- Offline Operation: Functions entirely without an internet connection, ensuring data privacy and availability.
		- Multilingual Support: Handles questions and provides answers in multiple languages (Spanish, French, German, Thai, Russian, Arabic, Portuguese, Mandarin).
		- Contextual Understanding: Maintains conversation history within chat sessions to provide more relevant and coherent responses to follow-up questions.
		- Self-Correction: Employs a retry mechanism to iteratively refine answers, minimizing hallucinations and improving accuracy.
		- Terminal-Based Chat Interface: Offers a user-friendly, real-time chat interface for interaction.
		- UID Tracking & Database: Assigns unique identifiers to each interaction, facilitating tracking, analysis, and debugging.
		- Caching: Enhances performance by storing and reusing previous results.
		- Document Referencing: Provides transparency by citing the sources used to generate answers.
		- Efficient Multilingual Tokenizer: Utilizes SentencePiece for efficient handling of multiple languages.
		- Offline LLM: Leverages the Vicuna-7B model for powerful language understanding and answer generation capabilities in an offline setting.
		- Queue-Based Processing: Handles multiple chat requests concurrently, ensuring fair and efficient processing.
		* Offline Operation: Functions entirely without an internet connection, ensuring data privacy and availability.
		* Multilingual Support: Handles questions and provides answers in multiple languages (Spanish, French, German, Thai, Russian, Arabic, Portuguese, Mandarin).
		* Contextual Understanding: Maintains conversation history within chat sessions to provide more relevant and coherent responses to follow-up questions.
		* Self-Correction: Employs a retry mechanism to iteratively refine answers, minimizing hallucinations and improving accuracy.
		* Terminal-Based Chat Interface: Offers a user-friendly, real-time chat interface for interaction.
		* UID Tracking & Database: Assigns unique identifiers to each interaction, facilitating tracking, analysis, and debugging.
		* Caching: Enhances performance by storing and reusing previous results.
		* Document Referencing: Provides transparency by citing the sources used to generate answers.
		* Efficient Multilingual Tokenizer: Utilizes SentencePiece for efficient handling of multiple languages.
		* Offline LLM: Leverages the Vicuna-7B model for powerful language understanding and answer generation capabilities in an offline setting.
		* Queue-Based Processing: Handles multiple chat requests concurrently, ensuring fair and efficient processing.

		![image](/CRAIG_Graph.svg)

		## System Architecture

		@@ -67,7 +69,7 @@ The system's modular architecture comprises interconnected components, each fulf
		* Loads offline translation models (MarianMT) for supported language pairs using the SentencePiece tokenizer
		* Loads the local embedded document collection using the specified embedding model (sentence-transformers/all-MiniLM-L6-v2)
		* Initializes the database connection and creates the necessary table
		* Sets up an in-memory cache (can be replaced with a more robust solution)
		* Sets up a Redis cache
		* Loads the offline LLM (Vicuna-7B)

		### `Tasks` (`tasks.py`)
		@@ -83,7 +85,7 @@ The system's modular architecture comprises interconnected components, each fulf
		* `translate_to_user_language_task`: Translates the answer back to the original language if needed
		* `display_answer_in_chat_task`: Displays the answer in the chat interface and updates the database
		* Key Considerations:
		* The `tool_code` blocks within each task contain the actual logic for performing the task. You'll need to fill in the placeholders with your specific implementations
		* The `tool_code` blocks within each task contain the actual logic for performing the task.
		* The `args` dictionaries define how data flows between tasks, specifying which outputs from one task are passed as inputs to another

		### Tools
		@@ -196,7 +198,6 @@ The system's modular architecture comprises interconnected components, each fulf
		```

		* Optional Libraries:
		* If you plan to implement the email functionality in the future, you'll also need to install libraries for interacting with the Outlook 365 API (e.g., `requests_oauthlib` and `microsoft-graph`).
		* If you choose a different caching solution than the basic in-memory cache, install the necessary library for that (e.g., `redis` for Redis).


		@@ -205,7 +206,7 @@ The system's modular architecture comprises interconnected components, each fulf
		1. Clone the repository:

		```bash
		git clone [https://code.ornl.gov/6cq/offline-multilingual-question-answering-system](https://code.ornl.gov/6cq/offline-multilingual-question-answering-system)
		git clone https://code.ornl.gov/6cq/offline-multilingual-question-answering-system
		```

		2. Navigate to the project directory:
		@@ -217,7 +218,7 @@ The system's modular architecture comprises interconnected components, each fulf
		3. Create a virtual environment:

		```bash
		python -m venv myenv # Replace 'myenv' with your preferred environment name
		python -m venv offlineqa-env # Create an environment named 'offlineqa-env'
		```

		4. Activate the virtual environment:
		@@ -225,13 +226,13 @@ The system's modular architecture comprises interconnected components, each fulf
		* On Windows:

		```bash
		myenv\Scripts\activate
		offlineqa-env\Scripts\activate
		```

		* On macOS/Linux:

		```bash
		source myenv/bin/activate
		source offlineqa-env/bin/activate
		```

		5. Install dependencies using the provided `requirements.txt` file:
		@@ -330,6 +331,63 @@ The system's modular architecture comprises interconnected components, each fulf

		* This code snippet demonstrates how to load documents from the `uploads` folder, split them into chunks, generate embeddings using the `all-MiniLM-L6-v2` model (recommended for Vicuna), and store them in a FAISS index.

		8. Download the spaCy language model and enable the coherence pipe:

		```bash
		python -m spacy download en_core_web_sm
		python -m spacy_experimental.coref.download en # Download the coreference resolution data
		```

		9. Set up Redis (if using Redis for caching):

		* On Windows: Download and install Redis from the official website: [https://redis.io/download/](https://redis.io/download/). Follow the instructions provided on the website for Windows installation.
		* On macOS:
		```bash
		brew install redis
		```
		* On Ubuntu:
		```bash
		sudo apt update
		sudo apt install redis-server
		```
		* Start the Redis server: Follow the platform-specific instructions to start the Redis server.

		10. Set up PostgreSQL (if using PostgreSQL for the database):

		* On Windows: Download and install PostgreSQL from the official website: [https://www.postgresql.org/download/](https://www.postgresql.org/download/). Follow the instructions provided on the website for Windows installation.
		* On macOS:
		```bash
		brew install postgresql
		brew services start postgresql
		```
		* On Ubuntu:
		```bash
		sudo apt update
		sudo apt install postgresql postgresql-contrib
		```
		* Create a database and user:
		```bash
		sudo -u postgres psql # Access PostgreSQL shell
		CREATE DATABASE my_qna_db;
		CREATE USER my_qna_user WITH ENCRYPTED PASSWORD 'your_password';
		GRANT ALL PRIVILEGES ON DATABASE my_qna_db TO my_qna_user;
		\q # Exit the shell
		```
		* Update the database connection details in the `_initialize_database` method in `config.py` with your PostgreSQL credentials.

		11. Start Redis and PostgreSQL servers (if applicable):

		* On Windows: Use the services management console or the command line to start the Redis and PostgreSQL services.
		* On macOS:
		```bash
		brew services start redis
		brew services start postgresql
		```
		* On Ubuntu:
		```bash
		sudo systemctl start redis-server
		sudo systemctl start postgresql
		```

		### Configuration

		@@ -342,7 +400,7 @@ The system's modular architecture comprises interconnected components, each fulf
		2. Tool Implementations
		* In `document_answerer.py`, ensure the `_run` method uses your actual offline LLM interface.
		* Customize the `SelfCorrectiveAgent` in `self_corrective_agent.py` with your desired evaluation logic and thresholds.
		* Implement the chat interface logic in `chat_input_tool.py` and the `display_answer_in_chat_task` in `tasks.py` using the `curses` library or a similar approach.
		* The chat interface logic in `chat_input_tool.py` and the `display_answer_in_chat_task` in `tasks.py` are already implemented using the `curses` library.

		### Running the System

		@@ -391,9 +449,6 @@ This project is licensed under the [MIT License](LICENSE)
		* If you have domain-specific data, explore fine-tuning the Vicuna-7B LLM to enhance its accuracy and relevance for your particular use case.
		* Enhance Self-Correction:
		* Investigate and implement more advanced techniques for hallucination detection, coherence assessment, and fact-checking to further improve the quality of generated answers.
		* Implement Robust Caching:
		* Replace the simple in-memory cache with a more production-ready solution like Redis or Memcached, especially if you anticipate high volumes of interactions.
		* Implement proper cache expiration and invalidation strategies to manage memory usage and ensure answer freshness.
		* Expand Document Collection:
		* Continuously update and expand your embedded document collection to cover a wider range of topics and domains, making the system more knowledgeable and versatile.
		* User Feedback Mechanism:

config.py

+33 −6

Original line number	Diff line number	Diff line
		import logging
		import sqlite3

		from langchain.embeddings import HuggingFaceEmbeddings
		@@ -10,23 +11,49 @@ from transformers import (
		AutoModelForCausalLM,
		pipeline,
		)
		from redis import Redis # Import Redis library

		# Set up logging
		logging.basicConfig(filename='qna_system.log', level=logging.ERROR,
		format='%(asctime)s - %(levelname)s - %(filename)s - %(message)s')


		class Config:
		def __init__(self):
		# 1. Load Offline Translation Models
		try:
		self.tokenizer, self.models = self._load_translation_models()
		except Exception as e:
		logging.error(f"Error loading translation models: {e}")
		raise

		# 2. Load Embedded Document Collection
		try:
		self.vectorstore = self._load_document_collection()
		except Exception as e:
		logging.error(f"Error loading document collection: {e}")
		raise

		# 3. Initialize Database Connection
		try:
		self.conn, self.cursor = self._initialize_database()
		except Exception as e:
		logging.error(f"Error initializing database: {e}")
		raise

		# 4. Initialize Cache
		self.cache = {}
		# 4. Initialize Redis Cache
		try:
		self.cache = Redis(host='localhost', port=6379, db=0) # Configure Redis connection
		except Exception as e:
		logging.error(f"Error connecting to Redis: {e}")
		raise

		# 5. Load LLM
		try:
		self.llm = self._load_llm()
		except Exception as e:
		logging.error(f"Error loading LLM: {e}")
		raise

		def _load_translation_models(self):
		# Define language codes and model names

main.py

+77 −46

Original line number	Diff line number	Diff line
		import time
		import queue
		import logging

		from langgraph import Graph

		@@ -11,6 +12,10 @@ from tools.document_answerer import DocumentAnswerer
		from tools.self_corrective_agent import SelfCorrectiveAgent
		from tools.chat_input_tool import ChatInputTool, get_user_input_from_terminal, display_answer_in_terminal

		# Set up logging
		logging.basicConfig(filename='qna_system.log', level=logging.ERROR,
		format='%(asctime)s - %(levelname)s - %(filename)s - %(message)s')

		# 3. Main Program

		class QuestionAnsweringSystem:
		@@ -53,28 +58,46 @@ class QuestionAnsweringSystem:

		def run(self):
		while True:
		try:
		# 1. Check for new chat requests
		self.graph.execute(inputs={
		"request_queue": self.request_queue,
		"cache": self.config.cache
		"cache": self.config.cache,
		"current_chat_uid": None
		})

		# 2. Process requests from the queue
		while not self.request_queue.empty():
		request = self.request_queue.get()
		_, question, uid, context, _ = request # Ignore 'source' and 'user_email'
		_, question, uid, context, _ = request

		# 3. Process the chat request
		try:
		input_language = language_detection_tool.run(question)
		except Exception as e:
		logging.error(f"Error during language detection: {e}")
		display_answer_in_terminal("Error: Could not detect language.")
		continue

		retry_count = 0

		while retry_count < 3:
		try:
		if input_language != 'en':
		question = translator.run(question)
		except Exception as e:
		logging.error(f"Error during translation to English: {e}")
		display_answer_in_terminal("Error: Could not translate to English.")
		break

		try:
		documents = document_retriever.run(question)
		answer = document_answerer.run(question, documents, context)
		answer, feedback_or_problems = self_corrective_agent.run(question, answer, documents, retry_count)
		except Exception as e:
		logging.error(f"Error during answer generation: {e}")
		display_answer_in_terminal("Error: Could not generate an answer.")
		break

		if feedback_or_problems is None:
		break
		@@ -87,19 +110,27 @@ class QuestionAnsweringSystem:
		documents = document_retriever.run(question, feedback_or_problems)
		answer = document_answerer.run(question, documents, context)

		try:
		if input_language != 'en':
		answer = translator.run(answer, target_language=input_language)
		except Exception as e:
		logging.error(f"Error during translation to original language: {e}")
		display_answer_in_terminal("Error: Could not translate to original language.")
		continue

		# Display the answer in the chat interface
		display_answer_in_terminal(answer)

		# Update cache with context for potential follow-up questions
		# Update cache with context
		self.config.cache.set(uid, {"question": question, "documents": documents, "answer": answer})

		# Update DB for chat interactions
		status = "answered" if feedback_or_problems is None else "answered_with_problems"
		self.tasks.update_interaction_in_db(uid, answer, status)

		except Exception as e:
		logging.error(f"An unexpected error occurred: {e}")

		time.sleep(1) # Adjust the interval as needed

		if __name__ == "__main__":