Commit 24b6a4f9 authored by Harris, Tyrone's avatar Harris, Tyrone
Browse files

Added Redis, PostgreSQL, error handling, improved chat ui

parent c241fa3a
Loading
Loading
Loading
Loading
+0 −0

File moved.

+1 −1

File changed and moved.

Preview size limit exceeded, changes collapsed.

+131 −76
Original line number Diff line number Diff line
@@ -33,17 +33,19 @@ This system is designed to provide a seamless and informative question-answering

## Features

- **Offline Operation:** Functions entirely without an internet connection, ensuring data privacy and availability.
- **Multilingual Support:** Handles questions and provides answers in multiple languages (Spanish, French, German, Thai, Russian, Arabic, Portuguese, Mandarin).
- **Contextual Understanding:** Maintains conversation history within chat sessions to provide more relevant and coherent responses to follow-up questions.
- **Self-Correction:** Employs a retry mechanism to iteratively refine answers, minimizing hallucinations and improving accuracy.
- **Terminal-Based Chat Interface:** Offers a user-friendly, real-time chat interface for interaction.
- **UID Tracking & Database:** Assigns unique identifiers to each interaction, facilitating tracking, analysis, and debugging.
- **Caching:** Enhances performance by storing and reusing previous results.
- **Document Referencing:** Provides transparency by citing the sources used to generate answers.
- **Efficient Multilingual Tokenizer:** Utilizes SentencePiece for efficient handling of multiple languages.
- **Offline LLM:** Leverages the Vicuna-7B model for powerful language understanding and answer generation capabilities in an offline setting.
- **Queue-Based Processing:** Handles multiple chat requests concurrently, ensuring fair and efficient processing.
* Offline Operation: Functions entirely without an internet connection, ensuring data privacy and availability.
* Multilingual Support: Handles questions and provides answers in multiple languages (Spanish, French, German, Thai, Russian, Arabic, Portuguese, Mandarin).
* Contextual Understanding: Maintains conversation history within chat sessions to provide more relevant and coherent responses to follow-up questions.
* Self-Correction: Employs a retry mechanism to iteratively refine answers, minimizing hallucinations and improving accuracy.
* Terminal-Based Chat Interface: Offers a user-friendly, real-time chat interface for interaction.
* UID Tracking & Database: Assigns unique identifiers to each interaction, facilitating tracking, analysis, and debugging.
* Caching: Enhances performance by storing and reusing previous results.
* Document Referencing: Provides transparency by citing the sources used to generate answers.
* Efficient Multilingual Tokenizer: Utilizes SentencePiece for efficient handling of multiple languages.
* Offline LLM: Leverages the Vicuna-7B model for powerful language understanding and answer generation capabilities in an offline setting.
* Queue-Based Processing: Handles multiple chat requests concurrently, ensuring fair and efficient processing.

![image](/CRAIG_Graph.svg)

## System Architecture

@@ -67,7 +69,7 @@ The system's modular architecture comprises interconnected components, each fulf
    * Loads offline translation models (MarianMT) for supported language pairs using the SentencePiece tokenizer
    * Loads the local embedded document collection using the specified embedding model (sentence-transformers/all-MiniLM-L6-v2)
    * Initializes the database connection and creates the necessary table
    *   Sets up an in-memory cache (can be replaced with a more robust solution)
    * Sets up a Redis cache
    * Loads the offline LLM (Vicuna-7B)

### `Tasks` (`tasks.py`)
@@ -83,7 +85,7 @@ The system's modular architecture comprises interconnected components, each fulf
    * `translate_to_user_language_task`: Translates the answer back to the original language if needed
    * `display_answer_in_chat_task`: Displays the answer in the chat interface and updates the database
* **Key Considerations:**
    *   The `tool_code` blocks within each task contain the actual logic for performing the task. You'll need to fill in the placeholders with your specific implementations
    * The `tool_code` blocks within each task contain the actual logic for performing the task. 
    * The `args` dictionaries define how data flows between tasks, specifying which outputs from one task are passed as inputs to another

### Tools
@@ -196,7 +198,6 @@ The system's modular architecture comprises interconnected components, each fulf
    ```

*   **Optional Libraries:** 
    *   If you plan to implement the email functionality in the future, you'll also need to install libraries for interacting with the Outlook 365 API (e.g., `requests_oauthlib` and `microsoft-graph`).
    *   If you choose a different caching solution than the basic in-memory cache, install the necessary library for that (e.g., `redis` for Redis).


@@ -205,7 +206,7 @@ The system's modular architecture comprises interconnected components, each fulf
1.  **Clone the repository:**

    ```bash
    git clone [https://code.ornl.gov/6cq/offline-multilingual-question-answering-system](https://code.ornl.gov/6cq/offline-multilingual-question-answering-system)
    git clone https://code.ornl.gov/6cq/offline-multilingual-question-answering-system
    ```

2.  **Navigate to the project directory:**
@@ -217,7 +218,7 @@ The system's modular architecture comprises interconnected components, each fulf
3.  **Create a virtual environment:**

    ```bash
    python -m venv myenv  # Replace 'myenv' with your preferred environment name
    python -m venv offlineqa-env  # Create an environment named 'offlineqa-env'
    ```

4.  **Activate the virtual environment:**
@@ -225,13 +226,13 @@ The system's modular architecture comprises interconnected components, each fulf
    *   **On Windows:**

        ```bash
        myenv\Scripts\activate
        offlineqa-env\Scripts\activate
        ```

    *   **On macOS/Linux:**

        ```bash
        source myenv/bin/activate
        source offlineqa-env/bin/activate
        ```

5.  **Install dependencies using the provided `requirements.txt` file:**
@@ -330,6 +331,63 @@ The system's modular architecture comprises interconnected components, each fulf

    *   This code snippet demonstrates how to load documents from the `uploads` folder, split them into chunks, generate embeddings using the `all-MiniLM-L6-v2` model (recommended for Vicuna), and store them in a FAISS index.

8. **Download the spaCy language model and enable the coherence pipe:**

    ```bash
    python -m spacy download en_core_web_sm
    python -m spacy_experimental.coref.download en  # Download the coreference resolution data
    ```

9. **Set up Redis (if using Redis for caching):**

    *   **On Windows:** Download and install Redis from the official website: [https://redis.io/download/](https://redis.io/download/). Follow the instructions provided on the website for Windows installation.
    *   **On macOS:**
        ```bash
        brew install redis
        ```
    *   **On Ubuntu:**
        ```bash
        sudo apt update
        sudo apt install redis-server
        ```
    *   **Start the Redis server:** Follow the platform-specific instructions to start the Redis server.

10. **Set up PostgreSQL (if using PostgreSQL for the database):**

    *   **On Windows:** Download and install PostgreSQL from the official website: [https://www.postgresql.org/download/](https://www.postgresql.org/download/). Follow the instructions provided on the website for Windows installation.
    *   **On macOS:**
        ```bash
        brew install postgresql
        brew services start postgresql
        ```
    *   **On Ubuntu:**
        ```bash
        sudo apt update
        sudo apt install postgresql postgresql-contrib
        ```
    *   **Create a database and user:**
        ```bash
        sudo -u postgres psql  # Access PostgreSQL shell
        CREATE DATABASE my_qna_db;
        CREATE USER my_qna_user WITH ENCRYPTED PASSWORD 'your_password';
        GRANT ALL PRIVILEGES ON DATABASE my_qna_db TO my_qna_user;
        \q  # Exit the shell
        ```
    *   Update the database connection details in the `_initialize_database` method in `config.py` with your PostgreSQL credentials.

11. **Start Redis and PostgreSQL servers (if applicable):**

    *   **On Windows:** Use the services management console or the command line to start the Redis and PostgreSQL services.
    *   **On macOS:**
        ```bash
        brew services start redis
        brew services start postgresql
        ```
    *   **On Ubuntu:**
        ```bash
        sudo systemctl start redis-server
        sudo systemctl start postgresql
        ```

### Configuration

@@ -342,7 +400,7 @@ The system's modular architecture comprises interconnected components, each fulf
2.  **Tool Implementations**
    *   In `document_answerer.py`, ensure the  `_run`  method uses your actual offline LLM interface.
    *   Customize the  `SelfCorrectiveAgent`  in  `self_corrective_agent.py`  with your desired evaluation logic and thresholds.
    *   Implement the chat interface logic in  `chat_input_tool.py`  and the  `display_answer_in_chat_task`  in  `tasks.py`  using the  `curses`  library or a similar approach.
    *   The chat interface logic in  `chat_input_tool.py`  and the  `display_answer_in_chat_task`  in  `tasks.py`  are already implemented using the  `curses`  library.

### Running the System

@@ -391,9 +449,6 @@ This project is licensed under the [MIT License](LICENSE)
    *   If you have domain-specific data, explore fine-tuning the Vicuna-7B LLM to enhance its accuracy and relevance for your particular use case.
*   **Enhance Self-Correction:**
    *   Investigate and implement more advanced techniques for hallucination detection, coherence assessment, and fact-checking to further improve the quality of generated answers.
*   **Implement Robust Caching:** 
    *   Replace the simple in-memory cache with a more production-ready solution like Redis or Memcached, especially if you anticipate high volumes of interactions.
    *   Implement proper cache expiration and invalidation strategies to manage memory usage and ensure answer freshness.
*   **Expand Document Collection:**
    *   Continuously update and expand your embedded document collection to cover a wider range of topics and domains, making the system more knowledgeable and versatile.
*   **User Feedback Mechanism:**
+33 −6
Original line number Diff line number Diff line
import logging
import sqlite3

from langchain.embeddings import HuggingFaceEmbeddings
@@ -10,23 +11,49 @@ from transformers import (
    AutoModelForCausalLM,
    pipeline,
)
from redis import Redis  # Import Redis library

# Set up logging
logging.basicConfig(filename='qna_system.log', level=logging.ERROR, 
                    format='%(asctime)s - %(levelname)s - %(filename)s - %(message)s')


class Config:
    def __init__(self):
        # 1. Load Offline Translation Models
        try:
            self.tokenizer, self.models = self._load_translation_models()
        except Exception as e:
            logging.error(f"Error loading translation models: {e}")
            raise

        # 2. Load Embedded Document Collection
        try:
            self.vectorstore = self._load_document_collection()
        except Exception as e:
            logging.error(f"Error loading document collection: {e}")
            raise

        # 3. Initialize Database Connection
        try:
            self.conn, self.cursor = self._initialize_database()
        except Exception as e:
            logging.error(f"Error initializing database: {e}")
            raise

        # 4. Initialize Cache
        self.cache = {}  
        # 4. Initialize Redis Cache
        try:
            self.cache = Redis(host='localhost', port=6379, db=0)  # Configure Redis connection
        except Exception as e:
            logging.error(f"Error connecting to Redis: {e}")
            raise

        # 5. Load LLM 
        try:
            self.llm = self._load_llm()
        except Exception as e:
            logging.error(f"Error loading LLM: {e}")
            raise

    def _load_translation_models(self):
        # Define language codes and model names 
+77 −46
Original line number Diff line number Diff line
import time
import queue
import logging

from langgraph import Graph

@@ -11,6 +12,10 @@ from tools.document_answerer import DocumentAnswerer
from tools.self_corrective_agent import SelfCorrectiveAgent
from tools.chat_input_tool import ChatInputTool, get_user_input_from_terminal, display_answer_in_terminal

# Set up logging
logging.basicConfig(filename='qna_system.log', level=logging.ERROR, 
                    format='%(asctime)s - %(levelname)s - %(filename)s - %(message)s')

# 3. Main Program

class QuestionAnsweringSystem:
@@ -53,28 +58,46 @@ class QuestionAnsweringSystem:

    def run(self):
        while True:
            try:
                # 1. Check for new chat requests
                self.graph.execute(inputs={
                    "request_queue": self.request_queue, 
                "cache": self.config.cache
                    "cache": self.config.cache,
                    "current_chat_uid": None 
                })

                # 2. Process requests from the queue
                while not self.request_queue.empty():
                    request = self.request_queue.get()
                _, question, uid, context, _ = request  # Ignore 'source' and 'user_email' 
                    _, question, uid, context, _ = request 

                    # 3. Process the chat request
                    try:
                        input_language = language_detection_tool.run(question)
                    except Exception as e:
                        logging.error(f"Error during language detection: {e}")
                        display_answer_in_terminal("Error: Could not detect language.")
                        continue

                    retry_count = 0

                    while retry_count < 3:
                        try:
                            if input_language != 'en':
                                question = translator.run(question) 
                        except Exception as e:
                            logging.error(f"Error during translation to English: {e}")
                            display_answer_in_terminal("Error: Could not translate to English.")
                            break 

                        try:
                            documents = document_retriever.run(question)
                            answer = document_answerer.run(question, documents, context)
                            answer, feedback_or_problems = self_corrective_agent.run(question, answer, documents, retry_count)
                        except Exception as e:
                            logging.error(f"Error during answer generation: {e}")
                            display_answer_in_terminal("Error: Could not generate an answer.")
                            break 

                        if feedback_or_problems is None:
                            break 
@@ -87,19 +110,27 @@ class QuestionAnsweringSystem:
                        documents = document_retriever.run(question, feedback_or_problems)
                        answer = document_answerer.run(question, documents, context)

                    try:
                        if input_language != 'en':
                            answer = translator.run(answer, target_language=input_language) 
                    except Exception as e:
                        logging.error(f"Error during translation to original language: {e}")
                        display_answer_in_terminal("Error: Could not translate to original language.")
                        continue 

                    # Display the answer in the chat interface
                    display_answer_in_terminal(answer)

                # Update cache with context for potential follow-up questions
                    # Update cache with context
                    self.config.cache.set(uid, {"question": question, "documents": documents, "answer": answer})

                    # Update DB for chat interactions
                    status = "answered" if feedback_or_problems is None else "answered_with_problems"
                    self.tasks.update_interaction_in_db(uid, answer, status)

            except Exception as e:
                logging.error(f"An unexpected error occurred: {e}")

            time.sleep(1)  # Adjust the interval as needed

if __name__ == "__main__":
Loading