Commit c241fa3a authored by Harris, Tyrone's avatar Harris, Tyrone
Browse files

updated tasks

parent 6153aa18
Loading
Loading
Loading
Loading
+90 −52
Original line number Diff line number Diff line
## Offline Multilingual Question Answering System
# Offline Multilingual Question Answering System

This comprehensive system enables users to interact via a terminal-based chat interface, posing questions in multiple languages and receiving accurate, contextually relevant answers, all without requiring an internet connection. By leveraging natural language processing, information retrieval, intelligent agents, and offline models, this system prioritizes data privacy and accessibility even in disconnected environments.
This comprehensive system enables users to interact via a terminal-based chat interface, posing questions in multiple languages and receiving accurate, contextually relevant answers in the same language, all without requiring an internet connection. By leveraging natural language processing, information retrieval, intelligent agents, and offline models, this system prioritizes data privacy and accessibility even in disconnected environments.

### Table of Contents
## Table of Contents

- [Introduction](#introduction)
- [Features](#features)
@@ -45,11 +45,9 @@ This system is designed to provide a seamless and informative question-answering
- **Offline LLM:** Leverages the Vicuna-7B model for powerful language understanding and answer generation capabilities in an offline setting.
- **Queue-Based Processing:** Handles multiple chat requests concurrently, ensuring fair and efficient processing.

![image](/CRAIG_Graph-2024-10-01-035038.svg)

## System Architecture

The system is built upon a modular architecture, orchestrated using Langgraph, a declarative workflow management framework. It consists of several key components, each responsible for a specific task in the question-answering process:
The system's modular architecture comprises interconnected components, each fulfilling a specific role in the question-answering process.

*   **`Config`**: Centralizes configuration and setup, including loading models, database connection, cache initialization.
*   **`Tasks`**: Defines the Langgraph tasks and their dependencies, representing the system's workflow.
@@ -76,8 +74,6 @@ The system is built upon a modular architecture, orchestrated using Langgraph, a

*   **Purpose:** Defines the Langgraph tasks and their dependencies, forming the system's workflow
*   **Tasks**
    *   `fetch_new_emails_task`:  (Placeholder for future email integration)
    *   `check_registry_task`:  (Placeholder for future email integration)
    *   `chat_input_task`: Gets user input from the chat interface and adds it to the request queue
    *   `detect_language_task`: Detects the language of the input question
    *   `translate_to_english_task`: Translates the question to English if needed
@@ -137,7 +133,7 @@ The system is built upon a modular architecture, orchestrated using Langgraph, a

## Getting Started

## Prerequisites
### Prerequisites

*   **Python 3.x:**  Ensure you have Python 3.x installed on your system. You can download it from the official Python website: [https://www.python.org/](https://www.python.org/)

@@ -165,8 +161,30 @@ The system is built upon a modular architecture, orchestrated using Langgraph, a

*   **Embedded Document Collection:** 
    *   Prepare your document collection in a suitable format (e.g., plain text files).
    *   Embed the documents using an offline embedding model like **Sentence Transformers**. 
    *   Store the embeddings in a local vector database (e.g., FAISS) for efficient retrieval.
    *   **Generate embeddings using Sentence Transformers:**

        ```python
        from langchain.document_loaders import TextLoader
        from langchain.text_splitter import RecursiveCharacterTextSplitter
        from langchain.vectorstores import FAISS
        from langchain.embeddings import HuggingFaceEmbeddings

        # 1. Load and preprocess documents
        loader = TextLoader('./data/documents/your_document.txt')  # Replace with your actual document path
        documents = loader.load()

        text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
        texts = text_splitter.split_documents(documents)

        # 2. Generate embeddings (all-MiniLM-L6-v2 is recommended for Vicuna)
        embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
        db = FAISS.from_documents(texts, embeddings)

        # 3. Save the vectorstore (FAISS index) to disk
        db.save_local("data/faiss_index")
        ```

    *   This code snippet demonstrates how to load documents, split them into chunks, generate embeddings using the `all-MiniLM-L6-v2` model (recommended for Vicuna), and store them in a FAISS index.

*   **SQLite3:** 
    *   Ensure you have SQLite3 installed on your system. It's usually included by default in most Python distributions. If not, you can install it using your package manager (e.g., `apt-get install sqlite3` on Debian/Ubuntu or `brew install sqlite3` on macOS).
@@ -181,12 +199,13 @@ The system is built upon a modular architecture, orchestrated using Langgraph, a
    *   If you plan to implement the email functionality in the future, you'll also need to install libraries for interacting with the Outlook 365 API (e.g., `requests_oauthlib` and `microsoft-graph`).
    *   If you choose a different caching solution than the basic in-memory cache, install the necessary library for that (e.g., `redis` for Redis).


### Installation

1.  **Clone the repository:**

    ```bash
    git clone https://code.ornl.gov/6cq/offline-multilingual-question-answering-system
    git clone [https://code.ornl.gov/6cq/offline-multilingual-question-answering-system](https://code.ornl.gov/6cq/offline-multilingual-question-answering-system)
    ```

2.  **Navigate to the project directory:**
@@ -221,7 +240,7 @@ The system is built upon a modular architecture, orchestrated using Langgraph, a
    pip install -r requirements.txt
    ```

    This will install all the necessary Python packages listed in `requirements.txt` within the virtual environment, ensuring that your project has its own isolated set of dependencies. 
    This will install all the necessary Python packages listed in `requirements.txt` within the virtual environment.

6.  **Download and organize models:**

@@ -256,10 +275,9 @@ The system is built upon a modular architecture, orchestrated using Langgraph, a
            MarianMTModel.from_pretrained(from_en_model_name, cache_dir="./models/translation")
        ```


    *   **Download the LLM (Vicuna-7B):**

    *   **Ensure you have enough disk space:** The Vicuna-7B model is quite large (around 13GB). Make sure you have sufficient disk space available before downloading
        *   **Ensure you have enough disk space:** The Vicuna-7B model is quite large (around 13GB). Make sure you have sufficient disk space available before downloading.
        *   **Use the `transformers` library to download:** 

            ```python
@@ -269,7 +287,7 @@ The system is built upon a modular architecture, orchestrated using Langgraph, a
            model = AutoModelForCausalLM.from_pretrained("TheBloke/Vicuna-7B-v1.5-GGUF", cache_dir="./models/llm")
            ```

        This code will download both the tokenizer and the model weights to the `./models/llm` directory
            This code will download both the tokenizer and the model weights to the `./models/llm` directory.

        *   **Consider using a quantized version:** If you're running the system on a CPU or have limited memory, you might want to explore using a quantized version of Vicuna-7B, which can significantly reduce its memory footprint and improve inference speed. Refer to the Vicuna-7B model documentation for instructions on how to obtain and use a quantized version.

@@ -283,24 +301,46 @@ The system is built upon a modular architecture, orchestrated using Langgraph, a

        This code will download the SentencePiece tokenizer to the `./models/embedding` directory.

*   **Prepare your document collection:**
7.  **Prepare your document collection:**

    1.  **Create a document upload folder:** 
        * Create a folder named `uploads` at the root level of the project. This is where users will upload their documents.
    2.  **Generate embeddings using Sentence Transformers:**

    *   Organize your documents in a suitable format (e.g., plain text files) within the `data/documents` directory.
    *   Use an offline embedding model (e.g., Sentence Transformers) to generate embeddings for your documents and store them in a local vector database (e.g., FAISS) for efficient retrieval. You can refer to the Langchain documentation or other resources for guidance on how to perform document embedding and indexing.
        ```python
        from langchain.document_loaders import DirectoryLoader
        from langchain.text_splitter import RecursiveCharacterTextSplitter
        from langchain.vectorstores import FAISS
        from langchain.embeddings import HuggingFaceEmbeddings

        # 1. Load and preprocess documents from the 'uploads' folder
        loader = DirectoryLoader('./uploads', glob="**/*.txt")  # Load all .txt files from the uploads folder
        documents = loader.load()

        text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
        texts = text_splitter.split_documents(documents)

        # 2. Generate embeddings (all-MiniLM-L6-v2 is recommended for Vicuna)
        embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
        db = FAISS.from_documents(texts, embeddings)

        # 3. Save the vectorstore (FAISS index) to disk
        db.save_local("data/faiss_index")
        ```

    *   This code snippet demonstrates how to load documents from the `uploads` folder, split them into chunks, generate embeddings using the `all-MiniLM-L6-v2` model (recommended for Vicuna), and store them in a FAISS index.


### Configuration

1.  **`config.py`**
    *   Update the `language_pairs` dictionary in the `Config` class with the actual paths to your downloaded translation models.
    *   Replace the placeholder in `_load_document_collection` with your actual code to load your embedded document collection.
    *   Ensure the path to your FAISS index in `_load_document_collection` is correct.
    *   Configure the database connection details in `_initialize_database` if you're using a different database system.
    *   Ensure you have the SentencePiece tokenizer downloaded and specify its path in `_load_translation_models`.

2.  **Tool Implementations**
    *   In `document_answerer.py`, replace the `OpenAI` placeholder with your actual offline LLM interface.
    *   In `document_answerer.py`, ensure the  `_run`  method uses your actual offline LLM interface.
    *   Customize the  `SelfCorrectiveAgent`  in  `self_corrective_agent.py`  with your desired evaluation logic and thresholds.
    *   Implement the chat interface logic in  `chat_input_tool.py`  and the  `display_answer_in_chat_task`  in  `tasks.py`  using the  `curses`  library or a similar approach.

@@ -358,5 +398,3 @@ This project is licensed under the [MIT License](LICENSE)
    *   Continuously update and expand your embedded document collection to cover a wider range of topics and domains, making the system more knowledgeable and versatile.
*   **User Feedback Mechanism:**
    *   Incorporate a mechanism to collect user feedback on the quality and relevance of answers. This feedback can be used to further fine-tune models and improve the system's overall performance.
 No newline at end of file

+12 −5
Original line number Diff line number Diff line
import sqlite3
from transformers import MarianMTModel, MarianTokenizer, AutoTokenizer, AutoModelForCausalLM, pipeline

from langchain.embeddings import HuggingFaceEmbeddings
from langchain.llms import HuggingFacePipeline
from langchain.vectorstores import FAISS
from transformers import (
    MarianMTModel,
    MarianTokenizer,
    AutoTokenizer,
    AutoModelForCausalLM,
    pipeline,
)

class Config:
    def __init__(self):
@@ -50,15 +57,15 @@ class Config:
    def _load_document_collection(self):
        # Load your local embedded document collection 
        embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
        return FAISS.load_local("my_faiss_index", embeddings)
        return FAISS.load_local("data/faiss_index", embeddings)

    def _initialize_database(self):
        conn = sqlite3.connect('email_interactions.db')
        conn = sqlite3.connect('chat_interactions.db')  # Changed database name to 'chat_interactions.db'
        cursor = conn.cursor()

        # Create the email_interactions table 
        # Create the chat_interactions table 
        cursor.execute('''
            CREATE TABLE IF NOT EXISTS email_interactions (
            CREATE TABLE IF NOT EXISTS chat_interactions (
                uid TEXT PRIMARY KEY,
                timestamp DATETIME,
                question TEXT,
+29 −14
Original line number Diff line number Diff line
@@ -5,6 +5,11 @@ from langgraph import Graph

from config import Config
from tasks import Tasks
from tools.language_detection_tool import LanguageDetectionTool
from tools.translator import Translator
from tools.document_answerer import DocumentAnswerer
from tools.self_corrective_agent import SelfCorrectiveAgent
from tools.chat_input_tool import ChatInputTool, get_user_input_from_terminal, display_answer_in_terminal

# 3. Main Program

@@ -18,7 +23,7 @@ class QuestionAnsweringSystem:
    def _build_graph(self):
        graph = Graph()

        # Add tasks to the graph (excluding email-related tasks)
        # Add tasks to the graph 
        graph.add_tasks([
            self.tasks.chat_input_task(), 
            self.tasks.detect_language_task(),
@@ -48,25 +53,28 @@ class QuestionAnsweringSystem:

    def run(self):
        while True:
            # 1. Check for new chat requests (no email checks anymore)
            self.agent_executor.run("check_chat")
            # 1. Check for new chat requests
            self.graph.execute(inputs={
                "request_queue": self.request_queue, 
                "cache": self.config.cache
            })

            # 2. Process requests from the queue
            while not self.request_queue.empty():
                request = self.request_queue.get()
                _, data, uid, context, _ = request  # Ignore 'source' and 'user_email' for now
                _, question, uid, context, _ = request  # Ignore 'source' and 'user_email' 

                # 3. Process the chat request
                input_language = language_detection_tool.run(data)
                input_language = language_detection_tool.run(question)
                retry_count = 0

                while retry_count < 3:
                    if input_language != 'en':
                        data = translator.run(data) 
                        question = translator.run(question) 

                    documents = document_retriever.run(data)
                    answer = document_answerer.run(data, documents, context)
                    answer, feedback_or_problems = self_corrective_agent.run(data, answer, documents, retry_count)
                    documents = document_retriever.run(question)
                    answer = document_answerer.run(question, documents, context)
                    answer, feedback_or_problems = self_corrective_agent.run(question, answer, documents, retry_count)

                    if feedback_or_problems is None:
                        break 
@@ -76,25 +84,32 @@ class QuestionAnsweringSystem:
                        break

                    retry_count += 1
                    documents = document_retriever.run(data, feedback_or_problems)
                    answer = document_answerer.run(data, documents, context)
                    documents = document_retriever.run(question, feedback_or_problems)
                    answer = document_answerer.run(question, documents, context)

                if input_language != 'en':
                    answer = translator.run(answer, target_language=input_language) 

                # Display the answer in the chat interface
                display_answer_in_chat(answer)
                display_answer_in_terminal(answer)

                # Update cache with context for potential follow-up questions
                cache.set(uid, {"question": data, "documents": documents, "answer": answer})
                self.config.cache.set(uid, {"question": question, "documents": documents, "answer": answer})

                # Update DB for chat interactions
                status = "answered" if feedback_or_problems is None else "answered_with_problems"
                update_interaction_in_db(uid, answer, status)
                self.tasks.update_interaction_in_db(uid, answer, status)

            time.sleep(1)  # Adjust the interval as needed

if __name__ == "__main__":
    config = Config()

    # Initialize tools
    language_detection_tool = LanguageDetectionTool()
    translator = Translator(config.cache, config.tokenizer, config.models)
    document_answerer = DocumentAnswerer(config.vectorstore, config.cache, config.llm)
    self_corrective_agent = SelfCorrectiveAgent()

    system = QuestionAnsweringSystem(config)
    system.run()
 No newline at end of file
+1 −5
Original line number Diff line number Diff line
# Core Libraries
langgraph==0.0.13  # Latest version as of Sept 30, 2024
langgraph==0.0.13 
sentence-transformers==2.2.2
spacy==3.6.1
langdetect==1.0.9
@@ -13,10 +13,6 @@ transformers==4.31.0
# - Redis: 
# redis

# Email Handling (If implementing Outlook 365 integration in the future)
requests_oauthlib==1.3.1
microsoft-graph==0.60.0

# Offline LLM (Vicuna-7B)
torch==2.1.0
accelerate==0.23.0
+132 −15
Original line number Diff line number Diff line
@@ -2,6 +2,10 @@ import uuid
import curses  # For the chat interface

from langgraph import Task, tool_code
from langdetect import detect
from tools.translator import Translator
from tools.self_corrective_agent import SelfCorrectiveAgent
from langchain.chains import RetrievalQA

class Tasks:
    def __init__(self, config):
@@ -56,7 +60,7 @@ class Tasks:
                    return cached_translation

                # 2. Perform translation if not cached
                translation = translator.run(question) 
                translation = translator.run(question)  # Use the translator tool

                # 3. Store translation in cache
                cache.set(question, translation)
@@ -86,7 +90,7 @@ class Tasks:
            tool_code=tool_code(
                """
                # Generate answer using the offline LLM
                answer = generate_answer_from_llm(query, docs, context)
                answer = generate_answer_from_llm(query, docs, context, llm)

                # Format references
                references = [f"Document: {doc.metadata['title']}" for doc in docs]
@@ -150,25 +154,77 @@ class Tasks:

                # Update DB for chat interactions
                status = "answered" if problems is None else "answered_with_problems"
                update_interaction_in_db(uid, answer, status)
                update_interaction_in_db(uid, answer, status, cursor, conn)
                """
            ),
            args={"answer": "{answer}", "problems": "{problems}", "uid": "{uid}", "cursor": self.config.cursor, "conn": self.config.conn},
        )

# Helper function for chat interface (using curses)
# Helper functions for chat interface (using curses)
def get_user_input_from_terminal():
    # ... (Implementation using curses to get user input)
    pass
    # Initialize curses
    stdscr = curses.initscr()
    curses.cbreak()  # Get characters immediately
    curses.noecho()  # Don't echo user input
    stdscr.keypad(True)  # Enable special keys

    # Create chat window and input area
    height, width = stdscr.getmaxyx()
    chat_win = curses.newwin(height - 3, width, 0, 0)  # Leave space for input
    input_win = curses.newwin(3, width, height - 3, 0)
    chat_win.scrollok(True)  # Enable scrolling in chat window

    # Display welcome message
    display_message(chat_win, "Welcome to the Offline Q&A Chatbot!")

    # Get user input
    user_input = get_input(input_win)

    if user_input:
        # Display user input in the chat window
        display_message(chat_win, f"You: {user_input}")

    return user_input.strip()

def get_input(input_win):
    input_win.clear()
    input_win.addstr(1, 0, "You: ")
    input_win.refresh()

    user_input = ""
    while True:
        key = input_win.getch()
        if key == curses.KEY_ENTER or key in [10, 13]:  # Enter key
            break
        elif key == curses.KEY_BACKSPACE or key == 127:  # Backspace
            if user_input:
                user_input = user_input[:-1]
                y, x = input_win.getyx()
                input_win.delch(y, x - 1)
                input_win.refresh()
        else:
            user_input += chr(key)
            input_win.addch(key)
            input_win.refresh()

    return user_input.strip()

def display_answer_in_terminal(answer):
    # ... (Implementation using curses to display the answer)
    pass

def store_interaction_in_db(uid, question=None, answer=None, status="received", other_metadata=None):
def display_message(chat_win, message):
    chat_win.addstr(message + "\n")
    chat_win.refresh()

    # Scroll the chat window if necessary
    max_y, _ = chat_win.getmaxyx()
    if chat_win.getyx()[0] >= max_y - 1:
        chat_win.scroll(1)

def store_interaction_in_db(uid, question=None, answer=None, status="received", other_metadata=None, cursor=None, conn=None):
    """
    This function should implement the logic to store the interaction details in your database.
    You'll likely need to use your database cursor to execute an INSERT query.
    This function should implement the logic to store the interaction details in your database.You'll likely need to use your database cursor to execute an INSERT query.

    Args:
        uid: The unique identifier for the interaction.
@@ -177,14 +233,13 @@ def store_interaction_in_db(uid, question=None, answer=None, status="received",
        status: The status of the interaction (default is "received").
        other_metadata: Any additional metadata you want to store (optional).
    """

    cursor.execute('''
        INSERT INTO email_interactions (uid, timestamp, question, answer, status, other_metadata)
        INSERT INTO chat_interactions (uid, timestamp, question, answer, status, other_metadata)
        VALUES (?, datetime('now'), ?, ?, ?, ?)
    ''', (uid, question, answer, status, other_metadata))
    conn.commit()

def update_interaction_in_db(uid, answer, status):
def update_interaction_in_db(uid, answer, status, cursor, conn):
    """
    This function should implement the logic to update an existing interaction in your database.
    You'll likely need to use your database cursor to execute an UPDATE query.
@@ -194,10 +249,72 @@ def update_interaction_in_db(uid, answer, status):
        answer: The updated answer to be stored.
        status: The updated status of the interaction.
    """

    cursor.execute('''
        UPDATE email_interactions
        UPDATE chat_interactions
        SET answer = ?, status = ?
        WHERE uid = ?
    ''', (answer, status, uid))
    conn.commit()

def generate_answer_from_llm(query: str, docs: list, context: Optional[List[dict]], llm) -> str:
    """
    Generates an answer to the given query using the provided documents and context.

    Args:
        query: The question to be answered.
        docs: A list of relevant documents retrieved from the vectorstore.
        context: Optional context from previous interactions (for follow-up questions).
        llm: The offline LLM instance for generating answers.

    Returns:
        The generated answer to the query.
    """
    # Placeholder: Implement your answer generation logic here, using the provided LLM, documents, and context
    # ...
    raise NotImplementedError("Implement generate_answer_from_llm using your offline LLM")

def is_answer_valid(query: str, answer: str, documents: list) -> bool:
    """
    Checks if the generated answer is valid based on various criteria.

    Args:
        query: The original question asked by the user.
        answer: The generated answer to be evaluated.
        documents: The list of documents used to generate the answer.

    Returns:
        True if the answer is valid, False otherwise.
    """
    # Placeholder: Implement your answer validation logic here
    # ...
    raise NotImplementedError("Implement is_answer_valid")

def generate_feedback(query: str, answer: str, documents: list) -> str:
    """
    Generates feedback to guide the Document Retriever in case the answer is not valid.

    Args:
        query: The original question asked by the user.
        answer: The generated answer to be evaluated.
        documents: The list of documents used to generate the answer.

    Returns:
        A string containing feedback for the Document Retriever.
    """
    # Placeholder: Implement your feedback generation logic here
    # ...
    raise NotImplementedError("Implement generate_feedback")

def describe_problems(answer: str) -> str:
    """
    Provides a description of the problems identified in the answer.

    Args:
        answer: The generated answer to be evaluated.

    Returns:
        A string describing the problems found in the answer.
    """
    # Placeholder: Implement your problem description logic here
    # ...
    raise NotImplementedError("Implement describe_problems")
 No newline at end of file
Loading