updated tasks (c241fa3a) · Commits · Harris, Tyrone / Offline Multilingual Question Answering System

README.md

+90 −52

Original line number	Diff line number	Diff line
		## Offline Multilingual Question Answering System
		# Offline Multilingual Question Answering System

		This comprehensive system enables users to interact via a terminal-based chat interface, posing questions in multiple languages and receiving accurate, contextually relevant answers, all without requiring an internet connection. By leveraging natural language processing, information retrieval, intelligent agents, and offline models, this system prioritizes data privacy and accessibility even in disconnected environments.
		This comprehensive system enables users to interact via a terminal-based chat interface, posing questions in multiple languages and receiving accurate, contextually relevant answers in the same language, all without requiring an internet connection. By leveraging natural language processing, information retrieval, intelligent agents, and offline models, this system prioritizes data privacy and accessibility even in disconnected environments.

		### Table of Contents
		## Table of Contents

		- [Introduction](#introduction)
		- [Features](#features)
		@@ -45,11 +45,9 @@ This system is designed to provide a seamless and informative question-answering
		- Offline LLM: Leverages the Vicuna-7B model for powerful language understanding and answer generation capabilities in an offline setting.
		- Queue-Based Processing: Handles multiple chat requests concurrently, ensuring fair and efficient processing.

		![image](/CRAIG_Graph-2024-10-01-035038.svg)

		## System Architecture

		The system is built upon a modular architecture, orchestrated using Langgraph, a declarative workflow management framework. It consists of several key components, each responsible for a specific task in the question-answering process:
		The system's modular architecture comprises interconnected components, each fulfilling a specific role in the question-answering process.

		* `Config`: Centralizes configuration and setup, including loading models, database connection, cache initialization.
		* `Tasks`: Defines the Langgraph tasks and their dependencies, representing the system's workflow.
		@@ -76,8 +74,6 @@ The system is built upon a modular architecture, orchestrated using Langgraph, a

		* Purpose: Defines the Langgraph tasks and their dependencies, forming the system's workflow
		* Tasks
		* `fetch_new_emails_task`: (Placeholder for future email integration)
		* `check_registry_task`: (Placeholder for future email integration)
		* `chat_input_task`: Gets user input from the chat interface and adds it to the request queue
		* `detect_language_task`: Detects the language of the input question
		* `translate_to_english_task`: Translates the question to English if needed
		@@ -137,7 +133,7 @@ The system is built upon a modular architecture, orchestrated using Langgraph, a

		## Getting Started

		## Prerequisites
		### Prerequisites

		* Python 3.x: Ensure you have Python 3.x installed on your system. You can download it from the official Python website: [https://www.python.org/](https://www.python.org/)

		@@ -165,8 +161,30 @@ The system is built upon a modular architecture, orchestrated using Langgraph, a

		* Embedded Document Collection:
		* Prepare your document collection in a suitable format (e.g., plain text files).
		* Embed the documents using an offline embedding model like Sentence Transformers.
		* Store the embeddings in a local vector database (e.g., FAISS) for efficient retrieval.
		* Generate embeddings using Sentence Transformers:

		```python
		from langchain.document_loaders import TextLoader
		from langchain.text_splitter import RecursiveCharacterTextSplitter
		from langchain.vectorstores import FAISS
		from langchain.embeddings import HuggingFaceEmbeddings

		# 1. Load and preprocess documents
		loader = TextLoader('./data/documents/your_document.txt') # Replace with your actual document path
		documents = loader.load()

		text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
		texts = text_splitter.split_documents(documents)

		# 2. Generate embeddings (all-MiniLM-L6-v2 is recommended for Vicuna)
		embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
		db = FAISS.from_documents(texts, embeddings)

		# 3. Save the vectorstore (FAISS index) to disk
		db.save_local("data/faiss_index")
		```

		* This code snippet demonstrates how to load documents, split them into chunks, generate embeddings using the `all-MiniLM-L6-v2` model (recommended for Vicuna), and store them in a FAISS index.

		* SQLite3:
		* Ensure you have SQLite3 installed on your system. It's usually included by default in most Python distributions. If not, you can install it using your package manager (e.g., `apt-get install sqlite3` on Debian/Ubuntu or `brew install sqlite3` on macOS).
		@@ -181,12 +199,13 @@ The system is built upon a modular architecture, orchestrated using Langgraph, a
		* If you plan to implement the email functionality in the future, you'll also need to install libraries for interacting with the Outlook 365 API (e.g., `requests_oauthlib` and `microsoft-graph`).
		* If you choose a different caching solution than the basic in-memory cache, install the necessary library for that (e.g., `redis` for Redis).


		### Installation

		1. Clone the repository:

		```bash
		git clone https://code.ornl.gov/6cq/offline-multilingual-question-answering-system
		git clone [https://code.ornl.gov/6cq/offline-multilingual-question-answering-system](https://code.ornl.gov/6cq/offline-multilingual-question-answering-system)
		```

		2. Navigate to the project directory:
		@@ -221,7 +240,7 @@ The system is built upon a modular architecture, orchestrated using Langgraph, a
		pip install -r requirements.txt
		```

		This will install all the necessary Python packages listed in `requirements.txt` within the virtual environment, ensuring that your project has its own isolated set of dependencies.
		This will install all the necessary Python packages listed in `requirements.txt` within the virtual environment.

		6. Download and organize models:

		@@ -256,10 +275,9 @@ The system is built upon a modular architecture, orchestrated using Langgraph, a
		MarianMTModel.from_pretrained(from_en_model_name, cache_dir="./models/translation")
		```


		* Download the LLM (Vicuna-7B):

		* Ensure you have enough disk space: The Vicuna-7B model is quite large (around 13GB). Make sure you have sufficient disk space available before downloading
		* Ensure you have enough disk space: The Vicuna-7B model is quite large (around 13GB). Make sure you have sufficient disk space available before downloading.
		* Use the `transformers` library to download:

		```python
		@@ -269,7 +287,7 @@ The system is built upon a modular architecture, orchestrated using Langgraph, a
		model = AutoModelForCausalLM.from_pretrained("TheBloke/Vicuna-7B-v1.5-GGUF", cache_dir="./models/llm")
		```

		This code will download both the tokenizer and the model weights to the `./models/llm` directory
		This code will download both the tokenizer and the model weights to the `./models/llm` directory.

		* Consider using a quantized version: If you're running the system on a CPU or have limited memory, you might want to explore using a quantized version of Vicuna-7B, which can significantly reduce its memory footprint and improve inference speed. Refer to the Vicuna-7B model documentation for instructions on how to obtain and use a quantized version.

		@@ -283,24 +301,46 @@ The system is built upon a modular architecture, orchestrated using Langgraph, a

		This code will download the SentencePiece tokenizer to the `./models/embedding` directory.

		* Prepare your document collection:
		7. Prepare your document collection:

		1. Create a document upload folder:
		* Create a folder named `uploads` at the root level of the project. This is where users will upload their documents.
		2. Generate embeddings using Sentence Transformers:

		* Organize your documents in a suitable format (e.g., plain text files) within the `data/documents` directory.
		* Use an offline embedding model (e.g., Sentence Transformers) to generate embeddings for your documents and store them in a local vector database (e.g., FAISS) for efficient retrieval. You can refer to the Langchain documentation or other resources for guidance on how to perform document embedding and indexing.
		```python
		from langchain.document_loaders import DirectoryLoader
		from langchain.text_splitter import RecursiveCharacterTextSplitter
		from langchain.vectorstores import FAISS
		from langchain.embeddings import HuggingFaceEmbeddings

		# 1. Load and preprocess documents from the 'uploads' folder
		loader = DirectoryLoader('./uploads', glob="*/.txt") # Load all .txt files from the uploads folder
		documents = loader.load()

		text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
		texts = text_splitter.split_documents(documents)

		# 2. Generate embeddings (all-MiniLM-L6-v2 is recommended for Vicuna)
		embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
		db = FAISS.from_documents(texts, embeddings)

		# 3. Save the vectorstore (FAISS index) to disk
		db.save_local("data/faiss_index")
		```

		* This code snippet demonstrates how to load documents from the `uploads` folder, split them into chunks, generate embeddings using the `all-MiniLM-L6-v2` model (recommended for Vicuna), and store them in a FAISS index.


		### Configuration

		1. `config.py`
		* Update the `language_pairs` dictionary in the `Config` class with the actual paths to your downloaded translation models.
		* Replace the placeholder in `_load_document_collection` with your actual code to load your embedded document collection.
		* Ensure the path to your FAISS index in `_load_document_collection` is correct.
		* Configure the database connection details in `_initialize_database` if you're using a different database system.
		* Ensure you have the SentencePiece tokenizer downloaded and specify its path in `_load_translation_models`.

		2. Tool Implementations
		* In `document_answerer.py`, replace the `OpenAI` placeholder with your actual offline LLM interface.
		* In `document_answerer.py`, ensure the `_run` method uses your actual offline LLM interface.
		* Customize the `SelfCorrectiveAgent` in `self_corrective_agent.py` with your desired evaluation logic and thresholds.
		* Implement the chat interface logic in `chat_input_tool.py` and the `display_answer_in_chat_task` in `tasks.py` using the `curses` library or a similar approach.

		@@ -358,5 +398,3 @@ This project is licensed under the [MIT License](LICENSE)
		* Continuously update and expand your embedded document collection to cover a wider range of topics and domains, making the system more knowledgeable and versatile.
		* User Feedback Mechanism:
		* Incorporate a mechanism to collect user feedback on the quality and relevance of answers. This feedback can be used to further fine-tune models and improve the system's overall performance.
		No newline at end of file

config.py

+12 −5

Original line number	Diff line number	Diff line
		import sqlite3
		from transformers import MarianMTModel, MarianTokenizer, AutoTokenizer, AutoModelForCausalLM, pipeline

		from langchain.embeddings import HuggingFaceEmbeddings
		from langchain.llms import HuggingFacePipeline
		from langchain.vectorstores import FAISS
		from transformers import (
		MarianMTModel,
		MarianTokenizer,
		AutoTokenizer,
		AutoModelForCausalLM,
		pipeline,
		)

		class Config:
		def __init__(self):
		@@ -50,15 +57,15 @@ class Config:
		def _load_document_collection(self):
		# Load your local embedded document collection
		embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
		return FAISS.load_local("my_faiss_index", embeddings)
		return FAISS.load_local("data/faiss_index", embeddings)

		def _initialize_database(self):
		conn = sqlite3.connect('email_interactions.db')
		conn = sqlite3.connect('chat_interactions.db') # Changed database name to 'chat_interactions.db'
		cursor = conn.cursor()

		# Create the email_interactions table
		# Create the chat_interactions table
		cursor.execute('''
		CREATE TABLE IF NOT EXISTS email_interactions (
		CREATE TABLE IF NOT EXISTS chat_interactions (
		uid TEXT PRIMARY KEY,
		timestamp DATETIME,
		question TEXT,

main.py

+29 −14

Original line number	Diff line number	Diff line
		@@ -5,6 +5,11 @@ from langgraph import Graph

		from config import Config
		from tasks import Tasks
		from tools.language_detection_tool import LanguageDetectionTool
		from tools.translator import Translator
		from tools.document_answerer import DocumentAnswerer
		from tools.self_corrective_agent import SelfCorrectiveAgent
		from tools.chat_input_tool import ChatInputTool, get_user_input_from_terminal, display_answer_in_terminal

		# 3. Main Program

		@@ -18,7 +23,7 @@ class QuestionAnsweringSystem:
		def _build_graph(self):
		graph = Graph()

		# Add tasks to the graph (excluding email-related tasks)
		# Add tasks to the graph
		graph.add_tasks([
		self.tasks.chat_input_task(),
		self.tasks.detect_language_task(),
		@@ -48,25 +53,28 @@ class QuestionAnsweringSystem:

		def run(self):
		while True:
		# 1. Check for new chat requests (no email checks anymore)
		self.agent_executor.run("check_chat")
		# 1. Check for new chat requests
		self.graph.execute(inputs={
		"request_queue": self.request_queue,
		"cache": self.config.cache
		})

		# 2. Process requests from the queue
		while not self.request_queue.empty():
		request = self.request_queue.get()
		_, data, uid, context, _ = request # Ignore 'source' and 'user_email' for now
		_, question, uid, context, _ = request # Ignore 'source' and 'user_email'

		# 3. Process the chat request
		input_language = language_detection_tool.run(data)
		input_language = language_detection_tool.run(question)
		retry_count = 0

		while retry_count < 3:
		if input_language != 'en':
		data = translator.run(data)
		question = translator.run(question)

		documents = document_retriever.run(data)
		answer = document_answerer.run(data, documents, context)
		answer, feedback_or_problems = self_corrective_agent.run(data, answer, documents, retry_count)
		documents = document_retriever.run(question)
		answer = document_answerer.run(question, documents, context)
		answer, feedback_or_problems = self_corrective_agent.run(question, answer, documents, retry_count)

		if feedback_or_problems is None:
		break
		@@ -76,25 +84,32 @@ class QuestionAnsweringSystem:
		break

		retry_count += 1
		documents = document_retriever.run(data, feedback_or_problems)
		answer = document_answerer.run(data, documents, context)
		documents = document_retriever.run(question, feedback_or_problems)
		answer = document_answerer.run(question, documents, context)

		if input_language != 'en':
		answer = translator.run(answer, target_language=input_language)

		# Display the answer in the chat interface
		display_answer_in_chat(answer)
		display_answer_in_terminal(answer)

		# Update cache with context for potential follow-up questions
		cache.set(uid, {"question": data, "documents": documents, "answer": answer})
		self.config.cache.set(uid, {"question": question, "documents": documents, "answer": answer})

		# Update DB for chat interactions
		status = "answered" if feedback_or_problems is None else "answered_with_problems"
		update_interaction_in_db(uid, answer, status)
		self.tasks.update_interaction_in_db(uid, answer, status)

		time.sleep(1) # Adjust the interval as needed

		if __name__ == "__main__":
		config = Config()

		# Initialize tools
		language_detection_tool = LanguageDetectionTool()
		translator = Translator(config.cache, config.tokenizer, config.models)
		document_answerer = DocumentAnswerer(config.vectorstore, config.cache, config.llm)
		self_corrective_agent = SelfCorrectiveAgent()

		system = QuestionAnsweringSystem(config)
		system.run()
		No newline at end of file

requirements.txt

+1 −5

Original line number	Diff line number	Diff line
		# Core Libraries
		langgraph==0.0.13 # Latest version as of Sept 30, 2024
		langgraph==0.0.13
		sentence-transformers==2.2.2
		spacy==3.6.1
		langdetect==1.0.9
		@@ -13,10 +13,6 @@ transformers==4.31.0
		# - Redis:
		# redis

		# Email Handling (If implementing Outlook 365 integration in the future)
		requests_oauthlib==1.3.1
		microsoft-graph==0.60.0

		# Offline LLM (Vicuna-7B)
		torch==2.1.0
		accelerate==0.23.0

tasks.py

+132 −15

Original line number	Diff line number	Diff line
		@@ -2,6 +2,10 @@ import uuid
		import curses # For the chat interface

		from langgraph import Task, tool_code
		from langdetect import detect
		from tools.translator import Translator
		from tools.self_corrective_agent import SelfCorrectiveAgent
		from langchain.chains import RetrievalQA

		class Tasks:
		def __init__(self, config):
		@@ -56,7 +60,7 @@ class Tasks:
		return cached_translation

		# 2. Perform translation if not cached
		translation = translator.run(question)
		translation = translator.run(question) # Use the translator tool

		# 3. Store translation in cache
		cache.set(question, translation)
		@@ -86,7 +90,7 @@ class Tasks:
		tool_code=tool_code(
		"""
		# Generate answer using the offline LLM
		answer = generate_answer_from_llm(query, docs, context)
		answer = generate_answer_from_llm(query, docs, context, llm)

		# Format references
		references = [f"Document: {doc.metadata['title']}" for doc in docs]
		@@ -150,25 +154,77 @@ class Tasks:

		# Update DB for chat interactions
		status = "answered" if problems is None else "answered_with_problems"
		update_interaction_in_db(uid, answer, status)
		update_interaction_in_db(uid, answer, status, cursor, conn)
		"""
		),
		args={"answer": "{answer}", "problems": "{problems}", "uid": "{uid}", "cursor": self.config.cursor, "conn": self.config.conn},
		)

		# Helper function for chat interface (using curses)
		# Helper functions for chat interface (using curses)
		def get_user_input_from_terminal():
		# ... (Implementation using curses to get user input)
		pass
		# Initialize curses
		stdscr = curses.initscr()
		curses.cbreak() # Get characters immediately
		curses.noecho() # Don't echo user input
		stdscr.keypad(True) # Enable special keys

		# Create chat window and input area
		height, width = stdscr.getmaxyx()
		chat_win = curses.newwin(height - 3, width, 0, 0) # Leave space for input
		input_win = curses.newwin(3, width, height - 3, 0)
		chat_win.scrollok(True) # Enable scrolling in chat window

		# Display welcome message
		display_message(chat_win, "Welcome to the Offline Q&A Chatbot!")

		# Get user input
		user_input = get_input(input_win)

		if user_input:
		# Display user input in the chat window
		display_message(chat_win, f"You: {user_input}")

		return user_input.strip()

		def get_input(input_win):
		input_win.clear()
		input_win.addstr(1, 0, "You: ")
		input_win.refresh()

		user_input = ""
		while True:
		key = input_win.getch()
		if key == curses.KEY_ENTER or key in [10, 13]: # Enter key
		break
		elif key == curses.KEY_BACKSPACE or key == 127: # Backspace
		if user_input:
		user_input = user_input[:-1]
		y, x = input_win.getyx()
		input_win.delch(y, x - 1)
		input_win.refresh()
		else:
		user_input += chr(key)
		input_win.addch(key)
		input_win.refresh()

		return user_input.strip()

		def display_answer_in_terminal(answer):
		# ... (Implementation using curses to display the answer)
		pass

		def store_interaction_in_db(uid, question=None, answer=None, status="received", other_metadata=None):
		def display_message(chat_win, message):
		chat_win.addstr(message + "\n")
		chat_win.refresh()

		# Scroll the chat window if necessary
		max_y, _ = chat_win.getmaxyx()
		if chat_win.getyx()[0] >= max_y - 1:
		chat_win.scroll(1)

		def store_interaction_in_db(uid, question=None, answer=None, status="received", other_metadata=None, cursor=None, conn=None):
		"""
		This function should implement the logic to store the interaction details in your database.
		You'll likely need to use your database cursor to execute an INSERT query.
		This function should implement the logic to store the interaction details in your database.You'll likely need to use your database cursor to execute an INSERT query.

		Args:
		uid: The unique identifier for the interaction.
		@@ -177,14 +233,13 @@ def store_interaction_in_db(uid, question=None, answer=None, status="received",
		status: The status of the interaction (default is "received").
		other_metadata: Any additional metadata you want to store (optional).
		"""

		cursor.execute('''
		INSERT INTO email_interactions (uid, timestamp, question, answer, status, other_metadata)
		INSERT INTO chat_interactions (uid, timestamp, question, answer, status, other_metadata)
		VALUES (?, datetime('now'), ?, ?, ?, ?)
		''', (uid, question, answer, status, other_metadata))
		conn.commit()

		def update_interaction_in_db(uid, answer, status):
		def update_interaction_in_db(uid, answer, status, cursor, conn):
		"""
		This function should implement the logic to update an existing interaction in your database.
		You'll likely need to use your database cursor to execute an UPDATE query.
		@@ -194,10 +249,72 @@ def update_interaction_in_db(uid, answer, status):
		answer: The updated answer to be stored.
		status: The updated status of the interaction.
		"""

		cursor.execute('''
		UPDATE email_interactions
		UPDATE chat_interactions
		SET answer = ?, status = ?
		WHERE uid = ?
		''', (answer, status, uid))
		conn.commit()

		def generate_answer_from_llm(query: str, docs: list, context: Optional[List[dict]], llm) -> str:
		"""
		Generates an answer to the given query using the provided documents and context.

		Args:
		query: The question to be answered.
		docs: A list of relevant documents retrieved from the vectorstore.
		context: Optional context from previous interactions (for follow-up questions).
		llm: The offline LLM instance for generating answers.

		Returns:
		The generated answer to the query.
		"""
		# Placeholder: Implement your answer generation logic here, using the provided LLM, documents, and context
		# ...
		raise NotImplementedError("Implement generate_answer_from_llm using your offline LLM")

		def is_answer_valid(query: str, answer: str, documents: list) -> bool:
		"""
		Checks if the generated answer is valid based on various criteria.

		Args:
		query: The original question asked by the user.
		answer: The generated answer to be evaluated.
		documents: The list of documents used to generate the answer.

		Returns:
		True if the answer is valid, False otherwise.
		"""
		# Placeholder: Implement your answer validation logic here
		# ...
		raise NotImplementedError("Implement is_answer_valid")

		def generate_feedback(query: str, answer: str, documents: list) -> str:
		"""
		Generates feedback to guide the Document Retriever in case the answer is not valid.

		Args:
		query: The original question asked by the user.
		answer: The generated answer to be evaluated.
		documents: The list of documents used to generate the answer.

		Returns:
		A string containing feedback for the Document Retriever.
		"""
		# Placeholder: Implement your feedback generation logic here
		# ...
		raise NotImplementedError("Implement generate_feedback")

		def describe_problems(answer: str) -> str:
		"""
		Provides a description of the problems identified in the answer.

		Args:
		answer: The generated answer to be evaluated.

		Returns:
		A string describing the problems found in the answer.
		"""
		# Placeholder: Implement your problem description logic here
		# ...
		raise NotImplementedError("Implement describe_problems")
		No newline at end of file