{"id":4126,"date":"2024-07-08T19:04:00","date_gmt":"2024-07-08T19:04:00","guid":{"rendered":"https:\/\/lancerninja.com\/?p=4126"},"modified":"2025-10-15T10:09:51","modified_gmt":"2025-10-15T10:09:51","slug":"how-to-build-a-personal-pdf-gpt-chat-assistant","status":"publish","type":"post","link":"https:\/\/chatclient.ai\/blog\/how-to-build-a-personal-pdf-gpt-chat-assistant\/","title":{"rendered":"PDF GPT: How to Build Personal PDF Chat Assistant? [Tutorial]"},"content":{"rendered":"\n<h2 id=\"introduction\" class=\"wp-block-heading is-style-cnvs-heading-numbered\"><strong>Introduction<\/strong><\/h2>\n\n\n\n<p>In this article, we will understand what is PDF GPT and how to build build your personal PDF Chat Assistant through simple steps.<\/p>\n\n\n\n<p>Artificial Intelligence has taken center stage in the tech world since the revolutionary ChatGPT was introduced in late 2022. The groundbreaking model captivated many ideas and possibilities in developers and there has been an uprise in wide range implementation of these Large Language Models(LLMs). <\/p>\n\n\n\n<p>At the heart of all these possibilities lies the LangChain framework that simplifies the development of applications powered by these language models. Many LLM applications require user-specific data that is not part of the model&#8217;s training set. One such is PDF GPT.<\/p>\n\n\n\n<p>You can watch this video tutorial to build Personal PDF Chat Assistant: <\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-4-3 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe loading=\"lazy\" title=\"PDFGPT: Chat to Your PDFs for FREE in Minutes (Full Tutorial) #langchain #chatgpt #openai\" width=\"1200\" height=\"900\" src=\"https:\/\/www.youtube.com\/embed\/07eeist5TuY?start=188&#038;feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/div><figcaption class=\"wp-element-caption\"><strong><em>PDF GPT: Chat to Your PDFs for FREE in Minutes<\/em><\/strong>  <a href=\"https:\/\/www.youtube.com\/hashtag\/langchain\">#langchain<\/a> <a href=\"https:\/\/www.youtube.com\/hashtag\/chatgpt\">#chatgpt<\/a> <a href=\"https:\/\/www.youtube.com\/hashtag\/openai\">#openai<\/a><\/figcaption><\/figure>\n\n\n\n<h2 id=\"quick-start-to-pdf-gpt\" class=\"wp-block-heading is-style-cnvs-heading-numbered\"><strong>Quick Start to PDF GPT<\/strong><\/h2>\n\n\n\n<p>As discussed LangChain makes it easy to develop LLM-powered applications, But how?             Lets us know how by understanding the components of LangChain.<\/p>\n\n\n\n<h3 id=\"data-aware-connect-language-model-to-other-sources-of-data\" class=\"wp-block-heading is-style-cnvs-heading-numbered\"><strong>Data Aware: Connect Language Model to Other Sources of Data<\/strong><\/h3>\n\n\n\n<p>Most LLM applications need user-specific data (Same in the case of PDF GPT, Business Data Analyzer, Query Builder, AI coder, etc. ) that are not part of the model&#8217;s training data. LangChain provides building blocks to connect data to the model. We&#8217;ll discuss all 5 steps of Data Connection in detail while we build the assistant.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"354\" src=\"https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/data_connection-c42d68c3d092b85f50d08d4cc171fc25-scaled-1-1024x354.jpg\" alt=\"\" class=\"wp-image-4127\" srcset=\"https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/data_connection-c42d68c3d092b85f50d08d4cc171fc25-scaled-1-1024x354.jpg 1024w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/data_connection-c42d68c3d092b85f50d08d4cc171fc25-scaled-1-300x104.jpg 300w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/data_connection-c42d68c3d092b85f50d08d4cc171fc25-scaled-1-768x266.jpg 768w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/data_connection-c42d68c3d092b85f50d08d4cc171fc25-scaled-1-1536x531.jpg 1536w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/data_connection-c42d68c3d092b85f50d08d4cc171fc25-scaled-1-2048x708.jpg 2048w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/data_connection-c42d68c3d092b85f50d08d4cc171fc25-scaled-1-380x131.jpg 380w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/data_connection-c42d68c3d092b85f50d08d4cc171fc25-scaled-1-800x277.jpg 800w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/data_connection-c42d68c3d092b85f50d08d4cc171fc25-scaled-1-1160x401.jpg 1160w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/data_connection-c42d68c3d092b85f50d08d4cc171fc25-scaled-1.jpg 2560w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption class=\"wp-element-caption\">                                           This is the flow of data for a typical relevant context retriever of a query\/Question<\/figcaption><\/figure>\n\n\n\n<h3 id=\"agent-allow-a-language-model-to-interact-with-its-environment\" class=\"wp-block-heading is-style-cnvs-heading-numbered\"><strong>Agent: Allow a Language Model to interact with its Environment<\/strong><\/h3>\n\n\n\n<p>An <a href=\"https:\/\/lancerninja.com\/agents-in-langchain\/\" target=\"_blank\" rel=\"noreferrer noopener\">Agent<\/a> is a system that decides what action is taken by LLM, it observes and repeats until it reaches the correct answer. It allows us to combine LLMs capabilities with external sources of computation(tools) or knowledge. To use agents, we require three things: 1) A base LLM, 2) A tool to take on action, and 3) An agent to control and initiate actions and interactions.<\/p>\n\n\n\n<p>There is also Chains. Assume chains are assembling other components of LangChain in particular ways to accomplish specific use cases. These components can even be other chains, models, memory, and Agents.<\/p>\n\n\n\n<p>Tools are functions that agents can use to interact with the world. These tools can be generic utilities (e.g. search), other chains, or even other agents.<\/p>\n\n\n\n<h2 id=\"virtual-environment-setup\" class=\"wp-block-heading is-style-cnvs-heading-numbered\"><strong>Virtual Environment Setup<\/strong><\/h2>\n\n\n\n<p>Setting up a Virtual Environment is necessary to avoid version conflicts of packages between different projects that one might work on.                                                                                                                                                                                <br><br>First, we will create a virtual environment. Navigate to the directory where you want to place the virtual environment, create a project folder, cd to the project folder in the terminal, and run the following command in the command prompt:<br><\/p>\n\n\n\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\" data-line=\"\">python -m venv &lt;virtual-environment-name&gt;<\/code><\/pre>\n\n\n\n<p>Once you create the virtual environment, you need to activate it before you can use it in the project. On a Mac, run <\/p>\n\n\n\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\" data-line=\"\">source &lt;VIRTUAL-ENVIRONMENT-NAME&gt;\/bin\/activate<\/code><\/pre>\n\n\n\n<p>On a Windows run<\/p>\n\n\n\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\" data-line=\"\">.\\&lt;VIRTUAL-ENVIRONMENT-NAME&gt;\\Scripts\\activate<\/code><\/pre>\n\n\n\n<p>When you activate the virtual environment, the name of the environment will appear on the left side of the terminal. This indicates that the virtual environment is currently live. In the terminal, you can run the command &#8220;<strong>pip list<\/strong>&#8221; to check the base packages that are present when you create a new virtual environment.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"147\" src=\"https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-154404-1024x147.png\" alt=\"\" class=\"wp-image-4141\" style=\"width:888px;height:127px\" srcset=\"https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-154404-1024x147.png 1024w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-154404-300x43.png 300w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-154404-768x111.png 768w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-154404-380x55.png 380w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-154404-800x115.png 800w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-154404.png 1097w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<div class=\"wp-block-buttons alignwide is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-16018d1d wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button is-style-circular\"><a class=\"wp-block-button__link has-text-align-center wp-element-button\" href=\"https:\/\/chatclient.ai\/auth\/sign-up\" target=\"_blank\" rel=\"noreferrer noopener\">TRY CHATCLIENT for free NOW<\/a><\/div>\n<\/div>\n\n\n\n<h2 id=\"building-pdf-gpt-chat-assistant\" class=\"wp-block-heading is-style-cnvs-heading-numbered\"><strong>Building PDF GPT Chat Assistant<\/strong><\/h2>\n\n\n\n<p>Now that we have our virtual environment active, it is time to install the required packages from the terminal, create a main.py file in our project folder, and import these packages. Use Visual Studio Code (VSCode) for editing code as it is development-friendly.<\/p>\n\n\n\n<h3 id=\"install-and-import-packages\" class=\"wp-block-heading is-style-cnvs-heading-numbered\"><strong>Install and Import Packages<\/strong><\/h3>\n\n\n\n<p>First, we will install by running the following command in the command prompt:<\/p>\n\n\n\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\" data-line=\"\">pip install gradio openai numpy tiktoken langchain unstructured\n<\/code><\/pre>\n\n\n\n<p>Now we will Import Packages:<\/p>\n\n\n\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\" data-line=\"\">import gradio as gr\nimport openai\nimport numpy as np\nfrom time import sleep\nimport tiktoken\nfrom langchain.text_splitter import RecursiveCharacterTextSplitter\nfrom langchain.document_loaders import UnstructuredPDFLoader<\/code><\/pre>\n\n\n\n<p><strong>We will be using Gradio for the front end of the PDF GPT.<\/strong> Gradio is one of the best ways to make your machine-learning projects more interactive. OpenAI API key to send API requests to OpenAi&#8217;s Chat GPT. Tiktoken is a Byte-pair encoding (BPE) tokenizer, it splits text strings into tokens that can be passed to GPT models. GPT models see text in the form of tokens. <br><br>RecursiveCharacterTextSplitter will split documents recursively by different characters &#8211; starting with &#8220;\\n\\n&#8221;, then &#8220;\\n&#8221;, then &#8221; &#8220;. Important parameters to know here are chunkSize and chunkOverlap. chunkSize controls the max size (in terms of the number of characters) of the final documents. chunkOverlap specifies how much overlap there should be between chunks. <br><br>We obtain the final output from the raw text input as a list of documents formed based on chunk size and chunk overlap. UnstructuredPDFLoader uses Python&#8217;s unstructured package under the hood which provides components to process PDFs, HTML, and Word Documents.<\/p>\n\n\n\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\" data-line=\"\">COMPLETIONS_MODEL = &quot;gpt-3.5-turbo&quot;\nEMBEDDING_MODEL = &quot;text-embedding-ada-002&quot;<\/code><\/pre>\n\n\n\n<p><strong>We choose the &#8220;gpt-3.5-turbo&#8221; model to generate answers and the &#8220;text-embedding-ada-002&#8221; model to generate vector representations that capture the semantic meaning of the text.<\/strong> Later these embeddings are used to compare the similarity between two different pieces of text. Get your key from <a href=\"https:\/\/platform.openai.com\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">platform.openai<\/a>:<\/p>\n\n\n\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\" data-line=\"\"># Initialize the OpenAI API\nopenai.api_key = &quot;&lt;Your_OPENAI_APIKEY&gt;&quot;<\/code><\/pre>\n\n\n\n<h3 id=\"extract-text-from-the-pdf-function\" class=\"wp-block-heading is-style-cnvs-heading-numbered\"><strong>Extract Text from the PDF Function<\/strong><\/h3>\n\n\n\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\" data-line=\"\"># Function to convert a PDF to text\ndef extract_text_from_pdf(pdf_file, progress=gr.Progress()):\n\n    try:\n        reader = UnstructuredPDFLoader(pdf_file.name)\n        data = reader.load()\n        text = data[0].page_content\n\n        text_splitter = RecursiveCharacterTextSplitter(\n            chunk_size=400,\n            chunk_overlap=50,\n            length_function=len,\n        )\n        chunks = text_splitter.create_documents([text])\n\n        embed = compute_doc_embeddings(chunks, progress)\n        return chunks, embed, &quot;uploaded&quot;\n    except Exception as e:\n        print(e)\n        return None, None, &quot;&quot;<\/code><\/pre>\n\n\n\n<p><strong>The extract_text_from_pdf function takes two parameters as input one the pdf_file object and the progress object that will track any tqdm iterations with the library in function which will help us get the progress of the embedding process of the pdf.<\/strong><br><br>We use UnstructuredPDFLoader to extract content from pdf and store it in a reader object. We call the load method on the reader to retrieve all the content i.e. data from the pdf file and store it in the data variable. data[0].page_content is used to store all the text data of the first document(In this case there is only one document that is uploaded pdf) in the data variable in a variable named &#8220;text&#8221;.<\/p>\n\n\n\n<p>After extracting all the text from the PDF document, we use the RecursiveCharacterTextSplitter to create chunks of our text data. Each chunk consists of 400 words, with a 50-word overlap between consecutive chunks.<br><br>We will use a compute_doc_embeddings function within the function that will take all the chunks as input and the progress object as input. We want the function to return a dictionary of embeddings of each chunk. The function is defined further in the code.<br><br>The extract_text_from_pdf function covers the text extraction and processing part of the data connection process.<\/p>\n\n\n\n<h2 id=\"function-to-generate-embeddings\" class=\"wp-block-heading is-style-cnvs-heading-numbered\"><strong>Function to Generate Embeddings<\/strong><\/h2>\n\n\n\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\" data-line=\"\">def get_embedding(text, model=EMBEDDING_MODEL):\n    result = openai.Embedding.create(\n        model=model,\n        input=text\n    )\n    return result[&quot;data&quot;][0][&quot;embedding&quot;]<\/code><\/pre>\n\n\n\n<p>get_embedding function generates embeddings for text using EMBEDDING_MODEL.<\/p>\n\n\n\n<h3 id=\"compute_doc_embeddings-function\" class=\"wp-block-heading is-style-cnvs-heading-numbered\"><strong>compute_doc_embeddings() Function<\/strong><\/h3>\n\n\n\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\" data-line=\"\">def compute_doc_embeddings(text, progress):\n    &quot;&quot;&quot;\n    Create an embedding for each row in the dataframe using the OpenAI Embeddings API.\n\n    Return a dictionary that maps between each embedding vector and the index of the row that it corresponds to.\n    &quot;&quot;&quot;\n    result = {}\n    for idx in progress.tqdm(range(len(text))):\n        try:\n            res = get_embedding(text[idx].page_content)\n        except:\n            done = False\n            while not done:\n                sleep(5)\n                try:\n                    res = get_embedding(text[idx].page_content)\n                    done = True\n                except:\n                    pass\n        result[idx] = res\n\n    return result<\/code><\/pre>\n\n\n\n<p>You probably got the functionality from the comments. The function gets text chunks and progress object as input. For each chunk, we extract embeddings using OpenAi embeddings and store in result with the corresponding index. The function returns a dictionary of embeddings with each embedding mapping to the index of the corresponding chunk.<br><br>Hooray! Now that we have defined compute_doc_embeddings, we have covered four steps of the data connection process: text extraction, processing, embedding, and storing. For easing storing and retrieving steps there are various vector store database options such as Pinecone, Chromadb, etc.<\/p>\n\n\n\n<h3 id=\"retrieving-from-source-pdf\" class=\"wp-block-heading is-style-cnvs-heading-numbered\"><strong>Retrieving from Source PDF<\/strong><\/h3>\n\n\n\n<p>The functions defined until now act as the base data storage of our chatbot. For retrieving data, the most important text is user query\/user input\/ user question. When you search for information, formula, or a topic on Google you get results based on keywords, based on your location, etc. <\/p>\n\n\n\n<p><strong>The retrieval method widely used to obtain relevant source documents for user input is vector similarity search.<\/strong> It could be by getting the dot product of vectors or it could be by calculating cosine similarity. After calculating all the vector similarity scores, we arrange them in decreasing order of similarity magnitude and select the top n similar documents.<\/p>\n\n\n\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\" data-line=\"\">def vector_similarity(x, y):\n    &quot;&quot;&quot;\n    Returns the similarity between two vectors.\n\n    Because OpenAI Embeddings are normalized to length 1, the cosine similarity is the same as the dot product.\n    &quot;&quot;&quot;\n    return np.dot(np.array(x), np.array(y))\n\n\ndef order_document_sections_by_query_similarity(query, contexts):\n    &quot;&quot;&quot;\n    Find the query embedding for the supplied query, and compare it against all of the pre-calculated document embeddings\n    to find the most relevant sections.\n\n    Return the list of document sections, sorted by relevance in descending order.\n    &quot;&quot;&quot;\n    query_embedding = get_embedding(query)\n    \n    document_similarities = sorted([\n        (vector_similarity(query_embedding, doc_embedding), doc_index) for doc_index, doc_embedding in contexts.items()\n    ], reverse=True)\n\n    return document_similarities<\/code><\/pre>\n\n\n\n<div class=\"wp-block-buttons alignwide is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-16018d1d wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button is-style-circular\"><a class=\"wp-block-button__link has-text-align-center wp-element-button\" href=\"https:\/\/chatclient.ai\/auth\/sign-up\" target=\"_blank\" rel=\"noreferrer noopener\">TRY CHATCLIENT for free NOW<\/a><\/div>\n<\/div>\n\n\n\n<h2 id=\"construct-prompt-from-question-and-context_embeddings\" class=\"wp-block-heading is-style-cnvs-heading-numbered\"><strong>Construct Prompt from Question and context_embeddings<\/strong><\/h2>\n\n\n\n<p>A Prompt is an input that we pass to the language model. For a specific use case, one generally constructs it using a prompt template that takes in certain parameters and generates the desired prompt. This prompt serves as input to the Language Model (LLM). A prompt can contain instructions, few-shot examples, question input, context from the user, and everything the language model needs to generate better answers.<\/p>\n\n\n\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\" data-line=\"\">SEPARATOR = &quot;\\n* &quot;\nENCODING = &quot;gpt2&quot;  # encoding for text-davinci-003\n\nencoding = tiktoken.get_encoding(ENCODING)\nseparator_len = len(encoding.encode(SEPARATOR))\nCOMPLETIONS_API_PARAMS = {\n    # We use temperature of 0.0 because it gives the most predictable, factual answer.\n    &quot;temperature&quot;: 0.0,\n    &quot;max_tokens&quot;: 300,\n    &quot;model&quot;: COMPLETIONS_MODEL,\n}\n\n\ndef construct_prompt(question, context_embeddings, df):\n    &quot;&quot;&quot;\n    Fetch relevant\n    &quot;&quot;&quot;\n    chosen_sections = []\n    chosen_sections_len = 0\n    chosen_sections_indexes = []\n    most_relevant_document_sections = order_document_sections_by_query_similarity(question, context_embeddings)\n\n    if &quot;email&quot; in question:\n        MAX_SECTION_LEN = 2500\n        COMPLETIONS_API_PARAMS[&#039;max_tokens&#039;] = 1000\n        COMPLETIONS_API_PARAMS[&#039;temperature&#039;] = 0.5\n        header = &quot;&quot;&quot;Write email using the provided context \\n\\nContext:\\n &quot;&quot;&quot;\n    elif &quot;summary&quot; in question or &quot;summarize&quot; in question:\n        MAX_SECTION_LEN = 2500\n        COMPLETIONS_API_PARAMS[&#039;max_tokens&#039;] = 1000\n        COMPLETIONS_API_PARAMS[&#039;temperature&#039;] = 0.5\n        header = &quot;&quot;&quot;Write detailed summary of the provided context \\n\\nContext:\\n &quot;&quot;&quot;\n        question = &quot;&quot;\n    else:\n        MAX_SECTION_LEN = 1000\n        COMPLETIONS_API_PARAMS[&#039;max_tokens&#039;] = 300\n        COMPLETIONS_API_PARAMS[&#039;temperature&#039;] = 0.0\n        header = &quot;&quot;&quot;Answer the question in detail as truthfully as possible, and if the answer is not contained within the text below, say &quot;I don&#039;t know.&quot;\\n\\nContext:\\n &quot;&quot;&quot;\n\n    for _, section_index in most_relevant_document_sections:\n        # Add contexts until we run out of space.\n        document_section = df[section_index].page_content\n        chosen_sections_len += len(document_section) * 0.25 + separator_len\n\n        if chosen_sections_len &gt; MAX_SECTION_LEN:\n            break\n\n        chosen_sections.append(SEPARATOR + document_section.replace(&quot;\\n&quot;, &quot; &quot;))\n        chosen_sections_indexes.append(str(section_index))\n\n    # Useful diagnostic information\n    print(f&quot;Selected {len(chosen_sections)} document sections:&quot;)\n    print(&quot;\\n&quot;.join(chosen_sections_indexes))\n\n    return header + &quot;&quot;.join(chosen_sections) + &quot;\\n\\n Q: &quot; + question + &quot;\\n A:&quot;<\/code><\/pre>\n\n\n\n<p>We use a dictionary &#8216;COMPLETIONS_API_PARAMS&#8217; to pass parameters while calling ChatGPT API. In the function, you can observe we have conditioned the function to pass different parameters for different categories in questions. You can always add more conditions that are suitable for your use. In the loop of most_relevant_documents, we have multiplied len(document_section) by 0.25 assuming each word to have 4 characters to keep track on MAX_SECTION_LEN.  <\/p>\n\n\n\n<h3 id=\"final-step-answering-user-questions-with-pdf-gpt\" class=\"wp-block-heading is-style-cnvs-heading-numbered\"><strong>Final step: Answering User Questions<\/strong> with PDF GPT<\/h3>\n\n\n\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"language-python\" data-line=\"\">def answer_query_with_context(\n        query,\n        df,\n        document_embeddings, history,\n        openchat, show_prompt=True\n):\n    history = history or []\n    prompt = construct_prompt(\n        query,\n        document_embeddings,\n        df\n    )\n\n    if show_prompt:\n        print(prompt)\n    openchat = openchat or [{&quot;role&quot;: &quot;system&quot;, &quot;content&quot;: &quot;You are a Q&amp;A assistant&quot;}]\n    openchat.append({&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: prompt})\n    response = openai.ChatCompletion.create(\n        messages=openchat,\n        **COMPLETIONS_API_PARAMS\n    )\n    openchat.pop()\n    openchat.append({&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: query})\n    print(COMPLETIONS_API_PARAMS)\n    output = response[&quot;choices&quot;][0][&quot;message&quot;][&quot;content&quot;].replace(&#039;\\n&#039;, &#039;&lt;br&gt;&#039;)\n    openchat.append({&quot;role&quot;: &quot;assistant&quot;, &quot;content&quot;: output})\n    history.append((query, output))\n    return history, history, openchat, &quot;&quot;<\/code><\/pre>\n\n\n\n<p>By looking at the function it is evident that this function uses constructed prompt, document embeddings, and chat history to generate appropriate responses from OpenAI API call to ChatGPT.To understand this function better let&#8217;s follow the gradio interface code.<\/p>\n\n\n\n<div class=\"wp-block-buttons alignwide is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-16018d1d wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button is-style-circular\"><a class=\"wp-block-button__link has-text-align-center wp-element-button\" href=\"https:\/\/chatclient.ai\/auth\/sign-up\" target=\"_blank\" rel=\"noreferrer noopener\">TRY CHATCLIENT for free NOW<\/a><\/div>\n<\/div>\n\n\n\n<h3 id=\"gradio-interface-of-pdf-gpt\" class=\"wp-block-heading is-style-cnvs-heading-numbered\"><strong>Gradio Interface of PDF GPT<\/strong><\/h3>\n\n\n\n<pre class=\"wp-block-prismatic-blocks\"><code class=\"\" data-line=\"\">with gr.Blocks() as app:\n    history_state = gr.State()\n    document = gr.Variable()\n    embeddings = gr.Variable()\n    chat = gr.Variable()\n    with gr.Row():\n        upload = gr.File(label=None, interactive=True, elem_id=&quot;short-upload-box&quot;)\n        ext = gr.Textbox(label=&quot;Progress&quot;)\n\n    with gr.Row():\n        with gr.Column(scale=3):\n            chatbot = gr.Chatbot().style(color_map=(&quot;#075e54&quot;, &quot;grey&quot;))\n\n    with gr.Row():\n        message = gr.Textbox(label=&quot;What&#039;s on your mind??&quot;,\n                             placeholder=&quot;What&#039;s the answer to life, the universe, and everything?&quot;,\n                             lines=1)\n        submit = gr.Button(value=&quot;Send&quot;, variant=&quot;secondary&quot;).style(full_width=False)\n\n    upload.change(extract_text_from_pdf, inputs=[upload], outputs=[document, embeddings, ext])\n    message.submit(answer_query_with_context, inputs=[message, document, embeddings, history_state, chat],\n                   outputs=[chatbot, history_state, chat, message])\n    submit.click(answer_query_with_context, inputs=[message, document, embeddings, history_state, chat],\n                 outputs=[chatbot, history_state, chat, message])\nif __name__ == &quot;__main__&quot;:\n    app.queue().launch(debug=True)<\/code><\/pre>\n\n\n\n<p>gr.blocks() sets up the user interface of PDFGPT, it includes a file upload box, a textbox to display progress towards completing document embedding, A textbox for the user to input questions,a submit button, and a chatbot interface to display chat. <br><\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"462\" src=\"https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-155056-1024x462.png\" alt=\"\" class=\"wp-image-4137\" srcset=\"https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-155056-1024x462.png 1024w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-155056-300x135.png 300w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-155056-768x346.png 768w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-155056-1536x692.png 1536w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-155056-380x171.png 380w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-155056-800x361.png 800w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-155056-1160x523.png 1160w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-155056.png 1912w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p><br>upload.change() initiates the update of embeddings, and documents when a new pdf file is uploaded. Whenever a new pdf is uploaded the function  <code class=\"\" data-line=\"\">extract_text_from_pdf<\/code> is called with file object as input and its outputs are <code class=\"\" data-line=\"\">[document, embeddings, ext]<\/code>.<br><br>message.submit() calls the answer_query_with_context function when a user submits a query by pressing the &#8216;ENTER&#8217; key. This function takes <code class=\"\" data-line=\"\">[message, document, embeddings, history_state, chat]<\/code> as input. At the beginning of the Gradio code, we initialized four variables. The <code class=\"\" data-line=\"\">history_state<\/code> is initialized as <code class=\"\" data-line=\"\">gr.state()<\/code>, and it stores the chat history of the current session.<br><br>It initializes history as the chat session starts and in answer_query_with_context function it is updated with each (query, output). The query is sent to the answer_query_with_context function, where it is converted into a suitable prompt template and then appended  <code class=\"\" data-line=\"\">openchat<\/code> as a message. This message, along with the parameters, is passed to the text generator (openai.ChatCompletion.create()) to obtain a response.<br><br>message.submit() and submit.click() serve the same purpose. One initiates the function when the &#8216;ENTER&#8217; key is pressed while the other initiates when submit button is clicked.<br><\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"481\" src=\"https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-160829-1024x481.png\" alt=\"Pdft gpt demo\" class=\"wp-image-4139\" srcset=\"https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-160829-1024x481.png 1024w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-160829-300x141.png 300w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-160829-768x360.png 768w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-160829-1536x721.png 1536w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-160829-380x178.png 380w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-160829-800x375.png 800w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-160829-1160x544.png 1160w, https:\/\/chatclient.ai\/blog\/wp-content\/uploads\/2023\/07\/Screenshot-2023-07-08-160829.png 1920w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h2 id=\"conclusion\" class=\"wp-block-heading is-style-cnvs-heading-numbered\"><strong>Conclusion<\/strong><\/h2>\n\n\n\n<p>This article tries to provide a step-by-step and comprehensive explanation about how you can build your own PDFGPT. It also provides all the necessary information about components of the chat assistant which are LangChain, openai, GPT-3.5-turbo model(or ChatGPT), Gradio for the interface, tiktoken for embedding. I suggest you backtrack the code from the Gradio interface which will help you gain intuition.  <\/p>\n\n\n\n<p>Some exciting reads, Relevant Resources on Langchain, and Artificial Intelligence :<br><br><a href=\"https:\/\/lancerninja.com\/document-summarization-with-langchain\/\" target=\"_blank\" rel=\"noopener\" title=\"\">Document Summarization using Langchain<\/a><br><br><a href=\"https:\/\/lancerninja.com\/unlocking-quick-data-insights-with-pandas-and-csv-agents-effortlessly\/\" target=\"_blank\" rel=\"noopener\" title=\"\">Unlocking Quick Data Insights With Pandas and CSV Agents Of LangChain<\/a><br><br><a href=\"https:\/\/www.pinecone.io\/learn\/series\/langchain\/langchain-intro\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">LangChain: Introduction| Pinecone<\/a><br><\/p>\n","protected":false},"excerpt":{"rendered":"Introduction In this article, we will understand what is PDF GPT and how to build build your personal&hellip;\n","protected":false},"author":1,"featured_media":4169,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_coblocks_attr":"","_coblocks_dimensions":"","_coblocks_responsive_height":"","_coblocks_accordion_ie_support":"","_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[32,38],"tags":[34,35,37],"class_list":{"0":"post-4126","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-langchain","8":"category-openai","9":"tag-gpt-3-5","10":"tag-langchain","11":"tag-openai"},"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/chatclient.ai\/blog\/wp-json\/wp\/v2\/posts\/4126","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/chatclient.ai\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/chatclient.ai\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/chatclient.ai\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/chatclient.ai\/blog\/wp-json\/wp\/v2\/comments?post=4126"}],"version-history":[{"count":4,"href":"https:\/\/chatclient.ai\/blog\/wp-json\/wp\/v2\/posts\/4126\/revisions"}],"predecessor-version":[{"id":5595,"href":"https:\/\/chatclient.ai\/blog\/wp-json\/wp\/v2\/posts\/4126\/revisions\/5595"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/chatclient.ai\/blog\/wp-json\/wp\/v2\/media\/4169"}],"wp:attachment":[{"href":"https:\/\/chatclient.ai\/blog\/wp-json\/wp\/v2\/media?parent=4126"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/chatclient.ai\/blog\/wp-json\/wp\/v2\/categories?post=4126"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/chatclient.ai\/blog\/wp-json\/wp\/v2\/tags?post=4126"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}