AI Document Assistant via Telegram + Supabase

工作流概述

这是一个包含28个节点的复杂工作流，主要用于自动化处理各种任务。
工作流源代码

下载
{
  "id": "LL0TBxEbXoK2zhqp",
  "meta": {
    "instanceId": "af80dcc2dbd3882359ca17a5fe5b2d4bd4ca3cf3cbe39546ecc263e2e97807e5",
    "templateId": "self-building-ai-agent",
    "templateCredsSetupCompleted": true
  },
  "name": "AI Document Assistant via Telegram + Supabase",
  "tags": [
    {
      "id": "Fo1OtHUY0RXxPbjJ",
      "name": "google-gemini",
      "createdAt": "2025-05-01T23:10:32.399Z",
      "updatedAt": "2025-05-01T23:10:32.399Z"
    },
    {
      "id": "HcgCSAB27xdCFyCf",
      "name": "vectorstore",
      "createdAt": "2025-05-01T23:10:13.148Z",
      "updatedAt": "2025-05-01T23:10:13.148Z"
    },
    {
      "id": "NFkP0TdshXJdwIOG",
      "name": "chatbot",
      "createdAt": "2025-05-01T23:09:53.855Z",
      "updatedAt": "2025-05-01T23:09:53.855Z"
    },
    {
      "id": "QXeMQNrN4XlEXs1I",
      "name": "telegram",
      "createdAt": "2025-05-01T23:09:23.634Z",
      "updatedAt": "2025-05-01T23:09:23.634Z"
    },
    {
      "id": "RLZgltwJo60sK1Dm",
      "name": "embeddings",
      "createdAt": "2025-05-01T23:10:20.621Z",
      "updatedAt": "2025-05-01T23:10:20.621Z"
    },
    {
      "id": "fMH2im2pHJBOzkXp",
      "name": "document-qa",
      "createdAt": "2025-05-01T23:10:07.948Z",
      "updatedAt": "2025-05-01T23:10:07.948Z"
    },
    {
      "id": "ghpuX9kkAqpLyIVR",
      "name": "n8n-ai",
      "createdAt": "2025-05-01T23:10:38.373Z",
      "updatedAt": "2025-05-01T23:10:38.373Z"
    },
    {
      "id": "tSHEttl48VrqMYiV",
      "name": "supabase",
      "createdAt": "2025-05-01T23:10:16.583Z",
      "updatedAt": "2025-05-01T23:10:16.583Z"
    }
  ],
  "nodes": [
    {
      "id": "0213dfab-a1b2-42c9-9ab1-8a0f1de4c4c0",
      "name": "Google Gemini Chat Model",
      "type": "@n8n/n8n-nodes-langchain.lmChatGoogleGemini",
      "position": [
        480,
        40
      ],
      "parameters": {
        "options": {},
        "modelName": "models/gemini-2.5-flash-preview-04-17"
      },
      "credentials": {
        "googlePalmApi": {
          "id": "QuysglXiB421WI90",
          "name": "Google Gemini(PaLM) Api account"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "9c166f83-8ea4-4dc7-8ea2-92ec186c9f32",
      "name": "OpenWeatherMap",
      "type": "n8n-nodes-base.openWeatherMapTool",
      "position": [
        740,
        -100
      ],
      "parameters": {
        "cityName": "={{ /*n8n-auto-generated-fromAI-override*/ $fromAI('City', ``, 'string') }}"
      },
      "credentials": {
        "openWeatherMapApi": {
          "id": "MCzSGdWHBJE7l1aN",
          "name": "OpenWeatherMap account"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "aa0abeff-b5e9-497b-9d9c-8f79721a5c11",
      "name": "AI Agent",
      "type": "@n8n/n8n-nodes-langchain.agent",
      "position": [
        480,
        -320
      ],
      "parameters": {
        "text": "={{ $json.message.text }}",
        "options": {
          "systemMessage": "=4. If the user sends you a message starting with / sign, it means this is a Telegram bot command. For example, all users send /start command as their first message. Try to figure out what these commands mean and reply accodringly.
User can only send pdf files and text messages and let them know that this type is not supported if it was not a PDF file or text.
At first let them know that they can ask questions about sent PDF files you can use your own capabilities as well. 
Generate a detailed, well-structured response ,
Format the response strictly using Telegram's supported HTML syntax. Use tags like <b>, <i>, <u>, <s>, <span class=\"tg-spoiler\">, <code>, <pre> (with optional <code class=\"language-...\"> inside), <a href=\"...\">, and <blockquote> where appropriate.

Structure the content logically using paragraphs and distinct sections. **Be mindful that this text might need to be split into multiple messages due to character limits (Telegram's limit is around 4096 characters per message). Try to make sections or paragraphs relatively self-contained where possible to facilitate splitting.**

**Ensure all <, >, and & symbols within the *text content* (i.e., not part of an HTML tag or entity) are replaced with the corresponding HTML entities: < with &lt;, > with &gt;, and & with &amp;.**

Maintain proper nesting of HTML tags according to Telegram's rules. While the final splitting will be handled by a script, aim for a structure that is easy to break into logical parts without leaving tags improperly open mid-message."
        },
        "promptType": "define",
        "hasOutputParser": true
      },
      "typeVersion": 1.9
    },
    {
      "id": "72b85aff-4fe7-4705-a07c-463f381cb806",
      "name": "Telegram Trigger",
      "type": "n8n-nodes-base.telegramTrigger",
      "position": [
        -20,
        100
      ],
      "webhookId": "d4f286b2-8094-40e3-aeb2-813eb1895ecf",
      "parameters": {
        "updates": [
          "message"
        ],
        "additionalFields": {}
      },
      "credentials": {
        "telegramApi": {
          "id": "jOxapcl3g1n1HrCE",
          "name": "Telegram account"
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "ea716dba-2856-40a8-ad73-86132f52dda8",
      "name": "Telegram",
      "type": "n8n-nodes-base.telegram",
      "onError": "continueErrorOutput",
      "position": [
        1540,
        -320
      ],
      "webhookId": "137d8d2f-a941-4803-8646-8932525360c3",
      "parameters": {
        "text": "={{ $json.text }}",
        "chatId": "={{ $json.chatId }}",
        "additionalFields": {
          "parse_mode": "HTML",
          "appendAttribution": false
        }
      },
      "credentials": {
        "telegramApi": {
          "id": "jOxapcl3g1n1HrCE",
          "name": "Telegram account"
        }
      },
      "typeVersion": 1.2,
      "alwaysOutputData": true
    },
    {
      "id": "59a22620-0d26-4e19-940a-5c07efccbdfa",
      "name": "Think",
      "type": "@n8n/n8n-nodes-langchain.toolThink",
      "position": [
        640,
        -100
      ],
      "parameters": {},
      "typeVersion": 1
    },
    {
      "id": "7bb66887-c9c6-4057-bbc0-306d1e20ea12",
      "name": "Embeddings Google Gemini",
      "type": "@n8n/n8n-nodes-langchain.embeddingsGoogleGemini",
      "position": [
        840,
        340
      ],
      "parameters": {
        "modelName": "models/text-embedding-004"
      },
      "credentials": {
        "googlePalmApi": {
          "id": "QuysglXiB421WI90",
          "name": "Google Gemini(PaLM) Api account"
        }
      },
      "typeVersion": 1
    },
    {
      "id": "668db8fd-3d5f-433a-8ccb-4bea237107ce",
      "name": "Default Data Loader",
      "type": "@n8n/n8n-nodes-langchain.documentDefaultDataLoader",
      "position": [
        1220,
        460
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 1
    },
    {
      "id": "d1495354-bfc0-4ef1-9102-dc3577580d5b",
      "name": "Recursive Character Text Splitter",
      "type": "@n8n/n8n-nodes-langchain.textSplitterRecursiveCharacterTextSplitter",
      "position": [
        1440,
        620
      ],
      "parameters": {
        "options": {}
      },
      "typeVersion": 1
    },
    {
      "id": "6a8e5ce6-4c52-4ca0-962f-045ea42dac7c",
      "name": "Extract from File",
      "type": "n8n-nodes-base.extractFromFile",
      "position": [
        1020,
        480
      ],
      "parameters": {
        "options": {},
        "operation": "pdf"
      },
      "typeVersion": 1,
      "alwaysOutputData": true
    },
    {
      "id": "3f4a9da9-0364-4861-a6f3-33b1d5c501e0",
      "name": "Answer questions with a vector store",
      "type": "@n8n/n8n-nodes-langchain.toolVectorStore",
      "position": [
        860,
        -60
      ],
      "parameters": {
        "description": "Use this data if the user's question appears to reference an uploaded file, document content, or specific information that might be stored in prior user documents. If not relevant, ignore this source."
      },
      "typeVersion": 1.1
    },
    {
      "id": "933a93c7-9401-4bac-9b9c-395866b46d61",
      "name": "Supabase Vector Store",
      "type": "@n8n/n8n-nodes-langchain.vectorStoreSupabase",
      "position": [
        760,
        80
      ],
      "parameters": {
        "options": {
          "queryName": "match_documents"
        },
        "tableName": {
          "__rl": true,
          "mode": "list",
          "value": "user_knowledge_base",
          "cachedResultName": "user_knowledge_base"
        }
      },
      "credentials": {
        "supabaseApi": {
          "id": "jq6dt73fwyUImYqH",
          "name": "Supabase account"
        }
      },
      "typeVersion": 1.1
    },
    {
      "id": "5f37c202-a1ca-4ee0-9de0-267349adffbd",
      "name": "Sticky Note10",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        360,
        200
      ],
      "parameters": {
        "color": 5,
        "width": 1625,
        "height": 779,
        "content": "✅ Scenario 2 – Document Upload and Embedding

Flow for downloading a document sent via Telegram, extracting its text, generating embeddings, and inserting them into Supabase Vector Store."
      },
      "typeVersion": 1
    },
    {
      "id": "6e9c1070-90bc-4ab7-a8a0-62461bede708",
      "name": "Sticky Note11",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        360,
        -420
      ],
      "parameters": {
        "color": 5,
        "width": 1625,
        "height": 599,
        "content": "✅ Scenario 1 – Chatbot Interaction

Flow for handling user messages sent to the bot. Includes accessing weather data, answering questions based on user-uploaded documents, and running code using a code execution tool."
      },
      "typeVersion": 1
    },
    {
      "id": "3b211a14-6813-459f-8d23-b40fc0eb4bd6",
      "name": "Telegram - Embedding Complete",
      "type": "n8n-nodes-base.telegram",
      "position": [
        1760,
        320
      ],
      "webhookId": "4eaead72-f9a7-49a3-95ca-b3bc8f6b9a95",
      "parameters": {
        "text": "=✅ Document saved!
Feel free to start asking questions about it.",
        "chatId": "={{ $('Command Router').item.json.message.chat.id }}",
        "additionalFields": {
          "appendAttribution": false
        }
      },
      "credentials": {
        "telegramApi": {
          "id": "jOxapcl3g1n1HrCE",
          "name": "Telegram account"
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "05703266-aaed-491d-87a6-ed7f96a9c49a",
      "name": "Supabase - Save Embeddings",
      "type": "@n8n/n8n-nodes-langchain.vectorStoreSupabase",
      "position": [
        1200,
        320
      ],
      "parameters": {
        "mode": "insert",
        "options": {},
        "tableName": {
          "__rl": true,
          "mode": "list",
          "value": "user_knowledge_base",
          "cachedResultName": "user_knowledge_base"
        }
      },
      "credentials": {
        "supabaseApi": {
          "id": "jq6dt73fwyUImYqH",
          "name": "Supabase account"
        }
      },
      "typeVersion": 1.1,
      "alwaysOutputData": false
    },
    {
      "id": "3b7db0e6-b551-4698-921a-306e837ceffc",
      "name": "Command Router",
      "type": "n8n-nodes-base.switch",
      "position": [
        160,
        100
      ],
      "parameters": {
        "rules": {
          "values": [
            {
              "outputKey": "document",
              "conditions": {
                "options": {
                  "version": 2,
                  "leftValue": "",
                  "caseSensitive": true,
                  "typeValidation": "loose"
                },
                "combinator": "and",
                "conditions": [
                  {
                    "id": "895b32db-777d-4d8e-b1d3-596cc9863d09",
                    "operator": {
                      "type": "boolean",
                      "operation": "exists",
                      "singleValue": true
                    },
                    "leftValue": "={{ $json.message.document }}",
                    "rightValue": "={{ $json.message.document }}"
                  }
                ]
              },
              "renameOutput": true
            },
            {
              "outputKey": "text",
              "conditions": {
                "options": {
                  "version": 2,
                  "leftValue": "",
                  "caseSensitive": true,
                  "typeValidation": "loose"
                },
                "combinator": "and",
                "conditions": [
                  {
                    "id": "26c12573-8e00-4832-8410-73d2d739c455",
                    "operator": {
                      "type": "boolean",
                      "operation": "exists",
                      "singleValue": true
                    },
                    "leftValue": "={{ $json.message.text }}",
                    "rightValue": ""
                  }
                ]
              },
              "renameOutput": true
            }
          ]
        },
        "options": {
          "fallbackOutput": "extra"
        },
        "looseTypeValidation": true
      },
      "typeVersion": 3.2
    },
    {
      "id": "fa06fc6c-3661-4065-81fc-09f93d6a4a25",
      "name": "Telegram - Download file",
      "type": "n8n-nodes-base.telegram",
      "position": [
        600,
        540
      ],
      "webhookId": "11b8f884-34bc-401c-8978-b28507d96e40",
      "parameters": {
        "fileId": "={{ $('Telegram Trigger').item.json.message.document.file_id }}",
        "resource": "file"
      },
      "credentials": {
        "telegramApi": {
          "id": "jOxapcl3g1n1HrCE",
          "name": "Telegram account"
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "756a36aa-187d-48ca-894c-f8c9a79a4794",
      "name": "Aggregate",
      "type": "n8n-nodes-base.aggregate",
      "notes": "This is used to flag the end of progress—no real aggregation.",
      "position": [
        1580,
        320
      ],
      "parameters": {
        "options": {},
        "fieldsToAggregate": {
          "fieldToAggregate": [
            {}
          ]
        }
      },
      "notesInFlow": true,
      "typeVersion": 1
    },
    {
      "id": "3b49f357-5d21-4710-bd32-3218d23b1bd9",
      "name": "Fallback- No formatting",
      "type": "n8n-nodes-base.telegram",
      "notes": "This is used if, even after HTML formatting,g Telegram wasn't able to process the text, so we send it without formatting.",
      "position": [
        1740,
        -260
      ],
      "webhookId": "dd2182fe-0b11-4d96-9838-30d60bf8c229",
      "parameters": {
        "text": "={{ $('Manual Mapping').item.json.text }}",
        "chatId": "={{ $('Manual Mapping').item.json.chatId }}",
        "additionalFields": {
          "appendAttribution": false
        }
      },
      "credentials": {
        "telegramApi": {
          "id": "jOxapcl3g1n1HrCE",
          "name": "Telegram account"
        }
      },
      "notesInFlow": true,
      "typeVersion": 1.2
    },
    {
      "id": "eafdbacb-17e5-4de6-a4e9-b986140353e5",
      "name": "Split Out",
      "type": "n8n-nodes-base.splitOut",
      "position": [
        1120,
        -320
      ],
      "parameters": {
        "options": {},
        "fieldToSplitOut": "output"
      },
      "typeVersion": 1
    },
    {
      "id": "538be3ed-4bd6-4295-ac11-e4d46b943f5a",
      "name": "Simple Memory",
      "type": "@n8n/n8n-nodes-langchain.memoryBufferWindow",
      "position": [
        540,
        -100
      ],
      "parameters": {
        "sessionKey": "={{ $('Telegram Trigger').item.json.message.from.id }}",
        "sessionIdType": "customKey"
      },
      "typeVersion": 1.3
    },
    {
      "id": "0afca77d-0e08-4f04-a6d3-b107c1dd54f9",
      "name": "Handle formatting and split",
      "type": "n8n-nodes-base.code",
      "notes": "This is used to prevent Markdown issues in Telegram while sending messages.",
      "position": [
        900,
        -320
      ],
      "parameters": {
        "language": "python",
        "pythonCode": "import re
import html

gemini_output_text = _('AI Agent').first().json.output;
# Regex to match any HTML tag <...>
HTML_TAG_PATTERN = re.compile(r'(<[^>]*?>)', re.IGNORECASE)

# List of UNSUPPORTED Telegram HTML tag names
UNSUPPORTED_TAG_NAMES = [
    'p', 'li', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'ul', 'ol',
    'table', 'thead', 'tbody', 'tr', 'td', 'th', 'div', 'br', 'font',
    'span', # Span is unsupported *unless* it has the specific class
    'a'     # A is unsupported *unless* it has the href attribute
    # Add more unsupported tags if you encounter them
]

# Regex to match unsupported opening or closing tags based on the names list
# This pattern is simplified and might misinterpret complex attributes
UNSUPPORTED_TAG_PATTERN = re.compile(r'<\/?(' + '|'.join(UNSUPPORTED_TAG_NAMES) + r')\b[^>]*?>', re.IGNORECASE)

# Regex to match a span tag *without* the class=\"tg-spoiler\" attribute
# This tries to capture the tag and its content to remove both
UNSUPPORTED_SPAN_FULL_PATTERN = re.compile(r'<span(?! class=\"tg-spoiler\"\b)[^>]*?>.*?<\/span>', re.IGNORECASE | re.DOTALL) # DOTALL allows . to match newlines

# Regex to match an a tag *without* an href attribute
# This tries to capture the tag and its content to remove both
UNSUPPORTED_A_FULL_PATTERN = re.compile(r'<a(?![^>]*href=)[^>]*?>.*?<\/a>', re.IGNORECASE | re.DOTALL)


# --- Cleaning Function (Regex Only) ---

def unescape_common_html_entities(text):
    \"\"\"
    Unescapes a limited set of common HTML entities in text.
    Does NOT use html.unescape for maximum compatibility with \"no external library\" rule.
    \"\"\"
    # Order matters: &amp; must be replaced first!
    text = text.replace('&amp;', '&')
    text = text.replace('&lt;', '<')
    text = text.replace('&gt;', '>')
    text = text.replace('&quot;', '\"')
    text = text.replace('&apos;', \"'\")
    # Add more common entities here if needed, e.g., text = text.replace('&nbsp;', ' ')
    return text


def clean_html_regex_only(html_string):
    \"\"\"
    Cleans HTML string using regex: removes unsupported tags and escapes text content.
    Handles &apos; and other basic entities.
    WARNING: This is a regex-based approach and is NOT as robust as using an HTML parser.
    It may fail on complex or malformed HTML.

    Args:
        html_string (str): The input HTML string.

    Returns:
        str: The cleaned HTML string.
    \"\"\"
    # 1. Remove unsupported tags and their content where specific attributes are missing
    # Process specific full patterns first
    cleaned_text = UNSUPPORTED_SPAN_FULL_PATTERN.sub('', html_string)
    cleaned_text = UNSUPPORTED_A_FULL_PATTERN.sub('', cleaned_text)

    # 2. Remove remaining unsupported opening/closing tags, leaving content behind
    cleaned_text = UNSUPPORTED_TAG_PATTERN.sub('', cleaned_text)

    # 3. Split the remaining string into tags and text segments
    # This pattern captures the tags themselves so we can differentiate them from text
    parts = HTML_TAG_PATTERN.split(cleaned_text)

    cleaned_parts = []
    for part in parts:
        if not part:
            continue

        if HTML_TAG_PATTERN.fullmatch(part):
            # If the part is a tag (matches the full tag pattern)
            # We assume at this point it's a supported tag due to previous removal steps.
            # Keep the tag as is.
            cleaned_parts.append(part)
        else:
            # If the part is text content
            # 1. Unescape common HTML entities (like &apos;) that might be in the text
            unescaped_text = unescape_common_html_entities(part)

            # 2. Escape the literal characters <, >, & that are *in* the text content
            # This ensures only the characters themselves are escaped, not entities.
            # Need to escape & first to avoid issues with '&amp;' if it resulted from unescaping or was original.
            re_escaped_text = unescaped_text.replace('&', '&amp;').replace('<', '&lt;').replace('>', '&gt;')

            cleaned_parts.append(re_escaped_text)

    # Join the processed parts back into a single string
    return \"\".join(cleaned_parts)

# --- Splitting Logic ---
SPLIT_PATTERN_REGEX_ONLY = re.compile(r'(<\/blockquote>|<\/pre>|\n\n|\s{2,}|(?<=[.!?])\s+|<[a-z]+[^>]*?>|<\/[a-z]+>)', flags=re.IGNORECASE)


def split_telegram_message_regex_only(text, max_length=4096):
    \"\"\"
    Splits text into multiple messages based on character count and basic patterns.
    Operates on text already cleaned by clean_html_regex_only.
    Does NOT guarantee HTML tag integrity across splits due to lack of parsing.

    Args:
        text (str): The input text (preferably cleaned by clean_html_regex_only).
        max_length (int): The maximum length for each message part.

    Returns:
        list: A list of strings, where each string is a message part.
    \"\"\"
    if len(text) <= max_length:
        return [text]

    messages = []
    current_chunk = \"\"

    # Split by the defined pattern
    parts = SPLIT_PATTERN_REGEX_ONLY.split(text)

    for part in parts:
        # Handle parts that are None (can happen with split) or just short whitespace
        if part is None or (not part.strip() and len(part) < 2 and part != '\n\n'):
             if part is not None and len(part) > 0: # Keep meaningful whitespace splits like \n\n
                  if len(current_chunk) + len(part) <= max_length:
                       current_chunk += part
                  else:
                       # Split happens within meaningful whitespace, finalize chunk
                       if current_chunk.strip(): # Only add if chunk has content
                            messages.append(current_chunk.strip())
                       current_chunk = part # Start new chunk with the whitespace
             continue # Skip to next part


        # Check if adding the current part exceeds the max length
        if len(current_chunk) + len(part) > max_length:
            # If the current chunk is empty or only whitespace after stripping,
            # it means the 'part' itself is too long to fit in a new chunk.
            if not current_chunk.strip():
                # Handle very long individual parts (e.g., a huge code block line, a very long word, a single huge tag)
                # Hard split the long part. WARNING: This can break tags, words, or escape sequences.
                while len(part) > max_length:
                    messages.append(part[:max_length])
                    part = part[max_length:]
                if part.strip():
                    current_chunk = part # Remaining part starts a new chunk
                else:
                     current_chunk = \"\" # If remainder is just whitespace, clear
            else:
                # The current part makes the chunk too long, finalize the current chunk
                messages.append(current_chunk.strip())
                # Start a new chunk with the current part
                current_chunk = part # Keep original part for the new chunk

        else:
            # Add the current part to the chunk
            current_chunk += part

    # Add the last chunk
    if current_chunk.strip(): # Only add if the final chunk has content
        messages.append(current_chunk.strip())

    # Clean up any empty messages that might have been created
    messages = [msg for msg in messages if msg.strip()]

    return messages
  
cleaned_html_regex = clean_html_regex_only(gemini_output_text)
message_parts_regex = split_telegram_message_regex_only(cleaned_html_regex)

return dict({'output': message_parts_regex })"
      },
      "typeVersion": 2
    },
    {
      "id": "dbea9e13-6ad4-4eb3-8da1-9db9e2116283",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        2000,
        -420
      ],
      "parameters": {
        "width": 1960,
        "height": 3520,
        "content": "# 🤖 Telegram AI Assistant for Your Documents (n8n + Supabase + Gemini)

This project transforms a standard **Telegram bot** into your dedicated AI assistant – designed to understand and answer questions based on **your own documents**. It seamlessly integrates the power of **Google Gemini** for advanced language capabilities and **Supabase's vector database** for efficient, intelligent document retrieval. Built entirely within the no-code platform **n8n**, it allows you to deploy a sophisticated document chatbot without writing a single line of code.

Simply upload any PDF document to the bot, and instantly gain the ability to chat with it, querying its contents as if it were a knowledgeable expert on your uploaded files.

---
## 📹 Watch the Bot in Action

[![𝗨𝗻𝗹𝗲𝗮𝘀𝗵𝗶𝗻𝗴 𝗔𝗜 𝗼𝗻 𝗠𝘆 𝗕𝗼𝗼𝗸𝘀𝗵𝗲𝗹𝗳: 𝗙𝗹𝗼𝘄 𝗣𝗿𝗼𝗴𝗿𝗮𝗺𝗺𝗶𝗻𝗴 𝗣𝗼𝘄𝗲𝗿𝘀 𝗮 𝗡𝗲𝘅𝘁-𝗟𝗲𝘃𝗲𝗹 𝗧𝗲𝗹𝗲𝗴𝗿𝗮𝗺 𝗕𝗼𝘁🤖](https://img.youtube.com/vi/r_KGyJApy5M/maxresdefault.jpg)](https://www.youtube.com/watch?v=r_KGyJApy5M)

**▶️ Click the image above to watch a live demo on YouTube.** 

This video provides a live demonstration of the bot's core features and how it interacts. See a quick walkthrough of its capabilities and user flow.

---

## ✨ Ignite Your Workflow: Use Cases

This project empowers two core interactions:

### 1. Conversational AI Interface (User Inquiry → Telegram Bot → Intelligent Answers)
- Users pose questions directly to the Telegram bot.
- The bot generates relevant, informative answers using the cutting-edge capabilities of the Google Gemini LLM.
- Leveraging a powerful vector search mechanism, it can pull specific, contextual information from previously uploaded documents to provide highly relevant and informed responses.
- (Optional) Augment answers with real-time data, like current **weather information**.

### 2. Effortless Document Integration (User Upload PDF → Processing → Searchable Knowledge)
- Users upload a PDF document directly to the bot.
- The workflow automatically parses the document content, converts it into numerical representations called embeddings using Gemini's embedding models.
- These embeddings, alongside the document's text content, are then securely stored in a dedicated **Supabase vector table**, creating a searchable knowledge base.
- Immediately after successful processing, the document becomes part of the bot's memory, enabling users to ask questions about its contents via the standard chat interface.

---
## 🧠 Core Intelligence Features

- ✅ **Pure No-Code**: Developed and managed entirely within the intuitive [n8n](https://n8n.io) automation platform.
- 📄 **Seamless PDF Integration**: Easily upload and process PDF documents to expand the bot's knowledge.
- 🧠 **Powered by Google Gemini**: Utilizes Gemini for both generating document embeddings and formulating intelligent conversational responses.
- 🗂 **Vector Database Memory (Supabase)**: Employs **Supabase as a robust vector database** for storing and efficiently searching document embeddings, providing the bot with long-term memory about your content.
- **⚡️ Rapid & Private Retrieval**: The vector search allows for swift identification and retrieval of the most relevant document snippets based on the user's query. This approach enhances response speed and significantly improves data privacy, as **the original document content remains securely stored in your Supabase instance, and only the user's query and the retrieved relevant chunks are sent to the LLM for generating a response.**
- 🧹 **Intelligent HTML Post-processing**: Cleans the LLM's responses by removing HTML tags not supported by Telegram while preserving essential formatting and correctly escaping special characters in the text content.
- 📤 **Adaptive Message Chunking**: Splits lengthy AI-generated answers into multiple messages that adhere to Telegram's 4096-character limit, ensuring the full response is delivered cleanly.
- 🌦️ **Dynamic Weather Data**: (Optional) Integrates with OpenWeatherMap to provide current weather information upon request.
- **📝 Note on Usage**: This workflow is designed primarily for **personal, single-user** scenarios. It processes each message independently and **does not include multi-user session management**, making it unsuitable for public deployment where different users require separate conversational contexts. For a session-based Telegram bot implemented in Python, you may refer to this project, which is a multi-model telegram bot: [https://github.com/mohamadghaffari/gemini-tel-bot](https://github.com/mohamadghaffari/gemini-tel-bot).
---

## 🛠 Getting Started: Setup

### 1. Deploy the Workflow in n8n

- Click the \"Use this workflow\" button on the n8n template page.
- This will open the workflow directly in your n8n instance, ready for configuration.


### 2. Connect Your Services: Configure Credentials

Create API credentials for the following services within your n8n instance:

| Service          | Purpose                          |
|------------------|------------------------------------|
| Telegram API     | Receiving user messages & sending replies |
| Google Gemini    | Generating embeddings & LLM responses |
| Supabase         | Storing & searching document vectors |
| OpenWeatherMap   | (Optional) Fetching weather data    |

### 3. Prepare Your Supabase Knowledge Base

Set up a vector-enabled table in your Supabase project to store your document embeddings. Execute the following SQL commands in your Supabase SQL Editor:

``` sql
-- Enable the pgvector extension to work with embedding vectors
create extension vector;

-- Create a table to store your documents and their embeddings
create table user_knowledge_base (
  id bigserial primary key,
  content text, -- Stores the text chunk from the document
  metadata jsonb, -- Stores document information (e.g., filename, page number)
  embedding vector(768) -- Stores the vector representation (embedding) generated by Gemini. Adjust dimension if using a different model.
);

-- Create a function to perform vector similarity search against your documents
create function match_documents (
  query_embedding vector(768),
  match_count int default null,
  filter jsonb DEFAULT '{}'
) returns table (
  id bigint,
  content text,
  metadata jsonb,
  similarity float
)
language plpgsql
as $$
#variable_conflict use_column
begin
  return query
  select
    id,
    content,
    metadata,
    -- Calculate cosine similarity: 1 - cosine distance (using the '<=>' operator provided by pgvector)
    1 - (user_knowledge_base.embedding <=> query_embedding) as similarity
  from user_knowledge_base
  where metadata @> filter -- Optional: filter results based on metadata
  order by user_knowledge_base.embedding <=> query_embedding -- Order by similarity (closest first)
  limit match_count; -- Limit the number of results
end;
$$;
````

This sets up the necessary table and a function to perform vector similarity searches, allowing you to find document chunks most similar to a user's query.
-----

## 📚 Integrated Technologies

This project brings together powerful tools:

  - [n8n](https://n8n.io) – The central hub for workflow automation and integration.
  - [Telegram Bot API](https://core.telegram.org/bots/api) – The communication layer for user interaction.
  - [Supabase](https://supabase.com/) + [pgvector Extension](https://www.google.com/search?q=https://supabase.com/docs/guides/ai/vector-embeddings) – Provides a scalable database with powerful vector search capabilities.
  - [Google Gemini API](https://ai.google.dev/) – The intelligence engine for embeddings and text generation.
  - [OpenWeatherMap API](https://openweathermap.org/api) – (Optional) For adding real-time weather features.

-----
"
      },
      "typeVersion": 1
    },
    {
      "id": "965ba2bd-747d-4718-a76e-9f7d685dcea4",
      "name": "Manual Mapping",
      "type": "n8n-nodes-base.set",
      "position": [
        1320,
        -320
      ],
      "parameters": {
        "options": {},
        "assignments": {
          "assignments": [
            {
              "id": "cdeb5bf1-c91c-44ae-bebd-ab3f4ba2561a",
              "name": "text",
              "type": "string",
              "value": "={{ $json.output }}"
            },
            {
              "id": "7cd7d120-96fa-4539-b343-25bc9b75abb4",
              "name": "chatId",
              "type": "number",
              "value": "={{ $('Command Router').item.json.message.from.id }}"
            }
          ]
        }
      },
      "typeVersion": 3.4
    },
    {
      "id": "c6a315f1-6f0b-4127-b377-b7b12975929f",
      "name": "Unsupported message",
      "type": "n8n-nodes-base.telegram",
      "position": [
        500,
        760
      ],
      "webhookId": "52f3456a-06ef-4799-b245-0293213dcc4b",
      "parameters": {
        "text": "Unsupported command or file. 😓 Please upload a valid PDF document or ask your question regarding your files.",
        "chatId": "={{ $('Command Router').item.json.message.chat.id }}",
        "additionalFields": {
          "appendAttribution": false
        }
      },
      "credentials": {
        "telegramApi": {
          "id": "jOxapcl3g1n1HrCE",
          "name": "Telegram account"
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "375bd185-3836-4f25-8708-d6dcd79b2675",
      "name": "Send processing document message",
      "type": "n8n-nodes-base.telegram",
      "position": [
        920,
        720
      ],
      "webhookId": "32ade357-f14b-4d10-91f2-02c8aa6e198e",
      "parameters": {
        "text": "=<b>Processing document...</b>
<b>Please wait...⏳</b>",
        "chatId": "={{ $('Command Router').item.json.message.chat.id }}",
        "additionalFields": {
          "parse_mode": "HTML",
          "appendAttribution": false
        }
      },
      "credentials": {
        "telegramApi": {
          "id": "jOxapcl3g1n1HrCE",
          "name": "Telegram account"
        }
      },
      "typeVersion": 1.2
    },
    {
      "id": "d01f8b15-e495-46cf-bfdf-20b4399c23d7",
      "name": "Send embedding Started message",
      "type": "n8n-nodes-base.telegram",
      "position": [
        1220,
        660
      ],
      "webhookId": "32ade357-f14b-4d10-91f2-02c8aa6e198e",
      "parameters": {
        "text": "=<b>Document processed ✅ </b> 
<b>Num of pages:</b> {{ $json.numpages }} 
<b>Creator:</b> {{ $json.info.Creator }}
<b>Title:</b> {{ $json.info.Title }} 
<b>Version:</b> {{ $json.version }}",
        "chatId": "={{ $('Command Router').item.json.message.chat.id }}",
        "additionalFields": {
          "parse_mode": "HTML",
          "appendAttribution": false
        }
      },
      "credentials": {
        "telegramApi": {
          "id": "jOxapcl3g1n1HrCE",
          "name": "Telegram account"
        }
      },
      "typeVersion": 1.2
    }
  ],
  "active": true,
  "pinData": {},
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "749ec7d0-e135-478a-b02e-9241dbf4ab68",
  "connections": {
    "Think": {
      "ai_tool": [
        [
          {
            "node": "AI Agent",
            "type": "ai_tool",
            "index": 0
          }
        ]
      ]
    },
    "AI Agent": {
      "main": [
        [
          {
            "node": "Handle formatting and split",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Telegram": {
      "main": [
        [],
        [
          {
            "node": "Fallback- No formatting",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Aggregate": {
      "main": [
        [
          {
            "node": "Telegram - Embedding Complete",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Split Out": {
      "main": [
        [
          {
            "node": "Manual Mapping",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Simple Memory": {
      "ai_memory": [
        [
          {
            "node": "AI Agent",
            "type": "ai_memory",
            "index": 0
          }
        ]
      ]
    },
    "Command Router": {
      "main": [
        [
          {
            "node": "Telegram - Download file",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "AI Agent",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Unsupported message",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Manual Mapping": {
      "main": [
        [
          {
            "node": "Telegram",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "OpenWeatherMap": {
      "ai_tool": [
        [
          {
            "node": "AI Agent",
            "type": "ai_tool",
            "index": 0
          }
        ]
      ]
    },
    "Telegram Trigger": {
      "main": [
        [
          {
            "node": "Command Router",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Extract from File": {
      "main": [
        [
          {
            "node": "Supabase - Save Embeddings",
            "type": "main",
            "index": 0
          },
          {
            "node": "Send embedding Started message",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Default Data Loader": {
      "ai_document": [
        [
          {
            "node": "Supabase - Save Embeddings",
            "type": "ai_document",
            "index": 0
          }
        ]
      ]
    },
    "Supabase Vector Store": {
      "ai_vectorStore": [
        [
          {
            "node": "Answer questions with a vector store",
            "type": "ai_vectorStore",
            "index": 0
          }
        ]
      ]
    },
    "Embeddings Google Gemini": {
      "ai_embedding": [
        [
          {
            "node": "Supabase - Save Embeddings",
            "type": "ai_embedding",
            "index": 0
          },
          {
            "node": "Supabase Vector Store",
            "type": "ai_embedding",
            "index": 0
          }
        ]
      ]
    },
    "Google Gemini Chat Model": {
      "ai_languageModel": [
        [
          {
            "node": "AI Agent",
            "type": "ai_languageModel",
            "index": 0
          },
          {
            "node": "Answer questions with a vector store",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Telegram - Download file": {
      "main": [
        [
          {
            "node": "Extract from File",
            "type": "main",
            "index": 0
          },
          {
            "node": "Send processing document message",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Supabase - Save Embeddings": {
      "main": [
        [
          {
            "node": "Aggregate",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Handle formatting and split": {
      "main": [
        [
          {
            "node": "Split Out",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Send embedding Started message": {
      "main": [
        []
      ]
    },
    "Recursive Character Text Splitter": {
      "ai_textSplitter": [
        [
          {
            "node": "Default Data Loader",
            "type": "ai_textSplitter",
            "index": 0
          }
        ]
      ]
    },
    "Answer questions with a vector store": {
      "ai_tool": [
        [
          {
            "node": "AI Agent",
            "type": "ai_tool",
            "index": 0
          }
        ]
      ]
    }
  }
}
功能特点

自动检测新邮件
AI智能内容分析
自定义分类规则
批量处理能力
详细的处理日志
技术分析

节点类型及作用

@N8N/N8N Nodes Langchain.Lmchatgooglegemini
Openweathermaptool
@N8N/N8N Nodes Langchain.Agent
Telegramtrigger
Telegram
复杂度评估

配置难度：
★★★★☆
维护难度：
★★☆☆☆
扩展性：
★★★★☆
实施指南

前置条件

有效的Gmail账户
n8n平台访问权限
Google API凭证
AI分类服务订阅
配置步骤

在n8n中导入工作流JSON文件
配置Gmail节点的认证信息
设置AI分类器的API密钥
自定义分类规则和标签映射
测试工作流执行
配置定时触发器（可选）
关键参数

参数名称	默认值	说明
maxEmails	50	单次处理的最大邮件数量
confidenceThreshold	0.8	分类置信度阈值
autoLabel	true	是否自动添加标签
最佳实践

优化建议

定期更新AI分类模型以提高准确性
根据邮件量调整处理批次大小
设置合理的分类置信度阈值
定期清理过期的分类规则
安全注意事项

妥善保管API密钥和认证信息
限制工作流的访问权限
定期审查处理日志
启用双因素认证保护Gmail账户
性能优化

使用增量处理减少重复工作
缓存频繁访问的数据
并行处理多个邮件分类任务
监控系统资源使用情况
故障排除

常见问题

邮件未被正确分类

检查AI分类器的置信度阈值设置，适当降低阈值或更新训练数据。
Gmail认证失败

确认Google API凭证有效且具有正确的权限范围，重新进行OAuth授权。
调试技巧

启用详细日志记录查看每个步骤的执行情况
使用测试邮件验证分类逻辑
检查网络连接和API服务状态
逐步执行工作流定位问题节点
错误处理

工作流包含以下错误处理机制：
网络超时自动重试（最多3次）
API错误记录和告警
处理失败邮件的隔离机制
异常情况下的回滚操作