Image-Based Data Extraction API using Gemini AI
工作流概述
这是一个包含9个节点的复杂工作流,主要用于自动化处理各种任务。
工作流源代码
{
"id": "YKZBEx4DTf0KGEBR",
"meta": {
"instanceId": "f5267db717c7383a3924a6083f6b9950be64cf36e2b4e9421d42eb2121922a14"
},
"name": "Image-Based Data Extraction API using Gemini AI",
"tags": [],
"nodes": [
{
"id": "e3448003-5c62-4da6-8fcc-6817915dcbb8",
"name": "Webhook",
"type": "n8n-nodes-base.webhook",
"position": [
40,
40
],
"webhookId": "18118afb-7fd2-47a5-a474-50813c5b20c8",
"parameters": {
"path": "data-extractor",
"options": {},
"responseMode": "responseNode"
},
"typeVersion": 2
},
{
"id": "3682c6bf-3442-4fba-ab6c-ae29e361ef93",
"name": "Respond to Webhook",
"type": "n8n-nodes-base.respondToWebhook",
"position": [
1180,
40
],
"parameters": {
"options": {}
},
"typeVersion": 1.1
},
{
"id": "bfa352d0-68a9-4f33-be54-254a5df22664",
"name": "Get image from URL",
"type": "n8n-nodes-base.httpRequest",
"position": [
280,
40
],
"parameters": {
"url": "={{ $json.body.image_url }}",
"options": {}
},
"typeVersion": 4.2
},
{
"id": "c6c8de12-08dc-42e8-9c0e-86e04c7cacc0",
"name": "Call Gemini API (Flash Lite) with Image",
"type": "n8n-nodes-base.httpRequest",
"position": [
760,
40
],
"parameters": {
"url": "=https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-lite:generateContent",
"method": "POST",
"options": {},
"jsonBody": "={
\"contents\": [
{
\"role\": \"user\",
\"parts\": [
{
\"inlineData\": {
\"data\": \"{{$json.data1}}\",
\"mimeType\": \"image/jpeg\"
}
}
]
},
{
\"role\": \"user\",
\"parts\": [
{
\"text\": \"check this\"
}
]
}
],
\"systemInstruction\": {
\"role\": \"user\",
\"parts\": [
{
\"text\": \"{{ $('Webhook').first().json.body.Requirement}}\"
}
]
},
\"generationConfig\": {
\"temperature\": 1,
\"topK\": 40,
\"topP\": 0.95,
\"maxOutputTokens\": 8192,
\"responseMimeType\": \"application/json\",
\"responseSchema\": {
\"type\": \"object\",
\"properties\": {{ $('Webhook').first().json.body.properties.toJsonString()}}
}
}
}
",
"sendBody": true,
"specifyBody": "json",
"authentication": "predefinedCredentialType",
"nodeCredentialType": "googlePalmApi"
},
"credentials": {
"googlePalmApi": {
"id": "MhMVz0OkKPSPX2Wn",
"name": "Gemini API Srinivasan Online"
}
},
"typeVersion": 4.2
},
{
"id": "06b0f807-aeba-44d6-bb1d-dfa1d50e1082",
"name": "Edit fields to output required data alone",
"type": "n8n-nodes-base.set",
"position": [
980,
40
],
"parameters": {
"options": {},
"assignments": {
"assignments": [
{
"id": "4a2f1343-4b5d-4de8-b04b-5640e0a38d27",
"name": "result",
"type": "string",
"value": "={{ $json.candidates[0].content.parts[0].text.parseJson()}}"
}
]
}
},
"typeVersion": 3.4
},
{
"id": "8c69dba2-f67c-4f8b-be18-02a414fd2ead",
"name": "Sticky Note",
"type": "n8n-nodes-base.stickyNote",
"position": [
20,
280
],
"parameters": {
"color": 5,
"width": 820,
"height": 420,
"content": "## Sample API Call (cURL)
```
curl --request GET \
--url https://your_domain.com/webhook/data-extractor \
--data '{
\"image_url\":\"https://www.immihelp.com/nri/images/sample-pan-card-front.jpg\",
\"Requirement\":\"extract the details from the image\",
\"properties\": {
\"PAN Number\": {
\"type\": \"string\"
},
\"Name\": {
\"type\": \"string\"
},
\"Date of Birth\": {
\"type\": \"string\"
},
\"Valid\": {
\"type\": \"boolean\"
}
}
}'
```"
},
"typeVersion": 1
},
{
"id": "8839f0d7-306f-4dc2-aca5-6ca529e1a2ff",
"name": "Sticky Note1",
"type": "n8n-nodes-base.stickyNote",
"position": [
20,
740
],
"parameters": {
"color": 5,
"width": 1240,
"height": 140,
"content": "## Sample Output
```
{
\"result\": \"{\\"Date of Birth\\":\\"23/11/1974\\",\\"Name\\":\\"RAHUL GUPTA\\",\\"PAN Number\\":\\"ABCDE1234F\\",\\"Valid\\":true}\"
}
```"
},
"typeVersion": 1
},
{
"id": "df733e11-f194-4878-a514-47ddc9811281",
"name": "Sticky Note2",
"type": "n8n-nodes-base.stickyNote",
"position": [
40,
-520
],
"parameters": {
"width": 940,
"height": 440,
"content": "## Convert the workflow into an Endpoint
This n8n workflow provides a ready-to-use API endpoint for extracting structured data from images. The API takes an image URL as input, processes it using an AI-powered OCR model, and returns relevant extracted details in a structured JSON format.
- The workflow converts the image to base64 before processing.
- It utilizes an AI-powered model (Gemini API) for text extraction.
- The output is formatted to include only the required fields.
- You can customize the extraction criteria by modifying the request parameters.
- Supports integration with various applications for automated data entry and processing.
It can be used for various use cases, such as:
- Document OCR (ID cards, invoices, receipts)
- Text Extraction from Images
- Automated Form Processing
- Business Card Data Extraction
Simply send a GET request with an image URL, define the extraction requirements, and receive structured JSON data in response.
"
},
"typeVersion": 1
},
{
"id": "aecf7331-6341-411e-8906-e42fc0ef264a",
"name": "Transform image to base64",
"type": "n8n-nodes-base.extractFromFile",
"position": [
520,
40
],
"parameters": {
"options": {
"encoding": "ascii"
},
"operation": "binaryToPropery",
"destinationKey": "data1"
},
"typeVersion": 1
}
],
"active": true,
"pinData": {},
"settings": {
"executionOrder": "v1"
},
"versionId": "b1fad586-998c-47ce-9921-e59527da029a",
"connections": {
"Webhook": {
"main": [
[
{
"node": "Get image from URL",
"type": "main",
"index": 0
}
]
]
},
"Get image from URL": {
"main": [
[
{
"node": "Transform image to base64",
"type": "main",
"index": 0
}
]
]
},
"Transform image to base64": {
"main": [
[
{
"node": "Call Gemini API (Flash Lite) with Image",
"type": "main",
"index": 0
}
]
]
},
"Call Gemini API (Flash Lite) with Image": {
"main": [
[
{
"node": "Edit fields to output required data alone",
"type": "main",
"index": 0
}
]
]
},
"Edit fields to output required data alone": {
"main": [
[
{
"node": "Respond to Webhook",
"type": "main",
"index": 0
}
]
]
}
}
}
功能特点
- 自动检测新邮件
- AI智能内容分析
- 自定义分类规则
- 批量处理能力
- 详细的处理日志
技术分析
节点类型及作用
- Webhook
- Respondtowebhook
- Httprequest
- Set
- Stickynote
复杂度评估
配置难度:
维护难度:
扩展性:
实施指南
前置条件
- 有效的Gmail账户
- n8n平台访问权限
- Google API凭证
- AI分类服务订阅
配置步骤
- 在n8n中导入工作流JSON文件
- 配置Gmail节点的认证信息
- 设置AI分类器的API密钥
- 自定义分类规则和标签映射
- 测试工作流执行
- 配置定时触发器(可选)
关键参数
| 参数名称 | 默认值 | 说明 |
|---|---|---|
| maxEmails | 50 | 单次处理的最大邮件数量 |
| confidenceThreshold | 0.8 | 分类置信度阈值 |
| autoLabel | true | 是否自动添加标签 |
最佳实践
优化建议
- 定期更新AI分类模型以提高准确性
- 根据邮件量调整处理批次大小
- 设置合理的分类置信度阈值
- 定期清理过期的分类规则
安全注意事项
- 妥善保管API密钥和认证信息
- 限制工作流的访问权限
- 定期审查处理日志
- 启用双因素认证保护Gmail账户
性能优化
- 使用增量处理减少重复工作
- 缓存频繁访问的数据
- 并行处理多个邮件分类任务
- 监控系统资源使用情况
故障排除
常见问题
邮件未被正确分类
检查AI分类器的置信度阈值设置,适当降低阈值或更新训练数据。
Gmail认证失败
确认Google API凭证有效且具有正确的权限范围,重新进行OAuth授权。
调试技巧
- 启用详细日志记录查看每个步骤的执行情况
- 使用测试邮件验证分类逻辑
- 检查网络连接和API服务状态
- 逐步执行工作流定位问题节点
错误处理
工作流包含以下错误处理机制:
- 网络超时自动重试(最多3次)
- API错误记录和告警
- 处理失败邮件的隔离机制
- 异常情况下的回滚操作