Convert Parquet, Avro, ORC & Feather via ParquetReader to JSON
工作流概述
这是一个包含4个节点的中等工作流,主要用于自动化处理各种任务。
工作流源代码
{
"id": "VU0kmvnWzctSFm2M",
"meta": {
"instanceId": "534a4ec070e550167af0cc407c76e029ac0ae69bef901c2f9ef440d79bfa5792"
},
"name": "Convert Parquet, Avro, ORC & Feather via ParquetReader to JSON",
"tags": [
{
"id": "1PTaY70kFjD8F12p",
"name": "Convert",
"createdAt": "2025-03-30T09:38:16.466Z",
"updatedAt": "2025-03-30T09:38:16.466Z"
},
{
"id": "98v0QSKrvfH5dl34",
"name": "Avro",
"createdAt": "2025-03-30T09:38:06.656Z",
"updatedAt": "2025-03-30T09:38:06.656Z"
},
{
"id": "Q0sqo52DKATPDab2",
"name": "ORC",
"createdAt": "2025-03-30T09:38:09.923Z",
"updatedAt": "2025-03-30T09:38:09.923Z"
},
{
"id": "b1s8WFj3TfMpoOQu",
"name": "Feather",
"createdAt": "2025-03-30T09:38:12.227Z",
"updatedAt": "2025-03-30T09:38:12.227Z"
},
{
"id": "fFnESRcynarFqlLf",
"name": "Parquet",
"createdAt": "2025-03-30T09:38:04.286Z",
"updatedAt": "2025-03-30T09:38:04.286Z"
}
],
"nodes": [
{
"id": "651a10dc-1c91-4957-bcdd-3e55d7328f04",
"name": "Send to Parquet API",
"type": "n8n-nodes-base.httpRequest",
"position": [
1100,
440
],
"parameters": {
"url": "https://api.parquetreader.com/parquet?source=n8n",
"options": {
"bodyContentType": "multipart-form-data"
},
"requestMethod": "POST",
"jsonParameters": true,
"sendBinaryData": true,
"binaryPropertyName": "=file0"
},
"typeVersion": 1
},
{
"id": "42a7e623-ca11-4d38-94bb-cfb48d021a5c",
"name": "Webhook",
"type": "n8n-nodes-base.webhook",
"notes": "Example trigger flow:
curl -X POST http://localhost:5678/webhook-test/convert \
-F \"file=@converted.parquet\"",
"position": [
900,
440
],
"webhookId": "0b1223c9-c117-45f9-9931-909f2db28955",
"parameters": {
"path": "convert",
"options": {
"binaryPropertyName": "file"
},
"httpMethod": "POST",
"responseData": "allEntries",
"responseMode": "lastNode"
},
"notesInFlow": false,
"typeVersion": 2
},
{
"id": "9b87f027-7ef2-40a7-88d7-a0ae9ef73375",
"name": "Sticky Note",
"type": "n8n-nodes-base.stickyNote",
"position": [
0,
0
],
"parameters": {
"width": 840,
"height": 580,
"content": "### ✅ **How to Use This Flow**
#### 📥 Trigger via File Upload
You can trigger this flow by sending a `POST` request with a file using **curl**, **Postman**, or **from another n8n flow**.
#### 🔧 Example (via `curl`):
```bash
curl -X POST http://localhost:5678/webhook-test/convert \
-F \"file=@converted.parquet\"
```
> Replace `converted.parquet` with your local file path. You can also send Avro, ORC or Feather files.
#### 🔁 Reuse from Other Flows
You can **reuse this flow** by calling the webhook from another n8n workflow using an **HTTP Request** node.
Make sure to send the file as **form-data** with the field name `file`.
#### 🔍 What This Flow Does:
- Receives the uploaded file via webhook (`file`)
- Sends it to `https://api.parquetreader.com/parquet` as `multipart/form-data` (field name: `file`)
- Receives parsed data, schema, and metadata
"
},
"typeVersion": 1
},
{
"id": "06d3e569-8724-48f2-951f-a1af5e0f9362",
"name": "Parse API Response",
"type": "n8n-nodes-base.code",
"position": [
1280,
440
],
"parameters": {
"jsCode": "const item = items[0];
// Convert `data` (stringified JSON array) → actual array
if (typeof item.json.data === 'string') {
item.json.data = JSON.parse(item.json.data);
}
// Convert `meta_data` (stringified JSON object) → actual object
if (typeof item.json.meta_data === 'string') {
item.json.meta_data = JSON.parse(item.json.meta_data);
}
return [item];
"
},
"typeVersion": 2
}
],
"active": true,
"pinData": {},
"settings": {
"executionOrder": "v1"
},
"versionId": "c10e1897-094e-42c6-bde9-1724972ada3e",
"connections": {
"Webhook": {
"main": [
[
{
"node": "Send to Parquet API",
"type": "main",
"index": 0
}
]
]
},
"Send to Parquet API": {
"main": [
[
{
"node": "Parse API Response",
"type": "main",
"index": 0
}
]
]
}
}
}
功能特点
- 自动检测新邮件
- AI智能内容分析
- 自定义分类规则
- 批量处理能力
- 详细的处理日志
技术分析
节点类型及作用
- Httprequest
- Webhook
- Stickynote
- Code
复杂度评估
配置难度:
维护难度:
扩展性:
实施指南
前置条件
- 有效的Gmail账户
- n8n平台访问权限
- Google API凭证
- AI分类服务订阅
配置步骤
- 在n8n中导入工作流JSON文件
- 配置Gmail节点的认证信息
- 设置AI分类器的API密钥
- 自定义分类规则和标签映射
- 测试工作流执行
- 配置定时触发器(可选)
关键参数
| 参数名称 | 默认值 | 说明 |
|---|---|---|
| maxEmails | 50 | 单次处理的最大邮件数量 |
| confidenceThreshold | 0.8 | 分类置信度阈值 |
| autoLabel | true | 是否自动添加标签 |
最佳实践
优化建议
- 定期更新AI分类模型以提高准确性
- 根据邮件量调整处理批次大小
- 设置合理的分类置信度阈值
- 定期清理过期的分类规则
安全注意事项
- 妥善保管API密钥和认证信息
- 限制工作流的访问权限
- 定期审查处理日志
- 启用双因素认证保护Gmail账户
性能优化
- 使用增量处理减少重复工作
- 缓存频繁访问的数据
- 并行处理多个邮件分类任务
- 监控系统资源使用情况
故障排除
常见问题
邮件未被正确分类
检查AI分类器的置信度阈值设置,适当降低阈值或更新训练数据。
Gmail认证失败
确认Google API凭证有效且具有正确的权限范围,重新进行OAuth授权。
调试技巧
- 启用详细日志记录查看每个步骤的执行情况
- 使用测试邮件验证分类逻辑
- 检查网络连接和API服务状态
- 逐步执行工作流定位问题节点
错误处理
工作流包含以下错误处理机制:
- 网络超时自动重试(最多3次)
- API错误记录和告警
- 处理失败邮件的隔离机制
- 异常情况下的回滚操作