Convert Parquet, Avro, ORC & Feather via ParquetReader to JSON

工作流概述

这是一个包含4个节点的中等工作流,主要用于自动化处理各种任务。

工作流源代码

下载
{
  "id": "VU0kmvnWzctSFm2M",
  "meta": {
    "instanceId": "534a4ec070e550167af0cc407c76e029ac0ae69bef901c2f9ef440d79bfa5792"
  },
  "name": "Convert Parquet, Avro, ORC & Feather via ParquetReader to JSON",
  "tags": [
    {
      "id": "1PTaY70kFjD8F12p",
      "name": "Convert",
      "createdAt": "2025-03-30T09:38:16.466Z",
      "updatedAt": "2025-03-30T09:38:16.466Z"
    },
    {
      "id": "98v0QSKrvfH5dl34",
      "name": "Avro",
      "createdAt": "2025-03-30T09:38:06.656Z",
      "updatedAt": "2025-03-30T09:38:06.656Z"
    },
    {
      "id": "Q0sqo52DKATPDab2",
      "name": "ORC",
      "createdAt": "2025-03-30T09:38:09.923Z",
      "updatedAt": "2025-03-30T09:38:09.923Z"
    },
    {
      "id": "b1s8WFj3TfMpoOQu",
      "name": "Feather",
      "createdAt": "2025-03-30T09:38:12.227Z",
      "updatedAt": "2025-03-30T09:38:12.227Z"
    },
    {
      "id": "fFnESRcynarFqlLf",
      "name": "Parquet",
      "createdAt": "2025-03-30T09:38:04.286Z",
      "updatedAt": "2025-03-30T09:38:04.286Z"
    }
  ],
  "nodes": [
    {
      "id": "651a10dc-1c91-4957-bcdd-3e55d7328f04",
      "name": "Send to Parquet API",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        1100,
        440
      ],
      "parameters": {
        "url": "https://api.parquetreader.com/parquet?source=n8n",
        "options": {
          "bodyContentType": "multipart-form-data"
        },
        "requestMethod": "POST",
        "jsonParameters": true,
        "sendBinaryData": true,
        "binaryPropertyName": "=file0"
      },
      "typeVersion": 1
    },
    {
      "id": "42a7e623-ca11-4d38-94bb-cfb48d021a5c",
      "name": "Webhook",
      "type": "n8n-nodes-base.webhook",
      "notes": "Example trigger flow:

curl -X POST http://localhost:5678/webhook-test/convert \
  -F \"file=@converted.parquet\"",
      "position": [
        900,
        440
      ],
      "webhookId": "0b1223c9-c117-45f9-9931-909f2db28955",
      "parameters": {
        "path": "convert",
        "options": {
          "binaryPropertyName": "file"
        },
        "httpMethod": "POST",
        "responseData": "allEntries",
        "responseMode": "lastNode"
      },
      "notesInFlow": false,
      "typeVersion": 2
    },
    {
      "id": "9b87f027-7ef2-40a7-88d7-a0ae9ef73375",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        0,
        0
      ],
      "parameters": {
        "width": 840,
        "height": 580,
        "content": "### ✅ **How to Use This Flow**

#### 📥 Trigger via File Upload

You can trigger this flow by sending a `POST` request with a file using **curl**, **Postman**, or **from another n8n flow**.

#### 🔧 Example (via `curl`):
```bash
curl -X POST http://localhost:5678/webhook-test/convert \
-F \"file=@converted.parquet\"
```
> Replace `converted.parquet` with your local file path. You can also send Avro, ORC or Feather files.

#### 🔁 Reuse from Other Flows
You can **reuse this flow** by calling the webhook from another n8n workflow using an **HTTP Request** node.  
Make sure to send the file as **form-data** with the field name `file`.

#### 🔍 What This Flow Does:
- Receives the uploaded file via webhook (`file`)
- Sends it to `https://api.parquetreader.com/parquet` as `multipart/form-data` (field name: `file`)
- Receives parsed data, schema, and metadata
"
      },
      "typeVersion": 1
    },
    {
      "id": "06d3e569-8724-48f2-951f-a1af5e0f9362",
      "name": "Parse API Response",
      "type": "n8n-nodes-base.code",
      "position": [
        1280,
        440
      ],
      "parameters": {
        "jsCode": "const item = items[0];

// Convert `data` (stringified JSON array) → actual array
if (typeof item.json.data === 'string') {
  item.json.data = JSON.parse(item.json.data);
}

// Convert `meta_data` (stringified JSON object) → actual object
if (typeof item.json.meta_data === 'string') {
  item.json.meta_data = JSON.parse(item.json.meta_data);
}

return [item];
"
      },
      "typeVersion": 2
    }
  ],
  "active": true,
  "pinData": {},
  "settings": {
    "executionOrder": "v1"
  },
  "versionId": "c10e1897-094e-42c6-bde9-1724972ada3e",
  "connections": {
    "Webhook": {
      "main": [
        [
          {
            "node": "Send to Parquet API",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Send to Parquet API": {
      "main": [
        [
          {
            "node": "Parse API Response",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

功能特点

  • 自动检测新邮件
  • AI智能内容分析
  • 自定义分类规则
  • 批量处理能力
  • 详细的处理日志

技术分析

节点类型及作用

  • Httprequest
  • Webhook
  • Stickynote
  • Code

复杂度评估

配置难度:
★★★☆☆
维护难度:
★★☆☆☆
扩展性:
★★★★☆

实施指南

前置条件

  • 有效的Gmail账户
  • n8n平台访问权限
  • Google API凭证
  • AI分类服务订阅

配置步骤

  1. 在n8n中导入工作流JSON文件
  2. 配置Gmail节点的认证信息
  3. 设置AI分类器的API密钥
  4. 自定义分类规则和标签映射
  5. 测试工作流执行
  6. 配置定时触发器(可选)

关键参数

参数名称 默认值 说明
maxEmails 50 单次处理的最大邮件数量
confidenceThreshold 0.8 分类置信度阈值
autoLabel true 是否自动添加标签

最佳实践

优化建议

  • 定期更新AI分类模型以提高准确性
  • 根据邮件量调整处理批次大小
  • 设置合理的分类置信度阈值
  • 定期清理过期的分类规则

安全注意事项

  • 妥善保管API密钥和认证信息
  • 限制工作流的访问权限
  • 定期审查处理日志
  • 启用双因素认证保护Gmail账户

性能优化

  • 使用增量处理减少重复工作
  • 缓存频繁访问的数据
  • 并行处理多个邮件分类任务
  • 监控系统资源使用情况

故障排除

常见问题

邮件未被正确分类

检查AI分类器的置信度阈值设置,适当降低阈值或更新训练数据。

Gmail认证失败

确认Google API凭证有效且具有正确的权限范围,重新进行OAuth授权。

调试技巧

  • 启用详细日志记录查看每个步骤的执行情况
  • 使用测试邮件验证分类逻辑
  • 检查网络连接和API服务状态
  • 逐步执行工作流定位问题节点

错误处理

工作流包含以下错误处理机制:

  • 网络超时自动重试(最多3次)
  • API错误记录和告警
  • 处理失败邮件的隔离机制
  • 异常情况下的回滚操作