fleettickets/n8n-hourly-s3-full-data-exports.md
david kiania df054c92be feat: INC hourly-CSV ingestion (newest-file, ETag dedup, clean + archive)
Rework import_tickets.py from the retired JSON `latest.json` model to the new
hourly full-snapshot CSV export. Strictly INC (CRQ out of scope).

- Ingest the newest automations/inc/<EAT-timestamp>.csv; skip-if-unchanged by
  comparing S3 ETag to tickets.import_meta.metadata.source_etag.
- Upsert on ticket_id (PK; no dups, never delete -> closure history accrues).
  No truncate. On success, move processed files to automations/inc/processed/.
- Clean at ingest: drop is_alarm=true + the "EXPORT STOPPED..." sentinel; drop
  week_*, source_s3_*/source_snapshot_id, department/source_type; lowercase
  region, uppercase raw_status; keep service_type + bucket.
- Force path-style S3 addressing; --inc-csv for local dev; --from-bucket for cron.
- Add migrations/02 (import_meta + freshness); refresh README/.env.example/docs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 19:33:16 +03:00

156 lines
3.7 KiB
Markdown

# n8n Hourly S3 Full-Data Exports
Updated on June 15, 2026.
## Overview
Two active n8n workflows export complete datasets to S3 every hour:
1. `FTTH Automation Ticket S3 Export`
2. `Fuel Records S3 Export`
Each execution creates CSV files only. Filenames use the actual execution time
in the `Africa/Nairobi` timezone.
No delta files, JSON files, `latest` files, `changes/` directories, `full/`
directories, or midnight-specific exports are created.
## Hourly Output
Together, the two workflows create exactly three files during their hourly
executions:
```text
automations/crq/YYYY-MM-DDTHH-mm-ss.csv
automations/inc/YYYY-MM-DDTHH-mm-ss.csv
fuel_records/YYYY-MM-DDTHH-mm-ss.csv
```
The CRQ and INC files are uploaded to the `tickets` bucket. The Fuel file is
uploaded to the `fuel` bucket.
## FTTH Automation Ticket S3 Export
Workflow ID: `JI3QkcJeHk9eYRsY`
The workflow:
1. Runs at the start of every hour using the `Africa/Nairobi` workflow timezone.
2. Creates one execution timestamp.
3. Calls the existing authenticated Scoreboard export endpoint with
`export_type: full`.
4. Reads all CRQ and INC rows returned by the endpoint.
5. Converts each complete dataset to CSV.
6. Uploads exactly two files:
- `automations/crq/<execution-timestamp>.csv`
- `automations/inc/<execution-timestamp>.csv`
7. Fails the execution if exactly two successful upload results are not
returned.
The workflow still has its existing manual webhook for operational testing.
## Fuel Records S3 Export
Workflow ID: `IP2KNAfFazAjTesh`
The workflow:
1. Runs at the start of every hour using the `Africa/Nairobi` workflow timezone.
2. Creates one execution timestamp.
3. Reads the complete `logistics_department.fuel_records` table.
4. Converts all returned rows to one CSV.
5. Uploads exactly one file:
- `fuel_records/<execution-timestamp>.csv`
6. Fails the execution if the S3 upload reports an error.
The workflow still has its existing manual webhook for operational testing.
## Timestamp Format
The timestamp format is:
```text
YYYY-MM-DDTHH-mm-ss
```
Example:
```text
2026-06-15T14-39-53
```
The timestamp is generated once at the start of each workflow execution and is
formatted in `Africa/Nairobi`.
## Credentials and Safety
- Existing n8n PostgreSQL, S3, workflow-variable, and API token configuration is
reused.
- No S3 credentials or API secrets are hard-coded in workflow code.
- Secrets are not included in workflow result messages.
- Source database queries are read-only.
- The workflows do not delete or update source database rows.
- S3 upload nodes retain retry handling. A failed hourly execution can also be
recovered naturally by the next full-data run.
## Removed Behavior
The workflows no longer contain:
- Delta export logic or stored delta pointers
- Midnight full-export schedules
- `latest.json` or `latest.csv`
- JSON output
- `changes/` keys
- `full/` keys
- Multipart or additional export files
- FTTH mark-sent state handling
## Deployment Status
Both workflows were saved, published, and activated on June 15, 2026.
Active versions:
```text
Fuel Records S3 Export:
60cf5824-9345-45bb-a2eb-3b20b877fd32
FTTH Automation Ticket S3 Export:
68b7be10-ac3a-43d8-8c17-b46a2cbb48d2
```
## Manual Test Evidence
### Fuel Records S3 Export
Execution ID: `404079`
Rows exported: `2001`
Exact S3 key:
```text
fuel_records/2026-06-15T14-39-50.csv
```
### FTTH Automation Ticket S3 Export
Execution ID: `404080`
Rows exported:
```text
CRQ: 12680
INC: 31434
```
Exact S3 keys:
```text
automations/crq/2026-06-15T14-39-53.csv
automations/inc/2026-06-15T14-39-53.csv
```
Both manual tests completed successfully. Their upload builders generated one
Fuel item and exactly two FTTH items, matching the required three output files.