Skip to main content

Transformation Instructions

info

This is a work in progress feature that is not yet publicly available.

Transformation Instructions

The Data Transformation feature allows you to manipulate and reshape structured data files (CSV, JSON, XML, Excel) directly within your continuity kits. This is accomplished by defining a series of transformation steps using a powerful, SQL-like Domain-Specific Language (DSL).

How It Works

Data transformation is configured as an externalAction within a kit's checklist. The action requires input and output file paths and a list of transforms to be executed in sequence.

Example Action Structure

{
"type": "externalAction",
"actionName": "DataTransform",
"value": {
"input": "/path/to/input.csv",
"output": "/path/to/output.xlsx",
"transforms": [
"UPDATE \"DEBIT EUR\" = 0 WHERE input.\"DEBIT EUR\" ISNOTSET",
"AGGREGATE ROWS input.\"DEBIT EUR\" + input.\"CREDIT EUR\" INTO \"total\"",
"DELETE WHERE input.\"CAT Cash Flow\" = \"A NE PAS COMPTER\"",
"AGGREGATE COLUMN SUM(\"total\") GROUP BY \"CAT Cash Flow\", \"CAT P&L\""
]
}
}

Supported File Formats

  • Input: CSV, JSON, XML, Excel (.xls, .xlsx)
  • Output: CSV, JSON, XML, Excel (.xls, .xlsx)

Advanced Format Handling

Reading Nested XML (SEPA SDD Example)

When dealing with complex, nested XML files like a SEPA Direct Debit (pain.008.001.02), you can use read_params to flatten the data into a tabular format.

  • record_path: A list of keys representing the path to the array of records you want to iterate over.
  • meta: A list of paths to metadata fields that are common to all records. These fields will be added as columns to every row.

Example: Given the following SEPA SDD XML input:

<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.008.001.02">
<CstmrDrctDbtInitn>
<GrpHdr>
<MsgId>M000038280</MsgId>
<CreDtTm>2025-05-23T10:30:00</CreDtTm>
<InitgPty>
<Nm>CREDIT LYONNAIS</Nm>
</InitgPty>
</GrpHdr>
<PmtInf>
<PmtInfId>FR-202505-38280</PmtInfId>
<DrctDbtTxInf>
<PmtId>
<EndToEndId>D011220347--05-2025</EndToEndId>
</PmtId>
<InstdAmt Ccy="EUR">62.27</InstdAmt>
<Dbtr>
<Nm>ACHIM PASCAL</Nm>
</Dbtr>
</DrctDbtTxInf>
<DrctDbtTxInf>
<PmtId>
<EndToEndId>D011220348--05-2025</EndToEndId>
</PmtId>
<InstdAmt Ccy="EUR">80.00</InstdAmt>
<Dbtr>
<Nm>JEAN DUPONT</Nm>
</Dbtr>
</DrctDbtTxInf>
</PmtInf>
</CstmrDrctDbtInitn>
</Document>

Use the following read_params in your action's value to flatten the transactions:

record_path:
- Document
- CstmrDrctDbtInitn
- PmtInf
- DrctDbtTxInf

meta:
-
- Document
- CstmrDrctDbtInitn
- GrpHdr
- MsgId
-
- Document
- CstmrDrctDbtInitn
- GrpHdr
- CreDtTm
-
- Document
- CstmrDrctDbtInitn
- PmtInf
- PmtInfId

This will produce the following internal table, which you can then transform using the DSL:

"PmtId.EndToEndId","InstdAmt.#text","InstdAmt.@Ccy","Dbtr.Nm","Document.CstmrDrctDbtInitn.GrpHdr.MsgId","Document.CstmrDrctDbtInitn.GrpHdr.CreDtTm","Document.CstmrDrctDbtInitn.PmtInf.PmtInfId"
"D011220347--05-2025",62.27,"EUR","ACHIM PASCAL","M000038280","2025-05-23T10:30:00","FR-202505-38280"
"D011220348--05-2025",80.00,"EUR","JEAN DUPONT","M000038280","2025-05-23T10:30:00","FR-202505-38280"

Writing Custom Nested XML (SEPA SDD Example)

To generate a nested XML output from flat data, you can provide a detailed template.

  • Set format to "custom_xml" inside write_params.
  • Define the desired XML structure using xml_template.

Template Keywords:

  • tag: The name of the XML element.
  • attributes: An object for element attributes. Values can be static or dynamic (e.g., {"from_column": "ColumnName"}).
  • text: The element's text content. Can be static or dynamic.
  • children: An array of child element templates.
  • iterate_on_rows: Place this key inside an element's template to repeat it for each row in the data.
  • aggregate: Computes a value over the entire dataset (e.g., {"aggregate": "count"} or {"aggregate": "sum", "column": "Amount"}).

Example: To generate a SEPA SDD XML from a flat CSV:

ns1:MsgId,ns1:CreDtTm,ns1:InstdAmt,Ccy,ns1:EndToEndId,ns1:Nm8
M000038280,2025-05-23T10:30:00,62.27,EUR,D011220347--05-2025,ACHIM PASCAL
M000038280,2025-05-23T10:30:00,80.00,EUR,D011220348--05-2025,JEAN DUPONT

Use the following write_params:

format: custom_xml
pretty_print: true
xml_template:
tag: Document
attributes:
xmlns: "urn:iso:std:iso:20022:tech:xsd:pain.008.001.02"
children:
- tag: CstmrDrctDbtInitn
children:
- tag: GrpHdr
children:
- tag: MsgId
text:
from_column: "ns1:MsgId"
- tag: CreDtTm
text:
from_column: "ns1:CreDtTm"
- tag: NbOfTxs
text:
aggregate: count
- tag: CtrlSum
text:
aggregate: sum
column: "ns1:InstdAmt"

- iterate_on_rows:
tag: DrctDbtTxInf
children:
- tag: PmtId
children:
- tag: EndToEndId
text:
from_column: "ns1:EndToEndId"

- tag: InstdAmt
attributes:
Ccy:
from_column: Ccy
text:
from_column: "ns1:InstdAmt"

- tag: Dbtr
children:
- tag: Nm
text:
from_column: "ns1:Nm8"

This will produce the following XML:

<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.008.001.02">
<CstmrDrctDbtInitn>
<GrpHdr>
<MsgId>M000038280</MsgId>
<CreDtTm>2025-05-23T10:30:00</CreDtTm>
<NbOfTxs>2</NbOfTxs>
<CtrlSum>142.27</CtrlSum>
</GrpHdr>
<DrctDbtTxInf>
<PmtId>
<EndToEndId>D011220347--05-2025</EndToEndId>
</PmtId>
<InstdAmt Ccy="EUR">62.27</InstdAmt>
<Dbtr>
<Nm>ACHIM PASCAL</Nm>
</Dbtr>
</DrctDbtTxInf>
<DrctDbtTxInf>
<PmtId>
<EndToEndId>D011220348--05-2025</EndToEndId>
</PmtId>
<InstdAmt Ccy="EUR">80.00</InstdAmt>
<Dbtr>
<Nm>JEAN DUPONT</Nm>
</Dbtr>
</DrctDbtTxInf>
</CstmrDrctDbtInitn>
</Document>

Transformation DSL

The DSL provides a set of commands to perform transformations. Note that all column names and string literals must be enclosed in double quotes (").

SELECT

Selects a subset of columns and filters rows based on conditions.

Example: Given the following CSV data:

ID,Name,Status,Amount
1,Alice,Completed,100
2,Bob,Pending,50
3,Charlie,Completed,200

The command:

SELECT "ID", "Name", "Amount" WHERE input."Status" = "Completed"

Will result in:

ID,Name,Amount
1,Alice,100
3,Charlie,200

UPDATE

Updates the values in a column, optionally based on conditions.

Example: Given the following CSV data:

Product,Price,Currency
Apple,1.0,USD
Banana,0.5,EUR
Orange,1.2,USD

The command:

UPDATE "Price" = input."Price" * 1.1 WHERE input."Currency" = "USD"

Will result in:

Product,Price,Currency
Apple,1.1,USD
Banana,0.5,EUR
Orange,1.32,USD

RENAME

Renames a column.

Example: Given the following CSV data:

Transaction ID,Amount
txn_123,100
txn_456,200

The command:

RENAME "Transaction ID" "TransactionId"

Will result in:

TransactionId,Amount
txn_123,100
txn_456,200

DELETE

Deletes rows or columns.

Example: Given the following CSV data:

ID,Name,Amount,InternalNotes
1,Alice,100,Note A
2,Bob,0,Note B
3,Charlie,200,Note C

The command:

DELETE WHERE input."Amount" <= 0

Will result in:

ID,Name,Amount,InternalNotes
1,Alice,100,Note A
3,Charlie,200,Note C

And the command:

DELETE COLUMN "InternalNotes"

Will result in:

ID,Name,Amount
1,Alice,100
3,Charlie,200

AGGREGATE

Performs aggregation operations.

Column Aggregation

Example: Given the following CSV data:

Category,Amount
Fruit,10
Vegetable,20
Fruit,15

The command:

AGGREGATE COLUMN SUM("Amount") GROUP BY "Category"

Will result in:

Category,Amount
Fruit,25
Vegetable,20

Row Aggregation

Example: Given the following CSV data:

Price,Quantity
10,2
5,3

The command:

AGGREGATE ROWS input."Price" * input."Quantity" INTO "Total"

Will result in:

Price,Quantity,Total
10,2,20
5,3,15

SORT

Sorts rows based on one or more columns.

Example: Given the following CSV data:

Name,Date,Amount
Alice,2024-10-12,100
Bob,2024-10-11,200
Alice,2024-10-12,50

The command:

SORT "Name" ASC, "Amount" DESC

Will result in:

Name,Date,Amount
Alice,2024-10-12,100
Alice,2024-10-12,50
Bob,2024-10-11,200

INSERT

Inserts a new row or column.

Example: Given the following CSV data:

ID,Name
1,Alice

The command:

INSERT ROW 2, "Bob"

Will result in:

ID,Name
1,Alice
2,Bob

And the command:

INSERT COLUMN "Status"

Will result in:

ID,Name,Status
1,Alice,
2,Bob,

MOVE

Moves a row or column to a new position.

Example: Given the following CSV data:

ID,Name,Status
1,Alice,A
2,Bob,B
3,Charlie,C

The command:

MOVE ROW 2 0

Will result in:

ID,Name,Status
3,Charlie,C
1,Alice,A
2,Bob,B

Functions

Functions can be used in UPDATE and AGGREGATE ROWS commands for data manipulation.

String Functions

LEFT(column, length) - Extract leftmost characters

UPDATE "ShortCode" = LEFT(input."Code", 3)

RIGHT(column, length) - Extract rightmost characters

UPDATE "Extension" = RIGHT(input."Filename", 3)

SUBSTRING(column, start, length) - Extract substring

UPDATE "MiddlePart" = SUBSTRING(input."Text", 5, 10)

UPPER(column) - Convert to uppercase

UPDATE "Name" = UPPER(input."Name")

LOWER(column) - Convert to lowercase

UPDATE "Email" = LOWER(input."Email")

TRIM(column) - Remove leading/trailing whitespace

UPDATE "Name" = TRIM(input."Name")

CONCAT(value1, value2, ...) - Concatenate strings

UPDATE "FullName" = CONCAT(input."FirstName", " ", input."LastName")

LENGTH(column) - Get string length

UPDATE "NameLength" = LENGTH(input."Name")

REPLACE(column, old, new) - Replace substring

UPDATE "Phone" = REPLACE(input."Phone", "-", "")

Example: Given the following CSV data:

FirstName,LastName,Email
John ,Doe,JOHN.DOE@EXAMPLE.COM
alice,smith,alice.smith@test.com

The commands:

UPDATE "FirstName" = TRIM(input."FirstName")
UPDATE "Email" = LOWER(input."Email")
UPDATE "FullName" = CONCAT(input."FirstName", " ", input."LastName")

Will result in:

FirstName,LastName,Email,FullName
John,Doe,john.doe@example.com,John Doe
alice,smith,alice.smith@test.com,alice smith

Date Functions

TODAY() - Get current date

UPDATE "ProcessedDate" = TODAY()

DATEADD(date, days) - Add days to a date

UPDATE "DueDate" = DATEADD(input."InvoiceDate", 30)

DATEDIFF(date1, date2) - Days between two dates

UPDATE "DaysOverdue" = DATEDIFF(TODAY(), input."DueDate")

YEAR(date), MONTH(date), DAY(date) - Extract date parts

UPDATE "Year" = YEAR(input."Date")
UPDATE "Month" = MONTH(input."Date")
UPDATE "Day" = DAY(input."Date")

WEEKDAY(date) - Get day of week (0=Monday, 6=Sunday)

UPDATE "DayOfWeek" = WEEKDAY(input."Date")

FORMAT_DATE(date, input_format, output_format) - Convert date format

UPDATE "FormattedDate" = FORMAT_DATE(input."Date", "%Y-%m-%d", "%d/%m/%Y")

Example: Given the following CSV data:

InvoiceDate,Amount
2024-10-13,100
2024-11-05,200

The commands:

UPDATE "DueDate" = DATEADD(input."InvoiceDate", 30)
UPDATE "Year" = YEAR(input."InvoiceDate")
UPDATE "Formatted" = FORMAT_DATE(input."InvoiceDate", "%Y-%m-%d", "%d/%m/%Y")

Will result in:

InvoiceDate,Amount,DueDate,Year,Formatted
2024-10-13,100,2024-11-12,2024,13/10/2024
2024-11-05,200,2024-12-05,2024,05/11/2024

Expressions

Expressions can be used in UPDATE and AGGREGATE ROWS commands. They support arithmetic operations (+, -, *, /) and all the functions listed above.

Conditions

Conditions are used in SELECT, UPDATE, DELETE WHERE, and AGGREGATE clauses.

Comparison Operators:

  • = or == - Equal to
  • != - Not equal to
  • > - Greater than
  • >= - Greater than or equal to
  • < - Less than
  • <= - Less than or equal to

Null/Empty Checks:

  • ISSET - Value is not null and not empty
  • ISNOTSET - Value is null or empty

Logical Operators:

  • AND - Both conditions must be true
  • OR - At least one condition must be true

Examples:

DELETE WHERE input."Amount" < 10 AND input."Status" = "Pending"
SELECT "Name", "Amount" WHERE input."Email" ISSET
UPDATE "DefaultValue" = 0 WHERE input."Amount" ISNOTSET

Conditional Aggregations

Aggregation functions can include WHERE clauses to filter which rows are included in the calculation.

Example: Given the following CSV data:

Category,Amount,Status
Fruit,100,Paid
Vegetable,50,Pending
Fruit,75,Paid
Vegetable,25,Paid

The command:

AGGREGATE COLUMN SUM("Amount") WHERE input."Status" = "Paid", COUNT("Amount") GROUP BY "Category"

Will result in:

Category,Amount,Amount_count
Fruit,175,2
Vegetable,25,1

Note: The SUM only includes "Paid" rows (175 for Fruit = 100+75, 25 for Vegetable), while COUNT counts all rows per category.

JOIN

Joins data from another file with the current dataset.

Syntax:

JOIN <TYPE> "file_path" ON <condition>

Join Types:

  • INNER - Only rows that match in both datasets
  • LEFT - All rows from left (current) dataset, matching rows from right
  • RIGHT - All rows from right dataset, matching rows from left (current)
  • OUTER - All rows from both datasets

Example: Given the current CSV data:

OrderID,CustomerID,Amount
1,C001,100
2,C002,200

And a file customers.csv:

CustomerID,Name,City
C001,Alice,Paris
C002,Bob,London
C003,Charlie,Berlin

The command:

JOIN LEFT "customers.csv" ON input."CustomerID" = other."CustomerID"

Will result in:

OrderID,CustomerID,Amount,Name,City
1,C001,100,Alice,Paris
2,C002,200,Bob,London

UNION

Combines data from another file with the current dataset.

Syntax:

UNION <TYPE> "file_path"

Union Types:

  • ROWS - Appends rows from the other file (must have same columns)
  • COLUMNS - Appends columns from the other file (must have same number of rows)

Example (UNION ROWS): Given the current CSV data:

Name,Amount
Alice,100
Bob,200

And a file more_data.csv:

Name,Amount
Charlie,150
Diana,250

The command:

UNION ROWS "more_data.csv"

Will result in:

Name,Amount
Alice,100
Bob,200
Charlie,150
Diana,250

Example (UNION COLUMNS): Given the current CSV data:

Name,Amount
Alice,100
Bob,200

And a file status.csv:

Status
Paid
Pending

The command:

UNION COLUMNS "status.csv"

Will result in:

Name,Amount,Status
Alice,100,Paid
Bob,200,Pending

SIGN and CHECK

Cryptographically sign data and verify its integrity using RSA digital signatures.

SIGN

Adds a cryptographic signature to your data. Each signature automatically generates a unique ephemeral RSA key pair (2048-bit) - no configuration or environment variables needed!

Syntax:

SIGN ALL COLUMNS INTO "column_name"
SIGN COLUMNS "col1", "col2", ... INTO "column_name"

How it works:

  1. Generates a fresh RSA key pair for this signature
  2. Creates a Merkle tree from row hashes
  3. Signs the Merkle root with the private key
  4. Embeds the public key in the signature
  5. Discards the private key (never stored!)
  6. Adds a signature row and column to the data

Example: Given the following CSV data:

Invoice Number,Amount,Vendor
INV001,1000,Acme Corp
INV002,2000,Beta Inc
INV003,1500,Gamma LLC

The command:

SIGN ALL COLUMNS INTO "__SIGNATURE__"

Will add a __SIGNATURE__ column and a signature row at the bottom:

Invoice Number,Amount,Vendor,__SIGNATURE__
INV001,1000,Acme Corp,4bdad51d55da...
INV002,2000,Beta Inc,e0b8107df890...
INV003,1500,Gamma LLC,7ba0349bf59e...
__MERKLE_ROOT__,__SIGNATURE_INFO__,__DO_NOT_EDIT__,{"root":"7928a130..."}

Signing specific columns:

SIGN COLUMNS "Invoice Number", "Amount" INTO "__SIGNATURE__"

This allows you to add additional columns later without invalidating the signature.

CHECK

Verifies a signature and ensures data hasn't been tampered with. Works with no configuration - the public key is embedded in the signature!

Syntax:

CHECK SIGNATURE FROM "column_name"
CHECK SIGNATURE FROM "column_name" KEEP ROW
CHECK SIGNATURE FROM "column_name" KEEP COLUMN
CHECK SIGNATURE FROM "column_name" KEEP ALL

Options:

  • Default: Removes both signature row and column after verification
  • KEEP ROW: Keeps the signature row (removes column)
  • KEEP COLUMN: Keeps the signature column (removes row)
  • KEEP ALL: Keeps both signature row and column

Example:

CHECK SIGNATURE FROM "__SIGNATURE__"

Returns the original data if verification succeeds, or raises an error if:

  • Data values have been modified
  • Rows have been added or removed
  • Signed columns are missing
  • The signature has been tampered with

Use Cases:

  • Multi-party data exchange: Sign data before sending to partners
  • Audit trails: Verify data integrity months later
  • Data pipelines: Ensure no unauthorized modifications between steps
  • Compliance: Prove data provenance and integrity

Security Features:

  • Each signature uses a unique ephemeral RSA key pair
  • Forward secrecy: compromising one key doesn't affect other signatures
  • Self-contained: no key management required
  • No environment variables or configuration needed

Complete Workflow Example:

{
"transforms": [
"UPDATE \"Amount\" = input.\"Amount\" * 1.1",
"SIGN ALL COLUMNS INTO \"__SIGNATURE__\"",
"... (data is stored or transferred) ...",
"CHECK SIGNATURE FROM \"__SIGNATURE__\"",
"AGGREGATE COLUMN SUM(\"Amount\") GROUP BY \"Vendor\""
]
}