Transformation Instructions
This is a work in progress feature that is not yet publicly available.
Transformation Instructions
The Data Transformation feature allows you to manipulate and reshape structured data files (CSV, JSON, XML, Excel) directly within your continuity kits. This is accomplished by defining a series of transformation steps using a powerful, SQL-like Domain-Specific Language (DSL).
How It Works
Data transformation is configured as an externalAction within a kit's checklist. The action requires input and output file paths and a list of transforms to be executed in sequence.
Example Action Structure
{
"type": "externalAction",
"actionName": "DataTransform",
"value": {
"input": "/path/to/input.csv",
"output": "/path/to/output.xlsx",
"transforms": [
"UPDATE \"DEBIT EUR\" = 0 WHERE input.\"DEBIT EUR\" ISNOTSET",
"AGGREGATE ROWS input.\"DEBIT EUR\" + input.\"CREDIT EUR\" INTO \"total\"",
"DELETE WHERE input.\"CAT Cash Flow\" = \"A NE PAS COMPTER\"",
"AGGREGATE COLUMN SUM(\"total\") GROUP BY \"CAT Cash Flow\", \"CAT P&L\""
]
}
}
Supported File Formats
- Input: CSV, JSON, XML, Excel (
.xls,.xlsx) - Output: CSV, JSON, XML, Excel (
.xls,.xlsx)
Advanced Format Handling
Reading Nested XML (SEPA SDD Example)
When dealing with complex, nested XML files like a SEPA Direct Debit (pain.008.001.02), you can use read_params to flatten the data into a tabular format.
record_path: A list of keys representing the path to the array of records you want to iterate over.meta: A list of paths to metadata fields that are common to all records. These fields will be added as columns to every row.
Example: Given the following SEPA SDD XML input:
<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.008.001.02">
<CstmrDrctDbtInitn>
<GrpHdr>
<MsgId>M000038280</MsgId>
<CreDtTm>2025-05-23T10:30:00</CreDtTm>
<InitgPty>
<Nm>CREDIT LYONNAIS</Nm>
</InitgPty>
</GrpHdr>
<PmtInf>
<PmtInfId>FR-202505-38280</PmtInfId>
<DrctDbtTxInf>
<PmtId>
<EndToEndId>D011220347--05-2025</EndToEndId>
</PmtId>
<InstdAmt Ccy="EUR">62.27</InstdAmt>
<Dbtr>
<Nm>ACHIM PASCAL</Nm>
</Dbtr>
</DrctDbtTxInf>
<DrctDbtTxInf>
<PmtId>
<EndToEndId>D011220348--05-2025</EndToEndId>
</PmtId>
<InstdAmt Ccy="EUR">80.00</InstdAmt>
<Dbtr>
<Nm>JEAN DUPONT</Nm>
</Dbtr>
</DrctDbtTxInf>
</PmtInf>
</CstmrDrctDbtInitn>
</Document>
Use the following read_params in your action's value to flatten the transactions:
record_path:
- Document
- CstmrDrctDbtInitn
- PmtInf
- DrctDbtTxInf
meta:
-
- Document
- CstmrDrctDbtInitn
- GrpHdr
- MsgId
-
- Document
- CstmrDrctDbtInitn
- GrpHdr
- CreDtTm
-
- Document
- CstmrDrctDbtInitn
- PmtInf
- PmtInfId
This will produce the following internal table, which you can then transform using the DSL:
"PmtId.EndToEndId","InstdAmt.#text","InstdAmt.@Ccy","Dbtr.Nm","Document.CstmrDrctDbtInitn.GrpHdr.MsgId","Document.CstmrDrctDbtInitn.GrpHdr.CreDtTm","Document.CstmrDrctDbtInitn.PmtInf.PmtInfId"
"D011220347--05-2025",62.27,"EUR","ACHIM PASCAL","M000038280","2025-05-23T10:30:00","FR-202505-38280"
"D011220348--05-2025",80.00,"EUR","JEAN DUPONT","M000038280","2025-05-23T10:30:00","FR-202505-38280"
Writing Custom Nested XML (SEPA SDD Example)
To generate a nested XML output from flat data, you can provide a detailed template.
- Set
formatto"custom_xml"insidewrite_params. - Define the desired XML structure using
xml_template.
Template Keywords:
tag: The name of the XML element.attributes: An object for element attributes. Values can be static or dynamic (e.g.,{"from_column": "ColumnName"}).text: The element's text content. Can be static or dynamic.children: An array of child element templates.iterate_on_rows: Place this key inside an element's template to repeat it for each row in the data.aggregate: Computes a value over the entire dataset (e.g.,{"aggregate": "count"}or{"aggregate": "sum", "column": "Amount"}).
Example: To generate a SEPA SDD XML from a flat CSV:
ns1:MsgId,ns1:CreDtTm,ns1:InstdAmt,Ccy,ns1:EndToEndId,ns1:Nm8
M000038280,2025-05-23T10:30:00,62.27,EUR,D011220347--05-2025,ACHIM PASCAL
M000038280,2025-05-23T10:30:00,80.00,EUR,D011220348--05-2025,JEAN DUPONT
Use the following write_params:
format: custom_xml
pretty_print: true
xml_template:
tag: Document
attributes:
xmlns: "urn:iso:std:iso:20022:tech:xsd:pain.008.001.02"
children:
- tag: CstmrDrctDbtInitn
children:
- tag: GrpHdr
children:
- tag: MsgId
text:
from_column: "ns1:MsgId"
- tag: CreDtTm
text:
from_column: "ns1:CreDtTm"
- tag: NbOfTxs
text:
aggregate: count
- tag: CtrlSum
text:
aggregate: sum
column: "ns1:InstdAmt"
- iterate_on_rows:
tag: DrctDbtTxInf
children:
- tag: PmtId
children:
- tag: EndToEndId
text:
from_column: "ns1:EndToEndId"
- tag: InstdAmt
attributes:
Ccy:
from_column: Ccy
text:
from_column: "ns1:InstdAmt"
- tag: Dbtr
children:
- tag: Nm
text:
from_column: "ns1:Nm8"
This will produce the following XML:
<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.008.001.02">
<CstmrDrctDbtInitn>
<GrpHdr>
<MsgId>M000038280</MsgId>
<CreDtTm>2025-05-23T10:30:00</CreDtTm>
<NbOfTxs>2</NbOfTxs>
<CtrlSum>142.27</CtrlSum>
</GrpHdr>
<DrctDbtTxInf>
<PmtId>
<EndToEndId>D011220347--05-2025</EndToEndId>
</PmtId>
<InstdAmt Ccy="EUR">62.27</InstdAmt>
<Dbtr>
<Nm>ACHIM PASCAL</Nm>
</Dbtr>
</DrctDbtTxInf>
<DrctDbtTxInf>
<PmtId>
<EndToEndId>D011220348--05-2025</EndToEndId>
</PmtId>
<InstdAmt Ccy="EUR">80.00</InstdAmt>
<Dbtr>
<Nm>JEAN DUPONT</Nm>
</Dbtr>
</DrctDbtTxInf>
</CstmrDrctDbtInitn>
</Document>
Transformation DSL
The DSL provides a set of commands to perform transformations. Note that all column names and string literals must be enclosed in double quotes (").
SELECT
Selects a subset of columns and filters rows based on conditions.
Example: Given the following CSV data:
ID,Name,Status,Amount
1,Alice,Completed,100
2,Bob,Pending,50
3,Charlie,Completed,200
The command:
SELECT "ID", "Name", "Amount" WHERE input."Status" = "Completed"
Will result in:
ID,Name,Amount
1,Alice,100
3,Charlie,200
UPDATE
Updates the values in a column, optionally based on conditions.
Example: Given the following CSV data:
Product,Price,Currency
Apple,1.0,USD
Banana,0.5,EUR
Orange,1.2,USD
The command:
UPDATE "Price" = input."Price" * 1.1 WHERE input."Currency" = "USD"
Will result in:
Product,Price,Currency
Apple,1.1,USD
Banana,0.5,EUR
Orange,1.32,USD
RENAME
Renames a column.
Example: Given the following CSV data:
Transaction ID,Amount
txn_123,100
txn_456,200
The command:
RENAME "Transaction ID" "TransactionId"
Will result in:
TransactionId,Amount
txn_123,100
txn_456,200
DELETE
Deletes rows or columns.
Example: Given the following CSV data:
ID,Name,Amount,InternalNotes
1,Alice,100,Note A
2,Bob,0,Note B
3,Charlie,200,Note C
The command:
DELETE WHERE input."Amount" <= 0
Will result in:
ID,Name,Amount,InternalNotes
1,Alice,100,Note A
3,Charlie,200,Note C
And the command:
DELETE COLUMN "InternalNotes"
Will result in:
ID,Name,Amount
1,Alice,100
3,Charlie,200
AGGREGATE
Performs aggregation operations.
Column Aggregation
Example: Given the following CSV data:
Category,Amount
Fruit,10
Vegetable,20
Fruit,15
The command:
AGGREGATE COLUMN SUM("Amount") GROUP BY "Category"
Will result in:
Category,Amount
Fruit,25
Vegetable,20
Row Aggregation
Example: Given the following CSV data:
Price,Quantity
10,2
5,3
The command:
AGGREGATE ROWS input."Price" * input."Quantity" INTO "Total"
Will result in:
Price,Quantity,Total
10,2,20
5,3,15
SORT
Sorts rows based on one or more columns.
Example: Given the following CSV data:
Name,Date,Amount
Alice,2024-10-12,100
Bob,2024-10-11,200
Alice,2024-10-12,50
The command:
SORT "Name" ASC, "Amount" DESC
Will result in:
Name,Date,Amount
Alice,2024-10-12,100
Alice,2024-10-12,50
Bob,2024-10-11,200
INSERT
Inserts a new row or column.
Example: Given the following CSV data:
ID,Name
1,Alice
The command:
INSERT ROW 2, "Bob"
Will result in:
ID,Name
1,Alice
2,Bob
And the command:
INSERT COLUMN "Status"
Will result in:
ID,Name,Status
1,Alice,
2,Bob,
MOVE
Moves a row or column to a new position.
Example: Given the following CSV data:
ID,Name,Status
1,Alice,A
2,Bob,B
3,Charlie,C
The command:
MOVE ROW 2 0
Will result in:
ID,Name,Status
3,Charlie,C
1,Alice,A
2,Bob,B
Functions
Functions can be used in UPDATE and AGGREGATE ROWS commands for data manipulation.
String Functions
LEFT(column, length) - Extract leftmost characters
UPDATE "ShortCode" = LEFT(input."Code", 3)
RIGHT(column, length) - Extract rightmost characters
UPDATE "Extension" = RIGHT(input."Filename", 3)
SUBSTRING(column, start, length) - Extract substring
UPDATE "MiddlePart" = SUBSTRING(input."Text", 5, 10)
UPPER(column) - Convert to uppercase
UPDATE "Name" = UPPER(input."Name")
LOWER(column) - Convert to lowercase
UPDATE "Email" = LOWER(input."Email")
TRIM(column) - Remove leading/trailing whitespace
UPDATE "Name" = TRIM(input."Name")
CONCAT(value1, value2, ...) - Concatenate strings
UPDATE "FullName" = CONCAT(input."FirstName", " ", input."LastName")
LENGTH(column) - Get string length
UPDATE "NameLength" = LENGTH(input."Name")
REPLACE(column, old, new) - Replace substring
UPDATE "Phone" = REPLACE(input."Phone", "-", "")
Example: Given the following CSV data:
FirstName,LastName,Email
John ,Doe,JOHN.DOE@EXAMPLE.COM
alice,smith,alice.smith@test.com
The commands:
UPDATE "FirstName" = TRIM(input."FirstName")
UPDATE "Email" = LOWER(input."Email")
UPDATE "FullName" = CONCAT(input."FirstName", " ", input."LastName")
Will result in:
FirstName,LastName,Email,FullName
John,Doe,john.doe@example.com,John Doe
alice,smith,alice.smith@test.com,alice smith
Date Functions
TODAY() - Get current date
UPDATE "ProcessedDate" = TODAY()
DATEADD(date, days) - Add days to a date
UPDATE "DueDate" = DATEADD(input."InvoiceDate", 30)
DATEDIFF(date1, date2) - Days between two dates
UPDATE "DaysOverdue" = DATEDIFF(TODAY(), input."DueDate")
YEAR(date), MONTH(date), DAY(date) - Extract date parts
UPDATE "Year" = YEAR(input."Date")
UPDATE "Month" = MONTH(input."Date")
UPDATE "Day" = DAY(input."Date")
WEEKDAY(date) - Get day of week (0=Monday, 6=Sunday)
UPDATE "DayOfWeek" = WEEKDAY(input."Date")
FORMAT_DATE(date, input_format, output_format) - Convert date format
UPDATE "FormattedDate" = FORMAT_DATE(input."Date", "%Y-%m-%d", "%d/%m/%Y")
Example: Given the following CSV data:
InvoiceDate,Amount
2024-10-13,100
2024-11-05,200
The commands:
UPDATE "DueDate" = DATEADD(input."InvoiceDate", 30)
UPDATE "Year" = YEAR(input."InvoiceDate")
UPDATE "Formatted" = FORMAT_DATE(input."InvoiceDate", "%Y-%m-%d", "%d/%m/%Y")
Will result in:
InvoiceDate,Amount,DueDate,Year,Formatted
2024-10-13,100,2024-11-12,2024,13/10/2024
2024-11-05,200,2024-12-05,2024,05/11/2024
Expressions
Expressions can be used in UPDATE and AGGREGATE ROWS commands. They support arithmetic operations (+, -, *, /) and all the functions listed above.
Conditions
Conditions are used in SELECT, UPDATE, DELETE WHERE, and AGGREGATE clauses.
Comparison Operators:
=or==- Equal to!=- Not equal to>- Greater than>=- Greater than or equal to<- Less than<=- Less than or equal to
Null/Empty Checks:
ISSET- Value is not null and not emptyISNOTSET- Value is null or empty
Logical Operators:
AND- Both conditions must be trueOR- At least one condition must be true
Examples:
DELETE WHERE input."Amount" < 10 AND input."Status" = "Pending"
SELECT "Name", "Amount" WHERE input."Email" ISSET
UPDATE "DefaultValue" = 0 WHERE input."Amount" ISNOTSET
Conditional Aggregations
Aggregation functions can include WHERE clauses to filter which rows are included in the calculation.
Example: Given the following CSV data:
Category,Amount,Status
Fruit,100,Paid
Vegetable,50,Pending
Fruit,75,Paid
Vegetable,25,Paid
The command:
AGGREGATE COLUMN SUM("Amount") WHERE input."Status" = "Paid", COUNT("Amount") GROUP BY "Category"
Will result in:
Category,Amount,Amount_count
Fruit,175,2
Vegetable,25,1
Note: The SUM only includes "Paid" rows (175 for Fruit = 100+75, 25 for Vegetable), while COUNT counts all rows per category.
JOIN
Joins data from another file with the current dataset.
Syntax:
JOIN <TYPE> "file_path" ON <condition>
Join Types:
INNER- Only rows that match in both datasetsLEFT- All rows from left (current) dataset, matching rows from rightRIGHT- All rows from right dataset, matching rows from left (current)OUTER- All rows from both datasets
Example: Given the current CSV data:
OrderID,CustomerID,Amount
1,C001,100
2,C002,200
And a file customers.csv:
CustomerID,Name,City
C001,Alice,Paris
C002,Bob,London
C003,Charlie,Berlin
The command:
JOIN LEFT "customers.csv" ON input."CustomerID" = other."CustomerID"
Will result in:
OrderID,CustomerID,Amount,Name,City
1,C001,100,Alice,Paris
2,C002,200,Bob,London
UNION
Combines data from another file with the current dataset.
Syntax:
UNION <TYPE> "file_path"
Union Types:
ROWS- Appends rows from the other file (must have same columns)COLUMNS- Appends columns from the other file (must have same number of rows)
Example (UNION ROWS): Given the current CSV data:
Name,Amount
Alice,100
Bob,200
And a file more_data.csv:
Name,Amount
Charlie,150
Diana,250
The command:
UNION ROWS "more_data.csv"
Will result in:
Name,Amount
Alice,100
Bob,200
Charlie,150
Diana,250
Example (UNION COLUMNS): Given the current CSV data:
Name,Amount
Alice,100
Bob,200
And a file status.csv:
Status
Paid
Pending
The command:
UNION COLUMNS "status.csv"
Will result in:
Name,Amount,Status
Alice,100,Paid
Bob,200,Pending
SIGN and CHECK
Cryptographically sign data and verify its integrity using RSA digital signatures.
SIGN
Adds a cryptographic signature to your data. Each signature automatically generates a unique ephemeral RSA key pair (2048-bit) - no configuration or environment variables needed!
Syntax:
SIGN ALL COLUMNS INTO "column_name"
SIGN COLUMNS "col1", "col2", ... INTO "column_name"
How it works:
- Generates a fresh RSA key pair for this signature
- Creates a Merkle tree from row hashes
- Signs the Merkle root with the private key
- Embeds the public key in the signature
- Discards the private key (never stored!)
- Adds a signature row and column to the data
Example: Given the following CSV data:
Invoice Number,Amount,Vendor
INV001,1000,Acme Corp
INV002,2000,Beta Inc
INV003,1500,Gamma LLC
The command:
SIGN ALL COLUMNS INTO "__SIGNATURE__"
Will add a __SIGNATURE__ column and a signature row at the bottom:
Invoice Number,Amount,Vendor,__SIGNATURE__
INV001,1000,Acme Corp,4bdad51d55da...
INV002,2000,Beta Inc,e0b8107df890...
INV003,1500,Gamma LLC,7ba0349bf59e...
__MERKLE_ROOT__,__SIGNATURE_INFO__,__DO_NOT_EDIT__,{"root":"7928a130..."}
Signing specific columns:
SIGN COLUMNS "Invoice Number", "Amount" INTO "__SIGNATURE__"
This allows you to add additional columns later without invalidating the signature.
CHECK
Verifies a signature and ensures data hasn't been tampered with. Works with no configuration - the public key is embedded in the signature!
Syntax:
CHECK SIGNATURE FROM "column_name"
CHECK SIGNATURE FROM "column_name" KEEP ROW
CHECK SIGNATURE FROM "column_name" KEEP COLUMN
CHECK SIGNATURE FROM "column_name" KEEP ALL
Options:
- Default: Removes both signature row and column after verification
KEEP ROW: Keeps the signature row (removes column)KEEP COLUMN: Keeps the signature column (removes row)KEEP ALL: Keeps both signature row and column
Example:
CHECK SIGNATURE FROM "__SIGNATURE__"
Returns the original data if verification succeeds, or raises an error if:
- Data values have been modified
- Rows have been added or removed
- Signed columns are missing
- The signature has been tampered with
Use Cases:
- Multi-party data exchange: Sign data before sending to partners
- Audit trails: Verify data integrity months later
- Data pipelines: Ensure no unauthorized modifications between steps
- Compliance: Prove data provenance and integrity
Security Features:
- Each signature uses a unique ephemeral RSA key pair
- Forward secrecy: compromising one key doesn't affect other signatures
- Self-contained: no key management required
- No environment variables or configuration needed
Complete Workflow Example:
{
"transforms": [
"UPDATE \"Amount\" = input.\"Amount\" * 1.1",
"SIGN ALL COLUMNS INTO \"__SIGNATURE__\"",
"... (data is stored or transferred) ...",
"CHECK SIGNATURE FROM \"__SIGNATURE__\"",
"AGGREGATE COLUMN SUM(\"Amount\") GROUP BY \"Vendor\""
]
}