Skip to main content

Transformation Instructions

info

This is a work in progress feature that is not yet publicly available.

Transformation Instructions

The Data Transformation feature allows you to manipulate and reshape structured data files (CSV, JSON, XML, Excel) directly within your continuity kits. This is accomplished by defining a series of transformation steps using a powerful, SQL-like Domain-Specific Language (DSL).

How It Works

Data transformation is configured as an externalAction within a kit's checklist. The action requires input and output file paths and a list of transforms to be executed in sequence.

Example Action Structure

{
"type": "externalAction",
"actionName": "DataTransform",
"value": {
"input": "/path/to/input.csv",
"output": "/path/to/output.xlsx",
"transforms": [
"DELETE HEADER",
"UPDATE \"DEBIT EUR\" = 0 WHERE input.\"DEBIT EUR\" ISNOTSET",
"AGGREGATE ROWS input.\"DEBIT EUR\" + input.\"CREDIT EUR\" INTO \"total\"",
"DELETE WHERE input.\"CAT Cash Flow\" = \"A NE PAS COMPTER\"",
"AGGREGATE COLUMN SUM(\"total\") GROUP BY \"CAT Cash Flow\", \"CAT P&L\""
]
}
}

Supported File Formats

  • Input: CSV, JSON, XML, Excel (.xls, .xlsx)
  • Output: CSV, JSON, XML, Excel (.xls, .xlsx)

Advanced Format Handling

Reading Nested XML (SEPA SDD Example)

When dealing with complex, nested XML files like a SEPA Direct Debit (pain.008.001.02), you can use read_params to flatten the data into a tabular format.

  • record_path: A list of keys representing the path to the array of records you want to iterate over.
  • meta: A list of paths to metadata fields that are common to all records. These fields will be added as columns to every row.

Example: Given the following SEPA SDD XML input:

<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.008.001.02">
<CstmrDrctDbtInitn>
<GrpHdr>
<MsgId>M000038280</MsgId>
<CreDtTm>2025-05-23T10:30:00</CreDtTm>
<InitgPty>
<Nm>CREDIT LYONNAIS</Nm>
</InitgPty>
</GrpHdr>
<PmtInf>
<PmtInfId>FR-202505-38280</PmtInfId>
<DrctDbtTxInf>
<PmtId>
<EndToEndId>D011220347--05-2025</EndToEndId>
</PmtId>
<InstdAmt Ccy="EUR">62.27</InstdAmt>
<Dbtr>
<Nm>ACHIM PASCAL</Nm>
</Dbtr>
</DrctDbtTxInf>
<DrctDbtTxInf>
<PmtId>
<EndToEndId>D011220348--05-2025</EndToEndId>
</PmtId>
<InstdAmt Ccy="EUR">80.00</InstdAmt>
<Dbtr>
<Nm>JEAN DUPONT</Nm>
</Dbtr>
</DrctDbtTxInf>
</PmtInf>
</CstmrDrctDbtInitn>
</Document>

Use the following read_params in your action's value to flatten the transactions:

"read_params": {
"record_path": ["Document", "CstmrDrctDbtInitn", "PmtInf", "DrctDbtTxInf"],
"meta": [
["Document", "CstmrDrctDbtInitn", "GrpHdr", "MsgId"],
["Document", "CstmrDrctDbtInitn", "GrpHdr", "CreDtTm"],
["Document", "CstmrDrctDbtInitn", "PmtInf", "PmtInfId"]
]
}

This will produce the following internal table, which you can then transform using the DSL:

"PmtId.EndToEndId","InstdAmt.#text","InstdAmt.@Ccy","Dbtr.Nm","Document.CstmrDrctDbtInitn.GrpHdr.MsgId","Document.CstmrDrctDbtInitn.GrpHdr.CreDtTm","Document.CstmrDrctDbtInitn.PmtInf.PmtInfId"
"D011220347--05-2025",62.27,"EUR","ACHIM PASCAL","M000038280","2025-05-23T10:30:00","FR-202505-38280"
"D011220348--05-2025",80.00,"EUR","JEAN DUPONT","M000038280","2025-05-23T10:30:00","FR-202505-38280"

Writing Custom Nested XML (SEPA SDD Example)

To generate a nested XML output from flat data, you can provide a detailed template.

  • Set format to "custom_xml" inside write_params.
  • Define the desired XML structure using xml_template.

Template Keywords:

  • tag: The name of the XML element.
  • attributes: An object for element attributes. Values can be static or dynamic (e.g., {"from_column": "ColumnName"}).
  • text: The element's text content. Can be static or dynamic.
  • children: An array of child element templates.
  • iterate_on_rows: Place this key inside an element's template to repeat it for each row in the data.
  • aggregate: Computes a value over the entire dataset (e.g., {"aggregate": "count"} or {"aggregate": "sum", "column": "Amount"}).

Example: To generate a SEPA SDD XML from a flat CSV:

ns1:MsgId,ns1:CreDtTm,ns1:InstdAmt,Ccy,ns1:EndToEndId,ns1:Nm8
M000038280,2025-05-23T10:30:00,62.27,EUR,D011220347--05-2025,ACHIM PASCAL
M000038280,2025-05-23T10:30:00,80.00,EUR,D011220348--05-2025,JEAN DUPONT

Use the following write_params:

"write_params": {
"format": "custom_xml",
"pretty_print": true,
"xml_template": {
"tag": "Document",
"attributes": {"xmlns": "urn:iso:std:iso:20022:tech:xsd:pain.008.001.02"},
"children": [
{
"tag": "CstmrDrctDbtInitn",
"children": [
{
"tag": "GrpHdr",
"children": [
{"tag": "MsgId", "text": {"from_column": "ns1:MsgId"}},
{"tag": "CreDtTm", "text": {"from_column": "ns1:CreDtTm"}},
{"tag": "NbOfTxs", "text": {"aggregate": "count"}},
{"tag": "CtrlSum", "text": {"aggregate": "sum", "column": "ns1:InstdAmt"}}
]
},
{
"iterate_on_rows": {
"tag": "DrctDbtTxInf",
"children": [
{
"tag": "PmtId",
"children": [{"tag": "EndToEndId", "text": {"from_column": "ns1:EndToEndId"}}]
},
{
"tag": "InstdAmt",
"attributes": {"Ccy": {"from_column": "Ccy"}},
"text": {"from_column": "ns1:InstdAmt"}
},
{
"tag": "Dbtr",
"children": [{"tag": "Nm", "text": {"from_column": "ns1:Nm8"}}]
}
]
}
}
]
}
]
}
}

This will produce the following XML:

<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.008.001.02">
<CstmrDrctDbtInitn>
<GrpHdr>
<MsgId>M000038280</MsgId>
<CreDtTm>2025-05-23T10:30:00</CreDtTm>
<NbOfTxs>2</NbOfTxs>
<CtrlSum>142.27</CtrlSum>
</GrpHdr>
<DrctDbtTxInf>
<PmtId>
<EndToEndId>D011220347--05-2025</EndToEndId>
</PmtId>
<InstdAmt Ccy="EUR">62.27</InstdAmt>
<Dbtr>
<Nm>ACHIM PASCAL</Nm>
</Dbtr>
</DrctDbtTxInf>
<DrctDbtTxInf>
<PmtId>
<EndToEndId>D011220348--05-2025</EndToEndId>
</PmtId>
<InstdAmt Ccy="EUR">80.00</InstdAmt>
<Dbtr>
<Nm>JEAN DUPONT</Nm>
</Dbtr>
</DrctDbtTxInf>
</CstmrDrctDbtInitn>
</Document>

Transformation DSL

The DSL provides a set of commands to perform transformations. Note that all column names and string literals must be enclosed in double quotes (").

SELECT

Selects a subset of columns and filters rows based on conditions.

Example: Given the following CSV data:

ID,Name,Status,Amount
1,Alice,Completed,100
2,Bob,Pending,50
3,Charlie,Completed,200

The command:

SELECT "ID", "Name", "Amount" WHERE input."Status" = "Completed"

Will result in:

ID,Name,Amount
1,Alice,100
3,Charlie,200

UPDATE

Updates the values in a column, optionally based on conditions.

Example: Given the following CSV data:

Product,Price,Currency
Apple,1.0,USD
Banana,0.5,EUR
Orange,1.2,USD

The command:

UPDATE "Price" = input."Price" * 1.1 WHERE input."Currency" = "USD"

Will result in:

Product,Price,Currency
Apple,1.1,USD
Banana,0.5,EUR
Orange,1.32,USD

RENAME

Renames a column.

Example: Given the following CSV data:

Transaction ID,Amount
txn_123,100
txn_456,200

The command:

RENAME "Transaction ID" "TransactionId"

Will result in:

TransactionId,Amount
txn_123,100
txn_456,200

DELETE

Deletes rows or columns.

Example: Given the following CSV data:

ID,Name,Amount,InternalNotes
1,Alice,100,Note A
2,Bob,0,Note B
3,Charlie,200,Note C

The command:

DELETE WHERE input."Amount" <= 0

Will result in:

ID,Name,Amount,InternalNotes
1,Alice,100,Note A
3,Charlie,200,Note C

And the command:

DELETE COLUMN "InternalNotes"

Will result in:

ID,Name,Amount
1,Alice,100
3,Charlie,200

AGGREGATE

Performs aggregation operations.

Column Aggregation

Example: Given the following CSV data:

Category,Amount
Fruit,10
Vegetable,20
Fruit,15

The command:

AGGREGATE COLUMN SUM("Amount") GROUP BY "Category"

Will result in:

Category,Amount
Fruit,25
Vegetable,20

Row Aggregation

Example: Given the following CSV data:

Price,Quantity
10,2
5,3

The command:

AGGREGATE ROWS input."Price" * input."Quantity" INTO "Total"

Will result in:

Price,Quantity,Total
10,2,20
5,3,15

SORT

Sorts rows based on one or more columns.

Example: Given the following CSV data:

Name,Date,Amount
Alice,2024-10-12,100
Bob,2024-10-11,200
Alice,2024-10-12,50

The command:

SORT "Name" ASC, "Amount" DESC

Will result in:

Name,Date,Amount
Alice,2024-10-12,100
Alice,2024-10-12,50
Bob,2024-10-11,200

INSERT

Inserts a new row or column.

Example: Given the following CSV data:

ID,Name
1,Alice

The command:

INSERT ROW 2, "Bob"

Will result in:

ID,Name
1,Alice
2,Bob

And the command:

INSERT COLUMN "Status"

Will result in:

ID,Name,Status
1,Alice,
2,Bob,

MOVE

Moves a row or column to a new position.

Example: Given the following CSV data:

ID,Name,Status
1,Alice,A
2,Bob,B
3,Charlie,C

The command:

MOVE ROW 2 0

Will result in:

ID,Name,Status
3,Charlie,C
1,Alice,A
2,Bob,B

Expressions

Expressions can be used in UPDATE and AGGREGATE ROWS commands. They support arithmetic operations (+, -, *, /) and the FORMAT_DATE function.

FORMAT_DATE Function: Converts a date string from one format to another.

Example: Given the following CSV data:

Date,FormattedDate
2024-10-13,

The command:

UPDATE "FormattedDate" = FORMAT_DATE(input."Date", "%Y-%m-%d", "%d/%m/%Y")

Will result in:

Date,FormattedDate
2024-10-13,13/10/2024

Conditions

Conditions are used in SELECT, UPDATE, and DELETE WHERE clauses.

Operators:

  • = (or ==), !=, >, >=, <, <=
  • ISNOTSET (checks for null/empty values)

Example:

DELETE WHERE input."Amount" < 10 and input."Status" = "Pending"

Unsupported Operations

The following operations are defined in the DSL specification but are not yet implemented:

  • JOIN
  • UNION

Using these commands will result in an error.