1. Overview
Data Shaping is a powerful built-in GoInsight utility designed to clean, transform, and manipulate data structures within your workflows. It provides a suite of essential tools for common data processing tasks, allowing you to filter, deduplicate, and reshape JSON objects and arrays without writing complex code.
With the GoInsight Data Shaping node, you can perform critical data transformations on the fly. This is especially useful when preparing data from one API to be sent to another, or when you need to refine a dataset before using it in subsequent workflow steps. Key capabilities include:
- Filtering Fields: Selectively keep or remove specific fields from objects.
- Deduplicating Records: Remove duplicate entries from a list based on one or more keys.
- Paginating Data: Easily extract a subset of records from a large array, similar to API pagination.
2. Prerequisites
This is a built-in GoInsight utility. No external accounts or special prerequisites are required to use this node.
3. Credentials
This is a built-in GoInsight utility and does not require any credentials to be configured.
4. Supported Operations
This node provides operations for manipulating both individual records (fields) and lists of records.
Summary
The following table summarizes the available operations, grouped by the type of data they primarily handle.
| Resource | Operation | Description |
|---|---|---|
| Records | Distinct By | Removes duplicate records from an array of objects based on specified keys, keeping either the first or last occurrence. |
| Records | Limit Records | Limits the number of records in an array of objects using pagination parameters limit and offset. |
| Fields | Remove Fields | Removes specified fields from a JSON object or an array of objects, optionally dropping fields recursively in nested structures. |
| Fields | Select Fields | Retains only the specified fields from a JSON object or an array of objects, removing all other properties. |
Operation Details
Distinct By
Removes duplicate records from an array of objects based on specified keys, keeping either the first or last occurrence.
Input Parameters:
- Records: The list of objects to deduplicate.
- Keys: A comma-separated list of key names to determine uniqueness.
Options:
- Keep: Specifies which duplicate to keep: "first" or "last".
Output:
- Records (object-array): The deduplicated list of objects.
- InputCount (number): Number of valid input records before deduplication.
- OutputCount (number): Number of records returned after deduplication.
- ErrorMessage (string): Error message if any; empty string on success.
Limit Records
Limits the number of records in an array of objects using pagination parameters limit and offset.
Input Parameters:
- Records: The list of objects to paginate.
- Limit: The maximum number of records to return
Options:
- Offset: The starting index for pagination.
- Reverse: Whether to paginate from the end; false for forward (from the beginning), true for backward (from the end).
Output:
- Records (object-array): The paginated subset of records after applying offset and limit.
- OutputCount (number): The number of records returned.
- InputCount (number): The total number of valid records before pagination.
- ErrorMessage (string): Error message if any error occurs; empty string on success.
Remove Fields
Removes specified fields from a JSON object or an array of objects, optionally dropping fields recursively in nested structures.
Input Parameters:
- JSON: A JSON string representing a single object or an array of objects.
- FieldsToRemove: A comma-separated list of field names to remove.
Options:
- Recursive: Whether to recursively drop fields in nested objects or arrays.
Output:
- Record (object): The resulting object when the input is a dict; empty object if input is an array.
- Records (object-array): The resulting list of objects when the input is an array; empty list if input is a dict.
- Count (number): Number of records returned: 1 for dict input, or the length of "Records" for array input; 0 if an error occurs.
- InputKind (string): Indicates the type of the input: "dict" or "array"; empty string if an error occurs.
- ErrorMessage (string): Error message if any error occurs; empty string on success.
Select Fields
Retains only the specified fields from a JSON object or an array of objects, removing all other properties.
Input Parameters:
- JSON: A JSON string representing a single object or an array of objects.
- SelectedFields: A comma-separated list of field names to retain.
Output:
- Record (object): The resulting object when the input is a dict; empty object if input is an array.
- Records (object-array): The resulting list of objects when the input is an array; empty list if input is a dict.
- Count (number): Number of records returned: 1 for dict input, or the length of "Records" for array input; 0 if an error occurs.
- InputKind (string): Indicates the type of the input: "dict" or "array"; empty string if an error occurs.
- ErrorMessage (string): Error message if any error occurs; empty string on success.
5. Example Usage
This section will guide you through creating a simple workflow to deduplicate a list of records. Imagine you have received a list of user sign-ups from an API, but it contains duplicates. We will use the Distinct By operation to clean this list.
The workflow will look like this: Start -> Data_shaping -> Answer.
- Add the Tool Node
- In the workflow canvas, click the + button to add a new node.
- Select the "Tools" tab in the pop-up panel.
- Find and select Data_shaping from the list of tools.
- From the list of supported operations for Data_shaping, click on Distinct By. This will add the node to your canvas.
- Configure the Node
- Click on the newly added Distinct By node to open its configuration panel on the right.
- Credentials: As a built-in utility, this node does not require credentials.
- Parameter Configuration:
- Records: This field expects an array of objects. You would typically use an expression to reference the output from a previous node (e.g., {{ upstream_node.output.data }}). For this example, you can paste the following sample JSON array directly into the field:
- Keys: Enter the field name to use for determining uniqueness. In our case, we want to remove users with the same email address, so enter email.
- Run and Validate
- Once the parameters are configured, the error indicator on the workflow's top-right corner should disappear.
- Click the "Run" button in the top-right corner to execute the workflow.
- After a successful run, you can click the log icon to view the detailed input and output of the node. The output Records will contain only the unique entries, keeping the first one found:
After completing these steps, your workflow is fully configured. When executed, it will effectively remove duplicate records from your dataset based on the specified key.
6. FAQs
Q: What is the difference between Remove Fields and Select Fields?
A: They perform opposite actions:
- Remove Fields is a "blacklist" approach: you specify which fields to *delete*, and all others are kept.
- Select Fields is a "whitelist" approach: you specify which fields to *keep*, and all others are deleted.
Choose the operation that requires listing fewer field names for your specific use case.
Q: Why is my JSON input causing an error?
A: This typically occurs if the input string is not valid JSON. Please ensure your data is correctly formatted. Common issues include:
- Trailing commas in objects or arrays.
- Using single quotes instead of double quotes for keys and string values.
- Unescaped special characters within strings.
You can use an online JSON validator to check your data. Also, ensure the data structure matches what the operation expects (e.g., an array of objects for Distinct By).
Q: Do these operations modify the original data from the previous node?
A: No. GoInsight nodes operate on the data passed to them and produce a new, separate output. The original data from the upstream node remains unchanged and accessible in its own output variables.
7. Official Documentation
As a built-in GoInsight utility, all relevant documentation is contained within this guide. For more advanced data transformation techniques, please refer to the main GoInsight documentation on expressions and data handling.
Leave a Reply.