Dataset expression syntax error

The expression defined on a dataset has a syntax errors.

Understanding the issue

This occurs when an expression-based dataset contains invalid syntax. Expression datasets allow you to create values that depend on other columns or datasets, but they must follow specific formatting rules.

Common syntax errors:

  • Missing or mismatched brackets: $[, $(, )
  • Invalid column references: non-existent column names
  • Incorrect deterministic syntax: malformed dataset references
  • Unescaped special characters in static text

How to fix

Problem examples:

{
  "datasets": [
    {
      "name": "BrokenEmail",
      "type": "Expression",
      "expression": "$(GivenNames($[GivenName]).$(FamilyNames)@company.com"
      // Issues: Missing closing ), missing deterministic syntax for FamilyNames
    }
  ]
}

Corrected example:

{
  "tables": [
    {
      "schema": "Person", 
      "name": "Users",
      "columns": [
        {
          "name": "GivenName",
          "dataset": "GivenNames",
          "deterministic": true
        },
        {
          "name": "FamilyName", 
          "dataset": "FamilyNames",
          "deterministic": true
        },
        {
          "name": "Email",
          "dataset": "EmailAddresses",
          "deterministic": true
        }
      ],
      "datasets": [
        {
          "name": "EmailAddresses",
          "type": "Expression",
          "expression": "$(GivenNames($[GivenName])).$(FamilyNames($[FamilyName]))@company.com"
        }
      ]
    }
  ]
}


Expression syntax rules

Static text:

  • Plain text requires no special syntax
  • Example: @company.com

Column references:

  • Format: $[column_name]
  • Example: $[GivenName]

Dataset references:

  • Non-deterministic: $(dataset_name)
  • Deterministic: $(dataset_name($[column_name]))

Complete expression example:

"$(GivenNames($[Id])).$(FamilyNames($[Id]))@$(Domains($[Id]))$(TopLevelDomains($[Id]))"

Validation checklist

  • All opening brackets have matching closing brackets
  • Column names in $[column_name] exist in the table
  • Dataset names in $(dataset_name) are defined or built-in
  • Deterministic references use proper $(dataset($[column])) format
  • Expression datasets are defined under tables, not globally

Prevention

  • Start with simple expressions and build complexity gradually
  • Validate column names exist in your table schema
  • Test expressions with --dry-run before full processing

Didn't find what you were looking for?