If can be used with all wrangles and connectors for read, write or run. If can be used to set a condition to determine if a wrangle or connector will execute or not.
If uses python syntax and will evaluate to True or False based on whether python considers the result to be 'truthy'.
Recipe variables ${variable} are parameterized and may be used within the statement, the dataframe object (df) is also available to write conditions on.
wrangles:
  - convert.case:
      input: a column
      case: upper
      if: ${variable}
      # Set the case to upper only if the value of ${variable} is true
wrangles:
  - convert.case:
      input: a column
      case: upper
      if: len(df) > 0
      # Set the case to upper only if the dataframe is not empty
For Wrangles, and Write, the additional variables 'columns', 'row_count' and 'column_count' are available.
write:
  - file:
      name: file1.csv
      if: row_count > 0
      # Write a file only if there's at least one row
      
  - file:
      name: file2.csv
      if: '"specific_column" in columns'
      # Write a file if a certain column exists. Note the quotes needed
      # to prevent "spec... being interpreted as a YAML string.
Environment variables are always strings. When using variable sourced from an environment variable, False may become 'False', which Python evaluates as a non-zero length string and therefore as True. Consider using an alternative such as:
if: str(${variable}).upper() == 'TRUE'
Where can be used to filter the rows used by a read, wrangle or write. Unlike if, which either executes or not, where will execute with a subset of the data.
Where uses SQL syntax (specifically the sqlite dialect) and should be written as the where clause of a sql select query.
# Write only the rows where the column
# named category contains the value 'example'
write:
  - file:
      name: file.csv
      where: category = 'example'
Column names that contain spaces can be referenced using square brackets ([example column]), double quotes ("example column") or backticks (`example column`).
Recipe variables that are included in a where clause are vulnerable to sql injection. Values can be parameterized with where_params, to be correctly treated as a value. Parameters can be included in the where condition with ? as a position based variable or :name as a named variable.
# Write only the rows where the column named
# category matches the value passed in as
# the variable ${category_variable}
write:
  - file:
      name: file.csv
      where: category = ?
      where_params:
        - ${category_variable}
      # set value based on position
        
  - file:
      name: file.csv
      where: category = :namedparameter
      where_params:
        namedparameter: ${category_variable}
      # set a named value