Data Observability platform Help

Profiling

Profiling page allows users to set profiling rules for selected objects and columns. This allows users to get insights on what kind of data is available, how the data is changing over time, and detect anomalies and deviations in data.

Profiling

Adding a new object to profile

  1. Press "Add objects" button on the sidebar

  2. Filter out objects you wish to profile

    ProfilingObjects
  3. Choose an object to profile by checking checkboxes and clicking "Insert selected" button

  4. New profiling object will now be shown on the page

Modifying profiling object

Opening profiling object allows to:

  • Schedule rules to run

  • Set alerts for scheduled executions

  • Add SQL filters to profiled object

  • Execute rules manually

  • Add/remove columns to profile

  • Add/remove rules for a column

  • Delete object/remove profiling for an object

  • Set threshold for the object

ProfilingModify

Deleting a profiling object

  • Open profiling object and click “Delete object” button

    ProfilingDelete
  • Select object(s) from list and click "Delete selected" button

    ProfilingDeleteObject

Column values

It is possible to analyze how values in column change over time. This allows the application to detect deviations/anomalies in date and alert users.

ProfilingColumn

By default, column values are analyzed by finding counts for each value. It is possible to change the SQL used to get column value results by clicking “SQL” button:

ProfilingSql

You can use predefined queries to set SQL for finding counts for each value or for range (numeric values):

Disabling column value rule execution

It is possible to disable column values from being analyzed by clicking the slider button:

ColumnValues

Delete values from column value history

You can delete values that no longer exist in column by clicking the trashcan icon for a value:

ProfilingHistory

Exclude values from column value history

Excluded value results are still saved, but are no longer analyzed for anomalies (excluded values do not flag as failed). You can exclude values by clicking the exclude icon for a value:

ProfilingHistoryExec

Exclude executions from column value history

To exclude column value executions from history, simply open the execution history list and deselect executions:

ProfilingExclude

Profiling rules

Profiling rules are used to profile data for columns. By default, a set of rules is available by the application, such as:

  • Average

  • Count of NULLs

  • Count of NOT NULLs

  • Unique values

  • SUM of values

ProfilingRules

Adding profiling rules to a column

There are 4 ways to add rules to a column:

  1. When adding new profiling objects ProfilingRuleObjects

  2. When adding profiled columns for an object

    ProfilingColumnObject
  3. Adding rules manually for a specific column by clicking "+"

    ProfilingInsertRule

Removing profiling rules from a column

Profiling rules are removed/deleted when:

Executing profiling rules for a column

To execute a single rule, open the column and click on “refresh” icon: ProfilingRefresh

Exclude executions from rule history

To exclude executions from rule history, simply select executions from history and click disable button:

ProfilingExcludeExec

View all existing rules for an object

All existing rules can be viewed by clicking Actions-> View all rules

ProfilingViewRules

It is possible to delete specific rules or delete execution history for selected rules:

ProfilingSpecificRule

Manage profiling rules

Profiling rules can be added/removed/changed by clicking on "Manage rules" button from sidebar. Each rule can be assigned for specific database drivers/connectors and data types. A rule must have a unique name. Parameters available to use:

  • <column> – column name

  • <object> – table/view name with schema (schema.table)

A rule can be assigned to all applicable columns by clicking the "Actions" button and then selecting "Assign" for the selected rule. The option "Clear" will remove rule from all columns that are no longer applicable by data type or driver/connector.

Rules with "default" value turned off are not automatically created when generating rules for an object/column. They can only be added manually.

ProfilingManageProfilingRules

Manage data types

Data types for rules can be added/removed/changed by clicking on "Manage data types" button from sidebar. Rule types categorize which rule can be created/assigned for a column based on its data type. Data types with parentheses must be inserted without the parentheses (e.g. VARCHAR(150) as VARCHAR).

ProfilingManageDataTypes
Last modified: 21 October 2024