Profiling
Profiling page allows users to set profiling rules for selected objects and columns. This allows users to get insights on what kind of data is available, how the data is changing over time, and detect anomalies and deviations in data.
Adding a new object to profile
Press "Add objects" button on the sidebar
Filter out objects you wish to profile
Choose an object to profile by checking checkboxes and clicking "Insert selected" button
New profiling object will now be shown on the page
Modifying profiling object
Opening profiling object allows to:
Schedule rules to run
Set alerts for scheduled executions
Add SQL filters to profiled object
Execute rules manually
Add/remove columns to profile
Add/remove rules for a column
Delete object/remove profiling for an object
Set threshold for the object
Deleting a profiling object
Open profiling object and click “Delete object” button
Select object(s) from list and click "Delete selected" button
Column values
It is possible to analyze how values in column change over time. This allows the application to detect deviations/anomalies in date and alert users.
By default, column values are analyzed by finding counts for each value. It is possible to change the SQL used to get column value results by clicking “SQL” button:
You can use predefined queries to set SQL for finding counts for each value or for range (numeric values):
Disabling column value rule execution
It is possible to disable column values from being analyzed by clicking the slider button:
Delete values from column value history
You can delete values that no longer exist in column by clicking the trashcan icon for a value:
Exclude values from column value history
Excluded value results are still saved, but are no longer analyzed for anomalies (excluded values do not flag as failed). You can exclude values by clicking the exclude icon for a value:
Exclude executions from column value history
To exclude column value executions from history, simply open the execution history list and deselect executions:
Profiling rules
Profiling rules are used to profile data for columns. By default, a set of rules is available by the application, such as:
Average
Count of NULLs
Count of NOT NULLs
Unique values
SUM of values
Adding profiling rules to a column
There are 4 ways to add rules to a column:
When adding new profiling objects
When adding profiled columns for an object
Adding rules manually for a specific column by clicking "+"
Removing profiling rules from a column
Profiling rules are removed/deleted when:
Deleting profiled column
Deleting specific rule by clicking “trash” icon:
Executing profiling rules for a column
To execute a single rule, open the column and click on “refresh” icon:
Exclude executions from rule history
To exclude executions from rule history, simply select executions from history and click disable button:
View all existing rules for an object
All existing rules can be viewed by clicking Actions-> View all rules
It is possible to delete specific rules or delete execution history for selected rules:
Manage profiling rules
Profiling rules can be added/removed/changed by clicking on "Manage rules" button from sidebar. Each rule can be assigned for specific database drivers/connectors and data types. A rule must have a unique name. Parameters available to use:
<column> – column name
<object> – table/view name with schema (schema.table)
A rule can be assigned to all applicable columns by clicking the "Actions" button and then selecting "Assign" for the selected rule. The option "Clear" will remove rule from all columns that are no longer applicable by data type or driver/connector.
Rules with "default" value turned off are not automatically created when generating rules for an object/column. They can only be added manually.
Manage data types
Data types for rules can be added/removed/changed by clicking on "Manage data types" button from sidebar. Rule types categorize which rule can be created/assigned for a column based on its data type. Data types with parentheses must be inserted without the parentheses (e.g. VARCHAR(150) as VARCHAR).