Powerful insights come from clean data and now with the Augmented Analytics Data Profiler, you have a helpful tool to not only understand how clean your data is but also guide you through suggestions on how to clean it right from the DataSet Overview page. Using ML and AI technology, the Profiler calculates a health score for every DataSet that helps you understand at a glance how complete the DataSet is and suggest recommended actions such as handling nulls, dropping columns with no data, correcting suspicious repeating data, dealing with empty values, and more.
Using Data Profiler
You can execute any Data Profiler recommended actions right in the Profiler by using the dropdown menu to apply those actions in the Views Explorer. Then, you can use that cleansed view for your visualizations or for machine learning as needed.
- Health Score: Provides a quick glance at how healthy the DataSet is based on null count.
Note: In upcoming iterations, this algorithm will continue to be updated to include a more robust set of data.
Data Overview: Provides an at-a-glance look at how much data is in the DataSet by showing the total number of columns, rows, size, and missing data points. To see more metrics about the DataSet, you can select the "View Statistics by Column" button.
Column Types: The column types menu will give you an idea of what kind of data is in the DataSet based on the types of the columns. To see your full schema, simply select the "View Full Schema Button."
Recommended Actions: Provides a list of recommended actions you can take on your data. These actions will help you clean your DataSet before creating visualizations or using an AutoML model.
To apply a recommended action, use the action drop-down menu to choose what action to take. You will automatically be taken to the Views Explorer where the action will be applied. From here, you can continue applying recommended actions to your DataSet using the menu in the rail.
Select this icon in Views Explorer to open and close the Data Profiler recommendations.
Remove empty columns
Rename long column names
Remove constant columns
Outlier detection (September Release)