Part II: Getting Started with Informatica Developer > Lesson 3. Profiling Data > Profiling Data Overview
  

Profiling Data Overview

A profile is a set of metadata that describes the content and structure of a dataset.
Data profiling is often the first step in a project. You can run a profile to evaluate the structure of data and verify that data columns are populated with the types of information you expect. If a profile reveals problems in data, you can define steps in your project to fix those problems. For example, if a profile reveals that a column contains values of greater than expected length, you can design data quality processes to remove or fix the problem values.
A profile that analyzes the data quality of selected columns is called a column profile.
Note: You can also use the Developer tool to discover primary key, foreign key, and functional dependency relationships, and to analyze join conditions on data columns.
A column profile provides the following facts about data:
You can run a column profile at any stage in a project to measure data quality and to verify that changes to the data meet your project objectives. You can run a column profile on a transformation in a mapping to indicate the effect that the transformation will have on data.

Story

HypoStores wants to verify that customer data is free from errors, inconsistencies, and duplicate information. Before HypoStores designs the processes to deliver the data quality objectives, it needs to measure the quality of its source data files and confirm that the data is ready to process.

Objectives

In this lesson, you complete the following tasks:

Prerequisites

Before you start this lesson, verify the following prerequisite:

Time Required