Data classification is about tagging your enterprise content so that data can be found quickly. But data classification comes at a price, and in fact, and the costs involved in trying to classify *all* of your data can be prohibitive. Striking the right balance is hard — it is a question of determining which content when tagged will offer the most value. Data value is often derived from financial (business value), regulatory (compliance), legal (liability), and security (privacy) attributes.
Attempts have been made by software to auto-classify content with varying degrees of success. ComputerWorld calls accurate data classification as elusive. The greatest success for auto-classification have come about within the narrow domains of a single application, like for email.
Data classification is best done under centralized control, especially when overall compliance for an organization is a consideration. An important consideration before even applying data classification is to make sure the data being used is current and of good quality. Doing an initial data audit of all available organizational data will provide an overview of the data — and, create a data classification design and plan based on the audit results. One way to do that is to apply data cleansing, the removal of obsolete and redundant data.














