Sunday, December 17, 2006

Data mining guide

Data mining is also known as Knowledge Discovery in Databases (KDD). Data mining is the process of automatically searching large volumes of data for patterns. Data is derived from the word datum, being its plural term. Data comprises of a class of large number of significant statements that are measurements or observations of a variable. Data mining works on the basis of computational techniques from statistics machine learning and pattern recognition.

Data mining is the crucial process that helps companies better comprehend their customers. Data mining can be defined as 'the nontrivial extraction of implicit, previously unknown, and potentially useful information from data' and also as 'the science of extracting useful information from large sets or databases'. Data mining is interpreted differently in different contexts but most often it is used in the context of a business or other organization's need to acknowledge trends.

Just like mining for gold, data mining navigates through large databases and extracts a wealth of customer data, which is then translated into useful and predictive information. A paradigm instance of data mining is its use in a retail sales department where a store tracks the purchases of a customer who buys ample of cotton trousers. The data mining system will make an association with customer and cotton trousers and might either directly market or sale the cotton trouser to that customer or try to get the customer to buy a vast range of products. Data mining also enables automatic detection of the patterns in a database and guides marketing professionals to a better understanding of the customer psychology.

Data mining software enables users to analyze large databases to provide solutions to business decision problems. Data mining is a technology and not a business solution like statistics. The data mining software uses the information that is stored in a historical database of earlier interactions with the customers and about other aspects such as age, zip code, their feedback etc. Thus the data mining software provides an idea about the customers that would be intrigued by the new product. This enables the marketing manager to choose appropriate customers to target.

Data mining is also linked with privacy concerns particularly regarding the source of the data analyzed. For instance an employer can screen out people with diabetes or heart attack and create ethical and legal issues and elimination of costs of insurance. Besides this data mining is also used in the field of medicine to find combinations of drugs with harmful results.

Data mining should be interpreted to give beneficial results. When the data collected involves individual people, the issues concerning privacy, legality and ethics crop up.

Data mining can bring pinpoint accuracy to dales. Creating big central stores of customer data to be used all through the entire enterprise are becoming quite usual but the data warehouses are of no use if there are no proper applications for accessing and using the data in the companies.