A Review: Data Quality Problem in Predictive Analytics
As data size continues to grow, there has been a revolution in computational methods and statistics to process and analyze data into insight and knowledge. This change in the paradigm of analytical data from explicit to implicit raises the way to extract knowledge from data through a prospective approach to determine the value of new observations based on the structure of the relationship between input and output. Data preparation is a very important stage in predictive analytics. To run quality analytical data, data with good quality is needed in accordance with the criteria. Data quality plays an important role in strategic decision making and planning before the digital computer era. The main challenge faced is that raw data cannot be directly used for analysis. One problem that arises related to data quality is completeness. Missing data is one that often causes data to become incomplete. As a result, predictive analysis generated from these data becomes inaccurate. In this paper we will discuss the problems related to the quality of data in predictive analytics through a literature study from related research. In addition, challenges and directions that might occur in the predictive analytics domain with problems related to data quality will be presented.