Research on the Prediction Model of Material Cost Based on Data Mining

: Material cost prediction should be based on the scientific mathematical models so that the influence of subjective factors on the quota and other indicators of decomposition can be reduced. This paper analyzes the particle swarm optimization (PSO) algorithm to optimize the parameters of support vector machine and establishes the prediction model of material cost after preprocessing the actual data and uses the support vector regression (SVR) machine to carry out data mining. In the forecasting process, the total cost of material is first predicted and the predicted results are then adjusted with the actual value, and finally, the relative errors are tested. The result indicates that the forecasting effect is fulfilled.


INTRODUCTION
The prediction of material cost refers to utilization of special methods to estimate and predict the level of material cost on the basis of historical data and relevant information.Its characteristic feature is the prediction of the future on the basis of history and prediction of unknown level on the basis of known information.The renewal of statistical data and a series of characteristics of material production of enterprises determine that the sequence formed by material cost data generally has the non-stationarity, nonlinearity and the point of abrupt change.The support vector machine can register the minimization of structural risk and the error rate on testing data (namely generalization error rate) of learning machines by taking the sum of training error rate and a dependent item as boundary.A specific characteristic of the support vector machine is that it does not utilize the field having internal problems, which provides a good generalization performance in situations of pattern classification.
This paper carries out data mining to establish the prediction model of material cost according to the learning method of support vector regression machine.This method is a type of relational schema between the spatial pattern of learning input and functional mapping of learning output and researchers generally consider the function set in this type of mapping relation as a learning machine.It starts with the research and observation of the data (namely the sample).Researchers adopt some rules that cannot be obtained from principle in the current situations and meanwhile utilize these rules to analyze the data obtained.Thus, after reaching the point of value prediction, they conduct the decision making and value estimation processes.

PSO-SVR MODELING ALGORITHM
In order to estimate the optimal computation of the support vector regression machine (SVR) parameter, the advanced optimization algorithm is introduced to SVR algorithm which is a hotspot in the support vector machine (SVM) field [1,2].
SVM is an effective new method that can be utilized to conduct research on data mining.The three types of kernel functions are shown in Table 1.

Type of Kernel Function Formula
Multinomial Function RBF Function

Sigmoid Function
In this paper, the model of support vector regression machine adopted the RBF function as a kernel function.
At present, the genetic algorithm is most widely used in optimization of algorithms but its operation such as choice, cross, variation, etc., is more complicated and its rate of convergence and precision is certainly limited when it conducts high dimension samples.Comparatively, the particle swarm optimization algorithm is a type of optimization algorithm which is based on group parallel and global search and is simpler than other algorithms and its parameter setting is relatively less and does not provide much solution for the problems.In addition, this algorithm exhibits faster rate of convergence and stronger global searching ability.Hence, the model is easy to operate and can easily be generalized [3][4][5].In this paper, the global optimization search is conducted for the sum of parameters in the algorithm model of support vector regression machine and the optimal penalty coefficient and kernel parameter obtained from search are taken as the parameters of the final model.

Parameters of Selection Optimization
Optimizing the parameters g and c of the kernel function proposed simultaneously and carrying out automatic selection of these parameters can provide a reference idea for solving the parameter optimization problems of the support vector machine.This paper mainly optimized the kernel parameters g and c of support vector regression machine (namely penalty factor), whereby, it can be popularized in other parameter optimizing problems of SVR.

Fitness Function Design of Particle Swarm Optimization
The selection of appropriate fitness function is very important for the particle swarm optimization algorithm.The target of selection features is to try to use a small number of features to obtain the same or better classifying effect when designing the fitness function.Therefore, the number of selected features should be considered in the evaluation of a fitness function.Specifically, the sample with fewer features will have high fitness if the accuracy rate of two feature samples is the same [6,7].

Operating Steps for Optimizing SVR Parameter by Using PSO Algorithm
The flow chart of PSO-SVR algorithm is shown in Fig. (1).

1.
The particle swarms g and c are initialized first including the confirmation of the group size and setting of the location and speed of each particle.
Then after presetting the inertia weight of PSO, the maximum iterations are set.

2.
The optimal solution of each particle is set as the current location of particles.The fitness value of each particle is calculated and the optimal solution of particles with maximum fitness is taken as the optimal solution of the current group.

3.
The position coordinate and speed of particles are updated.

4.
The fitness function is used to evaluate the fitness of particles.

5.
For each particle, its current fitness value is compared with the optimal solution of the current individual, and if the former is better, the optimal value of the individual shall be replaced by the fitness value.

6.
For each particle, its current fitness value is compared with the optimal solution of the current group, and if the former is better, it can be regarded as the current global optimum.

7.
The process is terminated if the iterative conditions are met or else it is skipped on to Step 3.

8.
The test sample is predicted by using the well-trained support vector regression machine.

9.
The optimal parameter combination (namely penalty coefficient c and kernel parameter g) is used to substitute into the programmed support vector regression machine and then the data of test sample is established for model prediction.The value waiting for prediction End

INPUT AND OUTPUT OF MODEL
In order to avoid the influence to the model caused by different dimensions of each data and improve the training speed, the matrix of data is subjected to normalization processing and the input and output data is limited within [0, 1].The normalized mapping adopted in this paper is as follows: Where, x is the original data, and x max and x min are the maximum and minimum values of the original data respectively.y min and y max are range parameters of the mapping.Accordingly, the needed predicting value is obtained from negated normalization of data after the prediction is finished.

TRAINING AND TEST OF MODEL
This paper adopted the algorithm of particle swarm optimization to obtain the optimal path and the relative optimal value [8].y = (y max !y min ) " x !x min x max !x min + y min The steps for the algorithm of particle swarm optimization are as follows: 1.
The parameter optimization for the established SVR prediction model is conducted by using PSO and also some parameters of PSO algorithm are initialized.

2.
Selection of the learning factors: the local searching ability of parameter is 1.5, belonging to [0, 2] and the global searching ability of parameter is 1.7, belonging to [0, 2].

3.
The size of particle swarm is 20.The maximum and minimum of penalty coefficients c are 100 and 0, respectively.The maximum and minimum of kernel parameter g are also 100 and 0, respectively.The maximum number of iterations is 200.
In the modeling process, two processing programs (svmtrain and svmpredict) are mainly used.In the training process, the input parameters should be adjusted repeatedly to obtain the optimal results.

APPLICATION OF MODEL
The time series for research in this paper can be obtained based on all indicators data.Through the time series, a series of prediction values according to time sequence can be obtained and different time series can also be obtained based on different objects of study or problems.For prediction of material cost, this paper adopts the data series of the total material cost on account of months as the historical data to predict the future value of a certain variable.The material cost and actual data of correlative factors are shown in Table 2.
Since the data has different dimensions under different influence factors, it can have different economic meanings and explanations.In order to avoid the different influence degrees to data caused by different dimensions, and for considering the difference existing between variable data and unit set in this paper, the normalization processing is conducted on the data.In the prediction process of total cost of material, 30 sets of data are extracted from the historical data which are true and effective.The data is between 0 and 1 after normalization processing.In this way, the significant influence caused by different dimensions of data can be avoided in estimation process of the support vector regression machine.For evaluating the effectiveness and source of data, firstly the data is divided into training data and test data.The former 25 data are chosen as the training set, and latter 5 data as test set.The management of material cost not only controls the total material cost but also controls the key components among it.

Parameter Optimization
According to the steps of particle swarm optimization parameters, the optimal parameters of support vector regression machine are obtained for predicting the total cost of material: c=43.22，g=0.01.The minimum mean square error of cross validation in parameter selection is MSE=0.0102gives betteroptimization effect.

Training and Prediction
The model training and prediction of support vector regression machine are conducted by using the optimal parameters obtained from particle swarm optimization with the correlation coefficient being R=98.63%.The fitting between original data and prediction data is carried out after negated normalization.The fitting results of material cost prediction are shown in Fig. (2).

Model Evaluation
For the purpose of illustrating that the support vector regression machine of particle swarm optimization parameters has excellent performance in the prediction of material cost, the relative errors of each data are calculated.The prediction effect is better when the relative error of prediction is basically within 0.005.The relative errors of material cost prediction are shown in Fig. (3).
The results are calculated by means of calculating the same data in combination of BP neural network and multiple regression method.BP neural network is a threelayer structure and the unit number of hidden layer can be set to 6 through a contrast test.Material cost prediction results by using three methods are shown in Table 3.It is observed that the maximum relative error in prediction model of support vector regression machine established in the thesis is 2.32%, the maximum relative error of BP neural network is 5.52%, and the maximum relative error of multiple regressions is 14.93%.The results from three prediction models show that the best effect in particle swarm optimization algorithm is of the prediction model of support vector regression machine.

CONCLUSION
The performance optimization of support vector machine mainly reflects the optimization of parameters, and in this respect, the particle swarm optimization algorithm possesses the characteristic of faster rate of convergence and stronger global searching ability.Therefore, this paper established the prediction model of the support vector regression machine in particle swarm optimization algorithm and described the modeling algorithm, input and output, training and test in detail.The paper has also provided the determining standard and establishment method of parameters as well while effectively estimating the prediction of material cost with higher accuracy.
-Original Data)/Original Data