Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. 19th International Conference on Production Research AN INTELLIGENT REASONING MODEL FOR YARN MANUFACTURE Jian-Guo Yang, Fu Zhou, Jing-Zhu Pang, Zhi-Jun Lv College Of Mechanical Engineering, University of DongHua, Ren Min Bei Road 2999, Song Jiang Zone, Shanghai, P R China Abstract Although many works have been done to construct prediction models on yarn processing quality, the relation between spinning variables and yarn properties has not been established conclusively so far. Support vector machines (SVMs), based on statistical learning theory, are gaining applications in the areas of machine learning and pattern recognition because of the high accuracy and good generalization capability. This study briefly introduces the SVM regression algorithms, and presents the SVM based system architecture for predicting yarn properties. Model selection which amounts to search in hyper- parameter space is performed for study of suitable parameters with grid-research method. Experimental results have been compared with those of ANN models. The investigation indicates that in the small data sets and real-life production, SVM models are capable of remaining the stability of predictive accuracy, and more suitable for noisy and dynamic spinning process Keywords: Support vector machines, Structure risk minimization, Predictive model, Kernel function, Yarn quality 1 INTRODUCTION dimensional feature space. The unknown parameters w Changing economic and political conditions and the and b in Equation (1) are estimated using the training set, increasing globalisation of the market mean that the textile G. To avoid over fitting and thereby improving the sector faces ever challenges. To stay competitive, there is generalization capability, following regularized functional an increasing need for companies to invest in new involving summation of the empirical risk and a complexity products. Along the textile chain, innovative technologies 2 term w , is minimized [3] and solutions are required to continuously optimize the production process. High quality standards and an M 1 ∑ 2 2 extensive technical and trade know-how are thus R reg = Remp + λ w = f ( xi ) − y i ε +λ w prerequisite to keep abreast of the growing dynamics of M i =1 the sector [1]. Although many works have been done to (2) construct prediction models on yarn processing quality, the relation between spinning variables and yarn where λis a regularization constant and the cost function properties has not been established conclusively so far.. defined by The increasing quality demands from the spinners make clear the need to explore innovative ways of quality  f ( x) − y − ε ( f ( x) − y ≥ ε ) f ( x) − y ε =  , prediction furthermore. The widespread use of artificial  0 ( f ( x) − y < ε ) intelligence (AI) has created a revolution in the domain of quality prediction, for example, application of artificial (3) neural network (ANN) in textile engineering [2]. This study is called Vapnik’s “ε-insensitive loss function”. It can be presents a support vector machines based intelligent shown that the minimizing function has the following form: predictive model for yarn process quality. The relative M algorithm, model selection and experiments are presented in detail. f ( x, α , α * ) = ∑ (α i − α i* )k ( x i , x) + b (4) i =1 2 SVM REGRESSION ALGORITHMS 2.1 Paper title and authors with α iα i* = 0 , α i , α i* ≥ 0 and the kernel function The main objective of regression is to approximate a k ( x i , x ) describes the dot product in the D-dimensional function g(x) from a given noisy set of samples feature space. G = {( x i , y i )}iN 1 = obtained from the function g. The basic idea of support vector machines (SVM) for k ( xi , x j ) = φ ( x i ), φ ( x j ) (5) regression is to map the data x into a high dimensional It is important to note that the featuresΦj need not be feature space via a nonlinear mapping and to perform a computed; rather what is needed is the kernel function linear regression in this feature space. that is very simple and has a known analytical form. The D only condition required is that the kernel function has to f ( x ) = ∑ wi φ i ( x ) + b (1) satisfy Mercer’s condition. Some of the mostly used i =1 kernels include linear, polynomial, radial basis function, and sigmoid. Note also that for Vapnik’s ε-insensitive loss where w denotes the weight vector, b is a constant known function, the Lagrange multipliers α i , α i are sparse, i.e. * as “bias”, {φ i ( x )}iD 1 = are called features. Thus, the they result in nonzero values after the optimization (2) problem of nonlinear regression in lower-dimensional input only if they are on the boundary, which means that they space is transformed into a linear regression in the high- satisfy the Karush–Kuhn–Tucker conditions. The
  2. 2. 19th International Conference on Production Research coefficients α i , α i are * obtained by maximizing the various data from yarn production process into engineering database. The reasoning machines are a following form: SVM-based yarn process simulator in nature, which are 1 M used to train the predictive models, and then make some Max : R (α * ,α ) = − ∑(αi* − αi )(α *j − α j ) K ( xi , x j ) 2 i , j =1 real-world process decision in term of the different raw materials inputs M M − ε ∑ αi* +αi ) + ∑y i (αi* −αi ) ( (6) 3.2 Model Selections i=1 i=1 In the yarn predictive learning task, the appropriate model M and parameter estimation method should be selected to .S .T . ∑α ( i=1 * i α − i) (7) obtain a high level of performance of the learning machine. Lacking a priori information about the accuracy 0 ≤α , α ≤C * i i of the y-values, it can be difficult to come up with a reasonable value of ε a prior. Instead, one would rather specify the degree of sparseness and let the algorithms Only a number of coefficients α i , α i will be different from * automatically compute ε from the data. This is the idea of zero, and the data points associated to them are called ν-SVM, a modification of the originalε-SVM introduced by support vectors. Parameters C and εare free and have to Schőlkopf, Smola, Williamson et al [6], which were used to be decided by the user. Computing b requires a more construct the yarn predictive model in our study. Under direct use of the Karush–Kuhn–Tucker conditions that the approach, the usually parameters to be chosen are lead to the quadratic programming problems stated the following: above. The key idea is to pick those values for a point xk  the penalty term C which determines the tradeoff between the complexity of the decision function and on the margin, i.e. α k or α k in * the open interval (0, C). the number of training examples misclassified; One xk would be sufficient but for stability purposes it is  the sparsity parameter ν in accordance with the noise that is in the output values in order to get the recommended that one take the average over all points highest generalization accuracy. Raw User Interface Yarn Material Yarn Quality Prediction Properties SVM-based Process Simulator Reasoning Machines Textile Engineering Database Data Acquisition Yarn Production Process Fig.1 Yarn Quality Predictive Model Architecture on the margin. More detailed description of SVM for  the kernel function such that K ( x, y) regression can be found in Ref. [3~6] 3 SVM BASED YARN PREDICTIVE MODEL According to the reference [7], the sparsity parameter ν usually may be choose in the interval [0.3, 0.6], here 3.1 Model Architecture ν=0.583. And radial basis function (RBF) kernel, given by Considering some salient features of SVM such as the Equitation 8 is used: absence of local minima, the sparseness of the solution 2 K ( x, y ) = exp(− x − y / 2σ 2 ) (8) and the improved generalization, there was proposed SVM-based yarn quality prediction system (shown as where σ is the width of the RBF kernel parameter. Fig.1). The system architecture mainly consists of three The RBF kernel nonlinearly maps samples into a higher modules, i.e. data acquisition, reasoning machine, and dimensional space, so it, unlike the linear kernel, can user interface. Among them, the user interface provides handle the case when the relation between inputs and friendly interactive operation with the model, including outputs is nonlinear. In addition, the sigmoid kernel data cleaning, model training, parameter selection, and so behaves like RBF for certain parameters. The reason on. The data acquisition collects and transforms the
  3. 3. 19th International Conference on Production Research using RBF kernels is the number of hyper-parameters and the width of the RBF kernel parameter σ. To optimize which influences the complexity of model selection. The the two parameters, the “grid-search” method above was polynomial kernel has more hyper-parameters than the applied in the present work. In fact, optimizing the model RBF kernel. Finally, for the RBF kernel, it has less parameters need an iterative process which can numerical difficulties; and a key point is 0 < k ( x, y ) < 1 continuously shrink the searching area and as a result, obtain a satisfying solution. Table1 lists the final searching in contrast to polynomial kernels of which kernel values area and optimal values of the four SVM models, may go to infinity or zero while the degree is large. respectively. Moreover, it is noted that the sigmoid kernel is not valid (i.e. not the inner product of two vectors) under some After the completion of model development or training, all parameters [4]. the models based on SVM (and ANN) were subjected to the unseen testing data set. Statistical parameters such 3.3 Optimization of Model Parameter as the correlation coefficient between the actual and Obviously, in the SVM model there are still two key predicted values (R), mean squared error, and mean error parameters need choosing: C and σ. Unfortunately, it is %, were used to compare the predictive power of the difficult to know beforehand which C and σ are the best SVM-based and ANN-based models. Results are shown for one problem. Our goal is just about to identify good (C, in Table2. It has observed that for ANN models, the mean σ) so that the model can accurately predict unknown data error (%) of three models is more than 10% except that (i.e., testing data). Therefore, a common way is to the CV% remains about 5%, and the correlation separate training data to two parts of which one is coefficient (R) of the CV% and EB models is very low, considered unknown in training the model. Then the shown as 0.76 and 0.67 respectively. However, for SVM prediction accuracy on data sets can more precisely models, the mean error (%) is less than 10% except that reflect the performance on predicting unknown data. The the ED is still high, and the correlation coefficient (R) of all procedure for improved model is called as cross- models is improved to more than 0.80. On the other hand, validation. The cross-validation procedure can also the cases with over 10% error also decrease from 5 and 6 prevent the over-fitting problem furthermore. In this study, in ANN models to 2 and 3 in SVM models. In fact, among the regression function was built with a given set of all four yarn properties considered in our work, end-down parameters {C, σ}.The performance of the parameter set per 1000 spindle hours could be affected by different is measured by the computational risk, here mean operators and observers [10], which data often result in squared error (MSE, see Equation 9) on the last subset. undermining the prediction accuracy of various regression The above procedure is repeated p times, so that each models. Even so, for ED, almost all statistical parameters subset is used once for testing. Averaging the MSE over using SVM model seem to be much better than using the p trials gives an estimate of the expected ANN model p −1 5 CONCLUSIONS generalization error for training on sets of size ⋅l , l Support vector machines are a new learning-by- p example paradigm with many potential applications in is the number of training data. science and engineering. The salient features of SVM include the absence of local minima, the sparseness of 1 p q MSE = ∑∑ ( yti( j ) − y (pij ) ) 2 pq j =1 i =1 (9) the solution and the improved generalization. SVMs being a relatively new technique, their application on textile production have hitherto been quite limited. However, the where q is the sample number of tested subset in the elegance of the formalism involved and their successful use in diverse science and engineering applications training set; y ti j ) and y (pij ) ( are the i th observed value and confirm the expectations raised in this appealing learning from examples approach. In this study, we presented the predicted value under j th tested subset, respectively. In SVM model for predicting the yarn properties and order to capture the better pairs of (C, σ), a “grid-search” compared with the BP neural network model. We have [8] on C and σ is employed in this work. Firstly, in term of found that like ANN model, the SVM model is able to possible range of the two parameters, C and σ were predict to a reasonably good accuracy in most of cases. divided r pairs; then each pair of the parameters was tried And a more interested phenomenon is that in small data using cross-validation and the one with the best cross- set and real-life production, the predictive power of ANN validation accuracy was picked up as optimal parameters models appears to decrease, while SVM models are still of the model. capable of remaining the stability of predictive accuracy to 4 THE EXPERIMENTS STUDY some extent. The experimental results indicate that the SVM models are more suitable for noisy and dynamic In this work, a small population (a total of twenty-six spinning process. Of course, like other emerging industrial different data samples) from real worsted spinning was techniques, applied issues on SVM reaffirm the due acquired. To demonstrate the generalization performance commitment to their further development and of SVM model, different experiments were completed and investigation, such as the problems how to design the comparisons with ANN models.To make problem more kernel function and how to set the SVM hyper-parameters simply, like most ANN models[2, 9], some fibre properties (to make the industrial model development more easily). and processing information were selected as the SVM Our research thus far demonstrates that SVMs are able to model’s inputs, which were mean fibre diameter (MFD, provide an alternative solution for the spinners to predict μm), diameter distribute (CVD, %), hauteur (HT, mm), yarn properties more correctly and reliably fiber length distribution (CVH, %), short fiber content (SFC, %), yarn count (CT, tex), twist (TW, t.p.m), draft 6 ACKNOWLEDGMENT ratio (DR), spinning speed (SS, r.p.m), traveler number This research was supported by national science (TN). Four yarn properties, namely unevenness (CV %), foundation and technology support plan of the People elongation at break (EB, %), break force (BF, cN) and Republic of China, under contract number 70371040 and end-down per 1000 spindle hour (ED), served as the SVM 2006BAF01A44 respectively. model’s outputs. 7 REFERENCES One of the primary aspects of developing a SVM regression model is the selection of the penalty term C
  4. 4. 19th International Conference on Production Research [1] Renate Esswein, “Knowledge assures quality”, [7] Athanassia Chalimourda, B. Scholkopt, A. Smola, International Textile Bulletin, 2004, Vol15, no2, “Experimentally Optimal ν in Support Vector 17~21, Regression for Different Noise Models and [2] R. Chattonpadhyay and A. Guha, “Artificial Neural Parameter Settings”, IEEE trans. on Neural Netw., Networks: Applications to Textiles”, Textile Progress, 2004, Vol17, no2, 127-141 2004, Vol35, no1, 1~42, [8] Chih-Wei Hsu, Chih-Chung Chang, and Chih-Jen [3] V. David Sanchez A, “Advanced Support Vector Lin, A practical guide to support vector classification, Machines and Kernel Methods”, Neurocomputing, available at http://www.csie.ntu.edu.tw/~cjlin/paper 2003, Vol55, no3, 5-20 , [9] Refael B., Lijing W., Xungai W., “Predicting worsted [4] V. N. Vapnik, 1999, The Nature of Statistical Learning spinning performance with an artificial neural Theory, 2nd ed., Berlin: Springer, 31-188, network model”, Textile Res. J. , 2004, Vol74, no.8, 757-763, [5] B. Scholkopf, C. Burges, and A. Smola, 1999, Advances in Kernel Methods—Support Vector [10] Peter R. Lord, 2003, Handbook of Yarn Production Learning. Cambridge, MA: MIT Press, 5-73, (Technology, Science and Economics), Abinhton England: Woodhead publishing Limited, 95-212 [6] B. Scholkopf, Smola A. and Williamson. R.C., et al, “New support vector algorithms”, Neural Computation, 2000, Vol12, no4, 1207-1245, Table1 The optimal values of σand C Output parameter Optimal value CV % = σC 0.973, = 1606 Elongation at break = σC 0.016, = 14.55 Breaking force = σC 0.012, = 101.19 Ends-down = σC 0.287, = 2.975 Table2 Comparison of the predictive power of the SVM-based and ANN-based models Predicted value using ANN model Predicted value using SVM model Sample No. CV% EB BF ED CV% EB BF ED  W21 19.32 13.81 113.89 70.41 19.66 12.85 116.24 72.06  W22 20.52 16.55 61.91 75.78 20.88 12.25 76.87 72.40  W23 15.62 12.32 153.46 39.40 16.84 15.59 156.57 42.22  W24 20.66 16.55 61.91 75.78 20.75 12.25 76.87 72.40  W25 22.60 19.77 47.00 69.84 19.66 12.76 76.86 59.31  W26 20.70 11.87 66.76 79.22 21.20 12.59 66.62 81.27 Correlation 0.76 0.67 0.96 0.88 0.88 0.80 0.99 0.91 coefficient. R Mean squared error 0.01 0.12 0.07 0.03 0.003 0.05 0.01 0.03 Mean error% 5.73 24.35 13.67 19.99 2.85 9.23 5.52 17.29 Cases with 1 6 5 6 0 2 2 3 over 10% error