STUDIA INFORATICA 2009 Volume 30 Number 2A (83 Jerzy ARTYNA Uwersytet Jagellońsk, Istytut Iformatyk FUZZY SUPPORT VECTOR ACHINES BASED ON DENSITY ESTIATION WITH GAUSSIAN IXTURE FOR ULTICLASS PROBLES Summary. I ths paper, we troduce ew Fuzzy Support Vector aches (FSVs for a multclass classfcato. The suggested Fuzzy Support Vector aches clude the data dstrbuto wth the desty estmated a set of fuctos defed as Gaussa mxture. The proposed method gves more approprate boudares tha the classcal FSV method. We demostrate some examples whch cofrm our approach. Keywords: Fuzzy Support Vector ache, desty, multclass problems, membershp fuctos ROZYTA ETODA SV OPARTA NA ESTYACJI GĘSTOŚCI Z IESZANKĄ GAUSSOWSKĄ DLA ROZWIĄZYWANIA PROBLEÓW WIELOKLASOWYCH Streszczee. W pracy przedstawoo matematyczy model, jakm jest Fuzzy Support Vector ache (FSV, czyl rozmyta maszya wektorów podperających. Wprowadzoo w m estymację gęstośc opartą a zborze fukcj defowaych jako meszaka fukcj gaussowskch. Zapropoowaa metoda dostarcza lepszych ograczeń ż dotychczas stosoway model FSV. Demostrujemy klka przykładów, które potwerdzają opsywae podejśce. Słowa kluczowe: rozmyta maszya wektorów podperających, gęstość, problemy weloklasowe, fukcje przyależośc
30 J. artya. Itroducto Support Vector aches (SVs [0, 8] have bee used may applcatos for classfcato ad regresso [9, 4, 6, 7]. The SV method s maly used for classfcato of two classes. It s caused by the exstece of some uclassfable regos whch appear the multclass problems. I order to avod ths problem the Fuzzy Support Vector aches (FSVs were proposed [5,, 3]. I these papers, the fuzzy membershps are assged accordg to the dstace betwee the patters. Nevertheless, so treated FSVs do ot take the dstrbuto of the data. Therefore, gve FSVs caot well adjust the decso boudares for regos of data sets. The ma goal of ths paper s to troduce a ew method of a multclass classfcato whch a uclassfable rego ca be resolved. I the proposed FSVs decso boudares are used whch cosder ot oly the optmal class separatg the hyperplaes the SV, but also the desty of the dstrbuto of the patters. As oe of the best approxmatos of the desty estmato we used a set of fuctos defed [0, as Gaussa mxtures. As a result, the multclass problem ca be better solved for data whch are geerally dstrbuted. The structure of the paper s as follows: secto 2 explas the FSVs method. The ext secto presets our soluto based o a approxmato of a desty wth Gaussa mxtures. I secto 4, we gve our proposed algorthm. I secto 5, we preset some umerc results of focusg o the advatages ad dsadvatages of our approach. Fally, secto 6, we gve our cocluso ad propose some future research. 2. Fuzzy Support Vector aches The Fuzzy Support Vector aches were troduced by T. Ioue, S. Abe, T. Dasuke the papers [5,, 3]. Let, ] be trag data where x R s the put ad y {, } s the output. [ x y = The optmal separatg hyperplae defed as D ( x = w x + b s the decso fucto. It ca be obtaed by solvg the followg problem: mmze: 2 w ( 2 T subject to y ( w x + b, =,...., The above Eq. ( ca be formulated a smple maer, amely
Fuzzy Support Vector aches based o desty estmato wth Gaussa... 3 mmze: W ( α = α y y jα α j x j x (2 = 2 j= = subject to y α = 0, α 0, =,..., = where α s a Lagrage multpler. The optmal weght vector w ad bas b ca be obtaed as follows: w = yα x = = * * b [max( w x + m( w x ] 2 y = y = (3 The decso fucto D(x ca be calculated from the above results, amely whe D( x > 0, patter x s classfed as class. Otherwse, t s classfed as class 2. The multclass problem s acheved by defg the decso fucto for the par class ad j as follows: where D ( x = w x + b (4 j T j j D ( x = D ( x. For trag data x, =,..., we have j j k D = sg( D ( x, (5 =, j j where sg(. = for (. > 0 ad zero otherwse. The value of x s categorzed by arg max D =,..., k ( x (6 3. Fuzzy Support Vector aches based o the desty wth Gaussa xtures for multclass problems I ths secto, we preset Fuzzy Support Vector aches based o the desty wth Gaussa mxtures for multclass problems. We resolve the problem of multclass regos the above preseted FSVs classfcato wth the use of a estmator: f ( x = K ( x, x (7 = where x, x... 2, x are the emprcal data obtaed from the observato of a -dmesoal radom varable x wth the probablty desty fucto f, s the kerel fucto. As the kerel fucto we ca use the so called Parze kerel [0] gve as follows: K
32 J. artya where K h ( x, u = h x u K( h s a fucto of the legth of trag data (8 h lm h = 0 ad lm h = (9 We ca provde [2] that 2 E [ f ( x f ( x] 0 the absolutely cotuous pots of f. Fucto K ca be gve the form ( K( x = H ( x ( = Assumg that fucto H s a Gaussa form type, we have f ( x = (2π 2 h T ( x x ( x x exp = 2h Wth the help of the above-gve equato, we ca redefe the decso boudary (0 (2 (x D j Eq. (5, amely T D ( x = γ ( w + b + ( γ ( f ( x f ( x (3 where j ad j j j j deote the class par, γ s a parameter that dcates the weght betwee the FSVs ad the approxmato of desty wth Gaussa mxture. The membershp fucto of x for a gve class mmum operator, amely s defed wth the help of the m ( x = m m ( x (4 j=,..., j 4. Expermetal Results Based o the FSV cocept, ths paper has made use of a scrpt wth ths method cluded to Oracle Data g Software. The scrpt are prepared usg kerel-depedet formula such as the oes gve for polyomal kerel wth degree 2 or Gaussa mxture as kerel. We use our method for classfcato hgh dmesoal data sets such as glass, ad cereals. For stace, Fg. shows a sapshot of the system for the classfcato tree obtaed for the cereals data set wth the help of FSV wth polyomal kerel. Table shows the classfcato results for these data sets.
Fuzzy Support Vector aches based o desty estmato wth Gaussa... 33 Fg.. The classfcato tree obtaed for the cereals data set wth the help of FSV method Rys.. Drzewo klasyfkacj uzyskae dla zboru daych cereals przy użycem metody FSV Table Classfcato results for glass ad cereals data sets Data set Class Feature glass 6 6 cereals 6 77 Recogto rate results for these data sets are show Fg. 2. We appled the frst 60% of data records for trag ad we used the remag 40% patters for testg. Parameter γ was studed relato to the umber of classes. By varyg parameter we have observed that for gog to zero, we obtaed the same result as the result obtaed wth the help of the FSV wth polyomal kerel. If parameter γ creases to value, we have obta the most recogto of the patter. These results dcate that the decso boudares are thus ot uformly dstrbuted as the FSVs method. Fgures 3 ad 4 llustrate the ROC curve of two classfcatos for the glass data set that are obtaed wth the FSV method based o the polyomal ad o the Gaussa mxture as
34 J. artya kerels, respectvely. As we ca see, the model of classfcato from Fg. 4 has better true postves tha model of classfcato from Fg. 3. Fg. 2. Recogto rate for the cereals ad glass data sets Rys. 2. Itesywość rozpozawaa dla zborów daych cereals oraz glas Fg. 3. The ROC curve of classfcato for the glass data set obtaed wth the FSV method based o the polyomal kerel Rys. 3. Krzywa ROC klasyfkacj zboru daych glass uzyskaa metodą FSV z jądrem welomaowym
Fuzzy Support Vector aches based o desty estmato wth Gaussa... 35 Fg. 4. The ROC curve of classfcato for the glass data set obtaed wth the FSV method based o Gaussa fuctos mxture kerel Rys. 4. Krzywa ROC klasyfkacj zboru daych glass uzyskaa metodą FSV z jądrem w postac meszak fukcj Gaussa 5. Cocluso The paper proposed a ew method for a multclass classfcato wth the help of Fuzzy Support Vector aches based o Gaussa desty fuctos. Our method ca mprove the problem of uclassfable regos, whch s typcal FSV. Our FSV method wth desty based o Gaussa fuctos allows us to overcome these dffcultes. oreover, selectg approprate parameters ca provde adequate accuracy of classfcato. I future, we wll vestgate percetage errors classfcato wth the help of FSV based o Gaussa desty fucto ad compare the obtaed results wth other fuzzy classfcato methods. BIBLIOGRAPHY. Abe S., Ioue T.: Fuzzy Support Vector aches for ultclass Problems. Neural Networks, Vol. 6, 2003, p. 785 792. 2. Cacoullos T.: Estmato of ultvarate Desty. A. Ist. Statst. ath., Vol. 8, 965, p. 79 89. 3. Dasuke T., Shgo A.: Fuzzy Least Squares Vector aches for ultclass Problems. Neural Networks, Vol. 6, 2003, p. 785 792. 4. Drucker H. D., Wu D., Vapk V. N.: Support Vector for Spam Categorzato. Tras. o Neural Networks, Vol. 0, No. 5, 999, p. 048 054.
36 J. artya 5. Ioue T., Abe S.: Fuzzy Support Vector aches for Patter Classfcato. I: Proc. of the It. Jot Cof. o Neural Networks, 2000, p. 449 454. 6. Joachms T.: Text Categorzato wth Support Vector aches: Learg wth ay Relevat Features. I: Proc. of the Europea Coferece o ache Learg. Sprger- Verlag, 998, p. 37 42. 7. üller K. R., Smola A. J., Rätsch G., Schölkopf B., Kohlmorge J., Vapk N. V.: Predctg Tme Seres wth Support Vector aches. I: Proc. It. Cof. o Artfcal Neural Networks. ICANN-97, 997, p. 999 004. 8. Nello C., Joh S.: A Itroducto to Support Vector aches ad Other Kerel-based Learg ethods. Cambrdge Uversty Press, 2000. 9. Vapk V. N., Golowch S. E., Smola A.: Support Vector ethod for Fucto Approxmato, Regresso Estmato, ad Sgal Processg. I:. ozer,. I. Jorda, T. Petsche (eds.: Advaces Neural Iformato Processg System 9. orga Kaufma, 997, p. 28 287. 0. Vapk V. N.: Statstcal Learg Theory. Joh Wley ad Sos, 998. Recezet: Dr ż. Jerzy Respodek Wpłyęło do Redakcj 5 marca 2009 r. Omówee W pracy przedstawoo rozmytą maszyę wektorów wsperających (FSV, w której zastosowao estymację gęstośc opartą a zborze fukcj gaussowskch. Rozwązae to pozwala e tylko a optymalą separację klas, jak to mało mejsce w dotychczas stosowaej metodze FSV, lecz także a lepszą aproksymację gęstośc we wzorcach uczących. W rezultace uzyskao dokładejsze ograczea przy rozwązywau problemów weloklasowych. Address Jerzy ARTYNA: Uwersytet Jagellońsk, Istytut Iformatyk, ul. Łojasewcza 6, 30-348 Kraków, Polska, martya@softlab..uj.edu.pl.