A HYBRID CLASSIFIER BASED ON SVM METHOD FOR CANCER CLASSIFICATION

Podobne dokumenty
FUZZY SUPPORT VECTOR MACHINES BASED ON DENSITY ESTIMATION WITH GAUSSIAN MIXTURE FOR MULTICLASS PROBLEMS

Hard-Margin Support Vector Machines

All Saints Day. Chants of the Proper of the Mass for. Adapted to English words and Edited by. Bruce E. Ford

Articulated Body Motion Tracking by Combined Particle Swarm Optimization and Particle Filtering

Linear Classification and Logistic Regression. Pascal Fua IC-CVLab

Previously on CSCI 4622

Metoda Monte-Carlo i inne zagadnienia 1

Helena Boguta, klasa 8W, rok szkolny 2018/2019

deep learning for NLP (5 lectures)

DODATKOWE ĆWICZENIA EGZAMINACYJNE

QUANTITATIVE ASSESSMENT OF CONSTRUCTION RISK

DYNAMIC FEEDBACK STABILIZATION OF NONLINEAR RC LADDER NETWORK

Tychy, plan miasta: Skala 1: (Polish Edition)

ELEKTRYKA Wojciech MITKOWSKI, Anna OBRĄCZKA Katedra Automatyki, Akademia Górniczo-Hutnicza im. Stanisława Staszica w Krakowie

Latent Dirichlet Allocation Models and their Evaluation IT for Practice 2016

Machine Learning for Data Science (CS4786) Lecture 11. Spectral Embedding + Clustering

SNP SNP Business Partner Data Checker. Prezentacja produktu

SSW1.1, HFW Fry #20, Zeno #25 Benchmark: Qtr.1. Fry #65, Zeno #67. like

Oscillating scalar fields and the Hubble tension: a solution with novel features

OpenPoland.net API Documentation

Machine Learning for Data Science (CS4786) Lecture11. Random Projections & Canonical Correlation Analysis

Using average-variance number system in calculation of a synthetic development measure

TTIC 31210: Advanced Natural Language Processing. Kevin Gimpel Spring Lecture 9: Inference in Structured Prediction


SNP Business Partner Data Checker. Prezentacja produktu

ZESZYTY NAUKOWE NR 10(82) AKADEMII MORSKIEJ W SZCZECINIE. Probabilistic Analysis of Marine Binary Technical Systems Represented by Boolean Models

Machine Learning for Data Science (CS4786) Lecture 8. Kernel PCA & Isomap + TSNE

Wprowadzenie do programu RapidMiner, część 2 Michał Bereta 1. Wykorzystanie wykresu ROC do porównania modeli klasyfikatorów

Lekcja 1 Przedstawianie się

Rachunek lambda, zima

Extraclass. Football Men. Season 2009/10 - Autumn round

Revenue Maximization. Sept. 25, 2018

Analysis of Movie Profitability STAT 469 IN CLASS ANALYSIS #2

JĘZYK ANGIELSKI POZIOM PODSTAWOWY


Weronika Mysliwiec, klasa 8W, rok szkolny 2018/2019

FASCINATING RHYTHM. for S.A.B. voices and piano with optional SoundPax and SoundTrax CD* Preview Only. Got mp. E m/g B 7sus/F E m A 9 E 5 E m7/b

Rozdział 1. Nazwa i adres Zamawiającego Gdyńskie Centrum Sportu jednostka budżetowa w Gdyni Rozdział 2. Informacja o trybie i stosowaniu przepisów

Egzamin maturalny z języka angielskiego na poziomie dwujęzycznym Rozmowa wstępna (wyłącznie dla egzaminującego)

Zakopane, plan miasta: Skala ok. 1: = City map (Polish Edition)

Test sprawdzający znajomość języka angielskiego

Zmiany techniczne wprowadzone w wersji Comarch ERP Altum

ć Ź Ę ź Ó ż ż Ś Ć Ś

Rozpoznawanie twarzy metodą PCA Michał Bereta 1. Testowanie statystycznej istotności różnic między jakością klasyfikatorów

Pro-tumoral immune cell alterations in wild type and Shbdeficient mice in response to 4T1 breast carcinomas

Wojewodztwo Koszalinskie: Obiekty i walory krajoznawcze (Inwentaryzacja krajoznawcza Polski) (Polish Edition)

TTIC 31210: Advanced Natural Language Processing. Kevin Gimpel Spring Lecture 8: Structured PredicCon 2

photo graphic Jan Witkowski Project for exhibition compositions typography colors : : janwi@janwi.com

ARNOLD. EDUKACJA KULTURYSTY (POLSKA WERSJA JEZYKOWA) BY DOUGLAS KENT HALL

PRZEDSIEBIORSTWO LUSARSKO-BUDOWLANE LESZEK PLUTA

Zestawienie czasów angielskich

Few-fermion thermometry

ź ź Ź

ż ć Ć ż ć ż Ć ż Ć ż

Katowice, plan miasta: Skala 1: = City map = Stadtplan (Polish Edition)

January 1st, Canvas Prints including Stretching. What We Use

THE APPLICATION OF THE TOOLS OF SPATIAL STATISTICS TO EVALUATION REGIONAL DIFFERENTIATION OF POLISH AGRICULTURE

Warsztaty Ocena wiarygodności badania z randomizacją

EGZAMIN MATURALNY Z JĘZYKA ANGIELSKIEGO

u l. W i d o k 8 t e l

Krytyczne czynniki sukcesu w zarządzaniu projektami

English Challenge: 13 Days With Real-Life English. Agnieszka Biały Kamil Kondziołka

EGZAMIN MATURALNY 2012 JĘZYK ANGIELSKI

Ankiety Nowe funkcje! Pomoc Twoje konto Wyloguj. BIODIVERSITY OF RIVERS: Survey to students

świat regeneracji kompresory zawieszenia

EGZAMIN MATURALNY 2012 JĘZYK ANGIELSKI

tum.de/fall2018/ in2357

Rozdział 1. Nazwa i adres Zamawiającego Gdyńskie Centrum Sportu jednostka budżetowa Rozdział 2. Informacja o trybie i stosowaniu przepisów

Zarządzanie sieciami telekomunikacyjnymi

Zawód: stolarz meblowy I. Etap teoretyczny (część pisemna i ustna) egzaminu obejmuje: Z ak res wi ad omoś c i i u mi ej ę tn oś c i wł aś c i wyc h d

Wojewodztwo Koszalinskie: Obiekty i walory krajoznawcze (Inwentaryzacja krajoznawcza Polski) (Polish Edition)

17-18 września 2016 Spółka Limited w UK. Jako Wehikuł Inwestycyjny. Marek Niedźwiedź. InvestCamp 2016 PL

ORTHOGONALITY OF LEGENDRE POLYNOMIALS

Ogólnopolski Próbny Egzamin Ósmoklasisty z OPERONEM. Język angielski Kartoteka testu. Wymagania szczegółowe Uczeń: Poprawna odpowiedź 1.1.

Klasyfikacja naiwny Bayes

Proposal of thesis topic for mgr in. (MSE) programme in Telecommunications and Computer Science

Convolution semigroups with linear Jacobi parameters


Testy jednostkowe - zastosowanie oprogramowania JUNIT 4.0 Zofia Kruczkiewicz

TEST FOR SOBRIETY KONTROLA STANU TRZEŹWOŚCI

WENTYLATORY PROMIENIOWE SINGLE-INLET DRUM BĘBNOWE JEDNOSTRUMIENIOWE CENTRIFUGAL FAN


ABOUT NEW EASTERN EUROPE BESTmQUARTERLYmJOURNAL

Karpacz, plan miasta 1:10 000: Panorama Karkonoszy, mapa szlakow turystycznych (Polish Edition)

Emilka szuka swojej gwiazdy / Emily Climbs (Emily, #2)

PIERWIASTKI W UKŁADZIE OKRESOWYM

Wojewodztwo Koszalinskie: Obiekty i walory krajoznawcze (Inwentaryzacja krajoznawcza Polski) (Polish Edition)

Narzędzia kosmetyczne Cosmetic and podiatry instruments

Automatyczne generowanie testów z modeli. Bogdan Bereza Automatyczne generowanie testów z modeli

Title: On the curl of singular completely continous vector fields in Banach spaces

y = The Chain Rule Show all work. No calculator unless otherwise stated. If asked to Explain your answer, write in complete sentences.

UMOWY WYPOŻYCZENIA KOMENTARZ

ANALYSIS OF RESONANCE PHENOMENA IN COMPLEX FRACTIONAL ORDER CIRCUITS ANALIZA ZJAWISK REZONANSOWYCH W ZŁOŻONYCH OBWODACH UŁAMKOWEGO RZĘDU

Dominika Janik-Hornik (Uniwersytet Ekonomiczny w Katowicach) Kornelia Kamińska (ESN Akademia Górniczo-Hutnicza) Dorota Rytwińska (FRSE)

MaPlan Sp. z O.O. Click here if your download doesn"t start automatically

DUAL SIMILARITY OF VOLTAGE TO CURRENT AND CURRENT TO VOLTAGE TRANSFER FUNCTION OF HYBRID ACTIVE TWO- PORTS WITH CONVERSION

Ł Ł Ł Ł Ł Ą Ó Ł Ł Ł Ś Ń Ą Ć Ł Ó Ł Ł Ą Ą Ł Ł ý Ď Ł ŕ Ł Ł Ł Ł Ó Ó Ł Ł Ł Ł Ć Ł Ń Ó Ż Ł Ł Ą Ł Ł Ą Ł Ą ŕ

Patients price acceptance SELECTED FINDINGS

Klaps za karę. Wyniki badania dotyczącego postaw i stosowania kar fizycznych. Joanna Włodarczyk

Bardzo formalny, odbiorca posiada specjalny tytuł, który jest używany zamiast nazwiska

Transkrypt:

SUIA INORMAIA 2009 Volue 30 Nuber 2A 83 Weroa IĄKOWSKA, Jerzy MARYNA Jagelloa Uversty, Isttute of outer Scece A HYBRI LASSIIER BASE ON SVM MEHO OR ANER LASSIIAION Suary. I ths aer, we roosed a ew ethod of alyg Suort Vector Maches SVMs for cacer classfcato. We roosed a hybrd classfer that cosders the degree of a ebersh fucto of each class wth the hel of uzzy Nave Bayes NB ad the orgazes oe-versus-rest OVR SVMs as the archtecture classfyg to the corresodg class. I ths ethod, we used a ovel syste of orderg the recogzed eresso rofles by eas of usg NB ad geerg SVMs wth the OVR schee. he results show that our hybrd classfer s coarable to the covetoal ethods. Keywords: SVM ethod, uzzy Nave Bayes, cacer classfcato HYBRYOWY KLASYIKAOR OARY NA MEOZIE SVM LA KLASYIKAJI HORÓB ONKOLOGIZNYH Streszczee. W artyule zarooowao ową etodę lasyfac chorób oologczych. Użyto w e.. awego, rozytego lasyfatora bayesowsego ag. uzzy Nave Bayes oraz aszyy z wetora wseraący ag. Suort Vector Maches ao systeu lasyfuącego. a owstały hybrydowy lasyfator lasyfue choroby oologcze orówywale z owecoaly etoda. Słowa luczowe: etoda SVM, awy rozyty bayes, lasyfaca chorób oologczych 1. Itroducto Suort Vector Maches SVMs are adatve learg systes whch receve labeled trag data ad trasfor these robles to otzato robles [12]. SVMs are

300 W. ątowsa, J. Martya usually solved by fdg solutos to quadratc rograg robles. Orgally the SVMs were used for bary atter classfcato robles where data were learly searable, but the algorth has bee eteded to hadle data that are ot searable by troducg slac varables [3] ad to use olear decso regos va erel fuctos [9]. herefore, a soluto to the SVMs worg wth sutable erel fuctos ca be foud by solvg the quadratc rograg roble the dual observato sace rather tha the ral feature sace, thereby reducg overall coutatos. NA croarrays cota forato about the gee eresso varatos of cells dfferet tssues [1]. he croarrays allow to uderstad the actvtes of gees uderlyg dfferet cacers. hus, the obtaed forato ca tur be used to detfy tyes or subtyes of cacers Mcroarrays allows to uderstad the actvtes of gees uderlyg dfferet cacers. hus, the obtaed forato ca tur be used to detyfy tyes or subtyes of cacers. Are use curretly two tyes of NA croarrays: the sotted cna [4] develoed at Staford Uversty ad dgoucleotde chs [6] develoed by Affyetr. Sotted croarrays are ade of a sold surface oto whch scule aouts sots of sgle strads of ucleotde sequeces are laced whch are deosted by a autoated rocess called cotact sottg a grd-le arrageet. Each sot defes a secfc gee ad serves as a robe agast whch a sale RNA s hybrdzed. Wth dgoucletde chs the robes are sythetzed o the array o the bass of the sequeces of estg or hyothetcal gees usg hotolthograhc techology. Affyetr also uses ultle robes to rereset the gees. I ost coutatoal eerets wth croarrays the raw data develoed fro these arrays ust be coutatoally collected, rocessed, ad tegrated. hs rocess of data rearato s called re-rocessg. It allows for coesatg systeatc easureet errors due to array equet erfecto ad also for obtag a sgle eresso level for each gee. As a result, the data fro dfferet croarrays are tegrated to a sgle data atr. Each row of ths atr of gee eresso corresods to a dfferet gee. Each colu corresods to a dfferet sale of te stat of whch the eresso data were easureed. I ths aer, we roose a ew odfed SVM ethod for cacer classfcato. he uzzy Naïve Bayes ethod descrbed by Rado ad Lawry [11] ad used atter recogto ad data aalyss reles o the use of soe dstace fucto. I the roosed ethod, the selecto stage by the Bayesa lelhood ftess fucto are added to covetoal SVM ethod. he raader of ths aer s orgazed as follows. I secto 2, we gve basc cocets of cacer classfcato wth the use of the SVMs ethod. I secto 3, we overvew the NB

A hybrd classfer based o SVM ethod for cacer classfcato 301 ethod that was roosed to resolve uclassfable regos ultclass robles. I secto 4, we gve several eerets results to show the valdty of our roosed ethod. ally secto 5 gves the coclusos. 2. Basc cocets of cacer classfcato usg SVMs I ths secto we gve basc cocets of cacer classfcato wth the use of the SVMs ethod. Wth the hel of the croarray techologes a large volue of gee eresso rofles s roduced. Mcroarray techques lead to a colete uderstadg of the olecular varatos aog dseases. hese gee eressos rovde forato about lless cludg soe tyes of cacers. Several data g ethods have bee develoed whch volve classfcato of gee eressos [8]. he gee eressos allow for obtag soe forato whch s useful for the classfer buldg. he rrelevat or redudat data ca decrease the accuracy of classfcato. herefore, a classfer whch s suffcetly resstat to accuracy ust be rovded. he SVMs ethod reresets oe of the ost ortat classfers. We recall that the SVM as a ut sale o a hgh-desoal sace ad zes the uber of sclassfed obects the trag set ad azes the arg betwee the boudg laes. N or trag set {, } wth the ut data y 1 the class label y { 1,1 }, the SVM calculates the lear classfer R ad the outut data R y wth y sg[ w + b] 1 Whe the data of the two classes are searable we have the orgal SVM classfer [12], [13], [14] that satsfes the followg codtos. w φ + b + 1 f y 1 w φ + b 1 f y 1 hese two sets of equaltes ca be cobed to oe sgle set as follows: 2 where y [ w φ + b] 1 0, 1,2,... N 3, φ : R R s the feature ag the ut sace to a usually hgh desoal feature sace. he data ots are learly searable by a hyerlae defed by the ar w R, b R. hus, the classfcato fucto s gve by f sg{ w φ + b} 4

302 W. ątowsa, J. Martya Istead of estatg wth the hel of the feature a we wor wth a erel fucto the orgal sace gve by K, y φ φ y 5 We troduce slac varable ξ such that y [ w φ + b] 1 ξ, ξ > 0, 1,2,..., N 6 he followg zato roble s accouted for as follows: subect to w, b, ξ J w, b, ξ 1 2 w 2 N + ξ 1 y[ w φ + b] 1 ξ, ξ > 0, 1,2,... N, > 0 8 where s a ostve costat araeter used to cotrol the tradeoff betwee the trag error ad the arg. he dual roble of the syste 8, obtaed as a result of Karush-Kuh-ucer KK codto, leads to a well-ow cove quadratc rograg Q. 7 3. A hybrd classfer based o SVMs for cacer classfcato I ths secto, we reset our hybrd classfer for cacer classfcato whch s based o SVMs ad uzzy Nave Bayes NB. he overvew of our hybrd classfer s gve g. 1. uzzy Nave Bayes NB are used to estate the robablty for classes rob,,..., }, whle SVMs classfy { 1 2 sales by usg the orgal trag data set of gee eresso rofles. he roosed SVMs allows for a robablstc orderg of cacer classes whch, further, s used by our NB after ts estato. he uzzy Nave Bayes are geerally based o the Bayesa theore. We assue that a focal set for each attrbute s gve. Let attrbute be uerc wth uverse Ω, the the lelhood of gve ca be rereseted by a desty fucto detered fro the gee eresso rofles ad a ror desty accordg to Jeffrey s rule [5], aely ro Bayes theore, we ca obta 9

A hybrd classfer based o SVM ethod for cacer classfcato 303 10 where Ω d 11 Substtutg Eq. 10 Eq. 11 ad re-arragg gves: f 12 where ca be derved fro accordg to 13 g. 1. Structure of hybrd classfer for cacer classfcato Rys. 1. Strutura hybrydowego lasyfatora dla lasyfac chorób oologczych hs odel, called uzzy Nave Bayes NB, ca rovde soe easures. he robablty of each class ca be calculated wth the use of Bayes theore [7], aely: ar ar ar er er er 14 where er er ar ar ad er ar s a feature of the arer gee.

304 W. ątowsa, J. Martya o rove the classfcato erforace we used a earso correlato as easure of the slarty betwee a deal arer ad gee g. he earso correlato [2] s used here as follows: ear 1 deal g 1deal 1g / 2 2 2 2 1deal 1deal / 1 g 1 g / 15 where s the uber of gees the croarray data set ad deal s a -th gee the croarray selected as the deal arer. able 1 ofuso atr acer tye 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1. Breast 65 35 2. rostate 86 14 3. Lug 100 4. olorectal 100 5. Lyhoa 10 90 6. Bladder 20 80 7. Melaoa 78 22 8. Uterus_adeo 100 9. Leuea 10 90 10. Real 67 11. acreas 33 33 34 12. Ovary 25 25 50 13. Mesotheloa 100 14. NS 100 We assued that a gee s a foratve gee f the dstace gve by the earso correlato ear s sall, whle the gee s ot a foratve gee f the dstace s large. 4. A eale of aalyss o evaluate our roosed ethod, we used the GM data set ublshed by Raaaway et al. 2001 [10]. It cossts of 144 trag sales ad 54 testg sales of 14 cacer classes. Each sale ossesses 16063 gee eresso levels. he etoed GM data set s avalable at: ht://www.geoe.w.t.edu/mr/gm. Eght etastatc sales fro the testg sales were droed, therefore the used testg sales cossted of 46 testg sales ad 14 cacer classes. Accordg to our ethod we selected 140 gees for learg NB based o the earso correlato. We used the lear erel fucto of SVMs. he features of sales are oralzed fro 0 to 1.

A hybrd classfer based o SVM ethod for cacer classfcato 305 he obtaed cofuso atr for the gve 14 cacer classes s gve able 1. As the codg strategy we used the wer-taes-all ethod. able 2 he accurracy of used ethods Method Accurracy % OVR-SVM 72 NB 68 Hybrd classfer 80 Our rogras are wrtte the MALAB laguage. Addtoally, we used the software acage for the SVM algorth whch s avalable at htt://www.erel-aches.org. I able 2 we coare the accurracy of used ethods. SVMs wth the oe-versus-rest strategy gave 72% classfcato accuracy. he NB acheved 68%. he hybrd ethod of the OVR-SVM ad the NB roduced the accuracy equal to 80%. It has bee show that our ethod has classfed better tha the OVR-SVM ad the NB treated searetely. 5. oclusos he hybrd classfer based o SVMs to ultclass croarray classfcato has bee vestgated for cacer recogto. he roosed ethod tegrates SVMs ad the NB leared wth the hel of the OVR schee. o verfy our ethod we have aled the GM cacer dataset. o reduce the desoalty of the codg atr we have used the earso correlato. he suggested ethod has a coarable erforace to other ethods but has a better erforace tha the ethod worg dvdually. It has bee show that further roveet of the erforace of the outut rocess deeds o the outut-codg strateges. herefore, we wll fd the algorth to rove the accuracy of the ultclass classfcato esecally whe the class sze s sall. Soe algorths le the heurstc algorth could be cosdered. BIBLIOGRAHY 1. Brow. O., Brotste.: Elorg the New World of the Geoe wth NA Mcroarrays. Nat. Geet. Sul., 21, 1999,. 33 37. 2. ho S. -B., Ryu J.: lassfyg Gee Eresso ata of acer Usg lassfer Eseble wth Mutually Eclusve eatures. roc. IEE 90 11, 2002,. 1744 1753. 3. ortes., Va V. N.: Suort Vector Networs. Mache Learg, 20, 1995,. 273 297.

306 W. ątowsa, J. Martya 4. ugga. J., Btter M., he Y., Melter., ret J.: Eresso roflg Usg cna Mcroarrays. Nature Geetcs, 21, 1999,. 10 14. 5. Jeffrey R..: he Logc of ecso. Gordo ad Brech Ic., New Yor 1965. 6. Lschutz R. J., odor S.. A., Ggeras. R., Lochart. J.: Hgh esty Sythetc Egeuclectde Arrays. Nature Geetcs, 21, 1999,. 20 24. 7. Lu J., et al.: A Iroved Nave Bayesa lassfer echque ouled wth a Novel Iut Soluto Method. IEEE ras. o Systes, Ma, ad yberetcs art : Al. Rev. 31, No. 2, 2001,. 249 256. 8. McLachla G. J., o K. -A., Abrose h.: Aalyzg Mcroarray Gee Eresso ata. Joh Wley ad Sos, 2004. 9. Müller K. R., Me S., Rätsch G., suda K., Schölof B.: A Itroducto to Kerel- Based Learg Algorths. IEEE ras. O Neural Networs, Vol. 12, No. 2, 2001,. 181 201. 10. Raasway S., et al.: Multclass acer agoss Usg uor Gee Eresso Sgatures. roc. Nat. Acad. Sc., Vol. 98, No. 26, 2001,. 15149 15154. 11. Rado J., Lawry J.: lassfcato ad Query Evaluato Usg Modelg wth Words. Iforato Sceces. Secal Issue outg wth Words: Models ad Alcatos, Vol. 176, 2006,. 438 464. 12. Va V. N.: he Nature of Statstcal Learg heory. Srger-Verlag, Berl, Hedelberg, New Yor 1995. 13. Va V. N.: Statstcal Learg heory. Joh Wley ad Sos, 1998. 14. Va V. N.: he Suort Vector Method of ucto Estato. : J. A. K. Suyes, J. Vadewolle eds.. Nolear Modelg: Advaced Blac-bo echques, Kluwer Acadec ublshers, Bosto 1998,. 55 85. Recezet: rof. dr hab. ż. Adrze olańs Włyęło do Redac 5 arca 2009 r. Oówee Mroszereg NA ozwalaą a aalzę wystęowaa oogeów. rzy użycu secale sostruowaego hybrydowego lasyfatora zbadao wystęowae chorób oologczych. o budowy tego lasyfatora użyto etody wetorów oderaących ag. Suort Vector Maches oraz awy, rozyty lasyfator bayesows ag. uzzy

A hybrd classfer based o SVM ethod for cacer classfcato 307 Nave Bayes. Metodę SVM użyto w ostac archtetury tyu ede rzecw reszce ag. oe-versus-rest, co uożlwa oddzelą lasyfacę ażde lasy odoszące sę do choroby oologcze. Wyazao, że ta oracoway hybrydowy lasyfator osada lesze ożlwośc lasyfac ż obece stosowae owecoale etody. Addresses Weroa IĄKOWSKA: Uwersytet Jagellońs, Istytut Iforaty Stosowae, ul. Reyota 4, 30-059 Kraów, olsa. Jerzy MARYNA: Uwersytet Jagellońs, Istytut Iforaty, ul. Łoasewcza 4, 30-348 Kraów, olsa, artya@softlab..u.edu.l.