Linear Classification and Logistic Regression. Pascal Fua IC-CVLab

Podobne dokumenty
Hard-Margin Support Vector Machines

tum.de/fall2018/ in2357

Previously on CSCI 4622

Machine Learning for Data Science (CS4786) Lecture11. Random Projections & Canonical Correlation Analysis

Machine Learning for Data Science (CS4786) Lecture 11. Spectral Embedding + Clustering

TTIC 31210: Advanced Natural Language Processing. Kevin Gimpel Spring Lecture 9: Inference in Structured Prediction

TTIC 31210: Advanced Natural Language Processing. Kevin Gimpel Spring Lecture 8: Structured PredicCon 2

Logistic Regression. Machine Learning CS5824/ECE5424 Bert Huang Virginia Tech

Gradient Coding using the Stochastic Block Model

Revenue Maximization. Sept. 25, 2018

Maximum A Posteriori Chris Piech CS109, Stanford University

y = The Chain Rule Show all work. No calculator unless otherwise stated. If asked to Explain your answer, write in complete sentences.

Wprowadzenie do programu RapidMiner Studio 7.6, część 9 Modele liniowe Michał Bereta

Machine Learning for Data Science (CS4786) Lecture 24. Differential Privacy and Re-useable Holdout

Rozpoznawanie twarzy metodą PCA Michał Bereta 1. Testowanie statystycznej istotności różnic między jakością klasyfikatorów

Weronika Mysliwiec, klasa 8W, rok szkolny 2018/2019

Helena Boguta, klasa 8W, rok szkolny 2018/2019

Few-fermion thermometry

Tychy, plan miasta: Skala 1: (Polish Edition)

ERASMUS + : Trail of extinct and active volcanoes, earthquakes through Europe. SURVEY TO STUDENTS.

Supervised Hierarchical Clustering with Exponential Linkage. Nishant Yadav

deep learning for NLP (5 lectures)

Neural Networks (The Machine-Learning Kind) BCS 247 March 2019

Zakopane, plan miasta: Skala ok. 1: = City map (Polish Edition)

Nonlinear data assimilation for ocean applications

Analysis of Movie Profitability STAT 469 IN CLASS ANALYSIS #2

Klasyfikacja naiwny Bayes

ARNOLD. EDUKACJA KULTURYSTY (POLSKA WERSJA JEZYKOWA) BY DOUGLAS KENT HALL

SSW1.1, HFW Fry #20, Zeno #25 Benchmark: Qtr.1. Fry #65, Zeno #67. like

Katowice, plan miasta: Skala 1: = City map = Stadtplan (Polish Edition)

DODATKOWE ĆWICZENIA EGZAMINACYJNE

Wprowadzenie do programu RapidMiner, część 3 Michał Bereta

Convolution semigroups with linear Jacobi parameters

Stargard Szczecinski i okolice (Polish Edition)

Wojewodztwo Koszalinskie: Obiekty i walory krajoznawcze (Inwentaryzacja krajoznawcza Polski) (Polish Edition)

Wojewodztwo Koszalinskie: Obiekty i walory krajoznawcze (Inwentaryzacja krajoznawcza Polski) (Polish Edition)

Rachunek lambda, zima

OpenPoland.net API Documentation

Inverse problems - Introduction - Probabilistic approach

Wprowadzenie do programu RapidMiner, część 2 Michał Bereta 1. Wykorzystanie wykresu ROC do porównania modeli klasyfikatorów

Jak zasada Pareto może pomóc Ci w nauce języków obcych?

Klasyfikacja Support Vector Machines

Instrukcja obsługi User s manual

harmonic functions and the chromatic polynomial


Egzamin maturalny z języka angielskiego na poziomie dwujęzycznym Rozmowa wstępna (wyłącznie dla egzaminującego)

MaPlan Sp. z O.O. Click here if your download doesn"t start automatically

Łukasz Reszka Wiceprezes Zarządu

New Roads to Cryptopia. Amit Sahai. An NSF Frontier Center

The Overview of Civilian Applications of Airborne SAR Systems

Agnostic Learning and VC dimension

Camspot 4.4 Camspot 4.5

EGZAMIN MATURALNY Z JĘZYKA ANGIELSKIEGO POZIOM ROZSZERZONY MAJ 2010 CZĘŚĆ I. Czas pracy: 120 minut. Liczba punktów do uzyskania: 23 WPISUJE ZDAJĄCY

Sargent Opens Sonairte Farmers' Market


DOI: / /32/37

DO MONTAŻU POTRZEBNE SĄ DWIE OSOBY! INSTALLATION REQUIRES TWO PEOPLE!

HAPPY ANIMALS L01 HAPPY ANIMALS L03 HAPPY ANIMALS L05 HAPPY ANIMALS L07

HAPPY ANIMALS L02 HAPPY ANIMALS L04 HAPPY ANIMALS L06 HAPPY ANIMALS L08

Chapter 1: Review Exercises

Klaps za karę. Wyniki badania dotyczącego postaw i stosowania kar fizycznych. Joanna Włodarczyk

General Certificate of Education Ordinary Level ADDITIONAL MATHEMATICS 4037/12

DEFINING REGIONS THAT CONTAIN COMPLEX ASTRONOMICAL STRUCTURES

Ankiety Nowe funkcje! Pomoc Twoje konto Wyloguj. BIODIVERSITY OF RIVERS: Survey to students

Karpacz, plan miasta 1:10 000: Panorama Karkonoszy, mapa szlakow turystycznych (Polish Edition)

SubVersion. Piotr Mikulski. SubVersion. P. Mikulski. Co to jest subversion? Zalety SubVersion. Wady SubVersion. Inne różnice SubVersion i CVS

EXAMPLES OF CABRI GEOMETRE II APPLICATION IN GEOMETRIC SCIENTIFIC RESEARCH

The Lorenz System and Chaos in Nonlinear DEs

Effective Governance of Education at the Local Level

Bayesian graph convolutional neural networks

UWAGA!!!! Nie odsyłać do Spółki ATTENTION!!!!! Do not send it to the Company

Instrukcja konfiguracji usługi Wirtualnej Sieci Prywatnej w systemie Mac OSX

Zmiany techniczne wprowadzone w wersji Comarch ERP Altum

Proposal of thesis topic for mgr in. (MSE) programme in Telecommunications and Computer Science

Configuring and Testing Your Network

Standardized Test Practice

DO MONTAŻU POTRZEBNE SĄ DWIE OSOBY! INSTALLATION REQUIRES TWO PEOPLE!

Ankiety Nowe funkcje! Pomoc Twoje konto Wyloguj. BIODIVERSITY OF RIVERS: Survey to teachers

Zarządzanie sieciami telekomunikacyjnymi

HAPPY K04 INSTRUKCJA MONTAŻU ASSEMBLY INSTRUCTIONS DO MONTAŻU POTRZEBNE SĄ DWIE OSOBY! INSTALLATION REQUIRES TWO PEOPLE! W5 W6 G1 T2 U1 U2 TZ1

Domy inaczej pomyślane A different type of housing CEZARY SANKOWSKI

Wojewodztwo Koszalinskie: Obiekty i walory krajoznawcze (Inwentaryzacja krajoznawcza Polski) (Polish Edition)

OSI Physical Layer. Network Fundamentals Chapter 8. Version Cisco Systems, Inc. All rights reserved. Cisco Public 1

Towards Stability Analysis of Data Transport Mechanisms: a Fluid Model and an Application

Machine learning techniques for sidechannel. Annelie Heuser

Strategic planning. Jolanta Żyśko University of Physical Education in Warsaw


Baptist Church Records

MoA-Net: Self-supervised Motion Segmentation. Pia Bideau, Rakesh R Menon, Erik Learned-Miller

Polska Szkoła Weekendowa, Arklow, Co. Wicklow KWESTIONRIUSZ OSOBOWY DZIECKA CHILD RECORD FORM

Arrays -II. Arrays. Outline ECE Cal Poly Pomona Electrical & Computer Engineering. Introduction

Formularz dla osób planujących ubiegać się o przyjęcie na studia undergraduate (I stopnia) w USA na rok akademicki

18. Przydatne zwroty podczas egzaminu ustnego. 19. Mo liwe pytania egzaminatora i przyk³adowe odpowiedzi egzaminowanego

Mixed-integer Convex Representability

A Zadanie

archivist: Managing Data Analysis Results

Prices and Volumes on the Stock Market

Wprowadzenie do programu RapidMiner, część 5 Michał Bereta

DO MONTAŻU POTRZEBNE SĄ DWIE OSOBY! INSTALLATION REQUIRES TWO PEOPLE!

Transkrypt:

Linear Classification and Logistic Regression Pascal Fua IC-CVLab 1

<latexit sha1_base64="qq3/volo5ub4xouhnmbrwbu7wn8=">aaagcxicbdtdbtmwfafwdgxlhk8orha31ibqycvkdgpshdqxtwotng2pxtvqujmok1qlky5xllzrnobbediegwcap4votk2kqkf+/y/tnphdschtadu/giv3vtea99cfma8fpx7ytlxx7ckns4sylo3doom7jguhj1hxchmy/irhrlgh67lxb5x3blis8jjqynmedqujiu5zsqqagrx+yjcfpcrydusshmzeluzsg7tttiew5khhcuzm5rv0gn1unw6zl3gbzlpr3liwncyr6aaqinx4wnc/rpg6ix5szd86agoftuu0g/krjxdarph62enthdey3zn/+mi5zknou2ap+tclvhob9sxhwvhaqketnde7geqjp21zvjsfrcnkfhtejoz23vq97elxjlpbtmxpl6qxtl1sgfv1ptpy/yq9mgacrzkgje0hjj2rq7vtywnishnnkzsqekucnlblrarlh8x8szxolrrxkb8n6o4kmo/e7siisnozcfvsedlol60a/j8nmul/gby8mmssrfr2it8lkyxr9dirxxngzthtbaejv Linear Classification y(x; w) = 1 if w T x + w 0 1 otherwise. 0, w = [w 1,...,w n ] w = [w 0,w 1,...,w n ] 2

<latexit sha1_base64="kygw32pos1dfqilziwku7emswic=">aaag8xicfztdbtmwfidtwcoifxtciw4sjqyx0jtubm6qjsy0nrexphbtvhev7titttijymdtlevbuepc8hk8bm+dk3baqccsnt3x9/neoucxtukhtof9qi3cur1yv7n01713/8hdr8srj09unkwmt1gcxmmhesvdefgwfjrknstlrnkqt+n5tsnbfzxvio6aeplwnisdsascew2m+ss/meudeeukfinoo3drzb3t8uu0ht4hhonorgmuy/qkjfoewthfoegjy6dwkeplukl4skbr5yik/lkgumkxxsytac+wppe4r0ihpetiny9nisarjqf3nily+0hlauay3yxczcp/asf95vvv <latexit sha1_base64="rrwoitkmjgdqscqkwhjxgrxzsra=">aaags3icfdtdbtm <latexit sha1_base64="kg9pconuuudtvxu+tjykjge44vm=">aaaeihicfznpb9mwgie9l <latexit sha1_base64="qrjfeiqscdqxz4srnutfjjul0wk=">aaaeihicfzpdbtmwfic9lcaop+vgcjcwhrjcucw7gdxnk5o46rhsu3aqq8pxndra7es20x 3 Distance to the Decision Boundary w T x kwk y(x) kwk w 0 kwk y(x) =w T x + w 0 y(x) kwk = wt x kwk + w 0 kwk < Signed distance to the decision boundary.

What about Data that is NOT Linearly Separable? 4 Map to a higher dimensional space in which it is!

What about Data that is NOT Linearly Separable? This is possible surprisingly often. This is the basis of many ML algorithms. 5

Polynomial Features 6

Wikipedia Dimension d Dimension D >> d Machine learning and data mining. https://en.wikipedia.org/wiki/machine_learning 7

8 Mapping to a Higher Dimension How to define f? How to find the hyperplane?

9 Deep Learning High dimensional feature vector

<latexit sha1_base64="/e1iuaiazzgzwfiryssh1hy0zlo=">aaagy3icfdtdbtmwgibhb7ayon62wreiywjc2hbmzu7gznjeqcymdqypxts1xeu4tmotdilbwvtfuq6uhlo4bi6a+8dtiucrxyk1+utntdq4aois4druaj+xlu/cxancw71fxxvw8nhj9y3nm53miri2tznudqoiwcilaxtuetbnfcmisfgnukxpvxpfloapbjljxvqcxjjhnbjjlwbrnt+r4sz24qejcgfv47fy17kyfhlfky9o8lzdv2j5wxggd7azykrfh6xv1xzrswo7gzcfttd8ob1srdzzw5tmgklde6j1z6tlpl8qzthnwfn1c80yqi9jzhp2leqw3s9m91biv3ylxfgq7esapfv994ycck0nircligaof226ejp1cho97xdczrlhkl5/ujqn2kr4ule45iprk0zsqkji9rtioiskugo3s1r1pzj7m4o17yu/z0wrk6rxhu9ulmi4tdcx+2+m020hl39co/0v1dy22ez9lktmenfzuvinsvcnwxaerams7bexbertiygm7u/fkhu1/jyizgh0q+chjh8bp3l8gpix4y3glcfbwnuod4b3ho8c7zp+dvx8ukelbqefca4qaa8cp8cp42eigtajoggekrpewgphh8 <latexit sha1_base64="chc5mer7qbbr5ec0tb1jgbqaxou=">aaahhnicfzrbb9mwfidtwegotw3e4owifqmxquonofeyaajmy5n2qaxrp6aqhmdjrcvofls3ovwx/glvvmkvwelbta7hory68ff5+hjcu3fihbttx6w1bw/xh5u3hleeph32/mxm1streq0stjo4cqok7sjbqspju1izknacemtckltc20bow0oscbrxkzmjszehgfofyirvu2+r9n1xsub5sivh9afjkhund6fbucqjyd6bauq4poyzhq8kejfqqxkqo559yfdrc/87ktnxlpxqvzarvb2gzn/1uztbeccqqxp7smljlsqu3wkcbhevykc5r8baa1cpakxoc0b9vpm0ib88kaaccjqkbmtwajheqqgvem8xvah4jpuub7wi91kkc7wyzeqf2v1mo5u79qwoypgewq7ihof5yckvlqkseo4t75n22tvctmt28yaz1gfbtjv7lntb668dl8idpja9mh+nbseym6jeuhyqjm5akbjhwxsqjgo5ykr006l8gbxtlr74uak+xelrutwjruyicxovmrderlk88s7wguj/uzelpb5iwvf0ih8q5rxnzxj4dlbnhku4owqugpsoqvjvwkhn+uluyhjyphjfxhnzo+rd6qakygicqcufzm4e3sdsph Perceptron Minimize: NX E(w) = (w T x n )t n n=1 Center the x n s so that w 0 = 0. Set w 1 to 0. Iteratively, pick a random index n. If x n is correctly classified, do nothing. Otherwise, w t+1 = w t + t n x n. 10

<latexit sha1_base64="tdes/naocamykpu3cy8uq6pgwka=">aaagxnicbdtbbtmwgafwflyyammbxjc4weygguau7alshdqxtwotng3urh11qrzhsa3ftpc4a6iob3hlbjwad4htvwxfjawojn//z3aayp4k5ply3v+te/exltspvh7ajx6vpllb33h6niv5slmxjngs9n2sszhl1lvcxaw/srkrfsx6/uv+471rlmy8kr1vtthqkejykfoi9nbo/q/2wcrlxa4ksvnsvq3tcgv7bdpfwpe4ybx2p/ub9bp91beowahwhex51aykrmhci9tdtud4svhxseb4kifb3tm+d27vcu0ru0iuwrtog2gm7hfev/jejvk65rlz9lsyywc+jr5jetrwji6w7851s7mbt12mvg3hcfsvhm5ctfhto43wn13hntvkdrx5z9oat7prxti3hcq0f0wqgpmsg3jura31fhsnmattngdsquglidhadyurlbtws9dso1d6jebhkuplkjqbvvtrezflpfb1uha1zhatgfyfdxivfhhwx Linear Classification y(x; w) = 1 if wt x 0, 1 otherwise. x = [1,x 1,...,x n ] 11

<latexit sha1_base64="pbzfmmjup+5namg/ftocyz22dqw=">aaagbhicfdrva9nahmdx23r11n+b+ksgcdiuoczorvdwyxcouchmhhxt6hxjcrmkx3jjvluskyevxlfju33om/a1eo0q+utvltdy4z7fpmlbe+sjsq7r+dkze+pmxo3w/o36nbv37j9ywhx4zlpccnkswzkztscttfqqw065rhzyi7koetkozjzh3j6xxqospxtdxpy0j1mvkcgdxzpdemeizpakozyuurymkbnh5ijl+2qfn6yu2zeuxfilrebldgg5sd4yhxqpzcmwtcbhweni3bmwzqlqmnui4dz2m43c9upunbkjroqssdln4ozhsuvhlgtpe+x4ksv63k+e1n+u/6sojlf/papk2tqhdnypuevbarstxmxdwkvve6vk88ljvfz+ufqk1gv0tgu0veyklwz9wivr/l6p6hpdhfmbw2cfph8wi/f8dt/l0ncxmdwscrnrflh5z4vz2mi6lltpn9bp/wutin02/r4m0enex52ubksq2wghgqdcqqp6nav Pacman Apprenticeship Examples are state s. Correct actions are those taken by experts. Feature vectors defined over pairs f(a,s). Score of a pairs taken to be w. f(a,s). Adjust w so that 8a, w (a,s) w (a, s) when a* is the correct action for state s. http://ai.berkeley.edu/project_overview.html 12

Perceptron Ambiguities Two different solutions among infinitely many. The perceptron has no way to favor one over the other. 13

Logistic Regression 14

<latexit sha1_base64="vsuhinusqzc9fjndsky+d1zfd1w=">aaaf7hicbdrnb9mwgadwd6xjljd1corissiniu3plnabtuzt2krnq2rxjqzutuo01minspy1wdrvwq1x5ttx46pgdbxw9mfsokf+/f0ijzjpipvzz/tzw7u3xt+4v/mg8fdr4ydbze2nl1mcp0j2rrzfaz+ztebkyk5vnpl9jjvm80j2+pvh5b0bmwyqnh1bjhko2dioualmxdeoeetzovamvfzqdsvnjyzflbrv7pp89oq+oz6ffum494y+ptorr99tr0vzlqmxysijeemyojamkokjbblci/lcuhnbp0kac8zvpgyxr93c0gr/lho1d7w9b9eoltrlyocs28voe/2zh8qi19jyebktdnpeyocls60skdu7n2cyyekajexalyzpmq3lxshn6uvxe9awtt1jlf30/juizdrlcs1dujm7yvat6vyfdxibvh2wyis5lubclrtmizoptpwgkpxcrourmeiv2ysve5yyyd13afhgtkwsnxmn4/oj+cabln61ca/lo/l8xy+bhym/ax6c/bt4kfio8a7ylvau8h7whvi+8d7yk+bxzhsrcqysdm3agxpkarhahgqgekbagijaiajj4gpke+at5eqbgeibdvwjj4hhyc1wizwhnio/ax6dfap8inwgfia8af5upwei6po/acgi8nx1bn0ga2cocagdh/pqnmqv3j24unzfa7v60/7owyflvbrjnpmxzje0yrtyqd6sc9ilgvyqbds2as26qx+tf6t/v4uu1zzjnhhq6j9+a633gd8=</latexit> <latexit sha1_base64="jd7jbjiqs498zhh4pun0yez8/pc=">aaafynicbdtratswfazgtuu2luu29moxg2fwbrsq9m7wy0eoxqsthcrnutguwzytuckylpzugl/abvd0e4y9weq0sb2fcqyh8/2wbgnoukhhro//2tl9mhg+fbb3fprif/ty1eud/rujq5lxkgmpy1lcdzci56evvvjzuxkqesmnyf248+mkl0bofglrgseklnkrcuata13fhrz5x/5mebgitsur2a67w8h3knwsujy3tfjj5off2lihprvm8nyuvyyxln3tbz+7mqekm7jzpgfrfxcd1mt06a7cepvuv3c0vbltq8qlfbvl07eu+t+bvzy7irurf5xloxs8kkukz7xxvbsxipizk2txufyk96wew9ksmus+zsjk+zpppwienlfy2s79uim6q5ksow3bnp8bp0n+dvwc+qxwc+qt4bpkifaq+rt4fpkm+az5lfbb56negoiertskwbpkddhdnqygkkjalofahgil4avks+bl5ekageabbvwh18a1cgvciq+av8hxwffi18dxyb+apycvgdfdtwac6upvgfhzxpv3ujcwcikcyxgyt60brkf/9odi5tnx4opvptkjb8l78pee5dp5qr6saxisrllyg/wc/b6+gb57hfq7o9vpdujagnp/agsf7k0=</latexit> sha1_base64="litsnoieonisodnkty4wotywuvy=">aaagahicbdrbb9mwfadwb7yywocn6y5pg6awosnlahixrdwntdo0phxtqkvkczzwwpxksbm2hhwervdz+cdccdoo5hpllz78+/slssv7acsl8rxfzr37a+unbxsp3uebj5883drevjbjnlhwpumuzh2fsbbxmhuvvxhrpxkjwo9yz7/q1n67yznksxyuipqnbrnhposukd012nyq9tmyxyvxtpcvrhjxxugpwttjowaifswaqkmnqsahqmdzdmroj6amrig77mguy5jlphhvuuut+7mwvarm0jrlzrpkm1h0rm34bnmc/b5ea8yu/b06rbopsnqfomwildo6d9rr6rwt2yataigtqm6zd/g6j4ho+mmsbh4kbzoq5rgrgbuyxchdd1tm3h7wagvp2/fma+yivsz20hkcjbbxvuagoblgsairkxlq9li1lemmoi3qhcwlswm9imm20gvmbjpdcn5wfbzqmwgesaz/syl57n0vjrfsfslxsuhurk5apfk/g+qqfdcsezzmisv08aawj0dvzn3wepcmuruvuia04/pdgu6i3molz9pfmzvsraiidwb7b9xag5a4fogflgdvtekhhh9afmt4kexhhh9bfm74uevdw7uw9wzvwd43vg/5pegx2t2vbdesxorgg+5btg2nlgebeqisqbgagdakja0fwz4xfgi550aawwfhula8mtyxxbmulm8nzy2/mfzg8qnhu8tnhs8slwwv6j+bercn/wkuroxpagdxygzoredhdhsq+jzqr949dnhxzr+t688e2ka7abc1uru9rr/qj3sguog6y+e788p5uf67sdpyxdxb95zlbfymgapx/a8yikrg</latexit> sha1_base64="pvvvsgkcrw7llx51ucw1dwzrc2g=">aaagc3icbdtdbtmwfafwzgxlhk8nlndztafqeuwtnybxm62axiztgtk6dtrv5thoay2os9hzg0iegvt4nh6ee5y2g7nguquj//4+sezkfhixqzrnx+7kvdw12v31b97dr4+fpn3yfhyhrzys2ieiemnpx5jglkydxvree0lkmfcj2vwv2pv3b2gqmyjpvz7qacejmiwmykwnhpuug3w6ynhbfoxsky09vfxqptarwrraxk4oycj0negrqhiw7ubmzaxqjbv4ix70oszpivpxpzfxkt9twctaoelsmdvvuke8pwzbn5groi/wbhdy4o/qaz1nlnavoddfpeiqplctqdy2bps0yrk0yqrnlqdrdac644tpud/hozwqzhgtghmixshdd5tp3h7wcgonuducdbcl1qlycrbjbli5+gufgmscxopewmp+q5moqyftxuhu7wamayljfr7rvi5jzkkcflozkuglngkgfkn+xqpms3dxfjhlmxnfjzlwy7ls1et/rj+p8mogyhgskrqt+ypclak9m9xbq8bsslsu6wktlol3btlgeo+vpk8pxxrcbody7wzyd8p+c1cg6if+wbyu5zifgn5o+zhhr5yfg35s+bnh55z3do9y3jw8a3np8j7ll4zfavewethiykudb7hvotgcwb4ericwamfobeirmdj8zpny8lhljbkbzgw44dxyybiwxbmulm8mzyy/mfzg8onhe8unhk8tzw3pqz+beecn/wier8xpcgd+ygzoredbdltl6jzqld89dnhxbrel68/nnb39xb207mw5207datnvnt3nk3pmdbzijtzv7g/359rv2lztu/zihl1xf2ueo8aovf0dl7rfpw==</latexit> sha1_base64="csak9axmy8ak8uohna7rwg/1ezy=">aaagc3icbdtdbtmwfafwblyywtcgl7s52gc1ckz2nybxm62axiztgtk6dtrv5thoay2os9hzg0iegvt4nh6ee5y2g7nguquj//4+sezkfhixqzrnx+69+yurtqdrd71hj588fba+8fxciiwltenejnkejywnwew7iqmi9pkuyu5htotftsvv3tbumhgfqzyha45hmqszwuppdtdcb/l0xokckcrzv1p6qkqgs2eisiiaif1rwfdoazaivcus2ohmybjugcvwfj3odyztfodvsi+vi3/agneacjkkyqqrpi54e9icbzajqb/hlsdkwd+h0zqbkkerqggkszfufabau61t3dzplhnplfwbhudxgq50xhftor6pcyohzsjamfiqjyo77zafup3g4fp2c6c5g2axruwx7szg2xbj5qskbmk4jrwjsjt9vjnrgwknipgo2sfm0gstkzyifv3gmfm5kgznvcirprnakfl9ixxmzu+ukdcxmue+tnksxnlzqsn/wt9t4ydbweikuzqm8wefwqr6n6udh4cllkgo1wumkdpvcmsm9r4rfz4eiumecm6x3hnkh5t95qba1up8sdgoyyu/npzq8ipdjyw/nvzy8npdzy3vgn6xvgt41/ke4t3llw2/1o4tjbcrwfyh33dfcmi4stwijebgbclqcirwygt4ypkx4wplgtmczapww7nlwnbhutjcwz4znll+y/in5rpdj5zpdz9anhuev38ci8bp/wuijort5q78xaycwig2gwix1w3uwr577ojid6el68+723v7i3tpzdl0tpy603leo3voj+fm6tjehbnf3r/uz9xftc3avu3lphrpxax54rij9u4pmfrfqq==</latexit> <latexit sha1_base64="htyjykboisytpkesmnuqqwv16qq=">aaafnnicbdtdbtmwfadwbywwylchl9xyteibyfo6g7gbtvrjbflhjrvrr1mqx3faa3esju7tkmpl8dtcwlvwnrhdjtg9sxtp5pz+thurshshmtw2/wdj896wdf/b9spao8dpnj6r7zy/sqms4alhoybkbi5lrsbd0dnsb2iqj4ipnxb996a98p5mjkmmwq4uyjfsbbjkx3kmtwtcf7u/3/zyph1b80rqqzvfrupo9+gh6psi/96l5pw+ofny3muo6w37wf4oiovwqmiq1bgy72x9c7yiz0qemgcstycto9ajkiva8kbunsdlrcz4dzuiosldpkq6kpefvdhxpunrp0rme2q67p4/o2qqtqvlmqriepqu26j5lw0z7b8fltkmmy1cfrurnwvur3rxrtstiea6kezbudkwysmfsorxbu6y5oqi55fslprkxz2uhvaodbabuh55xfvrfgl8bpkp8fpkz8dpkhebd5h3gpeq94h3kq+ad5bfa782xltlmjbgaauxuiuca+fipq8epbtwfrdwuwacfij8cnykxeoqkciggcvkefaiuqaukwfam+qz4dpkofac+rz4hhkbvfj8bccgzv8foavk8/uvvacgoijqhof2vznbqlv+9+di6vcgzerlw8brp9w9te1ekldkl7tio3jevpal0ioc/ca/ys/y26lw 15 Reintroducing Probabilities y(x) =w T x + w 0 > 0 assigns a label to each x but not a probability. We would like a soft version such that y(x) p(c 1 x), / p(x C 1)p(C 1 ) p(x). (Bayes rule) > We write y(x) =f(w T x + w 0 ). How should we choose f?

16 Brightness as a Feature Some algorithm brightness +1 (salmon) brightness -1 (sea bass)

<latexit sha1_base64="cqdm0wwhiibbu9wj1cr6p77p6ye=">aaahjhicfztdbtmwfidtwcoip9vgbokbiw20oalkdgm3kybgndzpy0jr2lfxlem4qbxydo7th4vi3mkd8dtcis644sl4ajyue7ges6l21n93to6d1ega0ex53s/a3i2b8/vbc7fdo3fv3v9cwn5wmolcytleihgyhacmjjstpqiqie1uesschlsc852ktwzezltwezvoszehmnoiyqt0vg/pn+sc8pbwbfboghcg+gqoisinpayzus7sulaaa6me31a9dtz7hq9gqj4afvf1vakolxaiuqggvpvbcybbsujprbcgjka8qfkicvngparuvwcl+aa8m+brcgqoyyjv6fwilw3pnajsk80qxyweh9os+oekcv81xnh1sayoq4qa1xrnbfmxbqmpm/larzdactzxg72lfa/htqawa38ardjtcdxbnn8eq4fzpvcjj7qjju+lqqs7ubqnphrhnpeu4xmuk44oowik6xatx1ocp3ombjfuihj6nyez/2yuigxzmaxa1l33s1lwtv7formkxnylytncey4vbhtlcvacvm8ahfqsrjkxdhcwvpckcb9jhjv+i1z4mui1shko675niurkyocfrdjmaftqtcvwo4queym/fhx0pzgjsdymn9cobkkwq5uc7pyfrhygcirdsnrdymkqc8aqfhfgsft2vo5uicacyfcmvmfxfypvw/za4acwpzh4icwbbm9avgxwlsxbbm9b/mzgz9ugzrjimjbvitb4yhfscgzxmdse0bkiybais4gnhlu8b/c+xsk1bgojzodm4slgwulk4mriucfziw8mprd40obdi48mprl42obj61/cjv4kgcxf0wwfdmgkh5awywo7 Brightness as a Feature Some algorithm brightness 1+1 (salmon) 0 (sea bass) Given the training set brightness {(x n,t n ) 1applenappleN } with tn = 1 t n = 0 if salmon, if seabass, estimate p(t = 1 or 0 x = b). 17

<latexit sha1_base64="szjxtdb8exyz4qzrc+0vnfad6mk=">aaahthicnztdbtmwfidtwugip9vghokbiwnu7k/jlocbsrnjgpu0uar17vrxlem4qbxycymztspyw1vyclwfttqour4twlkto/n9pqdxhdtrqbnhwt8r1qcpfx7vfh+bt54+e760vplipantgjmwdomw7jgoiqhlpcwocegniglitkdazuv+wdtxje5oym/eoci9hnxopyqrkkn+suuhdihpeyyc6vp13gzwrz/v2tcjotsnyl7bbdclec6a9tj1xeigmhqnfayaoycw0p/rnwdga9wm1oxyn4unde08szf0flbjrp/rcnn5ugeohcgoqm+hosf3flmzmtnhkbtsmzjlkvy2bqne/6/ojvnjhpom1vhdhis7sz3vl69z21y5gb7y02dnmi5mf2vhfbohthnhagcosbq2fylehmjbcubye6yjirc+rd7pypajrpjevp6whlyvgrd4ysx/xiaye3tfhlisjjkjtybeijlnrfiu1k2f97gxur6lgna8aeslarahki4ecglmsajgmka4pvk/ajxacjufpkam/ezks8tkrnb9gpeyitbezyck5y6ncvlsptwsovteymeijp4myncgtxk+r2glwu5wmniqz7dyacfjdvlcncenqxwyhurrhc5b3rv6u8erbzufkvxq40ckp9l4sckpnx6m8dontxte0nhb4w2ndxte0fifwi+kdzozkgigrykjcefjwofy466rck4mej4iejrgk9zx+edha41tqghue5jcmczdhycafwoxgk8vnmr8sufxgh8qfkjxkcjhgh8rfkx9jez0j4brkj3ov2anqncicfuqsc8fezna81enhpzvbnsy/raztvdpeq0ugq+nn0bdsi0pxp7xxwgalqnxflwxqq+qq7x3nvjdntjrq5xpmpegmmr8n+ovty8=</latexit> <latexit sha1_base64="zpvnvxpdokii0toosaq4gwnixb0=">aaagkxicfdtvt9nagafwa2xi/axyyvjmipemfbiae/ufcrejggexywxkt+d1et0u9nqmvy4ttf89/wf/b9/qa69bdt57keu2phk+37u2z7k6cabs02z+mju/dxuhdmfxbv3e/qcphy0tpz5noywrsiwiieo6lk9loelzmsoeshmnkms3kg33yqf09lamqyrcezoozu/zfqh8jbixrfolbyxvfc0bfj2u0s3k/isl3clo7tcxlmlr3njg6wvj9 Sigmoid Function P (t n =1 x n = b) = P (x n = b t n = 1)P (t n = 1) P (x n = b) P (x n = b t n = 1)P (t n = 1) = P (x n = b t n = 1)P (t n = 1) + P (x n = b t n = 0)P (t n = 0) 1 = s(a) = 1+ P (x n=b t n =0)P (t n =0) P (x n =b t n =1)P (t n =1) 1 1 + exp( a) = (a) s(a) Likelihood Prior with a = ln( P (x n = b t n = 0)P (t n = 0) P (x n = b t n = 1)P (t n = 1) ) 1 (a) = 1 + exp( a) @ @a = (1 ) a 18

<latexit sha1_base64="omroygomnyzcy7stf/ads2qsua8=">aaaggxicfdtva9nagmdxm3n1xl+bvhlfbifqio5kbxrlmjxjbrbzyv07mliul0t6lhcjl8vwevnv+b8ivtw/w2tx0afpxkdtw32+szudjsxturjp+750a/n2supo6l3n3v0hdx+trt8+lbjsm95hwzrpxkglngrfo0aylpdyzakmu94nz3en3r3guhczojgtna8ktzsibapglg3x2gfknsoe6ck7j+g71n12xwepagaxpqxqn8ddtr1+nfbda7wbv581wpfn636rbjnbyli24w16s8pfgz8fnsj8aa/xv54guczkyzvhks2kvu/lzlbrbqrlee0ezcfzys5pwvt2vftyyldnbr12x9ivyi0zbv/kulpvf8+oqcykiqxtkakzfys2xbzo+qwj3w4qoflscmwuviguu9dk7nqf3uhozkw6sqnlwtjf6rirtztm7g47wudu70xzi3vdzznx1gt6zrvqnug6ru29jcgr6xrtknsf0e7/cwur2gz2fkmiz4m8pqmcvbokpjsqhtvexttoopgly6skkqqcck/ue4n5em8c6pva95efad9afgj8epkj8bpkhead5f3gxeq94d3kz8dpphu0ufbquhsfehiinafnykmibbek4hgemqos4anyefarcifaifagguvkgfamuqfukjfas+qxwc+qxwk/rd4gpky+at5b/xj5/ddgnk2of68gj2bwhijdgozawd5m/cvhjx5otzz9o3/z2tj5mh+srpjn5dlpep+8itvke2mtdmhkg/lbfpjfjevgq+e1tq7sw0vzc54qcdte/wzwzlsb</latexit> <latexit sha1_base64="eaqva9z9oy6pf9vbuq+ypqyauey=">aaagqhicfdrbt9swfafwfebhmthgo027wendgazquwwxjeshgeiwtqk0qk4qx3fsi9ijhkdnlewd7rdvmqcuba9mwgr09h7/oimt2ksinupg41dtafnzsv356gv35dr6q9cbm29u0jhtlhvohmwq55gurvyyjuy6yr1emsk8ihw9u1bl3tftky/ltz4mbcbikhnakdgmndyyyi+fxbyk4qhck12cqrgnszkooef5/gnc2j0ojdo+qu0dpzrhzz+5uea7fad7yqynpbrkh8d2vgum2uuuztj/enbwy6tx0jgnzbfnebhlzed7ulnydvsxzqstmkyktfvnrqihbvga04ivls5slhb6r0lwn6ukgqwdyrzcjfpooj4kym 1 Parameter Model In theory: a = ln( P (x n = b t n = 0)P (t n = 0) P (x n = b t n = 1)P (t n = 1) ) In practice: These probabilities are hard to estimate. Introduce a parameterized model and choose the parameters that maximize its likelihood. Model: a x B y n = P (t n =1 x n = x) = (a) = (x B) 19

<latexit sha1_base64="hnh5li4qkqqwl+glkghelmsedzs=">aaagyhicfdrva9nahmdx23r11j/b9jh65haou7s2iiiimdbh3gbzwrp2nkvclpf0wo4s7i7bssjl8nx4vf+et30lxruk/vqbc7t8um83axolcfnuwtds/pybv3z9oxzj8wb91u07d5ewv+4d2awwxlr5lmamgziruqlf20mxim5ubfnhkjrhyebyo6fcwjnpqzfkrv+xrmtycub80md5fq1iw3gzbvvltffrsp/sd5qgtladtddg/u0ldqp9zk+9p436yhm12whodo <latexit sha1_base64="z7c6amlqww4cwmlfdro+e1vt9na=">aaags3icfdrva9nahmdx2zrzxn+dplkfbicwh4x0t/sjmdrl3gbzqrt2nlvclpf0wc4jl8u6egk+gp/qg/ef+dp8jj7wmlx0199cooxhfb5jmopgz2kra9f9vrr846a1cmv1tn3n7r37dxprd0/ytfcmd1kap6rv05zhiufdlxtm+5nivpox7/lnuzpvnxoviztp6gngh5jgiqgfo9osjrppvihd+rj Maximizing the Model Likelihood p(t =[t 1,...,t n ] x =[b 1,...,b n ]) = Y n y t n n (1 y n ) 1 t n ; (Naive Bayes) E(B) = ln(p(t x)), = X {t n ln y n +(1 t n )ln(1 y n )}. n de db = X n X (y n t n ). +1 (salmon) 0 (sea bass) B* s(x-b*) B = arg min E(B) B 20

<latexit sha1_base64="u8b5qlgetfg5o02gxzrcuffetgy=">aaagixicfdrva9nahmdx2zq649ttofgkoaqrgalp9kfqy9ygc0k7djrlxc6x9fgucxex/ihkdfjun+krexwgvhuvwuv//c0fwn7c55s0cftcihxa+p6pre07d51793ceua933uepn+ztnuu8viwpwz7mahxszvor8aerjuxjqneqw5spwqtu46m5v1rk2ccscj6vnmlelbg1dunscu/ap/tbw8ndzz0ckpvxue88dakclzjnhqvu60nhl8y0osoilvladurnc8quamindsyo5hpatfdzey/ssutfubkfzhjt6r9nvfrqvzkhlsu1m71pzejnniln/g5aiawodc/y9q/fzeqz3gse2oue4sykkztqpos9v4/nqklm2ffjbh+4frbf+/a6nwquqmnvqyqgkpf0wdtns4lxzxrbkli/oz3+f2qr2kz9viwrbsjvtqqgv1db8wbcsorvtesggv+wxeqarvuq9uqjp10hcrtapwj+hpwy+dhye+anyafab8ihwifir8bhymfax8gvgf80l2ijokcg6aoh8ba5a86qrxeiihtemqhifctae+qz4dpkqobaoeacl8hz4dlya9wgl4gxyofa58gxwbfil8cxyffav+hfik//boym1enmfwqfbn0udghqtyhdszuboycezt8cduz82sc75bl5tl6sdnll3pop5iwmcsmr+uk+ot+c787p6z13e2u9+e4tcdi/fgo8fzoh</latexit> sha1_base64="v+62nktsd0qly1tskucve3kostk=">aaahlxicpzrbt9swfidtdlawsqf7m/zigyykgijzy/acnmeqf4mbrglrxvwo46qwsvmldrsk8jp3sn+ylzlpyzyaiu2z1ojoff+olzm1n4x4qhznz63+ym7+zwphlf168c3bpewvxas0zhlkwjso4qtjkzrfxlkw4ipinwhcipai1vzu9krevmvjymn5qczd1hmkldzglcid6q/ujpzyygvoih7kzcie9yva30e45aegzdhw7gbc2mzbqmjua1ig3n8tsgu4trdtpu5sob1ultqsii4el8grbjapaj0umxbxv6/mmoyr30b/ontdal3v9dpzu1tgbajcz8makvg/q1ivdm9jjv37k52a/vkas+1ua5mbow3wrok476/mv8d+tdpbpkirsdou6wxvlyej4jrihy2zla0jvseh6+pqeshsxl51roe+6oypgjjrp6lqlx38rk5emo6fp01b1ccdzwxykdbnvpc1l3m5zbstddjrkevixahsm+tzhfevjxvaaml1whedeh2csjejjb8zvzeenei6z0owebunmzkmiw6fuah3fujpzfscyow9qko/ibohtvy9n1fepyinlrzvfzkut8dz8v2isg0s2r2nhsd642jvv+g6vakqvalkb4afgpwi8codhwn+bpblwc8n3gk8zfa24g2ddwdvgpwa8ovyggymagxivpaa9wxoaacg930g+iyqbeaidceepdt4apcbwtkhajceabgweax4bhafudj4bnhm8fvabw1+b/idwueajww+bnxs/eve6r+bkig/na0gtqbwygh7 sha1_base64="wivsrub7cdrumssvlzvvhf7cmmw=">aaahohicpzrbt9swfidtdlaw3wb7mvzigyy6nfcyl+0fcceqf4mbrglzxvwo46qwsvmldrsk8jp3sn+ylzlpyzyaiu2z1orofj/psv3x3jdiqxkcn7x6k7n5p42fz/bzfy9fvv5cenorxllcwyvguzx0pjkyievwulxfrdnmgbfexnre1u7j29cssxksz9v4yhqchjihnbklu/2lmsqec7nmscrduvby475eq5sipzwupdla3/6imlzxkbca+xowkpe3i1ibttnf6/cth9emkkutvurh98vgsdblvihuik2iq6txpyx+5wz0r0pxrdzvtz/p766dfabyp3drqod/rgk12r2nmfrvt3yc+osrzoztdwqg7jrysabjtl80/w77mc0ek4pgje27rjnuvzwkitoiftboujyk9iqerktdsqrle3l1mwr0qwd8fmsj/kifquz9gtkratownjyfuyn0lpxjh1g3u8hxxs7lmfnm0kmjiiuqilf5zzdpe0zvnnyboqnxa0v0qprxkn0zbfyn6b0k7ejxprmyhkg4wcsxsft1grv6byh+veapivzeijr6m6gvpnaq70cuusniysxhu0woyxpwvhy3kgwbs3zdyygi/ngxt1t0nd5uccob8j3a9wx+apibwq8bpzt4oednbm8b3jj4g/c2wtuadwx+cfhleuazbgegmsp4ghsgp4btg/s+ehxdcaigbiyqah4afad4wocca4ebggbcgdwgpda4alwzpam8m/g14ncgvwh8xuajwecghwm+nv4l4vipqemuh89weedqodke sha1_base64="4ny7wirajkcjqde6a1tejg4nlws=">aaahohicpzrbt9swfidtdlaw3wb7mvzigyykwlhdy/achgcii8rnorssrirhcvkl2kksb1pf+zl72g/zy5y0me7nkkzzanv0vs/n2k5rdxjyrdwbpyvvz3pzz2sll+yxr16/ebu49o4yidkyshanwijuucrhizespbgkwwcymylcklxd692ct29ynpbixqjxkpuecst3osvkp/plfyldfnczkzahcj23x32jvrcqtnggsh3u2fldgnvyjwnnpa1zlhk7eaeap+6gxspegtpcran1kujoyrkcynqrytf9sqlxdpwyh9gvmi3+dei65brk9jp9nqzyasr2c7+gev7pklbl3duyse/uzcegv7js3giwa5mbmw1wrok46y/nf8berfpbpkihszku0xyqxkzixwnichuncrssek0c1twhjiilvay8gtn6pdme8qnyf6rczfbhjiyijbklv5ucqeeyy4rky6ybkv9rl+nymcom6asrn4zirai4zsjjmamqhoua0jjrtsi6ipo4lb6mnv7g9f5idqzrng5ztfqur2eyxpr6jhk9twb/lqknrc7vrb39tdqxumvl9xokkbxxujlhvtzdxqm4brax57anjbulkrbe/7jy3cu7zd5u8esb8n3a9w1+cpihwy8apzl4beaxbm8b3jj4g/c2wtuadwx+bfhvcuazbgegmsq4glsgp4btg3seedxd8h0g+iyqab4yfad4wocca4ebggbcgdwcpdk4alwzpau8nfgn4dcgvwx81uajwecghwm+nv4l4uspqemyncxwemdqodae <latexit sha1_base64="8ofevy7+ai92vicmflv9gq1x5hw=">aaahgxicfztdttswfibtnjrw/zxtatqnntrugfqnn5uekbcsyidb2erpuv1vjuokfrftoq6lirix2dpsbtrtrvyge4w5pww7ncnsk5pzft7xt2pvgpfenxo/s3n37s6x7y3crzx4+ojxk+ri09mkthvllrphsep4jgerl6yluy5yz6gyev7e2t75bshbf0wlpjynejxkpufcyqnoitapfvux9ljizuyihsrvvikatz0vtiy20drcssr6euem6eiwstq299eo5hpouitfbvjq5m1tjuoi4wr+xmobjkrfi4qdrwjmn/pm38nr8hay6tzm53v3bdqt6plqcg+invmq3dyuql2xk+r8gdcmqhcdydk/nlm/utsonyyn2ye7dzacatvul84/x35mu8gkphfjkq7bgoperptmngj5bacjgxj6tklwnaekgiw9bliloxplmj4kymv+uqnj9t8egrfjmhaemqxrg2swfcmbwdfvwdtexuuw1uzsqxcfayr0jir9rt5xjopobajcftdjrxrazczo8xvu8dtm5qlyoan7ycgu0bfaztbroscxuzlbinek6dary2vrrp8te7poeta53qkiisjuvjjsvhpcridnzc08r1swzcmac0hmtmkvmxcbvakqtati9wdfs/g+4pswpwd8woingj9yvav4y+jtwnsw7wdesfgz4gffas0ybbjequab7 Computing the Derivatives We have: y n = (x B) dy n db = (x B)(1 (x B)) = y n (1 y n ) d ln(y n ) db = 1 dy n y n db d ln(1 y n ) db =(1 y n ) = 1 dy n 1 y n db = y n Therefore: E(B) = X n {t n ln y n +(1 t n )ln(1 y n )}. ) de db = X n {t n (y n 1) + (1 t n )y n }, = X n (y n t n ). 21

<latexit sha1_base64="cjq3woqyyfvhps76f18xrztgqtm=">aaagg3icfzrpb9mwfmcz0cii/zy4crfyh9qxvkkvikghawmsl40hdwunulso67br4irznlwryffgwlfhwggeocfx4ntgz9mom22w0jz598vl86tjj/tcifvw34xfw6xy7ttld8179x88fls88vgocmkgyseovib1hbqrz/xjixe5rzohi4g6hmk7jzukt88ii9zab/ekjd2krr47ddhicqq/urlx4fucpctihulj9yfhiadsxuderhrnuyhvcenf5k2m61+ymrrwjnyr9llfmxyaofqar8ddveaqzxxh7kapaukfthbakfihajrttgv1cmeo2llqeufjnqi/1vhxgbc03iowgdqbfri4gnckfkzxcyfzjfmcp9p4wyfpnd4p8kngpwweadwpnidu/xcw8st+faa6pwt7bwfhf3au4jcr6wty6ipgulkemmfvin0bfaky5howwxowks8ahwxhiakzvuw5tdytpdfmgxdgfjrz0kylmt9rsahgaywgkjnbvfs3uuiiwgkpsa7h3jvtvngyhyttsfohke+hcsyyaoojkqobnbi3qhuwhkzghbgqaird6awj6lurapdlakivhvzwxlozgyyksqkzjypjkesycbvttf6vvkmanget5nickkk1a+bmryvyu3xsqqblfu36vq1i9pdxryavdvam7dxynfjx0f/+dqcbjinxufyyo6hrwyhvyeocu9gjspw4irkeezqixrn6ijkoj7kzlqvrcmyahggtl89bnjv7hea0ihlqsfptlmieqcmrwdfmw1c94fphzimpz180jd3aa6aosjbwgchcs2sasoyniweei/k/chmcqiby80subkfnhm017a/n1a3tvb1lxlpjmve1boolsww8mw6mqwoxppe+lr6xfps/ll+vf5z/naulc/kztwxtlp/8a4bidsq=</latexit> Linear Model p(c p(x C 1 )p(c 1 ) 1 x) 1 = p(x C 1 )p(c 1 )+p(x C 2 )p(c 2 ), 1 = 1 + exp( a), = (a), where a = log( p(x C 1)p(C 1 ) p(x C 2 )p(c 2 ) ). > We write y(x) = (w T x + w 0 ) (Bayes rule)! 1! 0 22

<latexit sha1_base64="uw+e66n5cjpg/0ecbfeyqkcyxt8=">aaah1xicfztdbts2fiavr607dwub7wbabg6wbjbbj7b6s2hbgkjz1rvasg6i6xsma1aujrmvkywi/anod8vu9xj9qb7nknlew7aradnh5/t4eetrioqmlwowel/t+ezgzvvd25/7d7748u69+7tfvszzshi6jhmwy4silzrjgg4vuxm9kctfpmrokhpz3pdrnmqs5ejcrqo64tgvlgeek5oa7u68+y2xogyulmrmmjfcsrxsiw3bcir6aiqgn9uhoixeglh/ndwa6n2axyxkcvung5i59kafgtchbyuznlmfrtrlqtnlgaxeqwe1d1aekflwf6bo0ymf4bdzaspkhpsqq6l4ru05u19dcadnwiz1wochbrbqerwbuqxwdihh+vihz5jnktzbk1r2zdq0djmfmgvognab7vgvdiayevxtondu7jtppthj4wmqk970okg9y6zter5c0hrnur0mf2xbnb/+a3lbcri4yvchftavn3wdzta2ugq2vj196cmq4ivbnr2/nzgctapcinwee95mvjju3vwgxtmpobwkzlgsx+ggubonpwiko7wpqpiwmlzbkr2bugboy4luz1mn35tmdik5gekufltzqzm05mw54pexovaz8jprkh9j40olp000e0wlqcdrhziqa5vdczghzpisla1mgilkplcgmywxueyi++hxau5f0lnt94+csqxy+uajlfool7w5txt1m+htihnb0ut/j5ysnvr7+qmftwr/uklrsa1rswnrpe/q2veroausc47nk0xrst0etdzc0go2f2rxpw5/zvfndn9u8ecop7f4ucohfh86fgtxkcmvlh7h8fcwf9vs0dudwwz2kkqwjxxole4chsewedtcklhc4gipxvohzyw+czhjlsacgvucozy3eo5wzxhl8mrilcpnfp87fghxhcoxfl86fgxxlfmv4wcfbiizfxa9aj+1hvnholaf47p5l4bx35xu8plrywjipx/tpx6yeave9r71vvmcl/r+9b57v3svvkfhonc6yefnzlf31k27b7t/r9xozmbo1541uv/8c/dptzk=</latexit> 23 Logistic Regression For the training set {(x n,t n ) 1applenappleN } where t n 2 {0, 1}, wehave p(t w) = Y n y t n n {1 y n } 1 t n ; (Naive Bayes) E(w) = ln(p(t w)), X = {t n ln y n +(1 t n )ln(1 y n )} ; n re(w) = X n (y n t n )x n. > The objective function is a differentiable function of the parameter vector, which enables us to use standard optimization techniques.

Height and Weight 24

Addressing Perceptron Ambiguities Perceptron Logistic But. 25

Outliers Can Cause Problems Logistic regression tries to minimize the error-rate at training time. Can result in poor classification rates at test time. > Support Vector Machines. 26

From Binary to Multi-Class k classes. Naively using k (k-1)/2 binary classifiers results in ambiguities. 27

<latexit sha1_base64="fmrdnwx/mzztdtjx29lo1xjqvwa=">aaae1xichvjnb9naehvmomv8txdkmqkhsqfedjmahipk4ycekhxutkmyabterjml3t2wxjcxlm+ikxeu8c/4mfwb1q6rnh+oi1k7njdv3uzo+koqrdp1/1bsa9xrn5awbzq3bt+5e29l9f5hjgnfajpiukqwjymamkgbmumqtkakyu6h9mgfvs3xo1oqiibfgu5gtmnxx7caeaxnqltqlygf9plimaacfawzg3ipah9qqeicrsbx0eegoacuaqm4u3os7rco/mkgbapyx93hyye5j/auct91ag2nlpooepzrgy9j0cmqgvq9ymczgvl5oytwc6ajmpg5n+a1uavkknorfbfgryficuondxdkg9avaiufkyezawteum8wyawboxvwab2ctubomxp3kizc7eecdxxan1h/oe0bcgzipbyoq0chgucpm/lcl5muwabzcettomcu+vny5bngqb0jrqakmheiyp7t5hvnnhxere/ykxx/53pvrqy5dbcwmhe80lmzstvvrlb+oj4kmadcf5vs9tyr7qryaubcfjhiii4wgei+brtxye6jtlqsbaaptasxr5p5hiyiepmryh5fcfdnjsd6em1iexar1o518lktmjgknrxktciiq9as8v2hhlou6daxdiakmv6bdldcrjs1n/sm6jhizrf5gus3srbbsvgu4gdpk8tm8omp/njgmwl6kkhpfrhp4em5fdkft+bwzapp5hrgexcjbifpxpyp2psd67xzunxw3ib3cwttz7cc+bl10hpk1s3pemhtwo+tfatpevvyp+1f9u/qutwrfqt+p0u1kyxngtvl1r//acjshzu=</latexit> Linear Discriminant K classifiers of the form y k (x) =wk T x + w0 k. Decision boundary (w k w j )x +(wk 0 wj 0 ) > 0. Decision regions are convex. (w k w j )x A +(w 0 k w 0 j ) > 0, (w k w j )x B +(w 0 k w 0 j ) > 0, )8, if x = x A +(1 )x B, then (w k w j )x +(w 0 k w 0 j ) > 0. 28

<latexit sha1_base64="a96wcfnu70qndyxd60tloaedaxg=">aaadg3icbvfdb9mwfhuwbqn8dfaid1c0seotqqrcgpehmv6qxqqhrvunpkso67rebcckdk2i8rpgv/bvcnkkkqa/+oiec889utepouuvbf82dsw7u3fv7d3vpxj46pgt/v7tyztkeklhjojrmvfxsjmtdkyy4nqsjxqln9mrp/xy81ffazkysf6oiqyzgresbyxgpute/5fr0wwtjvnusb+06vxcgoj1aghhoe21vrddfibaugiirpssxugfb66fv4yjcp2vf3690h8oh1bjwwvwjh90+0kcvbp+ahllgyrwpmcffraarl9g76hwbhpo1umacw7wdbisfopqgupyvm5vu3r9gt20mwdd4gzaag3eubdv/htneckelaqjmhxswm1knchguhz0s5tgmir4qacasixooiubbvfwslfmta4gkgqa6r8djrzpwghfkwvwy3sbq4v/46azct7nsibjtffj1oocjnelqq8fc5zqonihasyj01mblhgcidjh0uurdeuiibdejetpqqk9k916ih+uk6ra4q9b/hvvbruolkb1dfytftxh8xafd/iixredaolsvkawl8+qsh/a2t5rf1yoho49dl6mbscnm5pvoefojtpadnqljtendi7gibgvjbpj1phs7pqh5sh8s5bugjuez6j1zkm/z4w <latexit sha1_base64="2et/ai+plmjfzmmmgmhzou9y5bu=">aaag3xicfdrnb9mwgafwdiwywtsgehcufhmtmhbqjig4ie2mawztxpdatvmdisdxum9xejlo2yjykrviygfgcnwfvg1o22l1pbbu6al//zyx3tr+ftnctfp/gks3lm82b63ctu/cvxf/weraw9m8ltgmhzzgke/5kccxtuhhubgtxsyjyn5muv7ftu3diee5tzo2kdpimhqlnkqyctxlrtueq59enklqtknku9oxyomtgihhdi29cwafjqnsqx8kvfpp7avvywkg7l9izlvhtelrb+cye+eolbwwaj Multi-Class Linear Classification K classifiers of the form y k (x) =w T k x + w0 k Assign x to class k if y k (x) >y j (x) for all j 6= k. k = arg max j w j T x 2 3 2 3 6 4 y 1. y K 7 5 = 6 4 w T 1. w T K k = arg max j 7 5 x y j 29

<latexit sha1_base64="a96wcfnu70qndyxd60tloaedaxg=">aaadg3icbvfdb9mwfhuwbqn8dfaid1c0seotqqrcgpehmv6qxqqhrvunpkso67rebcckdk2i8rpgv/bvcnkkkqa/+oiec889utepouuvbf82dsw7u3fv7d3vpxj46pgt/v7tyztkeklhjojrmvfxsjmtdkyy4nqsjxqln9mrp/xy81ffazkysf6oiqyzgresbyxgpute/5fr0wwtjvnusb+06vxcgoj1aghhoe21vrddfibaugiirpssxugfb66fv4yjcp2vf3690h8oh1bjwwvwjh90+0kcvbp+ahllgyrwpmcffraarl9g76hwbhpo1umacw7wdbisfopqgupyvm5vu3r9gt20mwdd4gzaag3eubdv/htneckelaqjmhxswm1knchguhz0s5tgmir4qacasixooiubbvfwslfmta4gkgqa6r8djrzpwghfkwvwy3sbq4v/46azct7nsibjtffj1oocjnelqq8fc5zqonihasyj01mblhgcidjh0uurdeuiibdejetpqqk9k916ih+uk6ra4q9b/hvvbruolkb1dfytftxh8xafd/iixredaolsvkawl8+qsh/a2t5rf1yoho49dl6mbscnm5pvoefojtpadnqljtendi7gibgvjbpj1phs7pqh5sh8s5bugjuez6j1zkm/z4w <latexit sha1_base64="2et/ai+plmjfzmmmgmhzou9y5bu=">aaag3xicfdrnb9mwgafwdiwywtsgehcufhmtmhbqjig4ie2mawztxpdatvmdisdxum9xejlo2yjykrviygfgcnwfvg1o22l1pbbu6al//zyx3tr+ftnctfp/gks3lm82b63ctu/cvxf/weraw9m8ltgmhzzgke/5kccxtuhhubgtxsyjyn5muv7ftu3diee5tzo2kdpimhqlnkqyctxlrtueq59enklqtknku9oxyomtgihhdi29cwafjqnsqx8kvfpp7avvywkg7l9izlvhtelrb+cye+eolbwwaj Multi-Class Logistic Regression K classifiers of the form y k (x) =w T k x + w0 k =s( ) Assign x to class k if y k (x) >y j (x) for all j 6= k. k = arg max j w j T x 2 3 2 3 6 4 y 1. y K 7 5 = 6 4 w T 1. w T K k = arg max j 7 5 x y j 30

<latexit sha1_base64="e+skd7wqfsnvy7n4rsahcgbbc+o=">aaahu3icfztrbtmwfedtsa2jmnjgbykxaxoohv3vjgeqjqsjuo0nbqypxyvqejmuk7qnnza4a6uqd+breiup4yfv4qun7wcuxyylvb3n+n46tmz5lg1ftfort3rrewu1v3a7cofu+r37g5spzkmvcjbpyc/1go6fqujstlqccpd0/iagzrmkby3qkw9fkcckhm+kqu96ddmc2hqjivpmzm4lwsshpcafoqocnn1jcgd4rcjq5gh8adcalab4cl7jc9obwjgazoixktkqgqtemiyyobzmddmczjy3icgyioh8juugz8tcy3p0qzkwbm+ajahcb2uayfc5yz3zezskqvmrg0qlus78d6v5592sdqfz1wgim+yj2cjlxxiaxukpkwdhwghyzllovq9kxmmlyql+rmjwqphohca7wbfhot+apb32qauace9fushmxna1us0g0ipapng25upm3fx5bpsejhjharsodlu1qi96mqoexs5jcjakiy/wcdmkk0oogal7cbabcxgim31ge4g8uabz9uqmglewndjlmgyjqbji0ur1rbsj+2uvptypbof41sioxca8kd4aoe8dgou7lqhcazx/feabknsu5anugg+ixetatmtd9z4jkpccnriiwgfoksi1obccrjejlf+kmvqfgfjhatnndqrlfha9esngesp0dlhw3eisqgfymsyey0jullqasbfamwt2jqj8uoghgj9s+jhgjxv+rpgmwpsabym8pfg2wtsa7yi8o/gpcv+y3qafaykg0ipycrc0jhwond7vk0jfe2xbewxncbtuahyg8ihgkvueqglm4uzjnsi9jqufc41hco80fqhwc42pft7w+ethe41pft7v3hj2+k/ayi1pfyuwe1u40ys6ktst9cytlz6cenc+v6nj+mpe9shr+am6zjw2toyiutnegafgw+pmabk49zx3lfc992p15+rv/fj+eayu5ezzhhrkyk//as/klsc=</latexit> 31 Multi-Class Logistic Regression p(c k x) = exp(a k ) Pj exp(a j) with a k = w T k x + w k, X X E(w 1,...,w K ) = t nk ln(y nk ), re(w) wj = X n n k (y nj t nj )x n. > The extension is to the multi-class case is natural.

Multi-Class Results 32