Neural Networks (The Machine-Learning Kind) BCS 247 March 2019

Podobne dokumenty
Linear Classification and Logistic Regression. Pascal Fua IC-CVLab

Hard-Margin Support Vector Machines

Gradient Coding using the Stochastic Block Model

TTIC 31210: Advanced Natural Language Processing. Kevin Gimpel Spring Lecture 9: Inference in Structured Prediction

tum.de/fall2018/ in2357

Machine Learning for Data Science (CS4786) Lecture11. Random Projections & Canonical Correlation Analysis

Machine Learning for Data Science (CS4786) Lecture 11. Spectral Embedding + Clustering

MATLAB Neural Network Toolbox przegląd

TTIC 31210: Advanced Natural Language Processing. Kevin Gimpel Spring Lecture 8: Structured PredicCon 2

Machine Learning for Data Science (CS4786) Lecture 24. Differential Privacy and Re-useable Holdout

Proposal of thesis topic for mgr in. (MSE) programme in Telecommunications and Computer Science

MoA-Net: Self-supervised Motion Segmentation. Pia Bideau, Rakesh R Menon, Erik Learned-Miller

deep learning for NLP (5 lectures)

Previously on CSCI 4622

Steps to build a business Examples: Qualix Comergent

Bayesian graph convolutional neural networks

Helena Boguta, klasa 8W, rok szkolny 2018/2019

Tychy, plan miasta: Skala 1: (Polish Edition)

Zakopane, plan miasta: Skala ok. 1: = City map (Polish Edition)

ERASMUS + : Trail of extinct and active volcanoes, earthquakes through Europe. SURVEY TO STUDENTS.

Wojewodztwo Koszalinskie: Obiekty i walory krajoznawcze (Inwentaryzacja krajoznawcza Polski) (Polish Edition)

Revenue Maximization. Sept. 25, 2018

Wojewodztwo Koszalinskie: Obiekty i walory krajoznawcze (Inwentaryzacja krajoznawcza Polski) (Polish Edition)

Wprowadzenie do sieci neuronowych i zagadnień deep learning

Maximum A Posteriori Chris Piech CS109, Stanford University

Presented by. Dr. Morten Middelfart, CTO

Stargard Szczecinski i okolice (Polish Edition)

Relaxation of the Cosmological Constant

Learning to find good correspondences

EGZAMIN MATURALNY Z JĘZYKA ANGIELSKIEGO POZIOM ROZSZERZONY MAJ 2010 CZĘŚĆ I. Czas pracy: 120 minut. Liczba punktów do uzyskania: 23 WPISUJE ZDAJĄCY

Horseshoe Priors for Bayesian Neural Networks

SSW1.1, HFW Fry #20, Zeno #25 Benchmark: Qtr.1. Fry #65, Zeno #67. like

MaPlan Sp. z O.O. Click here if your download doesn"t start automatically

Karpacz, plan miasta 1:10 000: Panorama Karkonoszy, mapa szlakow turystycznych (Polish Edition)

Podejście agentowe w projektowaniu sieci RBF Agent-based approach to design of the RBFNs. I.Czarnowski

ARMAX (ANN) : :. (ANN) ARMAX.... ARMAX ARMA :..Q47 E27 C53 C45 :JEL

y = The Chain Rule Show all work. No calculator unless otherwise stated. If asked to Explain your answer, write in complete sentences.

Inverse problems - Introduction - Probabilistic approach

2 nd ClimMani EU COST Action Meeting Poznań, Poland September, 2015

Emilka szuka swojej gwiazdy / Emily Climbs (Emily, #2)

Blow-Up: Photographs in the Time of Tumult; Black and White Photography Festival Zakopane Warszawa 2002 / Powiekszenie: Fotografie w czasach zgielku

Label-Noise Robust Generative Adversarial Networks

Katowice, plan miasta: Skala 1: = City map = Stadtplan (Polish Edition)

Prices and Volumes on the Stock Market

Zarządzanie sieciami telekomunikacyjnymi

arxiv: v1 [cs.lg] 22 Dec 2018

Logistic Regression. Machine Learning CS5824/ECE5424 Bert Huang Virginia Tech

Volley English! Dziś lekcja 1 Zaproszenie. Zapraszamy i my

TTIC 31190: Natural Language Processing

Wprowadzenie do programu RapidMiner Studio 7.6, część 9 Modele liniowe Michał Bereta

OSI Network Layer. Network Fundamentals Chapter 5. ITE PC v4.0 Chapter Cisco Systems, Inc. All rights reserved.

Internet of Things Devices

Sargent Opens Sonairte Farmers' Market

CEE 111/211 Agenda Feb 17

METHOD 2 -DIAGNOSTIC OUTSIDE

MULTI CRITERIA EVALUATION OF WIRELESS LOCAL AREA NETWORK DESIGNS

GRY EDUKACYJNE I ICH MOŻLIWOŚCI DZIĘKI INTERNETOWI DZIŚ I JUTRO. Internet Rzeczy w wyobraźni gracza komputerowego

Ankiety Nowe funkcje! Pomoc Twoje konto Wyloguj. BIODIVERSITY OF RIVERS: Survey to students

Język angielski. Poziom rozszerzony Próbna Matura z OPERONEM i Gazetą Wyborczą CZĘŚĆ I KRYTERIA OCENIANIA ODPOWIEDZI POZIOM ROZSZERZONY CZĘŚĆ I

archivist: Managing Data Analysis Results

Output Channels: N (a) A Convolutional Layer. # Layers: Expand Channels: Q

Umowa Licencyjna Użytkownika Końcowego End-user licence agreement

OSI Network Layer. Network Fundamentals Chapter 5. Version Cisco Systems, Inc. All rights reserved. Cisco Public 1

EGZAMIN MATURALNY Z JĘZYKA ANGIELSKIEGO POZIOM ROZSZERZONY MAJ 2010 CZĘŚĆ I. Czas pracy: 120 minut. Liczba punktów do uzyskania: 23 WPISUJE ZDAJĄCY

Miedzy legenda a historia: Szlakiem piastowskim z Poznania do Gniezna (Biblioteka Kroniki Wielkopolski) (Polish Edition)

Electromagnetism Q =) E I =) B E B. ! Q! I B t =) E E t =) B. 05/06/2018 Physics 0

OSI Physical Layer. Network Fundamentals Chapter 8. Version Cisco Systems, Inc. All rights reserved. Cisco Public 1

Miedzy legenda a historia: Szlakiem piastowskim z Poznania do Gniezna (Biblioteka Kroniki Wielkopolski) (Polish Edition)

Towards Stability Analysis of Data Transport Mechanisms: a Fluid Model and an Application

Nazwa projektu: Kreatywni i innowacyjni uczniowie konkurencyjni na rynku pracy

STRATEGIA DOBORU PARAMETRÓW SIECI NEURONOWEJ W ROZPOZNAWANIU PISMA

Wrocław University of Technology. Uczenie głębokie. Maciej Zięba

Ankiety Nowe funkcje! Pomoc Twoje konto Wyloguj. BIODIVERSITY OF RIVERS: Survey to teachers

WYŁĄCZNIK CZASOWY OUTDOOR TIMER

Zdecyduj: Czy to jest rzeczywiście prześladowanie? Czasem coś WYDAJE SIĘ złośliwe, ale wcale takie nie jest.

HOW MASSIVE ARE PROTOPLANETARY/ PLANET HOSTING/PLANET FORMING DISCS?

Weronika Mysliwiec, klasa 8W, rok szkolny 2018/2019

Karpacz, plan miasta 1:10 000: Panorama Karkonoszy, mapa szlakow turystycznych (Polish Edition)

Effective Governance of Education at the Local Level

Test sprawdzający znajomość języka angielskiego

Wojewodztwo Koszalinskie: Obiekty i walory krajoznawcze (Inwentaryzacja krajoznawcza Polski) (Polish Edition)

Łukasz Reszka Wiceprezes Zarządu

Baptist Church Records

UCZENIE WIELOWARSTWOWYCH SZEROKICH SIECI NEURONOWYCH Z FUNKCJAMI AKTYWACJI TYPU RELU W ZADANIACH KLASYFIKACJI

Lecture 18 Review for Exam 1

Jak zasada Pareto może pomóc Ci w nauce języków obcych?

Twoje osobiste Obliczenie dla systemu ogrzewania i przygotowania c.w.u.

Dolny Slask 1: , mapa turystycznosamochodowa: Plan Wroclawia (Polish Edition)

Egzamin maturalny z języka angielskiego na poziomie dwujęzycznym Rozmowa wstępna (wyłącznie dla egzaminującego)

ARNOLD. EDUKACJA KULTURYSTY (POLSKA WERSJA JEZYKOWA) BY DOUGLAS KENT HALL

Leki biologiczne i czujność farmakologiczna - punkt widzenia klinicysty. Katarzyna Pogoda

ANKIETA ŚWIAT BAJEK MOJEGO DZIECKA

Pielgrzymka do Ojczyzny: Przemowienia i homilie Ojca Swietego Jana Pawla II (Jan Pawel II-- pierwszy Polak na Stolicy Piotrowej) (Polish Edition)

DOI: / /32/37

Analysis of Movie Profitability STAT 469 IN CLASS ANALYSIS #2

Rev Źródło:

OpenPoland.net API Documentation

Convolution semigroups with linear Jacobi parameters

EPS. Erasmus Policy Statement

Working Tax Credit Child Tax Credit Jobseeker s Allowance

Transkrypt:

Neural Networks (The Machine-Learning Kind) BCS 247 March 2019

Neurons http://biomedicalengineering.yolasite.com/neurons.php Networks https://en.wikipedia.org/wiki/network_theory#/media/file:social_network_analysis_visualization.png Neural Networks? https://en.wikibooks.org/wiki/artificial_neural_networks/activation_functions

Artificial Neurons Neural output summarized as a non-negative number (analogous to firing rate) Inputs from other neurons are weighted (dendrites) and summed (soma) Output (axon) consists of this sum passed through a nonlinear activation function f() (analogous to spiking threshold) f() enforces non-negativity w1x1 w2x2 w3x3 Computational abstraction Σ f

McCullough-Pitts Neurons (~1943) Single Neurons as Logic Gates (AND, OR, NOT) Networks implement any Truth Table http://ecee.colorado.edu/~ecen4831/lectures/nnet2.html

Perceptrons (~1957) Output = classification (weights = separating hyperplane) https://blog.dbrgn.ch/2013/3/26/perceptrons-in-python/ The XOR problem ~1969 http://ecee.colorado.edu/~ecen4831/lectures/nnet3.html

Multi-Layer Perceptrons sigmoid http://matlabgeeks.com/tips-tutorials/neural-networks-a-multilayer-perceptron-in-matlab/ Universal Approximation Theorem (~1989) A single hidden layer is sufficient to make MLPs a universal approximator deeplearning.net/tutorial/mlp.html

Gartner Hype Cycle How many hidden layer neurons? Learnability? https://en.wikipedia.org/wiki/hype_cycle#/media/file:gartner_hype_cycle.svg

Deep Learning MLPs are shallow deep network http://neuralnetworksanddeeplearning.com

1980s vs 2010s what changed? Single hidden layer Fully Connected architecture Sigmoid activation function Small data and toy problems Multiple hidden layers ( depth ) Specialized architectures e.g. convolution and recurrence ReLU activation function Big Data and fast computers Some theoretical progress

Backpropagation Credit assignment problem: in a deep network, will adjusting a given weight help or hurt? How MLPs and deeper networks are trained Originally created for supervised learning problems Nothing but the chain rule from calculus applied to a cost function

<latexit sha1_base64="azka2xv7c7i1vzyop7td+gfyhm0=">aaadfnicfvfda9rafj3erzz+bfvrkmffejeluuffhfkl+fkt4lafzblczn4kqyetmhmjlih/oz/fj9/e1776b5xsv1iz1qsdh3povxm/kkpjs2h4y/ovxl12/cbwdndz1u07dwc7945swruby1gq0pwkyffjjwosppckmghfova4ox3b6cdf0fhz6s+0qhbaqkzlkgwqo2adb3fqqdrxbyykkb4xqhmsnou2xsmpr4jz2vi3poglpjf5+qasxu/oq3m77uyrl3aut7pbmbyfy+cbifqbivvf4wzhexfps1exqekoshyshrvnm66iuoj+rs1wie4hw4mdggq002a54py/dsycp6vxtxnfsuszdrtwloreobt+bv/ryh9pxuxbiaxbdcokpvt1tjg6qgm1uogkrrwnknch5xnpujbaoadcsdcmfzm4fzi7exdvoxvw4igr+7fca1sap00mjivga+ugz+jnhfqfueo/roecwg0+6u95exw9h0uvrugnl8pdvdunttgd9og9yrf7xxbze3bixkx4d71978d74j/53/0f/s8lq++tcu6zv8i//w2zigei</latexit> <latexit sha1_base64="+xrvfzvtrjotjencg3qigsvvsbs=">aaac+hicfvfni9raeo3ej13j16wevtqoioimibuwxorlffeirudslkygodjtszrtpen3rzwn+soevilx/43/xs7scgnmtadh8d6rqq6qpflsuhj+8vwrv69d39q+edy8dfvo3chovworaynwllts5jqbi0qwoczjck8rg1akck+ss9edfvizjzw6/eslcqcfzkvmpqby1gzwnu4nicauwjaexemcke/sztg2aytlsddlwv6kb72e7djf35s36wl99dkeetsbdmnruay+caivgljvhm12vlfxxiu6wjkeamsnuvjrtokqcowub22xaneggu4clkfao22ws2z5i8fmeaqneyxxjbue0ubh7ajinlp7r+1rhfkvratoo1eb3dbmakpfthtzvjvhks5+ktakk+bdyfhcghskfg6amninw0uoboxkdhveb9ana/c9q/uhqgokzdmmbpmv8kv1w2fxsw79zyjlp0ahgsbtpurvermcvxhfu6pw497w4hb1g232gd1kt1je9tkbe8eo2jgjj3mpvede6j/73/zv/o8lq++tcu6zv8l/+ruf+/rz</latexit> Backpropagation y = h(g(f(x))) y = h(g(f(x,θf),θg),θh) f x g h y How will a change in θf affect the output y? - Difficult to predict in the forward direction: changes in f affect g which in turn affect h - Chain rule: @y @ f = @f @g @ f @f @h @g @y @h @y = @g @h @ g @ g @g @y @h

<latexit sha1_base64="l+k25s7c8n9bgoqw+jfee6veb1k=">aaadaxicfvhfa9raen6k/mjjr6t99gxxkijimvtb+icukuklwmfrc5fjmoxnkqwbbnidiefiu/8un3wtx/1l/g/cxe+ol+rawsf3ftozm5nuslokw1+ev3ht+o2bm1vbrdt37t4bbn8/tro2asdck21oe7coziljkqtwtdiiralwjdl73eknn9fyqctptkhwwkbwylqkiefnbl9349saaoikdelqpc6a8irtfm17iaqccwzzy1/xyd0ju8oyrjnyvinvma7snbezwtachcvgfrctwjct4mi27b2n51rubzykffg7ickkpk1xush0fwulfygzyhdiyakf2mmz3gbldx0z56k27pxel+zljaykaxdf4pzdf+261ph/0rqktho1wz5hulo6p21kwdwepbj4svortpp3t+nzavcqwjgawkg3dbc5ubwsu3aqv0e3rmh3ru6hcg2qno+bgexwwjfwdz/ftzr0p6ms/xgdcgk3+wh9z31wvdekno3cj8+hb4erg2yyb+whe8qi9oidshfsii2z8la8p96+99i/97/53/0ff1bfw+xssl/c//kbb8b3zw==</latexit> @y @h x f g h y @f @g @f @g @h @g @h @ f @ g @ h Each layer computes local gradient information of its output with respect to (i) its input, and (ii) its parameters Compute backwards from output: chain rule is the product of local gradient with everything in front of it Since all computations are local we can easily compute gradients for h(g(f(x))) or f(f(g(f(x)))) etc.

Stochastic Gradient Descent (SGD) Gradient Descent (GD) minimizes a function by making small adjustments to parameters in the direction of the gradient Backpropagation algorithm computes gradient with respect to each weight in the network Neural Network loss function is typically a sum of losses per data point for a large number of data points! Solution: get a noisy estimate of the gradient on a batch of data Take a small step in the noisy gradient direction Take smaller and smaller steps over time (simulated annealing) Lots of tricks for making this process better in practice

<latexit sha1_base64="m8m21e7gudec7sjpbjb2e4kf6vq=">aaacr3icfvfdaxnbfj1s/ajrv6qpvgwgizusdqvqghrkffhbjxzmu8ik6+zkbjj0zmezudsmlpu7/c0++kp/w9k0fw3qcwpnnnvu5z47aagkwyj61grwbty8dxv9tnj33v0hd9sbj46dka2avjdk2jouo1ayhz5kvhbswoa6vtbiz1439ce5wcdn/hnnbyw0n+qyk4kjp5l20xvjhn2jzju6qerexj9+paxhhhvya2zdzbpmc5ymwtwre/mkxmux9ebw74rnovbz2gs2k3yn6kwloksgxoiowczhstf6y8zglbpyfio7n4yjakcvtyifgjpkpyocizm+gaghodfgrtxce02fewzmm2p9y5eu2d87kq6dm+vuk5td3fvaq/6r1kx0tdfywbems8x2r5xmixihf5ebzkwiaghzatqwfgsquqdcwonnudhllgv0/xgyn+dnwvjg534qwhi09nnfuj1opqu9+qnbatd/hdk/enouhv7y8fu7r4lj7v78ohcdvezshyz/yj08iu9jl8rkh+ytd+sq9ikgx8l38op8dojgejwgxy6lqwvz85j8fyh8bqdk2ai=</latexit> <latexit sha1_base64="5main/4uxvzdlzv0wyqus8v1zx4=">aaac03icfvfbaxqxfm6mtzr1stvhx4klsjvszlrqkejrer+8vhdbwmydzmtp7izmkihj6c5hxsrxf6d4z8xst1666ihal+/7zufcilok69l0exrfuhjp8pwnq8nmtes3bva2bh1a3rioq66lnscfwjrc4dajj/g4nghvifgoohne6uef0fih1qe3qhfcwvsjunbwgcp7dvnqsmhzbw5wlp5zs19ra+kezbapci/2svbjw7ruyg7nzqmx2rsdcncmzntcpkw/fds7vz5sbs4v2mdyznv9ddddbl0h2qr0ysoo8q3ojzto3lsohjdg7shlazf2yjzgetuenrzr4ccwxvgaciq0y7/ct0vvbwzcs23cu44u2t8zpftwlqoiolte7xmti/+ldrvtj2qda4zr48onyy9u3thu/lstsphuadpdg06eqe7kigdgrorhkj+bae7czrl2asowbt+euu9qnoc0ue8zmgkf8zymp2u7hfqfuagzy0bjejafnd/zojh8sjs93e3fp+rvp1vdyipcixfjggtkmdknr8gbgrjofkrrlesb8td28zf466k1jly5t8lfex/7ctio5jk=</latexit> <latexit sha1_base64="h4p6ixvh9wixs5vs76lrrahflye=">aaac5nicfvfdaxnbfj1dv2r8stu3uqadkeoju1qoiejrer8uk5i2kanl7orudujszdjzvxowfffjn/hvv+wf8dc4m6rag/tcwjl7zrncj7ru0meu/qjcc+cvxly0cblz5eq16ze6mzcpnkmsgkewytijldtqusmqjso4ki3wilvwmb6/apndj2cdnpodzksyf3yqzsyfr59kup8zsjwbmmmekt4krocyp1n9qafvjhp0gwwukpi6zvizhbnwkuernw1dok4bljxya2ztz/onzkxj0qf0j25r+/eh5rzreemfw0m3fw2irdb1ek9aj6xip9kmxrgjevubgoxizo3iqmrxzs1koadpsmpbycuxn8liq80lcon6sbkgpvczcc2m9u8jxwrpo2peodcvuq9se3vnutb5l66t6frswfgtjcrmnoxrqcskqytlj1mlkbrahohopawbau4bf1b6yajiueuc/rk77cx4ys289xxflwa5gvuwztxocz5r/pbttt2i/wmlphf61on4zcdn97wodh4n4sed6p1ob+/56gyb5a65t/okjrtkj7wm+2ribpkz3a7ubvfcppwsfg2/lavhsplcin9f+p0x8ddt3a==</latexit> <latexit sha1_base64="9twxksd3c2kbj//uen7smshv8fk=">aaacrhicfvhbihnbeo2mt3w8zfxrl8ygieiyqkavwuinh7yssnldmb5ctadm0mxp99bdo4zh/sqf0vf9ehuscjqgbq2hc04xdarywitpsfjtej07f+hipb3l8zwr167fgo7fppa2crkn0mrrtnpwqjxbksnsefo7hcrxejkfvej1k0/ovllmijy1zhwurhvkagvqnnwvkqbfnrevoqgxofsq0nnshyfcqzdby0x7uenvrffcqxjbgx/g14ztftycjenkvxwxtdzgxdz1onsfvbzzk5skdukn3qetpkasbudkauxi0xisqz5biwmabir0wbsk3vg7gznzwrrwdpev++epfirvl1uenp2cflvryx9pfuffi9bhjiftqhiatcrudagr60mkrnoyvf8znyuhkvqyajbohtbclscbphcmwlzeenbhu9d3q40oylr7rqbxvvclc+fl8abh/zmq89syubyhzu+297wljh+oj4/gycfho4pnmxvssdvsdrvhjuwjo2bv2cgbmsm+su/sb/szjaojki2yttuabp7cyn9vvpwcfexx3w==</latexit> Stochastic Gradient Descent (SGD) Loss = NX i=1 r w Loss = r w Loss = h i E rw Loss error(f(x i ; w), ŷ i ) NX i=1 X b2batch r w error(f(x i ; w), ŷ i ) r w error(f(x b ; w), ŷ b ) = r w Loss <latexit sha1_base64="nkl9+y066kp2d3rs7etg6nn/rbc=">aaacnhicfvfdaxnbfj2sx+36leqjuaadikjhvwvtuylaskglfuxayirwd3i3hto7s8zctyyl/8zf42v74r9xno0qg/tcwogccy/33mlkrtwlya9wdov2nbv31tbj+w8epnrc3njs97zyenvsautom/colceekdj4wjqeitn4kp1/apstb+i8suyrtuscfjaxklcskfcj9q4ogm6yvl6ycaexj3doxval9g0xsmcfguzdaen4zl0ftttjn5kxxwxpantyoo5hg60dmbayktcq1od9ie1kgtbgsemns1huhkuq5zdbqyagcvtdeh50xl8ezsxz68izxofsckcnhfftigvozk9/u2vif2nnrn+i1ugkyvbrvjoslskrqiovn8krzcny5qx8rbxk0tmaqdovwnb5bg4khephyh9dwidhye7neh2qda9qaw5swpdzcd8rrxv0p6myf4wbxxg4fhrzzqug/7abvusmx7y6e+8xf7dgnrhn7cvl2tbby4fsmpwyzd/yt3bjrqlnad/6gb1dw6pwoucp+6ui/m/njtb0</latexit> w w r w Loss

Specialized Architectures

Fully-Connected

Convolutional Neural Networks <latexit sha1_base64="5ncu2qmdbsyamedmi9hxzlwcksg=">aaacy3icfvfdaxnbfj2swunga5l2tqqlqrapyvcl7wowunwri5gpyizwd3ktdpmdwwbufsosh+fr+8v8af4pz5minqfegdicc+bovwestaplyfir4j16/otgafwzx3v+4vblvdhsw50bjj2uptbdbcxkobbhgiqom4oqjhihyeky1ac3akzq6jstmxynmfdijjiqowaxjpgindrbyttcv7apoi1osw11j43kvtzvpe9rezdg7sgkmxoxyehwiss/zi1mwbcwx5gdctwr42i97yp445hpmnpghuxbmv33rggptcs0cc4u6nruaix5kfz2tkwode4zrjnnlsafufloqphmklkua9jbmu4wfqy5yaudwi1wywt8ggxwchn68sd0yxr84vp+zdaaafouimhmu/ixcsvp49ms/c8o1f+jq77vko92c94h/fft6em7/hbw6nzc/kgvvwkv2vswsxpwyz9zl/uyzwv2k92yu8pvr+y1veon1ats7xyxe+wd/af5hlpd</latexit> = <latexit sha1_base64="hn2vcmcuj0oktowkjmvkr5zjsgs=">aaacxxicfvfnswmxee1x68datdwdby+lrrcrsqucxosiil5ebatct8hsom1ds8mszmwy9bd41r/nyb9itlzqiw4ehu+9gwzeooqzbxz/rebmtrdnzufm3yxs4tjyubjyq2wqkdao5fldr6crm4enwwzh+0qhxbhhu6h/kut3j6g0k+lgdbjsxdavrmmogetdhz2uq37nh5u3cyixqjjxxt1ucmdhw9i0rmeob62bgz+yvgbkmmpx6iapxgroh7rytfbajlqvjtydepuwaxsdqewtxhux3zsyiluexjf1xmb6+rewk39p+usdi1lhhkgzms5hk2misq0k+rljj+wekv6ei9dmcqnhawuakmap8wgpffbj03pdu7thkrywcy8tvgck2s5cun0ynob2+g64k6p/jex8gs1yxzt88dvnsxc7wwv2av71frv+pp6dobjonsgwccgbqznzckuahbikz+sfvbbenajtcpy+ru5h3lnkfpsz9gfdkbdx</latexit>

Convolutional Neural Networks <latexit sha1_base64="5ncu2qmdbsyamedmi9hxzlwcksg=">aaacy3icfvfdaxnbfj2swunga5l2tqqlqrapyvcl7wowunwri5gpyizwd3ktdpmdwwbufsosh+fr+8v8af4pz5minqfegdicc+bovwestaplyfir4j16/otgafwzx3v+4vblvdhsw50bjj2uptbdbcxkobbhgiqom4oqjhihyeky1ac3akzq6jstmxynmfdijjiqowaxjpgindrbyttcv7apoi1osw11j43kvtzvpe9rezdg7sgkmxoxyehwiss/zi1mwbcwx5gdctwr42i97yp445hpmnpghuxbmv33rggptcs0cc4u6nruaix5kfz2tkwode4zrjnnlsafufloqphmklkua9jbmu4wfqy5yaudwi1wywt8ggxwchn68sd0yxr84vp+zdaaafouimhmu/ixcsvp49ms/c8o1f+jq77vko92c94h/fft6em7/hbw6nzc/kgvvwkv2vswsxpwyz9zl/uyzwv2k92yu8pvr+y1veon1ats7xyxe+wd/af5hlpd</latexit> = <latexit sha1_base64="hn2vcmcuj0oktowkjmvkr5zjsgs=">aaacxxicfvfnswmxee1x68datdwdby+lrrcrsqucxosiil5ebatct8hsom1ds8mszmwy9bd41r/nyb9itlzqiw4ehu+9gwzeooqzbxz/rebmtrdnzufm3yxs4tjyubjyq2wqkdao5fldr6crm4enwwzh+0qhxbhhu6h/kut3j6g0k+lgdbjsxdavrmmogetdhz2uq37nh5u3cyixqjjxxt1ucmdhw9i0rmeob62bgz+yvgbkmmpx6iapxgroh7rytfbajlqvjtydepuwaxsdqewtxhux3zsyiluexjf1xmb6+rewk39p+usdi1lhhkgzms5hk2misq0k+rljj+wekv6ei9dmcqnhawuakmap8wgpffbj03pdu7thkrywcy8tvgck2s5cun0ynob2+g64k6p/jex8gs1yxzt88dvnsxc7wwv2av71frv+pp6dobjonsgwccgbqznzckuahbikz+sfvbbenajtcpy+ru5h3lnkfpsz9gfdkbdx</latexit>

Convolutional Neural Networks <latexit sha1_base64="5ncu2qmdbsyamedmi9hxzlwcksg=">aaacy3icfvfdaxnbfj2swunga5l2tqqlqrapyvcl7wowunwri5gpyizwd3ktdpmdwwbufsosh+fr+8v8af4pz5minqfegdicc+bovwestaplyfir4j16/otgafwzx3v+4vblvdhsw50bjj2uptbdbcxkobbhgiqom4oqjhihyeky1ac3akzq6jstmxynmfdijjiqowaxjpgindrbyttcv7apoi1osw11j43kvtzvpe9rezdg7sgkmxoxyehwiss/zi1mwbcwx5gdctwr42i97yp445hpmnpghuxbmv33rggptcs0cc4u6nruaix5kfz2tkwode4zrjnnlsafufloqphmklkua9jbmu4wfqy5yaudwi1wywt8ggxwchn68sd0yxr84vp+zdaaafouimhmu/ixcsvp49ms/c8o1f+jq77vko92c94h/fft6em7/hbw6nzc/kgvvwkv2vswsxpwyz9zl/uyzwv2k92yu8pvr+y1veon1ats7xyxe+wd/af5hlpd</latexit> = <latexit sha1_base64="hn2vcmcuj0oktowkjmvkr5zjsgs=">aaacxxicfvfnswmxee1x68datdwdby+lrrcrsqucxosiil5ebatct8hsom1ds8mszmwy9bd41r/nyb9itlzqiw4ehu+9gwzeooqzbxz/rebmtrdnzufm3yxs4tjyubjyq2wqkdao5fldr6crm4enwwzh+0qhxbhhu6h/kut3j6g0k+lgdbjsxdavrmmogetdhz2uq37nh5u3cyixqjjxxt1ucmdhw9i0rmeob62bgz+yvgbkmmpx6iapxgroh7rytfbajlqvjtydepuwaxsdqewtxhux3zsyiluexjf1xmb6+rewk39p+usdi1lhhkgzms5hk2misq0k+rljj+wekv6ei9dmcqnhawuakmap8wgpffbj03pdu7thkrywcy8tvgck2s5cun0ynob2+g64k6p/jex8gs1yxzt88dvnsxc7wwv2av71frv+pp6dobjonsgwccgbqznzckuahbikz+sfvbbenajtcpy+ru5h3lnkfpsz9gfdkbdx</latexit>

Convolutional Neural Networks http://deeplearning.net/tutorial/lenet.html http://on-demand.gputechconf.com/gtc/2015/webinar/deep-learning-geoint.pdf

Recurrent Neural Networks yt yt-1 yt yt+1 ht ht-1 ht ht+1 xt xt-1 xt xt+1

Are neural networks neural? feedforward strict layers deterministic supervised training But can they tell still tell us something about representations in the brain? backpropagation violate Dale s law