3.7 Novelty & Diversity

Podobne dokumenty
Hard-Margin Support Vector Machines

Revenue Maximization. Sept. 25, 2018

Machine Learning for Data Science (CS4786) Lecture11. Random Projections & Canonical Correlation Analysis

Linear Classification and Logistic Regression. Pascal Fua IC-CVLab

Machine Learning for Data Science (CS4786) Lecture 11. Spectral Embedding + Clustering

tum.de/fall2018/ in2357

OpenPoland.net API Documentation

deep learning for NLP (5 lectures)

Previously on CSCI 4622

Compressing the information contained in the different indexes is crucial for performance when implementing an IR system

TTIC 31210: Advanced Natural Language Processing. Kevin Gimpel Spring Lecture 9: Inference in Structured Prediction

Helena Boguta, klasa 8W, rok szkolny 2018/2019

Proposal of thesis topic for mgr in. (MSE) programme in Telecommunications and Computer Science

Supervised Hierarchical Clustering with Exponential Linkage. Nishant Yadav

Wojewodztwo Koszalinskie: Obiekty i walory krajoznawcze (Inwentaryzacja krajoznawcza Polski) (Polish Edition)

Rozpoznawanie twarzy metodą PCA Michał Bereta 1. Testowanie statystycznej istotności różnic między jakością klasyfikatorów

MaPlan Sp. z O.O. Click here if your download doesn"t start automatically

ARNOLD. EDUKACJA KULTURYSTY (POLSKA WERSJA JEZYKOWA) BY DOUGLAS KENT HALL

Karpacz, plan miasta 1:10 000: Panorama Karkonoszy, mapa szlakow turystycznych (Polish Edition)

TTIC 31210: Advanced Natural Language Processing. Kevin Gimpel Spring Lecture 8: Structured PredicCon 2

Zakopane, plan miasta: Skala ok. 1: = City map (Polish Edition)

Tychy, plan miasta: Skala 1: (Polish Edition)

Weronika Mysliwiec, klasa 8W, rok szkolny 2018/2019

Network Services for Spatial Data in European Geo-Portals and their Compliance with ISO and OGC Standards

Blow-Up: Photographs in the Time of Tumult; Black and White Photography Festival Zakopane Warszawa 2002 / Powiekszenie: Fotografie w czasach zgielku

Wojewodztwo Koszalinskie: Obiekty i walory krajoznawcze (Inwentaryzacja krajoznawcza Polski) (Polish Edition)

Few-fermion thermometry

OBWIESZCZENIE MINISTRA INFRASTRUKTURY. z dnia 18 kwietnia 2005 r.

MoA-Net: Self-supervised Motion Segmentation. Pia Bideau, Rakesh R Menon, Erik Learned-Miller

SSW1.1, HFW Fry #20, Zeno #25 Benchmark: Qtr.1. Fry #65, Zeno #67. like

Katowice, plan miasta: Skala 1: = City map = Stadtplan (Polish Edition)

POLITYKA PRYWATNOŚCI / PRIVACY POLICY


Latent Dirichlet Allocation Models and their Evaluation IT for Practice 2016

Oxford PWN Polish English Dictionary (Wielki Slownik Polsko-angielski)

Wojewodztwo Koszalinskie: Obiekty i walory krajoznawcze (Inwentaryzacja krajoznawcza Polski) (Polish Edition)

Analysis of Movie Profitability STAT 469 IN CLASS ANALYSIS #2

Zarządzanie sieciami telekomunikacyjnymi

ERASMUS + : Trail of extinct and active volcanoes, earthquakes through Europe. SURVEY TO STUDENTS.

Machine Learning for Data Science (CS4786) Lecture 24. Differential Privacy and Re-useable Holdout

Zmiany techniczne wprowadzone w wersji Comarch ERP Altum

Inverse problems - Introduction - Probabilistic approach

Życie za granicą Studia

Camspot 4.4 Camspot 4.5

Sargent Opens Sonairte Farmers' Market

1945 (96,1%) backlinks currently link back (74,4%) links bear full SEO value. 0 links are set up using embedded object

New Roads to Cryptopia. Amit Sahai. An NSF Frontier Center

Domy inaczej pomyślane A different type of housing CEZARY SANKOWSKI

Gradient Coding using the Stochastic Block Model

Klaps za karę. Wyniki badania dotyczącego postaw i stosowania kar fizycznych. Joanna Włodarczyk

DOI: / /32/37

Warsztaty Ocena wiarygodności badania z randomizacją


archivist: Managing Data Analysis Results

METHOD 2 -DIAGNOSTIC OUTSIDE

Towards Stability Analysis of Data Transport Mechanisms: a Fluid Model and an Application

European Crime Prevention Award (ECPA) Annex I - new version 2014

Wojewodztwo Koszalinskie: Obiekty i walory krajoznawcze (Inwentaryzacja krajoznawcza Polski) (Polish Edition)

Instrukcja obsługi User s manual


Installation of EuroCert software for qualified electronic signature

Miedzy legenda a historia: Szlakiem piastowskim z Poznania do Gniezna (Biblioteka Kroniki Wielkopolski) (Polish Edition)

Miedzy legenda a historia: Szlakiem piastowskim z Poznania do Gniezna (Biblioteka Kroniki Wielkopolski) (Polish Edition)

kdpw_stream Struktura komunikatu: Status komunikatu z danymi uzupełniającymi na potrzeby ARM (auth.ste ) Data utworzenia: r.

Immigration Studying. Studying - University. Stating that you want to enroll. Stating that you want to apply for a course.

Nonlinear data assimilation for ocean applications

Pielgrzymka do Ojczyzny: Przemowienia i homilie Ojca Swietego Jana Pawla II (Jan Pawel II-- pierwszy Polak na Stolicy Piotrowej) (Polish Edition)

Label-Noise Robust Generative Adversarial Networks

Council of the European Union Brussels, 7 April 2016 (OR. en, pl)

UMOWY WYPOŻYCZENIA KOMENTARZ

Logistic Regression. Machine Learning CS5824/ECE5424 Bert Huang Virginia Tech

ANKIETA ŚWIAT BAJEK MOJEGO DZIECKA

DODATKOWE ĆWICZENIA EGZAMINACYJNE

RADIO DISTURBANCE Zakłócenia radioelektryczne

SNP SNP Business Partner Data Checker. Prezentacja produktu

Baptist Church Records

DM-ML, DM-FL. Auxiliary Equipment and Accessories. Damper Drives. Dimensions. Descritpion

Polska Szkoła Weekendowa, Arklow, Co. Wicklow KWESTIONRIUSZ OSOBOWY DZIECKA CHILD RECORD FORM

Emilka szuka swojej gwiazdy / Emily Climbs (Emily, #2)

Raport bieżący: 44/2018 Data: g. 21:03 Skrócona nazwa emitenta: SERINUS ENERGY plc

Extraclass. Football Men. Season 2009/10 - Autumn round

RADIO DISTURBANCE Zakłócenia radioelektryczne

Relaxation of the Cosmological Constant

Poland) Wydawnictwo "Gea" (Warsaw. Click here if your download doesn"t start automatically

Dolny Slask 1: , mapa turystycznosamochodowa: Plan Wroclawia (Polish Edition)

Patients price acceptance SELECTED FINDINGS

Cracow University of Economics Poland

OSTC GLOBAL TRADING CHALLENGE MANUAL

Wykaz linii kolejowych, które są wyposażone w urządzenia systemu ETCS

TYLKO DO UŻYTKU WŁASNEGO! PERSONAL USE ONLY!

No matter how much you have, it matters how much you need

Dominika Janik-Hornik (Uniwersytet Ekonomiczny w Katowicach) Kornelia Kamińska (ESN Akademia Górniczo-Hutnicza) Dorota Rytwińska (FRSE)

Wykaz linii kolejowych, które są wyposażone w urzadzenia systemu ETCS

EXAMPLES OF CABRI GEOMETRE II APPLICATION IN GEOMETRIC SCIENTIFIC RESEARCH

SG-MICRO... SPRĘŻYNY GAZOWE P.103

Steeple #3: Gödel s Silver Blaze Theorem. Selmer Bringsjord Are Humans Rational? Dec RPI Troy NY USA


Testy jednostkowe - zastosowanie oprogramowania JUNIT 4.0 Zofia Kruczkiewicz

What our clients think about us? A summary od survey results

EPS. Erasmus Policy Statement

Transkrypt:

3.7 Novelty & Diversity Redundancy in returned results (e.g., near duplicates) has a negative effect on retrieval effectiveness No benefit in showing relevant yet redundant results Bernstein and Zobel [3] identified near duplicates in TREC GOV2; mean MAP dropped by 20% when treating them as irrelevant and increased by 16% when omitting them from results Novelty: How can we avoid redundancy in results? 92

Novelty & Diversity Ambiguity of query needs to be reflected in results to cater for the uncertainty about the user s information need Query ambiguity comes in different forms topic (e.g., jaguar, eclipse, defender, cookies) intent (e.g., java 10 download (transactional) vs. features (information)) time (e.g., olympic games 2012, 2016, 2020) Diversity: How well do results reflect query ambiguity? 93

Implicit vs. Explicit Diversification Implicit diversification methods do not represent query aspects explicitly and instead operate directly on document contents and their (dis)similarity Maximum Marginal Relevance [3] Explicit diversification methods represent query aspects explicitly (e.g., as categories, subqueries or key phrases) and consider which query aspects documents relate to IA-Diversify [4] PM-1 and PM-2 [5,6] 94

<latexit sha1_base64="y/xsgiisde2it553vpll7c00w3e=">aaaclxicdvdbattaef2pt9s9urfoq6gi+seyjjyuasllwdbs+lka0jojziuzrcbkyu2lu6ssi/qt/bv+qz+gj107ccvnoyzm4zyzs7un0dw3lk1/bognm7du39m627t3/8hdr/3ht46sagzdkvo1micfwky5xknjrsytbrbeuenxsxiz0o+/obfcys9uqtexuek+5wycp2b977srpdfrtwvepxkuy+hz11iwfrvw3kw0xrmlffomjdbtqjqacmpienhalrr463y5inaionu5nbr5st++w+jkdgx4t4dy6c2o4dwzg836g3s8m64qug6y8bqng8kzsq7dwf8nlrvrberhard2neu1y1swjrmaux5tlgpgc6jwffin/va8rvajdga5kxu4eptn23w0g2ilwtqlkk6ratzzjlkotxbq2l/m3fwgb7nujupjvlagbtk1pqhkuadxhycflslme8l/ysy1odxpbdcj7njdrufjucog+j842h1nhn/ag0wolniiw+qfeuvikpf9mihvysgzekz+bs+dyrchz8px4dvw3cvogfzupcubfx78dqznzry=</latexit> <latexit sha1_base64="y/xsgiisde2it553vpll7c00w3e=">aaaclxicdvdbattaef2pt9s9urfoq6gi+seyjjyuasllwdbs+lka0jojziuzrcbkyu2lu6ssi/qt/bv+qz+gj107ccvnoyzm4zyzs7un0dw3lk1/bognm7du39m627t3/8hdr/3ht46sagzdkvo1micfwky5xknjrsytbrbeuenxsxiz0o+/obfcys9uqtexuek+5wycp2b977srpdfrtwvepxkuy+hz11iwfrvw3kw0xrmlffomjdbtqjqacmpienhalrr463y5inaionu5nbr5st++w+jkdgx4t4dy6c2o4dwzg836g3s8m64qug6y8bqng8kzsq7dwf8nlrvrberhard2neu1y1swjrmaux5tlgpgc6jwffin/va8rvajdga5kxu4eptn23w0g2ilwtqlkk6ratzzjlkotxbq2l/m3fwgb7nujupjvlagbtk1pqhkuadxhycflslme8l/ysy1odxpbdcj7njdrufjucog+j842h1nhn/ag0wolniiw+qfeuvikpf9mihvysgzekz+bs+dyrchz8px4dvw3cvogfzupcubfx78dqznzry=</latexit> <latexit sha1_base64="y/xsgiisde2it553vpll7c00w3e=">aaaclxicdvdbattaef2pt9s9urfoq6gi+seyjjyuasllwdbs+lka0jojziuzrcbkyu2lu6ssi/qt/bv+qz+gj107ccvnoyzm4zyzs7un0dw3lk1/bognm7du39m627t3/8hdr/3ht46sagzdkvo1micfwky5xknjrsytbrbeuenxsxiz0o+/obfcys9uqtexuek+5wycp2b977srpdfrtwvepxkuy+hz11iwfrvw3kw0xrmlffomjdbtqjqacmpienhalrr463y5inaionu5nbr5st++w+jkdgx4t4dy6c2o4dwzg836g3s8m64qug6y8bqng8kzsq7dwf8nlrvrberhard2neu1y1swjrmaux5tlgpgc6jwffin/va8rvajdga5kxu4eptn23w0g2ilwtqlkk6ratzzjlkotxbq2l/m3fwgb7nujupjvlagbtk1pqhkuadxhycflslme8l/ysy1odxpbdcj7njdrufjucog+j842h1nhn/ag0wolniiw+qfeuvikpf9mihvysgzekz+bs+dyrchz8px4dvw3cvogfzupcubfx78dqznzry=</latexit> <latexit sha1_base64="gki4usoazpcibs7gxwssaexw1q8=">aaaclxicdvbdi9nafj3er7v+vx3wqzbgh5rcbpmsgvsilcjii7ii3v3ykevmcpsdmpmjmzeyjerp+o/8d/4ah53wlljxlwp3cm69z2zoxlfsuzp+d8jr12/cvlvzu3fn7r37d/ophx0701ibe2eqy09zcfhjjrosvofpbrfuxufjvni90k++onxs6m+0rhgqonrylgwqp2b9b7zrhder2ili2hcxovrutrxsyrvcdbgvce6xb960al7lty0wyfgnclsnvrd/2s1g0v4uz3ubqzef++m7jc5nv4b/dcig3ojbwz7tanyfpop9dfxrvzcn1z0dse0dzfo/egfeo1ctqmc5syytadqcjskq7hq8cvidwecjz5dw4g+ftiuahwsx27khq0e5abuodktsqtm3vpkvuggdb5o5mquc3p1ltvodast13rbq4bu1bjoj80elixytpzj5v0xmtsf8n1xsaeffokby48yedj0fz2ug0f/b8f448/jji8hhwsaohfaupwcxy9hldsjessm2yyl9dj4fwyaon4svwjfh29+jybdzecy2kvzwc4uazmi=</latexit> Maximum Marginal Relevance Carbonell and Goldstein [3] return the next document d as the one having maximum marginal relevance (MMR) given the set S of already-returned documents 3 4 arg max sim(q, d) (1 ) max d œs d Õ œs sim(dõ,d) with λ as a tunable parameter controlling relevance vs. novelty and sim as a similarity measure (e.g., cosine similarity between query and document vectors) 95

<latexit sha1_base64="2mdpahaddhqdmgcygz/ydcigh5u=">aaacynicdvblaxsxgjq3fbjui07bw3ty6oslrnc3fjplwdbllgwx1a/wlotw/myllvaqpa0xyn9efktuusxh9f557udd0ehwjwbme2gyyag2yxjv8a4epx7ytpms9fzfy1eh7apxyy1krwbebbnqmmenjbywmtqwmeofmgcmjln+benpzkfpkoqfzi0h4xhz0aul2dgpbacxkcqevfzx5x/1y13y1jiq7tuyqewywcj0i/+th0sl5qmd+zet/lpql2otnrt0z+0qulyzj7urtjth/zjcwh9ion49w87glaoxtnu38vyqkknhcmnaz6jqmsrizshhulxiuopejmdlmofqygkqsusqhixa79uofpidtmzd055pmdd6zbmhisdmts9mquqgz/qfx83ijlg0kkwbgjivpjyyadd2kgeg3khb93wqsie4p+mayqmxacdecd13tgq5eu478p9pxsf9ypefnzudk21pqineoq+oiyl0bq3qkrqiesloet2go/s7ce21vcpvztbqnxy7u/s9vpd/afhfvau=</latexit> <latexit sha1_base64="2mdpahaddhqdmgcygz/ydcigh5u=">aaacynicdvblaxsxgjq3fbjui07bw3ty6oslrnc3fjplwdbllgwx1a/wlotw/myllvaqpa0xyn9efktuusxh9f557udd0ehwjwbme2gyyag2yxjv8a4epx7ytpms9fzfy1eh7apxyy1krwbebbnqmmenjbywmtqwmeofmgcmjln+benpzkfpkoqfzi0h4xhz0aul2dgpbacxkcqevfzx5x/1y13y1jiq7tuyqewywcj0i/+th0sl5qmd+zet/lpql2otnrt0z+0qulyzj7urtjth/zjcwh9ion49w87glaoxtnu38vyqkknhcmnaz6jqmsrizshhulxiuopejmdlmofqygkqsusqhixa79uofpidtmzd055pmdd6zbmhisdmts9mquqgz/qfx83ijlg0kkwbgjivpjyyadd2kgeg3khb93wqsie4p+mayqmxacdecd13tgq5eu478p9pxsf9ypefnzudk21pqineoq+oiyl0bq3qkrqiesloet2go/s7ce21vcpvztbqnxy7u/s9vpd/afhfvau=</latexit> <latexit sha1_base64="2mdpahaddhqdmgcygz/ydcigh5u=">aaacynicdvblaxsxgjq3fbjui07bw3ty6oslrnc3fjplwdbllgwx1a/wlotw/myllvaqpa0xyn9efktuusxh9f557udd0ehwjwbme2gyyag2yxjv8a4epx7ytpms9fzfy1eh7apxyy1krwbebbnqmmenjbywmtqwmeofmgcmjln+benpzkfpkoqfzi0h4xhz0aul2dgpbacxkcqevfzx5x/1y13y1jiq7tuyqewywcj0i/+th0sl5qmd+zet/lpql2otnrt0z+0qulyzj7urtjth/zjcwh9ion49w87glaoxtnu38vyqkknhcmnaz6jqmsrizshhulxiuopejmdlmofqygkqsusqhixa79uofpidtmzd055pmdd6zbmhisdmts9mquqgz/qfx83ijlg0kkwbgjivpjyyadd2kgeg3khb93wqsie4p+mayqmxacdecd13tgq5eu478p9pxsf9ypefnzudk21pqineoq+oiyl0bq3qkrqiesloet2go/s7ce21vcpvztbqnxy7u/s9vpd/afhfvau=</latexit> <latexit sha1_base64="p1sdvl+az3gddasnappa/dsxnl0=">aaacynicdvdpixmxge1hxwvv3xz71mnglxvqz6yi9rjq8lixovl7azrdkem/dkotsuwyy5ywf55/hhdv7lhvpmmfa/er+b7vve+dvewyqk0yfq15dx4+ontcf9j4+uz5+uwzdtntolaepkqworyz1sbodlnddyofvib5xmcebd/v/fktke1f/snsjcqcb3k6pgqbj6xnnczs2ulpp5f+lr/rgqewlhgvkkklxwzwphv5b/xykrfk7cqpae5pyr+mfxrl0j23q+jmxrw+jltzcfudca//let9aoyddma4bd7hk0ekdrkhdgu9jejpeouvoyrb2ygldrktld7aeocss1cj3ydgyntu2hy0xxx0yquejkyludy7np2ihjubyzetymtwpv85btbdxnjcfgzy4ryk2mcqxdvbngplhg0+7ijuosh9sqcmg7gloczk6l6jzcpv86cd//9knuhhjn982xknd0xv0qv0cnvrhn6hebpgyzrfbh1b39ep9lp2zwt4la/9o+rvdjttdatv5s/xllux</latexit> Intent-Aware Selection Agrawal et al. [4] model query aspects as categories (e.g., from a topic taxonomy such as DMOZ) query q belongs to category c with probability P[c q] document d relevant to query q and category c with probability P[d q, c] Given a query q, a baseline retrieval result R, they aim to find a subset of documents S of size k that maximizes P[S q ]= ÿ A P[c q ] 1 Ÿ B (1 P[d q, c ]) c dœs 96

<latexit sha1_base64="2mdpahaddhqdmgcygz/ydcigh5u=">aaacynicdvblaxsxgjq3fbjui07bw3ty6oslrnc3fjplwdbllgwx1a/wlotw/myllvaqpa0xyn9efktuusxh9f557udd0ehwjwbme2gyyag2yxjv8a4epx7ytpms9fzfy1eh7apxyy1krwbebbnqmmenjbywmtqwmeofmgcmjln+benpzkfpkoqfzi0h4xhz0aul2dgpbacxkcqevfzx5x/1y13y1jiq7tuyqewywcj0i/+th0sl5qmd+zet/lpql2otnrt0z+0qulyzj7urtjth/zjcwh9ion49w87glaoxtnu38vyqkknhcmnaz6jqmsrizshhulxiuopejmdlmofqygkqsusqhixa79uofpidtmzd055pmdd6zbmhisdmts9mquqgz/qfx83ijlg0kkwbgjivpjyyadd2kgeg3khb93wqsie4p+mayqmxacdecd13tgq5eu478p9pxsf9ypefnzudk21pqineoq+oiyl0bq3qkrqiesloet2go/s7ce21vcpvztbqnxy7u/s9vpd/afhfvau=</latexit> <latexit sha1_base64="2mdpahaddhqdmgcygz/ydcigh5u=">aaacynicdvblaxsxgjq3fbjui07bw3ty6oslrnc3fjplwdbllgwx1a/wlotw/myllvaqpa0xyn9efktuusxh9f557udd0ehwjwbme2gyyag2yxjv8a4epx7ytpms9fzfy1eh7apxyy1krwbebbnqmmenjbywmtqwmeofmgcmjln+benpzkfpkoqfzi0h4xhz0aul2dgpbacxkcqevfzx5x/1y13y1jiq7tuyqewywcj0i/+th0sl5qmd+zet/lpql2otnrt0z+0qulyzj7urtjth/zjcwh9ion49w87glaoxtnu38vyqkknhcmnaz6jqmsrizshhulxiuopejmdlmofqygkqsusqhixa79uofpidtmzd055pmdd6zbmhisdmts9mquqgz/qfx83ijlg0kkwbgjivpjyyadd2kgeg3khb93wqsie4p+mayqmxacdecd13tgq5eu478p9pxsf9ypefnzudk21pqineoq+oiyl0bq3qkrqiesloet2go/s7ce21vcpvztbqnxy7u/s9vpd/afhfvau=</latexit> <latexit sha1_base64="2mdpahaddhqdmgcygz/ydcigh5u=">aaacynicdvblaxsxgjq3fbjui07bw3ty6oslrnc3fjplwdbllgwx1a/wlotw/myllvaqpa0xyn9efktuusxh9f557udd0ehwjwbme2gyyag2yxjv8a4epx7ytpms9fzfy1eh7apxyy1krwbebbnqmmenjbywmtqwmeofmgcmjln+benpzkfpkoqfzi0h4xhz0aul2dgpbacxkcqevfzx5x/1y13y1jiq7tuyqewywcj0i/+th0sl5qmd+zet/lpql2otnrt0z+0qulyzj7urtjth/zjcwh9ion49w87glaoxtnu38vyqkknhcmnaz6jqmsrizshhulxiuopejmdlmofqygkqsusqhixa79uofpidtmzd055pmdd6zbmhisdmts9mquqgz/qfx83ijlg0kkwbgjivpjyyadd2kgeg3khb93wqsie4p+mayqmxacdecd13tgq5eu478p9pxsf9ypefnzudk21pqineoq+oiyl0bq3qkrqiesloet2go/s7ce21vcpvztbqnxy7u/s9vpd/afhfvau=</latexit> <latexit sha1_base64="p1sdvl+az3gddasnappa/dsxnl0=">aaacynicdvdpixmxge1hxwvv3xz71mnglxvqz6yi9rjq8lixovl7azrdkem/dkotsuwyy5ywf55/hhdv7lhvpmmfa/er+b7vve+dvewyqk0yfq15dx4+ontcf9j4+uz5+uwzdtntolaepkqworyz1sbodlnddyofvib5xmcebd/v/fktke1f/snsjcqcb3k6pgqbj6xnnczs2ulpp5f+lr/rgqewlhgvkkklxwzwphv5b/xykrfk7cqpae5pyr+mfxrl0j23q+jmxrw+jltzcfudca//let9aoyddma4bd7hk0ekdrkhdgu9jejpeouvoyrb2ygldrktld7aeocss1cj3ydgyntu2hy0xxx0yquejkyludy7np2ihjubyzetymtwpv85btbdxnjcfgzy4ryk2mcqxdvbngplhg0+7ijuosh9sqcmg7gloczk6l6jzcpv86cd//9knuhhjn982xknd0xv0qv0cnvrhn6hebpgyzrfbh1b39ep9lp2zwt4la/9o+rvdjttdatv5s/xllux</latexit> Intent-Aware Selection What is the intuition behind this probability? A B P[S q ]= ÿ P[c q ] 1 Ÿ (1 P[d q, c ]) c dœs c{ Probability that d is relevant to q, {not Probability that no document S is relevant to q, c { in Probability that at least one document in S is relevant to q, c It is thus the probability that an average user finds at least one relevant result among the documents in S 97

Intent-Aware Selection Probability P[c q] can be estimated using query classification methods (e.g., Naïve Bayes) Probability P[d q, c] can be decomposed into probability P[c d] that document belongs to category query likelihood P[q d] that documents generates query Agrawal et al. [5] show that finding an optimal subset S is an NP-hard problem, but propose a greedy algorithm that gives an approximation guarantee 98

<latexit sha1_base64="7valo+qgwtclpku2wklqiyewvm8=">aaacthicdzdnahrbfivvt2km40/g6c6bwtle0onuijinmodgliqenssqbprqmjudyurpqhpxkpqzfbn3ltyyf3dptgqrnbgyg4ec+jjn3oi6trhc+sz7lvtw1m9t3n6807977/6drchd7wonf5bhhgmh7wlnhqqucok5f3hqlfjzczyp568v85opab3x6r1fgiwlbrsfcuz9tkrbqcgmdyxchra2vgvjk1iyq6dvmjkckxkdqudm7+bkoelmp7rh2hbpomydw96c+6fvyjin9rjlkzuqj7o7g44fq6fdavcjmgq2kkg8e9s5szwzvgzues4etv1i4dbqnqcnnthmuio2da1qid4uv+oiikp0zegawqkdlc4tzx3dlnsfr5q11nnpa/fp4362xwauzmkjyjhrmkqtf3tn5zxaehj6dplwjhrxty4v1oonvfjmtrtfbpuxnr8dkp/d8d4oj3z0yjjev+ojnmehnsau5pasxvagdmecdd7dv/gof8mx5gfyk/l9ndplrncewyp6g38ah2g2vw==</latexit> <latexit sha1_base64="7valo+qgwtclpku2wklqiyewvm8=">aaacthicdzdnahrbfivvt2km40/g6c6bwtle0onuijinmodgliqenssqbprqmjudyurpqhpxkpqzfbn3ltyyf3dptgqrnbgyg4ec+jjn3oi6trhc+sz7lvtw1m9t3n6807977/6drchd7wonf5bhhgmh7wlnhqqucok5f3hqlfjzczyp568v85opab3x6r1fgiwlbrsfcuz9tkrbqcgmdyxchra2vgvjk1iyq6dvmjkckxkdqudm7+bkoelmp7rh2hbpomydw96c+6fvyjin9rjlkzuqj7o7g44fq6fdavcjmgq2kkg8e9s5szwzvgzues4etv1i4dbqnqcnnthmuio2da1qid4uv+oiikp0zegawqkdlc4tzx3dlnsfr5q11nnpa/fp4362xwauzmkjyjhrmkqtf3tn5zxaehj6dplwjhrxty4v1oonvfjmtrtfbpuxnr8dkp/d8d4oj3z0yjjev+ojnmehnsau5pasxvagdmecdd7dv/gof8mx5gfyk/l9ndplrncewyp6g38ah2g2vw==</latexit> <latexit sha1_base64="7valo+qgwtclpku2wklqiyewvm8=">aaacthicdzdnahrbfivvt2km40/g6c6bwtle0onuijinmodgliqenssqbprqmjudyurpqhpxkpqzfbn3ltyyf3dptgqrnbgyg4ec+jjn3oi6trhc+sz7lvtw1m9t3n6807977/6drchd7wonf5bhhgmh7wlnhqqucok5f3hqlfjzczyp568v85opab3x6r1fgiwlbrsfcuz9tkrbqcgmdyxchra2vgvjk1iyq6dvmjkckxkdqudm7+bkoelmp7rh2hbpomydw96c+6fvyjin9rjlkzuqj7o7g44fq6fdavcjmgq2kkg8e9s5szwzvgzues4etv1i4dbqnqcnnthmuio2da1qid4uv+oiikp0zegawqkdlc4tzx3dlnsfr5q11nnpa/fp4362xwauzmkjyjhrmkqtf3tn5zxaehj6dplwjhrxty4v1oonvfjmtrtfbpuxnr8dkp/d8d4oj3z0yjjev+ojnmehnsau5pasxvagdmecdd7dv/gof8mx5gfyk/l9ndplrncewyp6g38ah2g2vw==</latexit> <latexit sha1_base64="ckqnqxaret8tv8cf08+qplyjwbi=">aaacthicdzbpsxwxgmyz2/qn26rbeuwlubcldwdgbl0ubc/1ifjavcezhkz23tfs/plkxsxmz/kbeovbi36bhnstoxhchlbxizafz/o+gtyl5sy6jlmjwm/ezs0vll5rv/+wtlzs+fjpykqxodcniitzuhilnenoo+y4nggdrjqcjsvr3mn+fahgmiv/uymgxjbksigjxawr6oxnvbufsagwrf3pgn/dmtzqupgbzpjewck4dn16ijdwm3te+0gdfw2ynmxydea+fj1u0ttmhovfqtpr7qslpjoson+ygajjadjrtqw9trptck+my5rd3c7gfjshi1lbkuk00wbyx4es4mxkng4oiqcb+6armdatye1elc9mqdzzrfkqnxkktm8ed8od3dopxw4kdvmdpu7b0gs8ksgew+kdsvzoyiq/2zgtb5exinqo2wtyt0m9/zvar8przi8n/goru7szlworfuzrab2labvtou/oepurrvfon7pfd9f19de6jx6erlvrdgcvzag1/w+krryd</latexit> <latexit sha1_base64="crxpajnoccayqjn3omm/fooao08=">aaacm3icdzdnsgmxfixv+g/9qz87n8vuxehnkoiuc27cciq2fzxhymrrdz1mypirs5jx8e3cudv3ehfiut/bdnrffq+bfjytg7gnksntjgievbhxicmp6znzytz8wujsdxmlo0wuklapsiu6s4jglgxynsykecyvep6k2e36+8o8e4nkm5gdmoheijnexi4zjczzctuidc5js4taskwyyya9gi3ssrfulcz1ys9+ma45rtadxnywvo0vnbvlhdrba1dqkk6+hxec5hwzq1oi9xkzkcayrblguywqya5retonptwngsqsvwr7kdganrinhwaeo45sufhiaanxesctpyyn5mrutitog5lox5+by73iskzmbjpqshkt39aupr+fohkh+ycdp5bocdtppyugb31oqbk64bcouhp+oqj9d53trtpx8u69tffve8zaomzajjrhf1pwaefqbgp38acp8otdey/eq/f29xtm+55zhrf5h59ho67q</latexit> <latexit sha1_base64="crxpajnoccayqjn3omm/fooao08=">aaacm3icdzdnsgmxfixv+g/9qz87n8vuxehnkoiuc27cciq2fzxhymrrdz1mypirs5jx8e3cudv3ehfiut/bdnrffq+bfjytg7gnksntjgievbhxicmp6znzytz8wujsdxmlo0wuklapsiu6s4jglgxynsykecyvep6k2e36+8o8e4nkm5gdmoheijnexi4zjczzctuidc5js4taskwyyya9gi3ssrfulcz1ys9+ma45rtadxnywvo0vnbvlhdrba1dqkk6+hxec5hwzq1oi9xkzkcayrblguywqya5retonptwngsqsvwr7kdganrinhwaeo45sufhiaanxesctpyyn5mrutitog5lox5+by73iskzmbjpqshkt39aupr+fohkh+ycdp5bocdtppyugb31oqbk64bcouhp+oqj9d53trtpx8u69tffve8zaomzajjrhf1pwaefqbgp38acp8otdey/eq/f29xtm+55zhrf5h59ho67q</latexit> <latexit sha1_base64="crxpajnoccayqjn3omm/fooao08=">aaacm3icdzdnsgmxfixv+g/9qz87n8vuxehnkoiuc27cciq2fzxhymrrdz1mypirs5jx8e3cudv3ehfiut/bdnrffq+bfjytg7gnksntjgievbhxicmp6znzytz8wujsdxmlo0wuklapsiu6s4jglgxynsykecyvep6k2e36+8o8e4nkm5gdmoheijnexi4zjczzctuidc5js4taskwyyya9gi3ssrfulcz1ys9+ma45rtadxnywvo0vnbvlhdrba1dqkk6+hxec5hwzq1oi9xkzkcayrblguywqya5retonptwngsqsvwr7kdganrinhwaeo45sufhiaanxesctpyyn5mrutitog5lox5+by73iskzmbjpqshkt39aupr+fohkh+ycdp5bocdtppyugb31oqbk64bcouhp+oqj9d53trtpx8u69tffve8zaomzajjrhf1pwaefqbgp38acp8otdey/eq/f29xtm+55zhrf5h59ho67q</latexit> <latexit sha1_base64="dlbkm10jpvzgyutirhf2glezw+y=">aaacm3icdzdlsgmxgiuz9vbrrerstbebf9kzfseuc27ccbxtbtrdken/1tdjjcyzsyr5dd/envt9b3enlvudtgtd1oihki9z8gf+e4myku15r05uaxllds2/xtjy3nreke7utrvpjyew4tgx3qgrigkclu11df0habmohk40opvkntuqivlkwo8fbawpezqgbgtrhuxpvykldclkphhs+akmsyqzv5l/pdvum9p/ztlhsfj2kjvvotiivcvt2yujmzph8dpvc5iysdsjsvk9qid0yldulmsqffxugcbkhifqw57aamrghsazadmejy0mmiekzhtxudbgptsyrqsmw/pm3ow4h2kcqt+f60e9mdqrqyae2gykxm0pw587ikdaq92lsrska9idlbtjdfcuw0ryvbgyfww9vx2u/od2rvk1fhlsbtrnrexratper6iktlednammaigchtatekyvzqpz5rw7hz9pc85szh/nyfn6bsyirnw=</latexit> Intent-Aware Selection (IA-SELECT) Greedy algorithm iteratively builds up the set S by selecting in each round the document with highest marginal utility ÿ P[ c S ]P[q d ]P[c d ] c with P[ c S] as the probability that none of the documents already in S is relevant to query q and category c P[ c S ]= Ÿ dœs (1 P[q d ]P[c d ]) which is initialized as P[c q] 99

Intent-Aware Selection One problem with IA-SELECT is that its results degenerate: it does not select more results for a category c, once it has been fully satisfied by a single highly relevant result, which is not effective for information needs that require more than a single document 100

Diversity by Proportionality Dang and Croft [5,6] develop the proportionality-based explicit diversification methods PM-1 and PM-2 Given a query q and a baseline result R, their objective is to find a subset of documents S of size k, so that S proportionally represents the query aspects Example: Query jaguar refers to aspect car with 75% probability and to query aspect cat with 25% probability S 1 more proportional than S 2 more proportional than S 3 101

<latexit sha1_base64="u516yjnpoztsbcgayfhwe64qyw8=">aaacgxicdzdnsgmxfixv+g/9qz+4cvpsrla6m0xqzcgng6gcvceow514w0mnk5ckxtl2sdy51zdwj25d+q4+hgnvrrupgxyckxu4j1epnzyi3r2jyanpmdm5+clc4tlysnf17dzirmbuydkv+jjbqynpqgg5telsaukrphsrdi6g+uwptoeyo7n9rzhadszbnkf1vlzcbly0srwx80febe6zmjd2s+eglpadsjuyqvqxwsrodsq1driphhc/mtesdqvllqvozfuykbvlqc1nkq0kza4hhaydbbrcqkeihevtkoks7o/hdjmuzkj8tn5ymkmwpi+sp6zaeznujlj2lcbm1+e2drjlpfndsxlz2qhzv2fcsx4nie0o90/6fqyc4xyyfoqwbn2btettctgouhp+oij9d+fvsuj4dl9co/zqcezgc7zhb0i4gbocqx0awoaohuarnrx779l78v6/nk543zprmcbv7rni46it</latexit> <latexit sha1_base64="u516yjnpoztsbcgayfhwe64qyw8=">aaacgxicdzdnsgmxfixv+g/9qz+4cvpsrla6m0xqzcgng6gcvceow514w0mnk5ckxtl2sdy51zdwj25d+q4+hgnvrrupgxyckxu4j1epnzyi3r2jyanpmdm5+clc4tlysnf17dzirmbuydkv+jjbqynpqgg5telsaukrphsrdi6g+uwptoeyo7n9rzhadszbnkf1vlzcbly0srwx80febe6zmjd2s+eglpadsjuyqvqxwsrodsq1driphhc/mtesdqvllqvozfuykbvlqc1nkq0kza4hhaydbbrcqkeihevtkoks7o/hdjmuzkj8tn5ymkmwpi+sp6zaeznujlj2lcbm1+e2drjlpfndsxlz2qhzv2fcsx4nie0o90/6fqyc4xyyfoqwbn2btettctgouhp+oij9d+fvsuj4dl9co/zqcezgc7zhb0i4gbocqx0awoaohuarnrx779l78v6/nk543zprmcbv7rni46it</latexit> <latexit sha1_base64="u516yjnpoztsbcgayfhwe64qyw8=">aaacgxicdzdnsgmxfixv+g/9qz+4cvpsrla6m0xqzcgng6gcvceow514w0mnk5ckxtl2sdy51zdwj25d+q4+hgnvrrupgxyckxu4j1epnzyi3r2jyanpmdm5+clc4tlysnf17dzirmbuydkv+jjbqynpqgg5telsaukrphsrdi6g+uwptoeyo7n9rzhadszbnkf1vlzcbly0srwx80febe6zmjd2s+eglpadsjuyqvqxwsrodsq1driphhc/mtesdqvllqvozfuykbvlqc1nkq0kza4hhaydbbrcqkeihevtkoks7o/hdjmuzkj8tn5ymkmwpi+sp6zaeznujlj2lcbm1+e2drjlpfndsxlz2qhzv2fcsx4nie0o90/6fqyc4xyyfoqwbn2btettctgouhp+oij9d+fvsuj4dl9co/zqcezgc7zhb0i4gbocqx0awoaohuarnrx779l78v6/nk543zprmcbv7rni46it</latexit> <latexit sha1_base64="w1qn5onig9rsymkedws5n4dtw0e=">aaacgxicdzdnsgmxfiuz9a/wv6rgxk2xg0hpzbtblgtu3agv7a+0w3an3tbqysqkabgmfrj3bvul3ilbv76dd2fa66iwd4f8njmbucesmdpg8z6dznlyyupadj23sbm1vzpf3wtomvau61teqrui0bizbougmrhbuihwkmzm1l+y5m0hks1ecmngegmovyr1gqvjrtb/0okqookwzoo03dnvisucfpxxmc96pbi3uwer/nl09opkplqy/+rccjrgmbgag9zt35mmseezrmmc5zodjrjoh3rybk+crbwkprqcjrrnxxyt4kiddlrexjgc13reowwtg7mbnymh+gyi/edz060ekuvkwgbcbtbf1k1rw5lbj1dzw9yrkrtka9idtbudwxuxa1vclyyoc7ae3w4k/0ojxpitx58vq5vzuvlysi7imfhjoamss1ijdulja3kiz+tfexrentfn/edpxpnn7jm5or/fzjch2q==</latexit> Sainte-Laguë Method Ensuring proportionality is a classic problem that also arises when assigning parliament seats to parties Sainte-Laguë method as used in New Zealand Let v i denote the number of votes received by party p i Let s i denote the number of seats allocated to party p i While not all seats have been allocated assign next seat to party p i with highest quotient v i 2 s i +1 Increment number of seats s i allocated to party p i 102

PM-1 PM-1 is a naïve adaptation of the Sainte-Laguë method to the problem of selecting documents from D for S members of parliament (MoP) belong to a single party only, hence a document d represents only a single aspect c, namely the one for which it has highest probability P[d c] Allocate the k seats available to the query aspects (parties) according to their popularity P[c q] using Sainte-Laguë when allocated a seat, the query aspect (party) c assigns it to the document (MoP) not yet in S having highest P[d c] Problem: Documents relate to more than a single query aspect in practice, but PM-1 cannot handle this 103

<latexit sha1_base64="u516yjnpoztsbcgayfhwe64qyw8=">aaacgxicdzdnsgmxfixv+g/9qz+4cvpsrla6m0xqzcgng6gcvceow514w0mnk5ckxtl2sdy51zdwj25d+q4+hgnvrrupgxyckxu4j1epnzyi3r2jyanpmdm5+clc4tlysnf17dzirmbuydkv+jjbqynpqgg5telsaukrphsrdi6g+uwptoeyo7n9rzhadszbnkf1vlzcbly0srwx80febe6zmjd2s+eglpadsjuyqvqxwsrodsq1driphhc/mtesdqvllqvozfuykbvlqc1nkq0kza4hhaydbbrcqkeihevtkoks7o/hdjmuzkj8tn5ymkmwpi+sp6zaeznujlj2lcbm1+e2drjlpfndsxlz2qhzv2fcsx4nie0o90/6fqyc4xyyfoqwbn2btettctgouhp+oij9d+fvsuj4dl9co/zqcezgc7zhb0i4gbocqx0awoaohuarnrx779l78v6/nk543zprmcbv7rni46it</latexit> <latexit sha1_base64="u516yjnpoztsbcgayfhwe64qyw8=">aaacgxicdzdnsgmxfixv+g/9qz+4cvpsrla6m0xqzcgng6gcvceow514w0mnk5ckxtl2sdy51zdwj25d+q4+hgnvrrupgxyckxu4j1epnzyi3r2jyanpmdm5+clc4tlysnf17dzirmbuydkv+jjbqynpqgg5telsaukrphsrdi6g+uwptoeyo7n9rzhadszbnkf1vlzcbly0srwx80febe6zmjd2s+eglpadsjuyqvqxwsrodsq1driphhc/mtesdqvllqvozfuykbvlqc1nkq0kza4hhaydbbrcqkeihevtkoks7o/hdjmuzkj8tn5ymkmwpi+sp6zaeznujlj2lcbm1+e2drjlpfndsxlz2qhzv2fcsx4nie0o90/6fqyc4xyyfoqwbn2btettctgouhp+oij9d+fvsuj4dl9co/zqcezgc7zhb0i4gbocqx0awoaohuarnrx779l78v6/nk543zprmcbv7rni46it</latexit> <latexit sha1_base64="u516yjnpoztsbcgayfhwe64qyw8=">aaacgxicdzdnsgmxfixv+g/9qz+4cvpsrla6m0xqzcgng6gcvceow514w0mnk5ckxtl2sdy51zdwj25d+q4+hgnvrrupgxyckxu4j1epnzyi3r2jyanpmdm5+clc4tlysnf17dzirmbuydkv+jjbqynpqgg5telsaukrphsrdi6g+uwptoeyo7n9rzhadszbnkf1vlzcbly0srwx80febe6zmjd2s+eglpadsjuyqvqxwsrodsq1driphhc/mtesdqvllqvozfuykbvlqc1nkq0kza4hhaydbbrcqkeihevtkoks7o/hdjmuzkj8tn5ymkmwpi+sp6zaeznujlj2lcbm1+e2drjlpfndsxlz2qhzv2fcsx4nie0o90/6fqyc4xyyfoqwbn2btettctgouhp+oij9d+fvsuj4dl9co/zqcezgc7zhb0i4gbocqx0awoaohuarnrx779l78v6/nk543zprmcbv7rni46it</latexit> <latexit sha1_base64="w1qn5onig9rsymkedws5n4dtw0e=">aaacgxicdzdnsgmxfiuz9a/wv6rgxk2xg0hpzbtblgtu3agv7a+0w3an3tbqysqkabgmfrj3bvul3ilbv76dd2fa66iwd4f8njmbucesmdpg8z6dznlyyupadj23sbm1vzpf3wtomvau61teqrui0bizbougmrhbuihwkmzm1l+y5m0hks1ecmngegmovyr1gqvjrtb/0okqookwzoo03dnvisucfpxxmc96pbi3uwer/nl09opkplqy/+rccjrgmbgag9zt35mmseezrmmc5zodjrjoh3rybk+crbwkprqcjrrnxxyt4kiddlrexjgc13reowwtg7mbnymh+gyi/edz060ekuvkwgbcbtbf1k1rw5lbj1dzw9yrkrtka9idtbudwxuxa1vclyyoc7ae3w4k/0ojxpitx58vq5vzuvlysi7imfhjoamss1ijdulja3kiz+tfexrentfn/edpxpnn7jm5or/fzjch2q==</latexit> PM-2 PM-2 is a probabilistic adaptation of the Sainte-Laguë method that considers to what extent documents related to query aspects Let v i = P[c i q] and s i denote the proportion votes/seats received by/assigned to aspect (party) c i While not all seats have been allocated select query aspect c i with highest quotient v i 2 s i +1 104

<latexit sha1_base64="1wger7epyuseuk2erac3vfrpm4m=">aaacdhicdzblaxsxfiu10zqp9xgnza5zddwflcbzmifmaeimm0ikcrlidmmd+dqvly0usrnixpzo0v/qdxefko6zce0uah3op3tap1kcgzumv4lw2cbzza3tnc6ll69e73b33lwy2wikqyq51fcvgossxqflluov0gii4nhzzb7c88tb1ibj+tzofrycjjubmwrww2x3juf+8qjyo3ysgbrbkrwunx+zkkwfoqz1plxajvphpfhwyxa8xpnomwle6azrxunn5pfjxnszmv3lmlzlt5fg/fr+onwrxys77q32ywloyu7vfcrpi7c2limx11mqbofaw0y5tp28maiazmcc15aqukgln0ep0or5kvaybogmcivmvqadycxcvgumaptj1ayknfmozh/hdnxaofarxmjnpvtilwyn7z+zvaj9ycm3eviqb/g/mysdxbteanxsxf62hv/pywfr0+kih2defz/pdu4feilb5b15tw5jrj6taflkzsiqupkt/a02g63gt3gq9sipd0/dylnzlqxmgp8d9nxasg==</latexit> <latexit sha1_base64="1wger7epyuseuk2erac3vfrpm4m=">aaacdhicdzblaxsxfiu10zqp9xgnza5zddwflcbzmifmaeimm0ikcrlidmmd+dqvly0usrnixpzo0v/qdxefko6zce0uah3op3tap1kcgzumv4lw2cbzza3tnc6ll69e73b33lwy2wikqyq51fcvgossxqflluov0gii4nhzzb7c88tb1ibj+tzofrycjjubmwrww2x3juf+8qjyo3ysgbrbkrwunx+zkkwfoqz1plxajvphpfhwyxa8xpnomwle6azrxunn5pfjxnszmv3lmlzlt5fg/fr+onwrxys77q32ywloyu7vfcrpi7c2limx11mqbofaw0y5tp28maiazmcc15aqukgln0ep0or5kvaybogmcivmvqadycxcvgumaptj1ayknfmozh/hdnxaofarxmjnpvtilwyn7z+zvaj9ycm3eviqb/g/mysdxbteanxsxf62hv/pywfr0+kih2defz/pdu4feilb5b15tw5jrj6taflkzsiqupkt/a02g63gt3gq9sipd0/dylnzlqxmgp8d9nxasg==</latexit> <latexit sha1_base64="1wger7epyuseuk2erac3vfrpm4m=">aaacdhicdzblaxsxfiu10zqp9xgnza5zddwflcbzmifmaeimm0ikcrlidmmd+dqvly0usrnixpzo0v/qdxefko6zce0uah3op3tap1kcgzumv4lw2cbzza3tnc6ll69e73b33lwy2wikqyq51fcvgossxqflluov0gii4nhzzb7c88tb1ibj+tzofrycjjubmwrww2x3juf+8qjyo3ysgbrbkrwunx+zkkwfoqz1plxajvphpfhwyxa8xpnomwle6azrxunn5pfjxnszmv3lmlzlt5fg/fr+onwrxys77q32ywloyu7vfcrpi7c2limx11mqbofaw0y5tp28maiazmcc15aqukgln0ep0or5kvaybogmcivmvqadycxcvgumaptj1ayknfmozh/hdnxaofarxmjnpvtilwyn7z+zvaj9ycm3eviqb/g/mysdxbteanxsxf62hv/pywfr0+kih2defz/pdu4feilb5b15tw5jrj6taflkzsiqupkt/a02g63gt3gq9sipd0/dylnzlqxmgp8d9nxasg==</latexit> <latexit sha1_base64="pbw0sdhom6qhmzd/wwjiwku8j84=">aaacdhicdzblaxsxfiu10zzj3uecdtkshphcstn5mecydgtttsgfoalkhugofo3klkakpakxyn5n6x/iurtcfxeyce0uah3op3tap1kcgzumv4lw2fmxg5tbl3uvxr95u93fexdhzkmpjqjkul9vyjczgkewwy5xsioiiunlnt994je3qa2t9bldkcwetgs2yrsst8r+tc794zhkb/lea3w3jwvdmd8wjys+r1nrfaq0g7eoeukt/exlt/ljm9oi0s2ivmabyophjfmxmvvlmlvlf5dgw/rhonwrxcs7hzbuzsr+ft6wtbfyw8rbmossvbzwoc2jhnte3hhuqocwxwtifsjuhzuifgj1yhv7wynau7hlcyvqgtbmiao1u4d9swpwus4tvoa/cds5lhyrvwoxpp4tputgxvefzcvu/rdk2yipltf8n0zcwejdiobqawiv256v57gd6glxmywzr78fdk6ou6k2yaeyr/zjro7icflkzsiiupkt/ak2gs3gd7gbdskp/56gqbfznqxmgp8fejg/9g==</latexit> <latexit sha1_base64="sj3cxbuym+npyyptzx8rh1qspzw=">aaacohicdzbbsxwxgia/svrtqnvte/mydc+csdmril6ebs9ecgqucs4wfbo/xenojihjieuyx+i/8dzr/qw99sacp/6cztcw2yovixl43ytwvbkqulfx/doyetc7935+4unjcwn540pz9dojkzvm1goykposr0mfl6lnus3otglckrd0mg/3x/npnwndzxlsr4psgyos9zld662suw0yhu6f430jtpoamuuy0u6idizjde0su4nmxduv7lvdz81w3n6mxwpfq6c9oenw9wtmdjg1n5ilyspbpwufgnpeizvnhwrlwuf1i6kmkwrdhna5xgov6dqnsaqyejqdeyxrkendzpqp0kewzityv6zaezlt5liolebmv89tfzd1vfsvpzl5biiu6hnfyttmsfvfo2+jkfpe8dozqebln5fapqvpe6wbvp5/hyrvw8lmu+p5akvv3x3ucrzgdb7conrgb7pwaifqawa38b1+wh1wf/wkholh56szwd83n2fkwe8/49+wpq==</latexit> <latexit sha1_base64="sj3cxbuym+npyyptzx8rh1qspzw=">aaacohicdzbbsxwxgia/svrtqnvte/mydc+csdmril6ebs9ecgqucs4wfbo/xenojihjieuyx+i/8dzr/qw99sacp/6cztcw2yovixl43ytwvbkqulfx/doyetc7935+4unjcwn540pz9dojkzvm1goykposr0mfl6lnus3otglckrd0mg/3x/npnwndzxlsr4psgyos9zld662suw0yhu6f430jtpoamuuy0u6idizjde0su4nmxduv7lvdz81w3n6mxwpfq6c9oenw9wtmdjg1n5ilyspbpwufgnpeizvnhwrlwuf1i6kmkwrdhna5xgov6dqnsaqyejqdeyxrkendzpqp0kewzityv6zaezlt5liolebmv89tfzd1vfsvpzl5biiu6hnfyttmsfvfo2+jkfpe8dozqebln5fapqvpe6wbvp5/hyrvw8lmu+p5akvv3x3ucrzgdb7conrgb7pwaifqawa38b1+wh1wf/wkholh56szwd83n2fkwe8/49+wpq==</latexit> <latexit sha1_base64="sj3cxbuym+npyyptzx8rh1qspzw=">aaacohicdzbbsxwxgia/svrtqnvte/mydc+csdmril6ebs9ecgqucs4wfbo/xenojihjieuyx+i/8dzr/qw99sacp/6cztcw2yovixl43ytwvbkqulfx/doyetc7935+4unjcwn540pz9dojkzvm1goykposr0mfl6lnus3otglckrd0mg/3x/npnwndzxlsr4psgyos9zld662suw0yhu6f430jtpoamuuy0u6idizjde0su4nmxduv7lvdz81w3n6mxwpfq6c9oenw9wtmdjg1n5ilyspbpwufgnpeizvnhwrlwuf1i6kmkwrdhna5xgov6dqnsaqyejqdeyxrkendzpqp0kewzityv6zaezlt5liolebmv89tfzd1vfsvpzl5biiu6hnfyttmsfvfo2+jkfpe8dozqebln5fapqvpe6wbvp5/hyrvw8lmu+p5akvv3x3ucrzgdb7conrgb7pwaifqawa38b1+wh1wf/wkholh56szwd83n2fkwe8/49+wpq==</latexit> <latexit sha1_base64="lztk82y5tan4omagz2eg64avd6e=">aaacohicdzdnsgmxfiuz/lv/qi7ddhyjcj2pklorbddubawrgjmmd+jtjz1mqpirs5gn8u3cuduncodobfc+gwmtsbupifk4jwnck8qmarogz97i6nj4xotudgvmdm5+obq4dkpfosg2qcieok9by8zybbpmmjyxcogngz6lnf1efnadsjorn5iuxjhdo2ctrse4k6lu6yt5u35vx/ejlgjqiyqvvswttvhz2kgxplhx5y97xzzjtrbwn8ke/l/qqpfpseygokqq79gloaxh3namtl5ohnlefprhnmoyehuajdaotpecqgksvwzbkdga1r2ohebauce2p/1qaifr3exph5oduro2uye6bll963pt2okty2vhmkcu66mnmtp1ghrsvg6x4labjnizbiydzgdwnubaldb1h2xf1fpdgf8/ng7ug46pn2t7o4oipsgkwsvrpeg2yr45ieekssi5iw/kktx5996l9+q9fv0d8qzvlsmqvi9patuwuq==</latexit> PM-2 select document d having highest score v i 2 s i +1 P[d c i ]+(1 ) ÿ j =i v j 2 s j +1 P[d c j ] with parameter λ trading of relatedness to aspect c i vs. all other aspects update s i for all query aspects as s i = s i + P[d c i ] q j P[d c j ] 105

Summary Returning relevant yet redundant results to users is not beneficial; MMR returns documents that are relevant and novel, i.e. different from already-returned results Results should cover different interpretations of the query to reflect different information needs; diversification methods (e.g., IA-SELECT and PM) return results that are relevant and cover different interpretations 106

Literature [1] C. D. Manning, P. Raghavan, and H. Schütze: Introduction to Information Retrieval, Cambridge University Press, 2008 (Chapter 12) [2] W. B. Croft, D. Metzler, and T. Strohman: Search Engines Information Retrieval in Practice, Pearson Education, 2009 (Chapter 7) [3] J. Carbonell and J. Goldstein: The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries, SIGIR 1998 107

Literature [4] R. Agrawal et al.: Diversifying Search Results, WSDM 2009 [5] V. Dang and W. Bruce Croft: Diversity by Proportionality: An Election-based Approach to Search Result Diversification, SIGIR 2012 [6] V. Dang and W. Bruce Croft: Term Level Search Result Diversification, SIGIR 2013 108