<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title>Disqus - Latest Comments for jaimearguello</title><link>http://disqus.com/by/jaimearguello/</link><description></description><atom:link href="http://disqus.com/jaimearguello/comments.rss" rel="self"></atom:link><language>en</language><lastBuildDate>Thu, 12 Jun 2008 14:14:12 -0000</lastBuildDate><item><title>Re: window office</title><link>http://windowoffice.tumblr.com/post/38033535#comment-647182</link><description>&lt;p&gt;I think it depends on how click-throughs are used to train the ranker. I can think of two ways to "learn to rank" using click-through data.&lt;/p&gt;&lt;p&gt;(1) Treat clicks are surrogates for relevance and use pairs of predicted (inferred from clickthroughs) relevant and non-relevant documents to learn to rank. I think this is the most common approach.&lt;/p&gt;&lt;p&gt;(2) Compare "good" retrievals to "bad" retrievals to learn to rank. Let's say some user runs query A and doesn't click on anything (let's call this retrieval R_1). A different user runs query B and clicks on the top 10 documents (let's call this retrieval R_2). One could argue that R_2 was a more successful retrieval than R_1. This evidence could be used to learn a model that (a) produces a ranking similar to R_2 in response to query B and (b) *does not* produce as ranking like R_1 in response to query A.&lt;/p&gt;&lt;p&gt;I don't know of anyone that does (2). Perhaps it's a terrible idea. However, it seems like a way of getting the most amount of data for training. Clickthroughs tend to focus on the top 10 docs, but by comparing "successful" retrievals with "unsuccessful" ones, we might be able to use evidence beyond rank 10.&lt;/p&gt;&lt;p&gt;Any thoughts? &lt;/p&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">jaimearguello</dc:creator><pubDate>Thu, 12 Jun 2008 14:14:12 -0000</pubDate></item></channel></rss>