Académique Documents
Professionnel Documents
Culture Documents
Title
well the IR system is performing compare the performance of the IR system with that of othersystems, objectively Retrieval evaluationis a critical and integral component of any modern IR system.
interested in the site or not. The users engagement with search -experience defines their site-behavior including the likelihood to purchase/complete certain transactions. This calls for the need to study and analyze behavior of site-users, their action-paths, decision-points, interest areas etc. Relevancy means the relationship between things or events.
Consider, R: the set of relevant documents A: the answer set for I, generated by an IR system R a: the intersection of the sets R and A
Precision:-
Precision of Google
Google, being one of the most popular search engines on the Internet, was selected as one of the search engines for comparison. Google focuses on the link structure of the Web to determine relevant results and is representative of the variety of easy-to-use search engines. This study would measure the relevance of
the web sites retrieved for each search query. Advanced search options were used for retrieving sites. Only English pages were searched for each search query since the web pages in other languages would be difficult to assess for relevancy. It was specified that the search query must appear in the title of the web page. Since the number of search results retrieved was large, only the first 100 sites were selected for analysis.
(b*b)/r(j)+1/p(j) where
r(j) is the recall at the j-th position in the ranking P(j) is the precision at the j-th position in the ranking b _ 0 is a user specified parameter E(j) is the E metric at the j-th position in the ranking
The parameter b is specified by the user and reflects the relative importance of recall and precision. If b = 0 E(j) = 1 P(j) low values of b make E(j) a function of precision If b ! 1 limb!1 E(j) = 1 r(j) high values of b make E(j) a function of recal For b = 1, the E-measure becomes the F-measure
Ranking for query:1. d68* 2. d48 3. d140 4. d19* 5. d13 6. d151 7. d2 8. d5 9. d55 10. d121 *
Calculated measures for query:-Document d68 corresponds to 10% of all the relevant documents in the set Rq. -Thus having precision of 1/1 i.e. 100% and recall of 1/10 i.e. 10%. -E measure can be calculated by using formula.
Precision(%) 100 50 33
Recall(%) 10 20 30
Type2:Search query: University of Pune Type of query : Phrase search query Rq={d5,d7,d19,d23,d58,d70,d99,d190} Ranking of query:1. d58 2. d5 5. d99 6. d70 *
Calculated measures for query:Document d190 d70 d23 Precision(%) 33 33 38 Recall(%) 13 25 38 E measure 0.81 (b=1) 0.72 (b=1) 0.62 (b=1)
Type3:-Search query:- Passport AND office. Query type:- Two word searches connected by a Boolean AND. Rq={ d2,d5,d10,d55,d80,d90,d125,d150,d200,d250} Ranking for query:1. d55 2. d5 3. d80 4. d125 5. d200 * * * 6. d250 7. d2 8. d90 9. d100 10. d150 * *
Calculated measures for query:Document d55 corresponds to 10% of all the relevant documents in the set Rq.Thus having precision of 1/1 i.e. 100% and recall of 1/10 i.e. 10%.
Precision(%) 100 66 60 57 50
Recall(%) 10 20 30 40 50
E-Measure 0.81 (b=1) 0.69 (b=1) 0.60 (b=1) 0.07 (b=1) 0.5 (b=1)
Type4:Search query: admission requirements AND ME. Query type: a phrase search and a word search connected by a Boolean AND. Rq={d10,d25,d5,d1,d70,d65,d100,d15,d150,d80} Ranking for query:1. d25 2. d5 3. d1 4. d70 5. d65 * 6. d100 7. d10 8. d80 9. d15 10. d150 * *
Calculated measures for query:Precision of document d70 is 1/4 i.e. 25%. While recall is 1/10 i.e. 10%. Document Precision (%) Recall(%) E-Measure
25 33 33
10 20 30
Conclusion
While the concept of relevancy as an approach is considered important.Search engine relevancy is a key feature, which often tends to get ignored in losses in terms of time, money and effort to fix and tune the effectiveness of the engine. By giving adequate thought to various parameters, which can boost the relevancy, organizations can achieve a self-sustainable, intelligent and useful search system. Precision, recall and E measure these measurements are best to evaluate the search engine.