0 évaluation0% ont trouvé ce document utile (0 vote)
17 vues6 pages
Designers oI An IRS system must decide what inIormation is going to be stored and represented in the database. They should be concerned with how to provide the user with the most intuitive, precise IRS possible. Designers will also want to Igure out ways to prevent disambiguation.
Designers oI An IRS system must decide what inIormation is going to be stored and represented in the database. They should be concerned with how to provide the user with the most intuitive, precise IRS possible. Designers will also want to Igure out ways to prevent disambiguation.
Droits d'auteur :
Attribution Non-Commercial (BY-NC)
Formats disponibles
Téléchargez comme DOCX, PDF, TXT ou lisez en ligne sur Scribd
Designers oI An IRS system must decide what inIormation is going to be stored and represented in the database. They should be concerned with how to provide the user with the most intuitive, precise IRS possible. Designers will also want to Igure out ways to prevent disambiguation.
Droits d'auteur :
Attribution Non-Commercial (BY-NC)
Formats disponibles
Téléchargez comme DOCX, PDF, TXT ou lisez en ligne sur Scribd
Design, query and evaluate information retrieval systems
Introduction InIormation Retrieval Systems (IRSs) are essential tools used by the majority, iI not all oI Library and InIormation ProIessionals (LIPs). InIormation retrieval (IR) occurs aIter a user presents an inIormation need in the Iorm oI a query 'to a database. IR is the process oI the database seeking, locating retrieving and presenting that inIormation to the user. An IRS is a database oI records. In some cases, these records will all be surrogate records. In other IRSs they will be both surrogate records and the records themselves. In either case, an LIP will oIten act as an intermediary between the person who has the query and the IRS. IRSs are not uniIorm in the way they are designed and structured, and a query oIten needs to be expressed a speciIic way Ior a speciIic IRS. It is the LIP`s role to direct the patron to the most appropriate IRS Ior his/her query and to show the patron how to translate the inIormation need into a query that is readable by the IRS being used. Designing an IRS begins the same way almost all user-oriented tasks begin: by considering the user populations` needs. The designers oI an IRS system must decide what inIormation is going to be stored and represented in the database, and iI they want to use either indexing that is pre-coordinate, post-coordinate or both. In short, the designers oI an IRS need to consider the user in all their designing eIIorts. They should be concerned with how to provide the user with the most intuitive, precise IRS possible. TO this end, designers will also want to Iigure out ways to prevent disambiguation, which occurs when an IRS has to determine the meaning oI a word that has one or more homonym. IN LISNews, JeIIerey Beall outlines an example oI disambiguation using the word 'boxer. He says, 'the word boxers` is a homonym with several diIIerent meanings, and the search engine doesn`t know which meaning you want. Boxers are a breed oI dog, a category oI athlete, and a kind oI men`s garment (Beall, 2010). One way to address the problem oI disambiguation is through a controlled vocabulary (as opposed to natural language). A controlled vocabulary has a limit on the number oI possible values that can be used Ior attributes. Although it doesn`t allow Ior the intuitive user/IRS interIace that natural language does, a controlled vocabulary reduces ambiguity and oIten provides Ior more precise results. II the designer oI an IRS system wants queries to be submitted using a controlled vocabulary, the designers need to create and establish those speciIic terms. Some oI the decisions that the designers oI the IRS made in the initial stages have tradeoIIs in terms oI user Iriendliness. For example, pre-coordinate indexing allows strings oI terms to be combined to describe a complex concept. The advantages oI this are a lower rate oI disambiguation and a higher rate oI precision (retrieving only relevant results). The user may Iind querying a pre-coordinate indexed IRS more diIIicult but might also be happier with the results. Conversely, a post-coordinate indexed IRS is more intuitive Ior the user, as post-coordinate IRSs will allow Ior many search terms and usually uses Boolean logic/descriptors. Although this type oI indexing may be more user- Iriendly, it also tends to yield high-recall results (a high number oI records retrieved) and less precision (a Iewer number oI relevant records retrieved). While recall and precision are key indicators oI the eIIectiveness oI an IRS, ProIessor Enid Irwin has also introduced us to the concept oI SEI, something that was created by a Iormer SJSU SLIS student. All oI the methods oI evaluating an IRS are really looking to measure the value oI the results retrieved. But remember, IRSs are designed to help a speciIic user with an inIormation need. Thus, the value oI the results retrieved ,s interpreted by the user is oIten the best way determine eIIectiveness. For example, an IRS could retrieve results that are highly relevant to a user`s query, but are oI little value (Ior example, iI the user already has the inIormation retrieved). So., in all aspects oI IRSIrom designing to querying and Iinally evaluating a sytem, the user plays an essential role. In Iact, although designers oI an IRS might create a system that is as user-Iriendly as possible, its eIIectiveness will also depend on how much knowledge the user has about IRSs in general and, speciIically, with the type oI system he or she is using.
Evidence #1 Assignments 2, Part A; LIBR 202: Database Design In LIBR 202, Inform,tion Retriev, Systems, ProIessor Enid Irwin arranged students into groups and asked each to create a database using DB/Textworks. The goal oI this assignment was to use the concepts we had read about (e.g., attributes, pre- coordination vs. post-coordination, etc.) in a very 'hands-on, practical way and design an IRS (a database). This task included outlining the client (user) group, the scope and goals oI the IRS, the language we would use (natural vs. controlled), the rules oI any controlled languages (pre-coordinate vs. post-coordinate and what the pre-coordinate levels and terms would be), search guidelines, a list oI attributes and indexing rules. I`m very grateIul that LIBR 202 was a mandatory class in the SLIS program. I don`t think, based on my areas oI interest, I would have taken the class iI it wasn`t required and I really believe I will be a better LIP Ior having completed this class and this database assignment. I am also grateIul that we were able to complete this assignment in groups. My teammates and I relied heavily on each other to Iigure out how to implement the concepts we understood via lectures and readings into the creation oI a database. Although it was a diIIicult assignment, it help me develop a clear competency regarding the inner workings oI IRSs and I believe I accomplished the goals ProIessor Irwin stated when she outlined the database assignment.
Evidence #2: Children`s Literature Complete Database Presentation Aristotle said, 'Those that know, do. Those that understand, teach. I can`t say that I will ever be a master oI IRSs, but I have developed a thorough understanding oI them and have Iound that I am capable oI sharing my knowledge on the subject with others In LIBR 210, ReIerence and InIormation Services, ProIessor Tash asked each oI the students to research a database, compile any necessary inIormation about that database and present that inIormation to the rest oI the class in an Elluminate session. Although I have used many IRSs in my personal, academic and proIessional work, I am including my presentation on the Children`s Literature Complete Database (CLCD) because it systematically highlights how to use CLCD in the most eIIective way Ior users. As I researched CLCD, I learned about some very valuable tools and practices. I was Iamiliar with how to use CLCD beIore beginning the 210 assignment, but I was interested to approach the database Irom the perspective oI developing the capability oI instructing others how to use it. During the process I developed a strong Iamiliarity with the Help Desk and tutorial Iunctions and, although I thought I knew how to use CLCD eIIectively, reviewing these helped me to 1) learn more about how to use the database than I had gleaned Irom my previous, selI-directed work, and 2) learn to provide instructional inIormation in a clear way. This was a good reminder that, no matter how Iamiliar I think I am with an IRS, I should always take care to review the guidelines so I can query the IRS in a way that will give me high value results.
Evidence #3 Assignment 2, Part B; LIBR 202: Database Evaluation The second part oI Assignment 2 required us to evaluate the database we had created. The goal oI the assignment was to ascertain the useIulness oI our database, emphasizing concepts we had learned in the semester such as recall, precision and a new concept called SEI. Developed by Marc Schatkum, a student at San Jose State University, the Search EIIectiveness Index (SEI) is a Iormula intended to show the end user how close a given search comes to the perIect search result, where relevant records (P) and retrieved records (R) 1 and 1, respectively; non-relevant records retrieved (F) and relevant records not retrieved 0 and 0, respectively. (Schatkun, 2009). A 'perIect search would mean that all relevant records were retrieved in a query (and, conversely, no relevant records were missed). A perIect SEI 1; a Iailed search (a query that retrieved no relevant records) 0. All other searches Iall either as a decimal between these two integers or is expressed as a percentage. I used SEI as one oI the ways I evaluated my group`s (Team One) database. I also used two other methods to evaluate (recall and precision) the database. Even when these methods can`t be employed in a strict mathematical way, it`s important Ior any LIP to understand the concepts behind them and their importance in Iilling a user`s inIormation need.
Conclusion As a person who entered the LIP Iield with the goal oI dedicating my career to the Iree Ilow oI inIormation in the context oI social impact, my relationship with IRS studies going into SJSU was not exactly Iront and center on my mind. As I proceeded toward earning my degree (and in my proIessional work as a Young Adult Librarian), however, I have learned how critical these systems are to the realization oI even my most loIty objectives. Through my work developing an IRS system oI my own, teaching and evaluating existing systems (such as CLCD), and understanding both traditional and new system measurement techniques (including SEI), I have developed a high degree oI competency with the inIrastructure required to ensure the inIormation Ilow and exchange necessary Ior libraries to serve their purposes in society.
References
Beall,J. (2010, February 11). The importance oI word-sense disambiguation in online inIormation retrieval. LISNews. Retrieved Irom http://lisnews.org/importancewordsensedisambiguationonlineinIormationretrieval
Meadow, C., Boyce, B., KraIt, D., & Barry, C. (2007). Text inform,tion retriev, systems London, UK: Academic Press
Schatkun, M., (2009). Hi LIBR 202 students. Retrieved May 15, 2009 Irom the San Jose State University ANGEL website Ior LIBR 202-03-04, Spring 2009 site: https://liIIey.sjsu.edu/