Vous êtes sur la page 1sur 5

A Query Simulation System To Illustrate Database Query Execution

Brett Allenstein Andrew Yost Paul Wagner Joline Morrison University of WisconsinEau Claire Eau Claire, WI 54701 yostac@uwec.edu wagnerpj@uwec.edu morrisjp@uwec.edu

Skyline Technologies Green Bay, WI 54301 brett.allenstein@gmail.com

ABSTRACT
The underlying processes that enable database query execution are fundamental to understanding database management systems. However, these processes are complex and can be difficult to explain and illustrate. To address this problem, we have developed a Java-based query simulation system that enables students to visualize the steps involved in processing DML queries. We performed a field experiment to evaluate the system, and the results suggest that the system improves student comprehension of the query execution process.

to be familiar with the underlying query execution process in order to understand how the DBMS implements commits and rollbacks. The query execution process also impacts query optimization techniques, and provides the foundation for how the DBMS implements backup and recovery operations. Unfortunately, the DBMS query execution process is complex and difficult to explain using conventional teaching methods such as lectures and static visual aids. In more DBMSs, the query execution process involves multiple processes that interact with different structures within both main memory and disk storage. Moreover, the process varies based on whether the query is reading data (e.g., SQL SELECT statement) or writing data (e.g., SQL INSERT, UPDATE, or DELETE). Historically, we (and many other instructors) use the following techniques to teach this process: Presenting the steps in sequential format in written form Diagramming the process on a chalk or white board Repeatedly modifying the diagrams as the process continues Our experience indicates that the written presentation of the steps is somewhat difficult to follow, especially when conditional points of execution occur, such as when the statement is already in executable cache or the data that the statement retrieves is already in data cache. Diagramming this process on the board becomes difficult because many components change state during the process. Students have historically expressed confusion and frustration with this part of the course. As a result, we decided to develop an interactive application that simulates the execution of an SQL query that allows the user to step through the process and examine a variety of data, memory and file components at different points during the process. Our university uses Oracle [10] as an example of a large commercial DBMS, so we have developed the system based on the Oracle DBMS's query process flow. However, the application could be extended to represent variations found in other DBMSs such as SQL Server, DB2, MySQL or PostgreSQL.

Categories and Subject Descriptors


H.2.7 [Information Systems]: Database Management Systems Query processing; K.3.1 [Computer Uses in Education]: Computer Assisted Instruction.

General Terms: Keywords: 1.

None.

Database query processing, computer-based simulation, visualization

INTRODUCTION

In addition to teaching applied topics such as data modeling, SQL structure and usage, and database application development in our database systems course, we also address the underlying structure of database management systems (DBMSs), including the steps a DBMS performs during query execution. Database students need
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SIGCSE 08, March 12-15, 2008, Portland, Oregon, USA. Copyright 2008 ACM 978-1-59593-0/08/0003$5.00.

The purpose of this paper is to describe the prototype application and its evaluation. We also describe planned future system enhancements based on the evaluation results.

2.

BACKGROUND

Early learning theory researchers (e.g., [6]) suggest that learning follows a VAK model, whereby students learn through a combination of visual, auditory, and kinetic experiences, or a VARK model [3], which expands the VAK model to include reading/writing experiences. More recent research (e.g, [14]) suggests that today's students favor a higher degree of sensory involvement in the learning process, and respond well to a practice-to-theory approach to learning that provides immediate and concrete gratification. Prior research suggests that computerized learning programs that communicate information using enhanced interactive visual stimuli have had success in a variety of domains (e.g., [8, 10, 15]). There is not a large body of literature supporting the benefits of the use of visualization in computer science education, and some results have been conflicting, but there have been some positive results [5, 9]. Visualization is not a panacea for all student comprehension problems, but it can be a useful tool for relating to modern learning styles and providing another way to think about a given concept. Care must still be taken to avoid assuming the visualization by itself is sufficient to ensure student learning; and instructors must ensure that students are engaged in using the visualization in order to gain its benefits [9]. Instructors have also used customized applications specifically to teach database systems concepts (e.g., SQL [4, 12] and JDBC [2]) as well as using a simplified DBMS engine to teach internals [13].

option, the user can opt to execute one of the four major types of SQL statements (SELECT, INSERT, UPDATE and DELETE). This enables the user to see the difference in execution between a read (SELECT) statement, and a read/write (INSERT, UPDATE, DELETE) statement. The statements are simple but representative: SELECT * FROM employees UPDATE employees SET name = 'John Doe' WHERE name IS NULL INSERT INTO employees (name) VALUES ('John Doe') DELETE FROM employees WHERE name = 'John Doe'

Under the Interact with database option, the user provides a username, password, host name, port and database instance name. He or she connects to the specified Oracle database instance, and can then enter and execute any valid SQL DML statement (SELECT, INSERT, UPDATE, or DELETE).

3. 3.1

SYSTEM DESCRIPTION Overview

Our first task was to develop a unified model of the Oracle query execution process [1, 7] in terms of background processes, memory structures, and disk structures used. We supplemented this material with information gained through Oracles Web site [11]. We then implemented this model to create an application to enable users to interactively navigate through the steps commonly found in the execution of Oracle data manipulation language (DML) statements. After the user starts the query simulation application (see Figure 1), he or she is presented with the option of using Static execution (executing one of four predefined statements) or Interact with database (connecting to a live Oracle database and executing any desired command.) Under the Static execution

Figure 1. Qu ry Simulation Application e

Figur e 2. Memory tab panel

Figur e 3. Di sk tab panel

The query simulation application allows the user to navigate through each step of the query process using Back, Restart, and Next buttons. During the simulation process, there are several points where the next step depends on the answer to a question such as "Is the data for this query already in data cache?" At these points, the system prompts the user to answer the current question to specify how the query process will proceed. The application maintains and displays the state of each DBMS memory and disk component, and specifies whether that component is inactive (gray), active (red), or no longer used (blue). The application uses the state information to provide three views of the query execution process, each of which can be seen as a tabbed panel: 1. 2. 3. Main memory structures (e.g. the system global area, including various caches) Disk structures (e.g. the data files, as well as other structures used such as rollback segments and redo logs) Overview (combined memory and disk view, showing the overall process)

shown in Figure 1. The application also displays two informational panels positioned below the main Tabbed View panel. On the lower left, the Control panel presents a description of the current query execution step selected by the user. For example, the description might state that the clients query is currently in the System Global Area of the Oracle instance, or that the query is currently being parsed. On the lower right, the Definition panel presents a description of any selected component in any of the tabbed views. For example, the user can select the Shared SQL Pool in the Memory view to view a description of the Shared SQL Pool in the Description panel. Figures 2 and 3 show screenshots of the (Main) Memory tab panel and the Disk tab panel. These views show the state of the respective memory structure (Not Used, In Use, or Completed) in relation to the current process state. When entering a query statement interactively, the visualization system will also show the database systems optimized execution plan, generated by the explain plan functionality of Oracle. Users find this useful because they can immediately see how internal execution plans change due to different choices users make in their queries in terms of search conditions and the number of tables involved in join queries. It also allows users to see how creating indexed columns impacts query execution.

These three views are available as tabs within the main Tabbed View panel, on which the Overview panel appears by default, as

Figure 4 shows an example query explain plan for a usersupplied query.

4.

SYSTEM EVALUATION

We have used this Oracle simulation system in one offering of our Database Systems course, and collected comments from the students to initially determine possible benefits. While acknowledging that a formal review of this system is the next step, we think that the student comments provide initial evidence of the value of the system.

4.1

Perceived Usefulness

The following question and associated responses address the perceived usefulness of the system: Do you feel that using the Oracle Query Simulator was useful to you in better understanding the execution of a query under the Oracle DBMS? If so, how? If not, why not? All of the students reported that they found the system to be useful. Multiple students cited the visual aspects of the system to be better than text explanations, and described how the system helped them understand the timing of different operations. They also appreciated having control over the learning process. Sample responses include: I think that it was definitely useful in understanding. Looking at a PowerPoint is one thing, but when you can actually walk through step by step, and go backwards to repeat something you dont understand makes it a lot easier to learn. It is also very helpful to be able to flip through the different views and see where things are happening with respect to each other. The simulator was useful for me in learning the processes and getting a more visual feel of how things work internally. I think it helped me because I am somewhat of a visual learner. It helped to separate the aspects of the system, and really flesh out the sequence of the query processing (e.g. when the redo log buffer is written to, vs. when the redo log files are written to). It was a great way to visualize the execution of a query. The simulation was far better than any explanation. The simulation was simple and to the point allowing you to understand the concepts very quickly.

Figure 4. Interactive query execution plan

3.2

System Design 4.2

The system is designed to be as modular as possible. Early in the design process, we decided to separate the functionality of the query steps from each of the DBMS components in the simulated system. We also realized that it was desirable to use a separate class to manage the step transition process rather than using the step objects themselves. This relieves the steps objects from having to know who their predecessors and successors are. Therefore, we developed a general base class (QuerySimulationStep) for the simulation steps, and currently have sixteen child classes that derive from this base class, representing sixteen query process steps. A separate singleton class (ComponentStateManager) tracks the state of activity for each DBMS component by keeping a hash map associating component names with their activity state. Another class (SimulationStepTransitionManager) maintains an arraylist of all of the query steps and returns the appropriate step object when the user moves through them. This organization makes it fairly easy to add both new components and new steps to the application.

Positive System Aspects

The following question and associated responses address the positive aspects of the system: What do you feel are the positive aspects of the Oracle Query Simulator? Students appreciated the visual aspects of the system and its use of color. Multiple students lauded the system's simplicity and ease of use. Representative responses are as follows: This tool helped me to understand some of the distinctions in Oracle Query Processing between memory processes and disk processes. I used the simulation to flip back and forth between the views to carefully look at exactly what was happening at each stage. The overview view also puts all the pieces together between the different processes to produce a coherent view of the processes behind Oracle query processing. It is great to get a visual look at what is happening and seeing the process slowed down. I like that the simulator has a lot of explanations (both in the step and definitions).

Its a way to help students learn about the query process in a fun way. The questions are a great aspect because it gives you control over how which way the process will go. You can even use them to test yourself to see if you know where it will branch off to in accordance to how to answer the questions. The simplicity of how the system is modeled in the simulation. Too much detail could distract from the main functions. This simulator was very simple and straightforward to understand. In addition the use of distinct colors helps to assure me not only when a component is in use but when it is complete which helped me better follow the flow of the query.

process. Finally, after some internal cleanup work, we will make this system freely available to the general CS education community.

6.

REFERENCES

4.3

Potential System Improvements

The following question and associated responses address potential system improvements: What additions or improvements could make the Oracle Query Simulator more useful to you and other students? The most-requested improvements involved being able to see all system views simultaneously, and providing the information regarding what data is written to the DBMS files. Representative responses include: I would like it if the program were able to show all three views on a single maximized view and then show/explain how the three view are related if at all. When questions are asked, you cant look at the other views until the question is answered. Maybe this could be changed. Also, maybe there could be some feedback about whether questions are answered correctly. Possibly all views could be placed on one screen, but the tabs are nice. It would be nice to see the actual data that is stored in the different parts of memory. It would also be nice to maybe put everything on one screen allowing arrows to be drawn to each level and to explicitly show the database hierarchy. Overall it is completely functional but it would be nice to show more information and allow for more features.

[1] Abbey, M., Corey, M., and Abramson, I.; Oracle 8i: A Beginners Guide, Oracle Press, 1999. [2] Dietrich, S., Urban, S., Kyriakides, I.; JDBC Demonstration Courseware Using Servlets and Java Server Pages. Proceedings of the 33rd SIGCSE Technical Symposium on Computer Science Education, 34, 1 (Feb. 2002), 266-270. [3] Fleming, N.D. & Mills, C. Not Another Inventory, Rather a Catalyst for Reflection. To Improve the Academy, 11, (1992) 137-155. [4] Guimaraes, M., The Kennesaw Database Courseware, Journal of Computing Sciences in Colleges, 21, 3 (Feb. 2006). [5] Hansen, S. R., Narayanan, N. H., and Hegarty, M., Designing Educationally Effective Algorithm Visualizations: Embedding Analogies and Animations in Hypermedia. Journal of Visual Languages and Computing, 13(2):291-317, Academic Press, 2002. [6] Kolb. D. A. and Fry, R., Toward an Applied Theory of Experiential Learning, in C. Cooper (Ed.), Theories of Group Process, London: John Wiley (1975). [7] Loney, K., and Koch, G., Oracle 9i: The Complete Reference, Oracle Press, 2002. [8] Mayer, R. E., & Gallini, J. K., When is an Illustration Worth Ten Thousand Words? J. of Ed. Psych. 82(4), 715-726 (1990). [9] Naps, T. L., Rling, G., Almstrum, V., Dann, W., Fleischer, R., Hundhausen, C., Korhonen, A., Malmi, L., McNally, M., Rodger, S., and Velazquez-Iturbide, J. .; Exploring the Role of Visualization and Engagement in Computer Science Education. SIGCSE Bulletin 35, 2 (June 2003), 131-152. [10] Ollerenshaw, A., Aidman, E., and Kidd, G. (1997), Is an Illustration Always Worth Ten Thousand Words? Effects of Prior Knowledge, Learning Style, and Multimedia Illustrations on Text Comprehension. Int. J. of Instructional Media 24, 3, p. 227-238.

5. CONCLUSIONS AND FUTURE DIRECTIONS


We have developed a prototype application for visualizing Oracle database query execution that allows students to interactively examine and understand the query process. An initial usage and informal evaluation with a database systems class indicates that this application seemed to improve student understanding of the query process. A formal evaluation is needed to confirm this hypothesis. We envision several possible enhancements to this system. First, we hope to add support for other DBMSs. such as DB2, SQL Server, MySQL, and PostgreSQL. This would be useful for illustrating variations in the query execution process across these systems. Second, we wish to explore displaying additional system information to provide users with more information about the internal state of the DBMS throughout the query execution

[11] Oracle DBMS; http://www.oracle.com.


[12] Sadik, S., Orlowska M., Sadiq W., Lin J.; SQLator: an Online SQL Learning Workbench, Proceedings of the 9th Conference on Innovation and Technology in Computer Science Education, 36, 3 (June. 2004), 223-227. [13] Sciore, E.; SimpleDB: A Simple Java-Based Multi-User System for Teaching Database Internals, Proceedings of the 28th SIGCSE Technical Symposium on Computer Science Education, 39, 1 (Mar. 2007), 561-565. [14] Schroeder, C., New Students - New Learning Styles, http://www.virtualschool.edu/mon/Academia/ KierseyLearningStyles.html. [15] Taylor, H.A., Renshaw, C.E., and Choi, E.J.; The Effect of Multiple Formats on Understanding Complex Visual Displays. , J. of Geosci. Ed., 52, 2 (March 2004), 115-121.

Vous aimerez peut-être aussi