Académique Documents
Professionnel Documents
Culture Documents
Bryan Huddleston
CONTENT INTRODUCTION .................................................................................................... 3 DATA GROWTH .................................................................................................... 3 THE GROWTH OF DATA IN NUMBERS ..................................................................... 3 WHY DO WE HAVE ALL OF THIS DATA? ....................................................................... 4 WILL IT GET ANY BETTER? ....................................................................................... 5 RADIO FREQUENCY IDENTIFICATION (RFID)............................................................ 5 VOICE OVER IP - VOIP ................................................................................... 5 DIGITAL AUDIO/VIDEO .................................................................................... 5 GETTING DATA TO THE USER .................................................................................... 6 THE SQL EXECUTION PROCESS WHAT HAPPENS WHEN I
HIT
EXECUTE? ............................... 6
SQL OPTIMIZATION CHALLENGES .............................................................................. 7 COMPLEX/DIFFICULT TASK ................................................................................ 7 TIME CONSUMING .......................................................................................... 7 METHODOLOGY .................................................................................................... 8 GO-TO PERSON .......................................................................................... 8 TEAM APPROACH ............................................................................................ 8 QUEST METHODOLOGY IDENTIFY, REWRITE AND TEST .................................................... 9 IMPLEMENTATION ............................................................................................... 10 MANUAL .................................................................................................... 10 PRODUCTIVITY TOOLS .................................................................................... 10 QUEST TOOLS ................................................................................................... 10 CONCLUSION .................................................................................................... 10 ABOUT THE AUTHOR ............................................................................................ 12 ABOUT QUEST SOFTWARE ..................................................................................... 12
Introduction
If your shop has not experienced data growth in the last two-10 years, you are in the minority. The explosion of the information age has increased the amount of data we send, receive, process and use. Whether its customer or employee information, portals, dashboards or even personal data like MP3s and downloadable videos, our IT systems and personal computers are becoming just like my Dad -packrat junkies for data. This paper is the result of reading industry articles and analyst papers and conducting numerous customer interviews to draw some conclusions about what the future holds for data growth, what it means for application owners and, ultimately, the person responsible for application performance. It informs application owners about business considerations that result from data growth, and explains that these owners can prepare their applications by employing a methodical approach to optimizing code within their organization. This paper focuses in on what developers have control over, what will arguably create the biggest gains for productivity and what it takes for management to have a be prepared approach. Ironically this is usually the last approach to be implemented for data growth and SQL statement optimization. Lastly, this paper suggests a successful methodology for optimizing SQL code and explains how Toad for Oracle Xpert can make SQL optimization seamless and less time-consuming by identifying potential performance-limiting factors. The reader should walk away with an understanding in the following areas: The breadth and growth of data What factors contribute to that growth How to apply that knowledge within their company How to incorporate a methodical approach for code optimization Ways to overcome potential performance issues when applications query data
Data Growth
This section discusses the impact of data growth in the industry, provides some examples of companies using data for a strategic advantage, explains why we are inundated by data growth and identifies future trends.
META Group: Net annual storage growth will average 20 25 percent for enterprise (monolithic), 50 55 percent for midrange (modular), and 80 85 percent for low-cost capacity based (SATA/ATA) storage yielding aggregate storage growth of about 45 percent per 1 year. According to Winter Corporations annual 2003 storage survey, France Telecoms database (the top winner in database size at 29.2TB) was three times as big as the winners for 2001.2 In addition to data storage, transactions numbers are increasing. The U.S. Bureau of Customers and Border Protection almost doubled its workload from 2001 to 51, 450 tps.
The best example is Wal-Mart. Wal-Mart has 460 terabytes of data stored on Teradata mainframes at its Bentonville headquarters. In perspective, the Internet has less than half that much data, according to experts.3 Now you may be saying to yourself, Im a small-to-medium business (SMB) theres no way my data growth can come close to these examples. Wrong. At 2004 GigaWorld, Bob Zimmerman, Forrester analyst, cited an example of an SMB that acquired a company that subscribed to a service that would deliver a terabyte of new data every four to six weeks. This single service would increase data by 250 percent in 12 months. The article goes on to say that administrators were unprepared and ill equipped to accept the first block of data. Another excellent example is the U.S. Air Force. The USAF is designing a flight data collection system to prototype in 2005. Each flight of a new aircraft will generate more than a terabyte of data. By the time this gets into production in 2006, data volumes are projected to approach 20 petabytes per year excluding replication. Zimmerman explains that every IT shop should be prepared for a 200 percent unanticipated increase in data volume at any given time. This preparation must consider the impact of such extreme changes on operations, data security and applications.4
of the products that were stocked for the storm sold out quickly.3 The price/capacity of storage has been and will continue to be the driver for this trend. As if the 250GB drives for $120 (before rebate) in the Sunday Best Buy ad are not enough validation, META Group indicates that like-for-like price/capacity storage will improve 35 percent per year.1 These are the business reasons for the growth of data. In a management capacity, tying this information back to the business allows for companies to not use but rather advantageously exploit IT. In these cases, data and the ability to manipulate it is not only powerful, but also a competitive advantage in regulatory compliance, improved productivity and profit.
Digital Audio/Video
Everyone I know either has an iPod or iPod envy. Its the data or the digital video/music that is stored everywhere that presents the future challenge. Drawing again on personal experience, I purchased an 80-hour, 160GB, TiVo over Christmas. My family had missed the last two hours and 40 minutes of the Survivor finale because of a VCR set up error. Upon setting TiVo up, I read about the ability to take shows and put them on a laptop and take them with you with TiVoToGo. This is great for me because I travel and I can put them on my work laptop and I can watch them at my leisure. Just as retail will have to cope with RFID, media will have to cope with digital formats of video and music. Id love to be able to link readers to the two-hour special on the History channel I watched
5
about the rise of Wal-Mart for this paper. Better yet, Google has introduced a new video search utility. 8 The real question is how will this technology on the consumer side be used in the corporate world? What digital media will be stored on corporate servers and how will users have effective access to this stored data?
The following section describes the SQL execution process, what challenges face developers, development teams, QA teams and ultimately the people responsible for the performance of the application. It also suggests a methodology to overcome these challenges and demonstrates how Toad for Oracle Xpert from Quest can optimize SQL for improved application performance.
SQL statement execution is much like Russ and his plan for accomplishing multiple errands. A SQL statement with data from four tables in the database has to make four stops to pick up the data and deliver the results to the person querying the database. Process efficiency (or the shortest distance that the SQL statement takes to retrieve its data from each of the four tables) is the amount of time it takes to provide results. It is not until this SQL statement is identified that all paths through the Oracle database are calculated and tested and the shortest route is determined. This process is called SQL tuning or SQL optimization.
Complex/Difficult Task
Undoubtedly, this is a difficult task. There are an untold number of papers, books articles and experts on the subject of the mechanical nature of tuning SQL. As an example, ask members of your team to explain how the database processes SQL statements how they use that data to optimize SQL and see how many different answers you get.
Time Consuming
Below is an example of a very simple query that developers and DBAs create every day. This simple SQL statement can be rewritten a total of 11,901 times. select emp_name, dpt_name, grd_desc from employee, department DEPARTMENT1, grade where emp_grade = grd_id and emp_dept = dpt_id and EXISTS (SELECT 'X' from department DEPARTMENT2
WHERE dpt_avg_salary in (select min(dpt_avg_salary) from department DEPARTMENT3) AND dpt_id = EMPLOYEE.emp_dept)
The above SQL statement has 36 words in it. This sentence has 15 words and took me 22 seconds to type and spell check. Doing this task nonstop 11,901 times would take 261,822 seconds, 4,363.7 minutes, 72.7 hours or 3.03 days. Is this ideal time developer/DBA usage for a manual effort in rewriting and testing even the simplest of SQL statements? If a simple statement takes this much time, think of the time required for a complex SQL statement.
Methodology
Based on numerous customer interviews I have conducted over the past five years, most organizations have some sort of code review process in place for optimizing SQL. This process could be using a single individual with SQL optimization expertise (Go-To person) or using a QA or review team (Team Approach) to conduct SQL optimization prior to production. However, there are flaws associated with both processes.
Go-To Person
Some organizations have a Go-To person for optimizing SQL. Single individuals provide the expertise to optimize SQL, however, this is only a subset of what they do. Typically, this person is a senior or principle developer or a DBA who has, by default, become the Go-To person. Their time is spread across the project(s) they are currently working and taking ad hoc requests from other developers. They are usually only contacted when a statement is noticeably slow. With this method, not all SQL statements will be optimized and based on the Go-To SQL persons time -- even crucial statements may be over looked.
Team Approach
Another method is a QA or review team to optimize SQL. During QA or load testing, the QA team may discover performance issues in the database. When this occurs, the QA team will send the offensive code back to the developers for optimization. It is more efficient to initially optimize the code, rather than have QA identify an issue, send it to development for correction and retesting, then retest again in QA. This leads us to the recommended Quest methodology for optimizing SQL code, implementation strategies and an overview of Toad for Oracle Xpert.
2. Rewrite SQL
Transform the SQL to obtain different versions
Get original SQLs run time Always test run your SQL alternatives Do not completely trust the Oracle cost estimation
This should be implemented by making everyone responsible for the performance of the application at every phase of the lifecycle. No matter the skill/experience level or the role, everyone from the junior developer to the DBA should be responsible for application performance. (See graphic below).
This ensures that mechanics are in place for quality code creation. It takes the pressure off individuals who are inundated by requests within the organization and shortens QA/Testing time, thereby decreasing delivery time for applications to move to production.
Implementation
One can implement this methodology in multiple ways, (based on each companys needs) either through a manual method or the use of productivity tools.
Manual
With manual implementation, every member of the team must become an expert in optimizing SQL. This could be done through lunch and learn sessions or internal training sessions. Though effective, this method is both complex and time-consuming.
Productivity Tools
A more productive method for SQL optimization is the use of database tools that automate the code quality process. In an inquiry through Forrester, Noel Yuhanna explains, Usually SQL tuning tools can help improve performance by several times, but it largely depends on how optimized the SQL query already is. Based on customer feedback, on two out of three occasions, SQL tuning tools usually help in improving the performance of a query. Typically developers focus mainly on the logic, not as much on performance, therefore the SQL tuning tool helps fill the gap, enhances productivity and makes applications run faster using less system resources. 9
Quest Tools
Quest Softwares de facto standard development tool, Toad for Oracle, has SQL optimization technology in the Xpert edition. This edition allows for developers and DBAs of any skill level, in any phase of the project lifecycle on most applications, to optimize and enhance the performance of SQL code. Toad for Oracle Xpert has embedded functionality to implement a SQL optimization methodology in any organization. Teams that already have expertise in-house for optimizing SQL will find increased productivity through the simple use of this tool. For those who have not implemented SQL Optimization into their process, they will find the intuitive interface easy to use and implement. For more information on Toad for Oracle Xpert, please go to: http://www.quest.com/toad/index.asp
Conclusion
Data growth is and will continue to be one of the continual challenges IT organizations face. Technologies such as RFID, VOIP and digital media will be the largest data drivers companies contend with in the near future. Over the next few years, many companies will see an exponential growth in collected data that is used strategically to maintain a competitive advantage. Understanding industry trends will enable IT management to be prepared to present solutions, not only to efficiently handle data growth, but also to install a methodology for SQL optimization. The best defense is a solid plan of preparation for potential data growth issues. For applications, it is providing developers with the tools to optimize SQL code, which leads to maximum application performance and satisfied end users. Companies must ensure that all individuals involved in application development and administration are responsible for data integrity and SQL code quality.
10
References:
1
IT Imperative: Managing Robust Storage Growth, META Group, Carl Greiner, Rob Schafer, December 21, 2004. Winter Corporations TOPTEN Grand Price Winners, Kathy Auerbach, DM Review, March 2004.
2 3
What Wal-Mart Knows About Customers Habits by Constance l. Hays, The New York Times November 14, 2004. Capacity Matters: Plan Ahead for Terabyte Data Growth: Bob Zimmerman Forester Paper, July 12, 2004.
5 CIOs' 'Must Do' Resolutions for 2005, J. Mahoney, M. McDonald, M. Raskino, Gartner Research Note, 17 December 2004. 4
Four Giant Steps to Maximize the Business Value of IT, M. Gerrard, Gartner Commentary, 22 December 2004.
7 8
The Tsunami of Data Growth, Elliot King, Windows IT Pro Magazine, March 1, 2004 Google to Branch Into Television, Michael Liedtke, AP Business Writer, January 25, 2005.
SQL Tuning Tools Remain Important, Especially For Custom Applications, Noel Yuhanna, Forrester. IdeaByte, January 2, 2004.
10 11
IT moves into Voice Communications, Caron Carison, EWeek, December 20/27, 2004 Get ready for RFID, Renee Boucher Ferguson, EWeek, January 17, 2005
11
World Headquarters 8001 Irvine Center Drive Irvine, CA 92618 www.quest.com e-mail: info@quest.com Inside U.S.: 1.800.306.9329 Outside U.S.: 1.949.754.8000 Please refer to our Web site for regional and international office information. For more information on Quest Software solutions, visit www.quest.com.
Copyright 2005 Quest Software, Inc. Toad and Toad for Oracle Xpert are registered trademarks of Quest Software. The information in this publication is furnished for information use only, does not constitute a commitment from Quest Software Inc. of any features or functions discussed and is subject to change without notice. Quest Software, Inc. assumes no responsibility or liability for any errors or inaccuracies that may appear in this publication.
12