Vous êtes sur la page 1sur 11

Top 10 DB2 Support Nightmares & How to Avoid Them

Triton Consulting The Royal 25 Bank Plain Norwich NR2 4SF Tel: 0870 241 1550 Fax: 0870 241 1549 email:sales@triton.co.uk

Over many years of providing DB2 support and consultancy services we have come across a huge array of support nightmares and difficult scenarios. In this document we share will you our top 10 and give you some advice on how to avoid such situations occurring in your own organisation.

1. Unintended Consequences
The Set-up We received a support call from a very worried Junior DBA who discovered that all the rows had been deleted from a critical table in a pre-production environment. The DBA admitted to connecting to the wrong system to clear down the table he shouldnt really have had the authority to perform such a task! The first priority was to get the data back from a recent backup, luckily there were no issues encountered during this part of the job. We needed to find out exactly why the Junior DBA had access to the table in the first place and this is what we discovered: The Junior DBA connected to the wrong system by mistake, using instance owner userid The System Administrator had been trying to get federation to work, and in addition to enabling the FEDERATED database manager parameter, the FED_NOAUTH (bypass federated authentication) parameter had also been set to YES. When FED_NOAUTH is set to YES, FEDERATED is set to YES, and authentication is set to SERVER or SERVER_ENCRYPT, then authentication at the instance is bypassed, as it is assumed that authentication will happen at the data source. So, it was possible to connect to the database as any user without having to get the password right! Once connected to the database, you only had access to the tables that the user (group) had access to. However, this meant if anyone got the right username for the DB2 instance owner then they could select/add/delete any data they liked!

The Moral Never forget the law of unintended consequences. In this case the System Administrator had not taken into account what those changes would do to the security of the DB2 database. DB2 can be a complex beast and a little knowledge is dangerous. Fiddling with settings can cause all sorts of problems. Dont underestimate the need for skilled DBA support.

Triton Consulting 2011

Page 2 of 11

2. DBA Performance
The set-up We always say that database admin is best performed from the command line with scripts. This means that admin is repeatable and the actions are recorded which in turn can improve DBA productivity. If you have a single database that you manage, and thats it, this may seem like overkill, but what happens if you fail the bus test? If, on the other hand, you have a full development lifecycle configuration (several unit test, integration, system, user acceptance and performance/OAT environments), then this is critical, as it will be impossible to manage effectively doing all of this by hand.

It is worth considering using db2look to verify that your databases are structurally consistent as you move through each of the testing environments. The Moral Striving to get the best performance out of your applications is always high on the priority list but dont forget that DBA productivity is increasingly important.

3. Make sure your tools are up to the job


The set-up We had a customer who had a COGNOS 8/BI development environment running against DB2 AIX V9.5. They were having issues with extremely long elapsed times for COGNOSgenerated queries (over 30 minutes!). Temp space was being exceeded and users were increasing it but still experiencing the same problem. The users checked COGNOS which seemed fine and they also ran DB2 Index Advisor and created additional indexes but to no avail. So, Triton stepped in. We ran some analysis using a tool which we find extremely useful Brother Eagle, from our partners DBI Software. The analysis showed a very high query execution cost. SQL being generated by COGNOS was captured, and found to be a Cartesian join of two tables, each with more than 25M rows in. New indexes were not being used. The solution was to re-work the Cognos report definitions to add some missing database relationships. This caused COGNOS to generate correct JOIN predicates . The reports were re-run to successful completion in under 2 minutes and the developers were educated to look at both COGNOS and native SQL.

Triton Consulting 2011

Page 3 of 11

The Moral Many applications (such as BI, ERP and Java frameworks) are generating their own SQL it can be difficult to know exactly whats getting thrown at DB2. Correct tooling (and just as importantly, the skills to interpret them) is essential.

4. Dont expect to get by with no DBA skills


The set-up This is quite a common issue in many organisations where development is carried out offshore and there is a local admin/support function but no local DBA resource. In this story we were initially brought in as a one off exercise because the developers were complaining about poor query performance and asking to move to another RDBMS. These developers were highly skilled in SQL and could write clever, complex queries, common table expressions, etc. On questioning the developers we found out that they had only a basic knowledge of DB2 and no specific skills whatsoever. We configured a basic set of automated housekeeping routines which resolved performance issues and made developers happy, but in the process a lot of other issues were uncovered with data quality (developers coding inefficient SQL to get around duplicate data that shouldnt exist). The solution was simple a Triton support contract to provide DBA support.

The Moral Skilled developers are not enough. A skilled DBA is needed for most applications

5. Dont touch the logs!


Following on from number 4, this is another classic we dont need a DBA story. We received a frantic call one evening from a customer asking for immediate help. DB2 had hung and no activity could be performed on the production database. As the OLTP database had a 24x7 online SLA, it was not surprising that the senior management were waiting when we arrived at the customer site. Upon investigation, it became apparent that DB2 was looking for an active transaction log file that had gone missing and nowhere to be found. After investigation we found: o In a desperate attempt to create some space on the production database server, a junior sysadm had stumbled upon the DB2 transaction log directory which in this case housed both the active and archived logs and thought, hmm, this directory could do with some cleanup.

Triton Consulting 2011

Page 4 of 11

Using his judgement of the age of the files, he deleted a chunk of log files. Unfortunately, he gained space but lost an active transaction log file in the process.

Despite our advice that a restore from a previous backup was the only solution to their problem, the Oracle DBA with some DB2 knowledge tried various methods to deceive DB2 like creating a dummy log file with the same name as the missing one, not knowing that the DB2 transaction log files have header information within them. After a lot of delay, it was finally agreed to carry out a restore from the most recent backup. This proved somewhat of a challenge since no backups were stored on disk. So, the correct tape had to be found and mounted. The restore and subsequent rollforward to a consistent point in time did successfully take place and there sighs of relief could be heard in the early morning hours. Even though some hours of business had been lost, jobs had been saved! Yes, even the junior sysadm was allowed to stay on since he owned up to his mistake! The Moral Do not touch the active log directory! Configure a log archiving strategy which archives log files to a different location than that of the active log directory. Have a scheduled clean up procedure for the archive log directory. The most recent active transaction log file can be found by listing the database configuration parameters. If possible, keep at least one backup image on disk. Do not give permissions to any Joe Blow to access tablespace data, active transaction logs, etc on the production database server, Have a Remote DBA support arrangement with an expert DB2 organisation, if your staff are lacking in DB2 skills.

6. Following simple rules can help you avoid BIG mistakes


The Set-up Here we have another story about a Junior DBA being left along whilst the Senior DBA goes on holiday (do Junior DBAs never get a break I wonder?). Said Junior DBA managed to accidentally drop a critical production database, thinking that he was connected to the UAT system. Luckily the Triton team were quickly on hand and assisted with a full recovery afterwards, but complications resulted in nearly a full days lost business never a good situation either for the business or for the DBAs career prospects!

The Moral Colour code your GUI/Telnet sessions! Consider restricting day-to-day access to production data to prevent accidents Get proper cover for holiday / sickness/ pregnancy etc

Triton Consulting 2011

Page 5 of 11

7. As bad as it gets
The Set-up Imagine the scene a broken database on an unsupported version of DB2, with no backups or log files to recover the database. Yes, this one really was the stuff of nightmares! An erroneous script had deleted a few transaction log files that had a 'last changed' date of more than 45 days. The same script had caused other errors and a database restart was required, but the database did not start. The database was looking for an old log file, which had just been deleted by the script. As the policy was to retain the backups and archive logs for 30 days, this log file was deleted from the archive logs too. The database was tiny less than 50GB. Nevertheless, it was a very important one, with a number of web facing apps relying on it for important features. To make matters worse, the version of DB2 in use had passed its "End of Service" date, so DB2 support was not willing to investigate (though they were happy to guide). When we got involved a few hours after the incident, panic had set in. Based on information available (saved snapshots and db2diag.log file), we were able to conclude that there was a transaction which started in the log file the database was looking for. This transaction was never completed. The rate of change of data was so small, the configured log numbers could go on for more than 45 days. The options available were to extract the data from the latest backup image (using tools like High Performance Unload) or extract the data from the damaged database (using db2dart). The latter option was chosen as this would allow us to recover the most recent data. Without further delay, we ran db2dart on the database to check for any errors and to get the Tablespaceid, Tableid and the total number of pages allocated to each table. We were then able to use the information to build the db2dart command with the DDEL option to extract the data in delimited format. db2dart with the DDEL option is interactive (i.e., when the command is run, it prompts for the tablesapce id, tableid and the page range to extract the data). This meant that the extract could not be scripted but had to be done manually for each of the 300+ tables. Once that mind-numbing task was complete, we created a new database with the DDL that was available (thankfully, they had a db2look output from the production database less than a week before the incident). Finally, we loaded the extracted data to the new database and ran runstats on the table and indexes. After a few hiccups and 15 hrs of db2dart, import/load, runstats and data fixes, the database was available for the application. The database was down for more than 20 hours, but it was back in once piece with nearly no data loss. Quite an achievement under the circumstances! The moral See previous issue for advice on log management. Be aware of advanced recovery tools such as db2dart they can be lifesavers in extreme situations.

Triton Consulting 2011

Page 6 of 11

8. Keeping up to date
The Set-up We often have to ask customers for the DB2 diagnostic log file (db2diag.log) only to be told its too large to send or its taking too long to open. This is because the DB2 diagnostic files have been append only since time immemorial, their growth only restricted by the maximum space available in the file system they reside in. The only way to curb this growth would be to rename the files which would then force the creation of new ones. Scripts had to be written to automate this process and to delete old files after a certain period. With the advent of DB2 9.7, all this is now history! The new diagsize database manager parameter allows a DBA to control the maximum sizes of the DB2 diagnostic log and administration notification log files. When this parameter is set to a non-zero value (which is the default pre-DB2 9.7 behavior), a series 10 rotating diagnostic log files and a series of rotating administration notification log files (only on UNIX and Linux) are used. It is also smart enough to clean up the diagnostic log directory of old log files. When the size of 10th file is full, the oldest file is deleted, and a new file is created. diagsize is the value (in MB) of the total size of all the DB2 diagnostic log and administration notification log files on UNIX and Linux. 90% of the total size is allocated for the 10 diagnostic log files and 10% of the total size is allocated for the 10 administration notification log files. On Windows, diagsize is the value (in MB) of the total size of all the 10 DB2 diagnostic log files on Windows. It can be difficult for many organisations to keep fully up to date with new features and indeed it is often the case that organisations are forced to stay on out-of-support versions of DB2. This can be a real issue when problems arise and they find that their DB2 systems are no longer supported by IBM. The Moral Where possible, try to keep up to date via fixpaks new features are delivered in DB2 all the time. Have a Remote DBA support arrangement with an expert DB2 organisation who will automatically keep your systems updated with the latest fixpaks. Arrange regular training for DBAs, or take out a Consultancy on Demand agreement which allows you to use consultancy hours on training, database healthchecks or just to answer technical questions.

Triton Consulting 2011

Page 7 of 11

9. Beware of Over-Federating
The Set-up During a recent DB2-LDAP configuration at a client site we stumbled upon a bizarre security exposure. Using any DB2 client tool, it was possible to connect to the database as any user without having to get the password right! Once connected to the database, you only had access to the tables that the user (grou) had access to. However, this meant if anyone got the right username for the DB2 instance owner then they could select/add/delete any data they liked! In short, they had SYSADM authority which could potentially lead to a major security exposure. It so happened that in a desperate attempt to get federated technology to work, in addition to enabling the FEDERATED database manager parameter, the FED_NOAUTH (bypass federated authentication) parameter had also been enabled (set to YES). This was the problem. When FED_NOAUTH is set to YES, FEDERATED is set to YES and authentication is set to SERVER or SERVER_ENCRYPT, then authentication at the instance is bypassed. It is assumed that authentication will happen at the data source. The Moral You do NOT need FED_NOAUTH enabled to implement federation in DB2. If in doubt ask the experts!

10. In the event of an emergency call....


The Set-up One of our customers was running HADR for disaster recovery. They had no cluster software used for monitoring or failover. HADR was being monitored on a regular basis using a shell script. On the primary site, a few disks failed, which caused some of the tablespaces to be put in Rollforward Pending state. Transactions accessing data in these tablespaces failed but others were successful. The last run of the HADR state monitoring script indicated a Peer State and therefore it was decided to issue TAKEOVER command on the DR site to switch roles. When the application started, some transactions failed with the same error as on the primary site. List tablespaces comman showed a number of tables in Rollforward Pending state. To get out of the pending state, ROLLFORWARD command was issued with the list of affected tablespaces. The rollforward was trying to retrieve a log, which was a few thousand logs older than the current one. This was not available in the archive. After a few tries, this ROLLFORWARD option was given up. The database was restored from the latest backup image and the application started. Analysis

Triton Consulting 2011

Page 8 of 11

We went through the db2diag.log and the notification logs. We could see that there were physical errors reported in some of the tablespaces on the DR site (HADR Standby) 100-ish days prior to the incident. This was reported in the db2diag.log, and the affected tablespaces were "excluded from the rollforward set". Based on other entries in the db2diag file, we were able to confirm that the log file requested for rollforward on the DR site was used at the time the physical errors occurred there. HADR continued to apply logs for the other tablespaces and was reporting to be in "Peer" State. In reality, some of the tablespaces were being ignored

The Moral Regular monitoring on the log files is essential to identify and resolve the issue on the DR site in advance of this incident. In case of an emergency, call the experts at Triton on 0870 2411 550!

Triton Consulting 2011

Page 9 of 11

Triton Consulting Company Profile


Triton Consulting is a highly focused, independent IBM Information Management services provider. We are able to supply software design, architecture and technical individuals to assist organisations in addressing their most pressing IT issues. As an IBM Premier Business Partner, the company is ideally placed to stay up to date with all of the very latest developments in IBM's flagship information management products. Triton Consulting has a proven track record in helping large and small companies address a wide variety of data management related strategic, technical and managerial issues. Ranging from IT strategy reviews to detailed solution performance modelling and tuning. Our consultants are carefully selected to combine in-depth technical knowledge with excellent communication and presentation skills. This combination allows us to work with our clients to quickly investigate their information and data issues and propose effective solutions. In addition, the majority of our technical specialists are officially certified under IBM's Certified Professional Programme - another guarantee that the advice and guidance that you'll receive from Triton will be timely, accurate and above all dependable. Triton Consulting is committed to retaining this focus as IBM continues to expand the certification programme to cover other products within its Information Management portfolio. By focusing on a our core strengths , Triton Consulting is able to ensure that it retains its position as one of the UK's leading providers of independent, specialist information-related technical services. Tritons unique characteristics include: Largest Independent: Triton is the largest independent DB2 consultancy in the UK and one of the largest independent information management companies in the UK. We also have clients in Europe and the United States. Accreditations: Triton has more IBM data management accreditations than any other business partner in the UK Authorship of IBM Documents: Triton provides a range of input into IBMs Redbooks, whitepapers and other industry publications. Membership of IBM Gold Consultancy Program: The Gold Consultancy program provides direct access to IBM laboratories in North America. This program allows Triton to bypass the normal IBM processes and to get answers much faster in emergencies. Often this is quicker than IBM UK can obtain the same results. There are currently only three Gold Consultants in the UK and 100 world-wide, of which Triton has two. Services Approval: Triton is a preferred supplier for information management resource to IBM, EDS and a number of IBM Business Partners.

Triton Consulting 2011

Page 10 of 11

For more information: Visit www.triton.co.uk Email enquiries@triton.co.uk Call 0870 2411 550

Triton Consulting 2011

Page 11 of 11

Vous aimerez peut-être aussi