Académique Documents
Professionnel Documents
Culture Documents
Triton Consulting The Royal 25 Bank Plain Norwich NR2 4SF Tel: 0870 241 1550 Fax: 0870 241 1549 email:sales@triton.co.uk
Over many years of providing DB2 support and consultancy services we have come across a huge array of support nightmares and difficult scenarios. In this document we share will you our top 10 and give you some advice on how to avoid such situations occurring in your own organisation.
1. Unintended Consequences
The Set-up We received a support call from a very worried Junior DBA who discovered that all the rows had been deleted from a critical table in a pre-production environment. The DBA admitted to connecting to the wrong system to clear down the table he shouldnt really have had the authority to perform such a task! The first priority was to get the data back from a recent backup, luckily there were no issues encountered during this part of the job. We needed to find out exactly why the Junior DBA had access to the table in the first place and this is what we discovered: The Junior DBA connected to the wrong system by mistake, using instance owner userid The System Administrator had been trying to get federation to work, and in addition to enabling the FEDERATED database manager parameter, the FED_NOAUTH (bypass federated authentication) parameter had also been set to YES. When FED_NOAUTH is set to YES, FEDERATED is set to YES, and authentication is set to SERVER or SERVER_ENCRYPT, then authentication at the instance is bypassed, as it is assumed that authentication will happen at the data source. So, it was possible to connect to the database as any user without having to get the password right! Once connected to the database, you only had access to the tables that the user (group) had access to. However, this meant if anyone got the right username for the DB2 instance owner then they could select/add/delete any data they liked!
The Moral Never forget the law of unintended consequences. In this case the System Administrator had not taken into account what those changes would do to the security of the DB2 database. DB2 can be a complex beast and a little knowledge is dangerous. Fiddling with settings can cause all sorts of problems. Dont underestimate the need for skilled DBA support.
Page 2 of 11
2. DBA Performance
The set-up We always say that database admin is best performed from the command line with scripts. This means that admin is repeatable and the actions are recorded which in turn can improve DBA productivity. If you have a single database that you manage, and thats it, this may seem like overkill, but what happens if you fail the bus test? If, on the other hand, you have a full development lifecycle configuration (several unit test, integration, system, user acceptance and performance/OAT environments), then this is critical, as it will be impossible to manage effectively doing all of this by hand.
It is worth considering using db2look to verify that your databases are structurally consistent as you move through each of the testing environments. The Moral Striving to get the best performance out of your applications is always high on the priority list but dont forget that DBA productivity is increasingly important.
Page 3 of 11
The Moral Many applications (such as BI, ERP and Java frameworks) are generating their own SQL it can be difficult to know exactly whats getting thrown at DB2. Correct tooling (and just as importantly, the skills to interpret them) is essential.
The Moral Skilled developers are not enough. A skilled DBA is needed for most applications
Page 4 of 11
Using his judgement of the age of the files, he deleted a chunk of log files. Unfortunately, he gained space but lost an active transaction log file in the process.
Despite our advice that a restore from a previous backup was the only solution to their problem, the Oracle DBA with some DB2 knowledge tried various methods to deceive DB2 like creating a dummy log file with the same name as the missing one, not knowing that the DB2 transaction log files have header information within them. After a lot of delay, it was finally agreed to carry out a restore from the most recent backup. This proved somewhat of a challenge since no backups were stored on disk. So, the correct tape had to be found and mounted. The restore and subsequent rollforward to a consistent point in time did successfully take place and there sighs of relief could be heard in the early morning hours. Even though some hours of business had been lost, jobs had been saved! Yes, even the junior sysadm was allowed to stay on since he owned up to his mistake! The Moral Do not touch the active log directory! Configure a log archiving strategy which archives log files to a different location than that of the active log directory. Have a scheduled clean up procedure for the archive log directory. The most recent active transaction log file can be found by listing the database configuration parameters. If possible, keep at least one backup image on disk. Do not give permissions to any Joe Blow to access tablespace data, active transaction logs, etc on the production database server, Have a Remote DBA support arrangement with an expert DB2 organisation, if your staff are lacking in DB2 skills.
The Moral Colour code your GUI/Telnet sessions! Consider restricting day-to-day access to production data to prevent accidents Get proper cover for holiday / sickness/ pregnancy etc
Page 5 of 11
7. As bad as it gets
The Set-up Imagine the scene a broken database on an unsupported version of DB2, with no backups or log files to recover the database. Yes, this one really was the stuff of nightmares! An erroneous script had deleted a few transaction log files that had a 'last changed' date of more than 45 days. The same script had caused other errors and a database restart was required, but the database did not start. The database was looking for an old log file, which had just been deleted by the script. As the policy was to retain the backups and archive logs for 30 days, this log file was deleted from the archive logs too. The database was tiny less than 50GB. Nevertheless, it was a very important one, with a number of web facing apps relying on it for important features. To make matters worse, the version of DB2 in use had passed its "End of Service" date, so DB2 support was not willing to investigate (though they were happy to guide). When we got involved a few hours after the incident, panic had set in. Based on information available (saved snapshots and db2diag.log file), we were able to conclude that there was a transaction which started in the log file the database was looking for. This transaction was never completed. The rate of change of data was so small, the configured log numbers could go on for more than 45 days. The options available were to extract the data from the latest backup image (using tools like High Performance Unload) or extract the data from the damaged database (using db2dart). The latter option was chosen as this would allow us to recover the most recent data. Without further delay, we ran db2dart on the database to check for any errors and to get the Tablespaceid, Tableid and the total number of pages allocated to each table. We were then able to use the information to build the db2dart command with the DDEL option to extract the data in delimited format. db2dart with the DDEL option is interactive (i.e., when the command is run, it prompts for the tablesapce id, tableid and the page range to extract the data). This meant that the extract could not be scripted but had to be done manually for each of the 300+ tables. Once that mind-numbing task was complete, we created a new database with the DDL that was available (thankfully, they had a db2look output from the production database less than a week before the incident). Finally, we loaded the extracted data to the new database and ran runstats on the table and indexes. After a few hiccups and 15 hrs of db2dart, import/load, runstats and data fixes, the database was available for the application. The database was down for more than 20 hours, but it was back in once piece with nearly no data loss. Quite an achievement under the circumstances! The moral See previous issue for advice on log management. Be aware of advanced recovery tools such as db2dart they can be lifesavers in extreme situations.
Page 6 of 11
8. Keeping up to date
The Set-up We often have to ask customers for the DB2 diagnostic log file (db2diag.log) only to be told its too large to send or its taking too long to open. This is because the DB2 diagnostic files have been append only since time immemorial, their growth only restricted by the maximum space available in the file system they reside in. The only way to curb this growth would be to rename the files which would then force the creation of new ones. Scripts had to be written to automate this process and to delete old files after a certain period. With the advent of DB2 9.7, all this is now history! The new diagsize database manager parameter allows a DBA to control the maximum sizes of the DB2 diagnostic log and administration notification log files. When this parameter is set to a non-zero value (which is the default pre-DB2 9.7 behavior), a series 10 rotating diagnostic log files and a series of rotating administration notification log files (only on UNIX and Linux) are used. It is also smart enough to clean up the diagnostic log directory of old log files. When the size of 10th file is full, the oldest file is deleted, and a new file is created. diagsize is the value (in MB) of the total size of all the DB2 diagnostic log and administration notification log files on UNIX and Linux. 90% of the total size is allocated for the 10 diagnostic log files and 10% of the total size is allocated for the 10 administration notification log files. On Windows, diagsize is the value (in MB) of the total size of all the 10 DB2 diagnostic log files on Windows. It can be difficult for many organisations to keep fully up to date with new features and indeed it is often the case that organisations are forced to stay on out-of-support versions of DB2. This can be a real issue when problems arise and they find that their DB2 systems are no longer supported by IBM. The Moral Where possible, try to keep up to date via fixpaks new features are delivered in DB2 all the time. Have a Remote DBA support arrangement with an expert DB2 organisation who will automatically keep your systems updated with the latest fixpaks. Arrange regular training for DBAs, or take out a Consultancy on Demand agreement which allows you to use consultancy hours on training, database healthchecks or just to answer technical questions.
Page 7 of 11
9. Beware of Over-Federating
The Set-up During a recent DB2-LDAP configuration at a client site we stumbled upon a bizarre security exposure. Using any DB2 client tool, it was possible to connect to the database as any user without having to get the password right! Once connected to the database, you only had access to the tables that the user (grou) had access to. However, this meant if anyone got the right username for the DB2 instance owner then they could select/add/delete any data they liked! In short, they had SYSADM authority which could potentially lead to a major security exposure. It so happened that in a desperate attempt to get federated technology to work, in addition to enabling the FEDERATED database manager parameter, the FED_NOAUTH (bypass federated authentication) parameter had also been enabled (set to YES). This was the problem. When FED_NOAUTH is set to YES, FEDERATED is set to YES and authentication is set to SERVER or SERVER_ENCRYPT, then authentication at the instance is bypassed. It is assumed that authentication will happen at the data source. The Moral You do NOT need FED_NOAUTH enabled to implement federation in DB2. If in doubt ask the experts!
Page 8 of 11
We went through the db2diag.log and the notification logs. We could see that there were physical errors reported in some of the tablespaces on the DR site (HADR Standby) 100-ish days prior to the incident. This was reported in the db2diag.log, and the affected tablespaces were "excluded from the rollforward set". Based on other entries in the db2diag file, we were able to confirm that the log file requested for rollforward on the DR site was used at the time the physical errors occurred there. HADR continued to apply logs for the other tablespaces and was reporting to be in "Peer" State. In reality, some of the tablespaces were being ignored
The Moral Regular monitoring on the log files is essential to identify and resolve the issue on the DR site in advance of this incident. In case of an emergency, call the experts at Triton on 0870 2411 550!
Page 9 of 11
Page 10 of 11
For more information: Visit www.triton.co.uk Email enquiries@triton.co.uk Call 0870 2411 550
Page 11 of 11