Desktop Database Maintenance
After your database is up and running, routine database maintenance will keep it in good working order, no different from maintaining your home or automobile. By maintaining your database, you can make sure your data is up to date, your data has integrity and you can catch any early signs of data corruption. You can also gain insights into the type of data you are storing, and you can predict when you may need to upgrade your hardware and resources to ensure optimal database performance. Being proactive with your plan can save you significant time in the future. This is especially important because if you fail to catch problems early, your database backups can become corrupted as well, making recovery efforts much more difficult. Whether you are using an enterprise system like SQL or a desktop system like Tracker Ten, the core principles are the same.
Creating a Database Maintenance Plan
The easiest way to ensure your database is maintained is to create and follow a comprehensive database maintenance plan. Your plan could include procedures for backing up your database, rebuilding indexes, shrinking it, updating statistics, perform integrity checks and more. Depending on your database, different toolsets may be available to help you with your maintenance efforts. For example, Microsoft SQL includes the SQL Server Management Studio (SSMS) software with available maintenance plan wizards. Or if your database includes scripting (for example PowerShell scripts), these tools can also be used to automate your efforts. While some of these tools may require the skills of an experienced DBA since they have a steep learning curve, there are other tasks you can easily perform yourself. For instance, fully backing up your database is typically not very difficult. If you are more skilled you can also look at differential or transactional database back ups. See our “Making a Full Database Backup” article for more details.
Shrinking Your Database
Whenever possible you should shrink the size of your database. A smaller database will perform more quickly and will take less time to backup. There are several ways to shrink a database. First you can look at archiving information. For more information, please look at our “Wen to Archive Data” article.
Next, if you are using a system like Microsoft SQL, your database may offer built in shrinking tools. In SQL Server Management Studio (SSMS) you can simply right click a database and select the “shrink” option, and the system will automatically shrink your database (this function will move data around a file to fill in empty spaces, reducing the overall file size). The downside of this type of operation is your data can become heavily fragmented, require you to rebuild your indexes. For this reason, shrink functions should be used with caution.
You can also look at removing old backups you no longer need and completely purging information that is no longer relevant. If you are running out of disk space after your database shrinking efforts, you can look at adding additional storage space.
Generating Database Statistics
Another maintenance activity is to generate statistics about your database, so you can better understand your data and your growth needs. This type of metadata can be very valuable. You will have an understanding of which tables in your database contain the most rows ( i.e., the cardinality), and could benefit from indexing.
You will also be able to see which queries or database searches are taking the most time. If you find that certain queries are using an unreasonable number of resources, you can look at refactoring the queries to make them perform faster. If refactoring is not an option, you can look at getting faster processors and more memory.
Please note much of this information will be available in logs your database system may automatically generate. A log will contain a sequential record of all the changes that take place in your database and it may also include a list of errors and system crashes that occur while your database is being used. In SQL management studio you can view logs by simply right clicking the database and selecting the “SQL Server Log” option. In other systems like MySQL you may need to manually enable logging.
SQL also internally uses statistics to optimize queries. SQL can figure out how many rows are likely to be returned by the predicates in a query, helping it to decide what order database should be retrieved to improve performance. This procedure is called cost-based optimizing. Clearly up to date statistics are necessary for this cost-based optimizing to work well, which is why you can force the system to create statistics on the fly (for example the 'use [?]; exec sp_updatestats' in SQL).
Ensuring Database Integrity
Your data has integrity if it is accurate, complete and consistent. Depending on your industry you may need to follow certain data integrity guidelines like GDPR. There are several ways to ensure data integrity. First, your entities should have unique keys. This is called “entity integrity”. Second whenever you are referencing data in another part of your database, you should ensure that the references are valid (i.e., you are never pointing to something that does not exist). Third you should ensure that values stored in a column are valid for that column (for example you wouldn’t want to include the city “Los Angeles” in a state table). This is called “domain integrity”. Lastly, you may have “User-defined integrity”, which is any additional rules you create to ensure the validity of your data.
If after examining your data you find integrity issues, you should look for the cause of the problems. The integrity issues could simply be human error, or they could be something malicious like viruses, spyware and bugs.
Detecting Database Corruption
Database corruption can occur due to hardware and software issues. Database files may become corrupted if the hard drives they are stored on are starting to degrade. So, the first way to detect database corruption early is to ensure the health of your disk drives by regularly checking their integrity using the tools built into your operating system. These tools work using sophisticated algorithms that check things like checksums, repetition codes, parity bits and cyclic redundancy checks. These techniques all refer to the process of adding extra information to a file, that is derived from information that is already in the file. If this derived information ever differs from what is expected, you know you have a corruption issue. However, computer hardware is getting much more reliable, so if you do encounter corruption issues it is much more likely to be software related, then hardware related.
Database systems like ORALCE and SQL also have built in validate, analyze table and analyze index functions that can be used to ensure the integrity of your data. You can also examine database logs to look for unexpected errors and crashes that may be the result of corrupted data. You should check your database system documentation to see what built in scanning mechanisms are available to detect database corruption, and you should run these checks on a regular basis.
By detecting corruption early, you can ensure that your data does not become directly corrupted, meaning that valid information is overwritten with invalid information. This can in turn lead to indirect corruption, where invalid data is used to update other data in your database, making this other data invalid as well. To prevent this type of corruption you need to make sure that your data is not erroneously updated by an end user or outside service.