Databases and Data Types
You can store many different types of information in a database. However, no matter what you store, internally it ends up being stored as series of numeric binary values. If everything is stored the same way, then what’s a data type? At an elemental level, a data type is just an interpretation of numbers stored inside a computer.
Internally a computer interprets information differently depending on how the user intends to use the information. For example, “9” could be interpreted as a number if it represents a numeric value, like the available quantity of something. On the other hand, if it’s part of a phrase like “the cat has 9 lives”, it could be interpreted as part of a string of characters.
How Do Computers Store Information?
Internally, all information in a computer is stored as numbers. In fact, all a computer can store is a series of 1’s and 0’s. This is called binary code. When you store anything in a computer’s memory, including music, videos and documents it’s converted into binary code by your computer’s hardware. When you need to view it again, your computer takes the 1’s and 0’s and translates them back to a format that you can understand. This binary code is used regardless of the media that is used to store the information. Internally a Kingston USB jump drive, a SanDisk digital flash card, a hard drive, a computer’s internal RAM (the main memory of your computer) and an old-style magnetic disk drive all store information in binary code. In fact, technically the best quality RAM for your PC will work the same as lower quality RAM.
You may have noticed that memory always seems to be sold in multiples of 4 (i.e., 4GB, 8GB, 12GB, 16GB etc.). This is a side effect of a computer using binary code. It doesn’t matter if you are looking for memory for an Android tablet, Apple iPhone, Microsoft laptop computer or a Lenovo desktop computer. Internally, the RAM and ROM memory is the same.
Five Main Data Types
Regardless of what you are storing there are 5 main data types: Character; string; Integer, Float and Boolean. A character is any alphanumeric character or symbol. A string is a collection of characters. An integer is a number without a decimal. A float is a number with a decimal. A Boolean is either true or false. All of these are “primitive” datatypes expect for string which is a “non-primitive” data type. A primitive datatype is something that can be directly translated into a number and stored in a computer’s memory. A non-primitive type needs to be represented as a series of numbers.
Modern database systems make combine the main data types into composite types to store more complex information. For example, a “Blob” which is used to store documents, images, videos and music files is just a collection of bytes. And the bytes just represent integer values.
Four Data Categories
No matter what you are storing, it can be categorized as “nominal”, “ordinal”, “discrete” or “continuous”:
- Nominal – labels (like gender, country etc.) that do not have a pre-set order.
- Ordinal – categories (like education level, income level etc.) that internally have a natural order.
- Discrete – Something that can be counted (like a countries population).
- Continuous - something that can take any value like the temperature.
By understanding what category data falls into, you can find ways to store it more efficiently in your database. For example, in an ordinal type, each item in the set can be assigned a number. For example, if you were tracking income (either low, medium or high), you could assign the number 1 to low, the number 2 to medium and the number 3 to high. Then when you needed to store a low-income type, you would just save the number 1, instead of the string “low”. Since a number can be stored as primitive data type, it takes less space to store then a string of characters.
Storing Dates in a Computer
The storage of dates and times in a computer can present some challenges. There are different time zones, and time can be broken down into an infinite number of discrete units. One approach might be to simply store a date as a string of characters. But this doesn’t always work very well as a computer doesn’t know how to perform mathematical functions with strings. If you wanted your computer to figure out how many days were left till the new year, it would have no way of calculating if it only knew today’s date as a string.
Accordingly, a computer typically stores a date and time as the number of seconds that have elapsed since a known time, in a particular time zone. For example, in an SQL database a date is stored as seconds that have elapsed since 1900/01/01.
Encoding Images
You might be wondering how a computer can store a picture as a string of 1’s and 0’s. The process is quite straightforward. Each pixel on your screen is translated to a number, depending on its color, hue and brightness. These numeric values that make up the picture are then saved together in a file. When the image is retrieved the numeric values are translated back to a color that your monitor can display.
As you can imagine this list of numbers could get quite large for bigger pictures. This is where image formats like JPG, PNG and GIF come into the picture. These image formats are simply mechanisms for compressing the list of numbers, by detecting patterns that take less space to save then the original picture. This is especially important these days, when pictures are transmitted are through the internet. Inside a database an image can be stored as a “blob”, a collection of bytes representing numbers.
Encoding Music
The encoding of music happens much the same way as the encoding of images, except that audio signals are sliced to discrete sound bites that are translated to a number representing the sound. The different slices are combined together to from an audio file. Again, the audio file can be quite large depending on the number of slices (or sampling rate). Accordingly, file formats like MP3 were created that compress audio files by detecting patterns in the original file that can be represented in a shorter sequence. When you are ready to listen to the music, your computer takes the numbers in the sound file and triggers a corresponding sound in your speaker or headphones. If you are storing a sound file in your database, again it will be stored as a blob.
Ever wonder why vinyl records are making a comeback? It’s because a record player doesn’t need to convert sound to and from numbers. Instead, a record player can read the grooves on the record and send it directly to your speaker, without any translation, making it a “loss less” format.
Encoding Videos
Storage of videos works much the same way as storage of music and images, except that a video is a series of still images with corresponding sound. Compression formats like “MP4” for invented to store videos efficiently. Videos can also be stored as “blobs” in your computer’s database.
Tracker Ten Database
Our Tracker Ten database supports various data types including numbers, strings, integers, money fields, dates, and more. Selection of the data type is easy through a visual user interface.