• Data is stored in a computer system to be accessed by the processor.
• It is a process that allows the computer system to retain information temporarily or permanently.
• This data is usually in the form of optical or electromagnetic form.
Types of Data Storages:
• There are two types of data storages i.e. primary and secondary.
• The primary storage retains data in RAM (Random Access Memory), ROM(Read Only Memory), or L1 & L2 cache.
• The secondary storage stores data in hard disks, RAID (Redundant Array of Independent Disks Systems), Zip drivers, etc.
• Primary storage is faster to access whereas secondary storage can store more data.
• Primary storage is also known as Main Storage whereas Secondary Storage is also known as Auxilary Storage.
• It is a process that allows you to package a single file or multiple files to use less disk space.
• There are two types of file compression:
1. Lossy file compression
2. Lossless file compression
Lossless File Compression:
• This file compression allows the original file to be reconstructed when uncompressed.
• It is best for file formats where data loss can damage the information. E.g. account statements, attendance spreadsheets, etc.
Lossy File Compression:
• In contrary to lossless, lossy compression removes the unnecessary data to compress the files.
• The original file cannot be reconstructed.
• It is used where the quality degradation cannot harm the information e.g. MP3 and JPEG.
• In computer systems, there are various types of file formats. Following are the ones we will discuss in detail:
- Text and numbers
• MP3 is a technology that compresses music files.
• It is also known as audio compression.
• It compresses a typical music file by 90%.
• E.g. A100 MB music file can be converted into an MP3 file format with a size of 10 MB.
• These types of files can be used in cellphones, computers or MP3 players.
• The music files are compressed using a technology known as ‘Perceptual Music Shaping’.
• This technology removes the sounds that the human ear cannot hear meaning that compression is done by removing some part of the music without affecting the overall quality of music.
• It uses a Lossy Format for compression.
• The MP4 format contrary to the MP3 format allows the storage of not only music but also the storage of videos, animation, photos, etc.
• Using this format, videos can be streamed over the internet without compromising the quality.
• JPEG stands for Joint Photograph Experts Group.
• JPEG is an image file format that changes the image resolution i.e. pixels per centimeter to store the image file.
• When the image file is compressed its size is reduced and quality takes the toll for it.
• Since JPEG, reduces the file size by losing the quality it is also an example of the lossy format of compression.
• The original quality cannot be reconstructed once the file is compressed.
MIDI (Musical Instrument Digital Interface):
• It is a standard that allows sound to be represented in binary format.
• It stores the sound description, not the sound itself.
• It stores a series of control messages containing sound events e.g. pitch, volume, and duration.
• When these control messages are received by the MIDI-compatible device the messages are interpreted and reproduced.
• The MIDI data can also be compressed however it does not need any special compression algorithm.
Text & Numbers:
• Text and numbers can be stored in various formats.
• Typically, the text is stored in ASCII.
• However, numbers can be stored in different number formats. E.g. real numbers, date, time, integers, currency, etc.
• The files containing numbers undergo a lossless format of compression since this type of data cannot be compromised.
• The text format can also be compressed and uses a complex algorithm that uses redundancy.
• The compression of text is also lossless.
Error Checking Methods
• When you transmit data, there is always a risk for data corruption i.e. caused due to fault in communication equipment, noise, etc.
• In compressed data, the risk of loss of information increases since redundancy has already reduced to a minimum to reduce the file size.
• Therefore, error control measures are taken to make sure the data that is transferred through communication channels is error-free.
• These error control measures usually contain error detection and correction.
• Error detection detects the errors in the data or message while error correction is the process of reconstruction of the original data.
Error Detection & Correction Methods:
3. Check Digit
4. Automatic Repeat Request (ARQ)
• In this error detection method, a parity bit is added to the original message.
• Systems that use even parity counts the occurrences of 1s; adds a 0 parity bit if the count is already even and adds a 1 parity bit to make the occurrence of 1s even if it is not even already.
• In an odd parity system, the number of 1s occurrences needs to odd including the parity bit.
Consider the byte 1101100
• If this byte is using an even parity system, then the parity bit needs to be ‘0’ since the number of occurrences of 1s is already even.
• However, if it is using an odd parity system then the parity bit needs to be ‘1’ to make the number of occurrences of 1s odd.
Now consider the following example bytes and identify the parity system used each one of them.
• In this byte, the parity system used is odd since the number of occurrences of 1s is odd.
• In this byte, the parity system used is even.
Consider an example, in which even parity (vertical parity check) system is used to transmit 9 bytes of data. The following table shows the data at the receiving end.
• If this table is studied properly then it can be seen that:
• Row 8 has incorrect parity i.e. the number of occurrences of 1s is not even so the parity should have been 1.
• Column 5 also has an odd number of occurrences of 1s and the parity bit is wrong.
• This information reveals that error has occurred at the intersection of column 5 and row 8.
• And byte 8 should have been this:
Shortcoming of Parity:
• If more than 1 bit of a byte was replaced during transmission, then it would have been impossible to detect the error.
• Suppose using even parity system, the following byte has been sent:
• This byte could have received like this:
• Or like this:
• In both situations, it would not have triggered the error since the number of occurrences of 1s has remained even.
• It is an error detection method that sends an additional value with the original data.
• This additional value is known as the checksum.
• It is a fixed-length modular arithmetic sum of the message. E.g. a byte.
• This sum can be negated by a 1s complement operation before sending the data stream or message to detect errors in the message.
• To understand how it works, assume the checksum is 1 byte in length i.e. the max value can be 28 - 1 = 255.
• If the sum of all the bytes transferred is less than or equal to 255 then checksum will be this value 28 - 1 = 255.
• If the sum of all the bytes transferred is greater than 255 then checksum will be calculated using the following method.
Suppose the sum of the bytes is 1185.
• Since it is greater than 255 therefore, we will use the second method.
• First, 1185 will be divided by 256. i.e. 1185/256 = 4.496
• Round this value to the nearest whole number i.e. 4.496 rounds off to 4
• Multiply the rounded value to 256 i.e. 4 * 256 = 1024
• Calculate the difference i.e. 1185 – 1024 = 127 checksum
• When data is to be transmitted, its checksum is calculated and attached to the original message before the transmission.
• At the receiving end, the checksum of the received block is again calculated and compared with the transmitted checksum.
• If both checksums are the same, then the data is error-free.
3. Check Digit:
• It is an error detection system in which an additional number is added to the series (e.g. account no. etc.) to check the accuracy.
• This number is usually derived from the original series of numbers.
• For example, consider a number 232, the sum of these three digits (2+3+2=7) can be added as the last digit to the original series i.e. 2327.
Consider an ISBN-10 number 0 - 2 0 1 - 5 3 0 8 2 - X that is typically used on books that use the module 11 system (X inclusive).
• To calculate the value of X, first, we need to find out the placement of each digit.
• Multiply each digit with its position,
(0x10) + (2x9) + (0x8) + (1x7) + (5x6) + (3x5) + (0x4) + (8x3) + (2x2)
= 0 + 18 + 0 + 7 + 30 + 15 + 0 + 24 + 4
• Divide the total with 11,
• Check the difference, i.e. subtract X placement from the remainder,
• This value is your check digit and the final ISBN becomes,
4. Automatic Repeat Request (ARQ):
• This error detection method uses acknowledgment and timeout.
• An acknowledgment is a message specifying correct data has been received and i.e. sent by the receiver.
• A Timeout is a deadline or defined time, or time elapsed before the receiving of the acknowledgment.
• If the acknowledgment is not sent by the receiver before timeout then the message will be sent again automatically.