Deleting Data
Deleting data is not as simple as it should be. There are many pitfalls that need to be avoided. Usually, when you instruct an operating system to delete a file it does NOT delete the contents of the file. It only makes them inaccessible to you and makes the space available for other files. This is done to save the time it would take to delete the data and to reduce wear on the storage device. Suitable programs can recover this data that the user trusts has been deleted. If you have sensitive data this should be regarded as a major security flaw and you will need to take additional steps to really delete the data.
Early days
In the early days of PCs data was stored on magnetic media:
- magnetic tapes (the original PC came with a cassette interface port)
- floppy disk
- hard disk
Erasing a tape was a simple as recording over top of it, but disks required more care. Deleting a file on a disk does not delete the data within the file. It only removes the details of the filename and location on the disk and makes the space occupied available for other files to use. With the right software the contents of the file can still be read. This is also true of files that have first been moved to the recycle bin and then deleted from the recycle bin. To destroy the data, the contents of the file should be overwritten before deleting the file.
Modern times
Modern operating systems, solid state drives and bad block handling have made file deletion very much harder to achive. Overwriting files is no longer sufficient as the following sections will explain.
Swap file
Many modern operating systems use a swap file to allow more programs to run than the PC has memory for. If a new program or file needs to be loaded into memory and there is insufficient memory, then the contents of some memory will be copied to a special file on the disk called the swap file to free up some memory. When these memory contents are needed again, they are read back from the swap file. This means that a file that has been read into memory may have been swapped out to the swap file at some point. The contents of the file may still be present in the swap file even after the file has been deleted and the PC has been turned off. The contents of the file captured in swap file will be those at the time it was swapped out of memory. Changes to the file e.g. deliberately overwriting its contents will not necessarily change the version in the swap file. The contents of the swap file need to overwritten as well, or disable the swap file from being used while working on sensitive files.
Solid state storage
There are now many types of solid state storage devices commonly used with PCs e.g.
- solidate state drives
- memory cards
- memory sticks
Many solid state storage devices have a limited number of times the data on them can be changed before regions on them give read/write errors. To delay the onset of these errors, a strategy of wear levelling is usually employed. When the contents of file are changed, rather than updating the region that the data is currently stored in, a region that has been witten to least is used instead. The details of the file location on the device are updated to reflect the change. This means the contents of the earlier version of the file still exist in the original location. Overwriting all the unused space on the device may be necessary to ensure the earlier contents of the file have been overwitten. This is especially a risk with some modern laptops and notepads where all the storage is solid state.
Bad blocks
Both magnetic drives and solid state storage devices can develop regions where data can no longer be reliably read from or written to. These are often referred to as bad blocks or bad sectors. When these are detected by the operaing systen or the device itself, they marked as bad and will no longer be used. Even though reading data from bad blocks my be unreliable, specialist programs can still read the data, most of which will be correct. These blocks are invisible to most programs and the operating system, making it very difficult for users to delete the data in them.
Formating devices
Devices are often formatted to delete all the files on them, but formatting a devices only makes all the data on it inaccessible to normal programs and gives the false impression that the device has no data on it. Formatting should not be relied on as a method of deleting data. A better practice would be to:
- delete all the files
- fill the device with a file that holds random data
- reformat the device
CD-Rs
CD-Rs can be written to multiple times (if multi-session mode is used), but each time it is to written only previously unused regions can be used. Eventually the CD will be filled and no further writes can be made. It may seem that a file has been deleted or the contents of the file changed, but the contents of every version of the file remain the CD. The only way to delete the data is to destroy the CD.
Conclusions
If you are working with sensitive data remember:
- The operating system lies
- Replace the contents of files before deleting them
- Replace the contents of a whole device before reformatting it
- Destroy CD-Rs
- If you are going to update sensitive files, store them on magnetic media instead of solid state
- Disable swap files
- The only guaranteed way to delete data is to destroy the drive it is on
- Don't save sensitive files on a drive you are not prepared to destroy