Skip to main content

Disk Imaging

Disk images, in computing, are computer files containing the contents and structure of a disk volume or an entire data storage device, such as a hard disk drive, tape drive, floppy disk, optical disc or USB flash drive. A disk image is usually made by creating a sector-by-sector copy of the source medium, thereby perfectly replicating the structure and contents of a storage device independent of the file system. Depending on the disk image format, a disk image may span one or more computer files.

Disk image file formats may be open standards, such as the ISO image format for optical disc images, or proprietary to particular software applications.

As disk images contain the contents of entire disks, they can be huge. Some disk imaging utilities are filesystem-aware and can omit copying unused space from the source media, or compress the disk they represent to reduce storage requirements.

History

Disk images were originally (in the late 1960s) used for backup and disk cloning of mainframe disk media, the early ones were as small as 5 megabytes and as large as 330 megabytes, and the copy medium was magnetic tape, which ran as large as 200 megabytes per reel. Disk images became much more popular when floppy disk media became popular, where replication or storage of an exact structure was necessary and efficient, especially in the case of copy protected floppy disks.

Uses

Disk images are used heavily for duplication of optical media including DVDs, Blu-ray disks, etc. It is also used to make perfect clones of hard disks.

A virtual disk may emulate any type of physical drive, such as a hard disk drive, tape drive, key drive, floppy drive, CD/DVD/BD/HD DVD or a network share among others; and of course, since it is not physical, requires a virtual reader device matched to it (see below). An emulated drive is typically created either in RAM for fast read/write access (known as a RAM disk), or on a hard drive. Typical uses of virtual drives include the mounting of disk images of CDs and DVDs, and the mounting of virtual hard disks for the purpose of on the fly disk encryption ("OTFE").

Some operating systems such as Linux and Mac OS X have virtual drive functionality built-in (such as the loop device), while others such as Microsoft Windows require additional software. Windows 8 includes native virtual drive functionality known as Hyper-V.

Virtual drives are typically read-only, being used to mount existing disk images which are not modifiable by the drive. However some software provides virtual CD/DVD drives which can produce new disk images; this type of virtual drive goes by a variety of names, including "virtual burner".

Enhancement

Using disk images in a virtual drive allows users to shift data between technologies, for example from CD optical drive to hard disk drive. This may provide advantages such as speed and noise (hard disk drives are typically four or five times faster than optical drives, and also quieter). In addition it may reduce power consumption, since it may allow just one device (a hard disk) to be used instead of two (hard disk plus optical drive).

Virtual drives may also be used as part of emulation of an entire machine (a virtual machine).

Software distribution

Since the spread of broadband, CD and DVD images have become a common medium for Linux distributions. Applications for Mac OS X are often delivered online as an Apple Disk Image containing a file system that includes the application, documentation for the application, and so on. Online data and bootable recovery CD images are provided for customers of certain commercial software companies.

Disk images may also be used to distribute software across a company network, or for portability (many CD/DVD images can be stored on a hard disk drive). There are several types of software that allow software to be distributed to large numbers of networked machines with little or no disruption to the user. Some can even be scheduled to update only at night so that machines are not disturbed during business hours. These technologies reduce end-user impact and greatly reduce the time and man-power needed to ensure a secure corporate environment. Efficiency is also increased because there is much less opportunity for human error. Disk images may also be needed to transfer software to machines without a compatible physical disk drive.

For computers running Mac OS X, disk images are the most common file type used for software downloads, typically downloaded with a web browser. The images are typically compressed Apple Disk Image (.dmg suffix) files. They are usually opened by directly mounting them without using a real disk. The advantage compared with some other technologies, such as Zip and RAR archives, is they do not need redundant drive space for the unarchived data.

Software packages for Windows are also sometimes distributed as disk images including ISO images. While Windows versions prior to Windows 7 do not natively support mounting disk images to the files system, several software options are available to do this; see comparison of disc image software.

Security

Virtual hard disks are often used in on the fly disk encryption ("OTFE") software such as FreeOTFE and TrueCrypt, where an encrypted "image" of a disk is stored on the computer. When the disk's password is entered, the disk image is "mounted", and made available as a new volume on the computer. Files written to this virtual drive are written to the encrypted image, and never stored in cleartext.

The process of making a computer disk available for use is called "mounting", the process of removing it is called "dismounting" or "unmounting"; the same terms are used for making an encrypted disk available or unavailable.

Virtualization

A hard disk image is interpreted by a Virtual Machine Monitor as a system hard disk drive. IT administrators and software developers administer them through offline operations using built-in or third-party tools. In terms of naming, a hard disk image for a certain Virtual Machine monitor has a specific file type extension, e.g., .vmdk for VMware VMDK, .vhd for Xen and Microsoft Hyper-V, .vdi for Oracle VM VirtualBox, etc..

Hard drive imaging is used in several major application areas:

  • Forensic imaging or acquisition is the process where the entire drive contents are imaged to a file and checksum values are calculated to verify the integrity (in court cases) of the image file (often referred to as a "hash value"). Forensic images are acquired with the use of software tools. (Some hardware cloning tools have added forensic functionality.)
  • Drive cloning, as previously mentioned, is typically used to replicate the contents of the hard drive for use in another system. This can typically be done by software-only programs as it typically only requires the cloning of file structure and files themselves.
  • Data recovery imaging (like forensic imaging) is the process of imaging every single sector on the source drive to another medium from which required files can be retrieved. In data recovery situations, one cannot rely on the integrity of the file structure and therefore a complete sector copy is mandatory (also similar to forensic imaging). The similarities to forensic imaging end there though. Forensic images are typically acquired using software tools such as EnCase and FTK. However, forensic imaging software tools have significantly limited ability to deal with drives that have hard errors (which is often the case in data recovery and why the drive was submitted for recovery in the first place).

Data recovery imaging must have the ability to pre-configure drives by disabling certain attributes (such as SMART and G-List re-mapping) and the ability to work with unstable drives (drive instability/read instability can be caused by minute mechanical wear and other issues). Data recovery imaging must have the ability to read data from "bad sectors." Read instability is a major factor when working with drives in operating systems such as Windows. A typical operating system is limited in its ability to deal with drives that take a long time to read. For these reasons, software that relies on the BIOS and operating system to communicate with the hard drive is often unsuccessful in data recovery imaging; separate hardware control of the source hard drive is required to achieve the full spectrum of data recovery imaging. This is because the operating system (through the BIOS) has a certain set of protocols or rules for communication with the drive that cannot be violated (such as when the hard drive detects a bad sector). A hard drive's protocols may not allow "bad" data to be propagated through to the operating system; firmware on the drive may compensate by rereading sectors until checksums, CRCs, or ECCs pass, or use ECC data to recreate damaged data.Data recovery images may or may not make use of any type of image file. Typically, a data recovery image is performed drive to drive and therefore no image file is required.

There are two schemes predominant across all Virtual Machine Monitor implementations:

  1. Preallocate the entire storage for the virtual disk upon creation
  2. Dynamically grow the storage on demand

The virtual disk is implemented as either split over a collection of flat files, typically each one is 2GB in size, collectively called a split flat file, or as a single, large monolithic flat file. The pre-allocated storage scheme is also referred to as a thick provisioning scheme.

The virtual disk can again be implemented using split or monolithic files, except that storage is allocated on demand. Several Virtual Machine Monitor implementations initialize the storage with zeros before providing it to the virtual machine that is in operation. The dynamic growth storage scheme is also referred to as a thin provisioning scheme.

There are two modes in which a raw disk can be mapped for use by a virtual machine:

Virtual modeThe mapped disk is presented as if it is a logical volume, or a virtual disk file, to the guest operating system and its real hardware characteristics are hidden. In this mode, file locking provides data protection through isolation for concurrent updates; the copy on write operation enables snapshots. Virtual mode also offers portability across storage hardware because it presents the consistent behavior as a virtual disk file.Physical modeIn this mode, also called the pass through mode, the Virtual Machine Monitor bypasses the I/O virtualization layer and passes all I/O commands directly to the device. All physical characteristics of the underlying hardware are exposed to the guest operating system. There is no file locking to provide data protection.

System backup

See also: System image and Backup and Restore (for Windows Vista and later)

Some backup programs only back up user files; boot information and files locked by the operating system, such as those in use at the time of the backup, may not be saved on some operating systems. A disk image contains all files, faithfully replicating all data. For this reason, it is also used for backing up CDs and DVDs.

Files that don't belong to installed programs can usually be backed up with file-based backup software, and this is preferred because file-based backup usually saves more time or space because they never copy unused space (as a bit-identical image does), they usually are capable of incremental backups, and generally have more flexibility. But for files of installed programs, file-based backup solutions may fail to reproduce all necessary characteristics, particularly with Windows systems. For example, in Windows certain registry keys use short filenames, which are sometimes not reproduced by file-based backup, some commercial software uses copy protection that will cause problems if a file is moved to a different disk sector, and file-based backups do not always reproduce metadata such as security attributes. Creating a bit-identical disk image is one way to ensure the system backup will be exactly as the original. Bit-identical images can be made in Linux with dd, available on nearly all live CDs.

Most commercial imaging software is "user-friendly" and "automatic" but may not create bit-identical images. These programs have most of the same advantages, except that they may allow restoring to partitions of a different size or file-allocation size, and thus may not put files on the same exact sector. Additionally, if they do not support Windows Vista, they may slightly move or realign partitions and thus make Vista unbootable (see Windows Vista startup process).

Rapid deployment of clone systems

Large enterprises often need to buy or replace new computer systems in large numbers. Installing operating system and programs into each of them one by one requires a lot of time and effort and has a significant possibility of human error. Therefore, system administrators use disk imaging to quickly clone the fully prepared software environment of a reference system. This method saves time and effort and allows administrators to focus on unique distinctions that each system must bear.

There are several types of disk imaging software available that use single instancing technology to reduce the time, bandwidth, and storage required to capture and archive disk images. This makes it possible to rebuild and transfer information-rich disk images at lightning speeds, which is a significant improvement over the days when programmers spent hours configuring each machine within an organization.

Legacy hardware emulation

Emulators frequently use disk images to simulate the floppy drive of the computer being emulated. This is usually simpler to program than accessing a real floppy drive (particularly if the disks are in a format not supported by the host operating system), and allows a large library of software to be managed.

Copy protection circumvention

A mini image is an optical disc image file in a format that fakes the disk's content to bypass CD/DVD copy protection.

Because they are the full size of the original disk, Mini Images are stored instead. Mini Images are small, on the order of kilobytes, and contain just the information necessary to bypass CD-checks. Therefore, the Mini Image is a form of a No-CD crack, for pirated games, and legally backed up games. Mini images do not contain the real data from an image file, just the code that is needed to satisfy the CD-check. They cannot provide CD or DVD backed data to the computer program such as on-disk image or video files.

Creation

Creating a disk image is achieved with a suitable program. Different disk imaging programs have varying capabilities, and may focus on hard drive imaging (including hard drive backup, restore and rollout), or optical media imaging (CD/DVD images).

A virtual disk writer or virtual burner is a computer program that emulates an actual disc authoring device such as a CD writer or DVD writer. Instead of writing data to an actual disc, it creates a virtual disk image. A virtual burner, by definition, appears as a disc drive in the system with writing capabilities (as opposed to conventional disc authoring programs that can create virtual disk images), thus allowing software that can burn discs to create virtual discs.

File formats

In most cases, a file format is tied to a particular software package. The software defines and uses its own, often proprietary, image format, though some formats are widely supported by open standards. These formats are supported by nearly all optical disc software packages.

Utilities

RawWrite and WinImage are examples of floppy disk image file writer/creator for MS-DOS and Microsoft Windows. They can be used to create raw image files from a floppy disk, and write such image files to a floppy.

In Unix or similar systems the dd program can be used to create disk images, or to write them to a particular disk. It is also possible to mount and access them at block level using a loop device.

Apple Disk Copy can be used on Mac OS systems to create and write disk image files.

Authoring software for CDs/DVDs such as Nero Burning ROM can generate and load disk images for optical media.

Source: Wikipedia, Google