A computer file is a resource for storing information that is available to a computer program and is usually based on some form of durable data storage.
A file is durable in the sense that it remains available for programs to use after the current program has stopped running. Computer files can be considered as the modern counterpart of paper documents which are traditionally stored in files, this being the source of the term.
At the lowest level, many operating systems consider a file simply as a sequence of bytes. At a higher level, where the content of the file is being considered, these bytes may represent various entities, including integer values, text characters, image pixels and audio. It is up to the program using the file to understand the meaning and internal layout of information in it.
A file is typically accessed using a name.
A file can be located in a directory. In this case, the term file must include directories. This permits the existence of directory hierarchies, where each directory can contain an arbitrary number of files and other directories. These other directories are known as subdirectories. Subdirectories can contain further files and directories and so on, constructing a tree-like structure in which one master directory, or root directory, can contain any number of levels of other directories and files.
The root directory is the top-most directory in a hierarchy, and can be likened to the root of a tree as the starting point where all branches originate. Unix abstracts this hierarchy, and in Unix-like systems, the root directory is denoted by /, where the directory entry itself has no name (its name is the empty part before the initial directory delimiter (/)). All filesystem entries, including mounted volumes, are "branches" of this root. Under DOS and Windows, each volume has a drive letter assignment (e.g. C:\) and there is no common root directory above that.
When directories are used, each file and directory has not only a name, but also a path, which identifies the directories in which it resides. In the path, a special character, such as a slash (/ or \), is used to separate the file and directory names.
A file is essentially an abstract concept presented to a user or an operating system. However, for a file to be useful, it must have a corresponding physical manifestation. In physical terms, a computer file is normally stored on a recording medium, for example, a hard disk for non-volatile storage or RAM if it contains only temporary information.
In Unix-like operating systems, many files have no direct association with a physical storage device, for example /dev/null and most files in the /dev, /proc and /sys directories. These can be accessed as files in user space although they are really virtual files that exist as objects within the operating system kernel.
Unix does not impose any internal file structure for normal files. This implies that from the point of view of the operating system there is only one file type, and the structure and interpretation is entirely dependent on how the file is interpreted by software.
Unix does however have some special files. These special files can be identified by the ls -l command which displays the type of the file in the first alphabetic letter of the permissions field.
A normal file is indicated by a dash (-).
The most common special file is the directory.
A directory is marked with d as the first letter of the permissions field, for example:
drwxr-xr-x /
A symbolic link is a reference to another file. This special file is stored as a textual representation of the referenced file's path.
A symbolic link is marked with l as the first letter of the permissions field, for example:
lrwxrwxrwx termcap -> /usr/share/misc/termcap
One of the facilities provided by Unix for inter-process communication is the pipe. A pipe connects the output of one Unix process to the input of another. This is particularly important if the processes have to be executed under different user names and permissions.
Named pipes are special files that can exist anywhere in the filesystem. They are created with the command mkfifo.
A named pipe is marked with p as the first letter of the permissions field, for example:
prw-rw---- mypipe
A socket is a special file used for inter-process communication. It facilitates communication between two processes. In addition to sending data, processes can send file descriptors across a Unix domain socket connection.
A socket is marked with s as the first letter of the permissions field, for example:
srwxrwxrwx X0
Device nodes are used to apply access rights and to direct operations to appropriate device drivers.
Nodes are created with the mknod command.
Unix makes a distinction between character devices and block devices. Basically, a character device provides only a serial stream of input or output, whereas a block device is randomly accessible.
A character device node is marked with c as the first letter of the permissions string and a block device node is marked with b, for example:
crw------- /dev/kbd
brw-rw---- /dev/hda
Device nodes correspond to resources that an operating system kernel has already allocated. Unix identifies these resources by a major number and a minor number, both stored as part of the structure of a node. Generally the major number identifies the device driver and the minor number identifies a particular device that the driver controls.
character devices |
Character device nodes relate to devices through which the system transmits data one character at a time. These device nodes often serve for stream communication with devices such as teletype machines, virtual terminals, and serial modems, and usually do not support random access to data. |
block devices |
Block device nodes correspond to devices through which the system moves data in the form of blocks. These device nodes often represent addressable devices such as hard drives, CD-ROM drives, or memory regions. |
pseudo-devices |
Device nodes on Unix-like systems do not necessarily have to correspond to physical devices. Nodes that lack this correspondence are referred to as pseudo-devices. |
The following prefixes are commonly used in Linux-based systems to identify device nodes in the /dev directory:
fb | frame buffer | ||
fd | floppy drive | ||
hd | IDE hard drive | ||
hda | master device on first ATA channel | ||
hdb | slave device on first ATA channel | ||
hdc | master device on second ATA channel | ||
hdd | slave device on second ATA channel | ||
lp | printer | ||
par | parallel port | ||
pt | pseudo-terminal | ||
s | SCSI device in general: mainly hard disks, but also SATA and USB disks | ||
scd | SCSI audio-oriented optical disk drive | ||
scd0 | first CD-ROM | ||
scd1 | second CD-ROM | ||
scd2 | third CD-ROM | ||
sd | SCSI, SATA or USB hard drive | ||
sda | first drive | ||
sdb | second drive | ||
sdc | third drive | ||
sg | SCSI generic device | ||
sr | SCSI data-oriented optical disk drive | ||
st | SCSI magnetic tape | ||
tty | terminal | ||
ttyS | serial port |
console |
purpose |
The console device node provides access to the system console. |
||||||
description |
Device node console provides access to the device or file designated as the system console. The system console is typically a terminal or display located near the system unit. It has two functions in the operating system:
|
|||||||
file |
/dev/console |
|||||||
initrd |
purpose |
The initrd device node is a RAM disk initialised before the kernel is started that can be used as the basis for a two-phased system startup. |
||||||
description |
Device node initrd is a read-only block device. It is a RAM disk that is initialised by the bootloader before the kernel is started. The kernel then can use initial root device initrd's contents for a two-phased system startup. In the first startup phase, the kernel starts up and mounts an initial root filesystem from the contents of initrd. In the second phase, additional drivers or other modules are loaded from the initial root device's contents. After loading the additional modules, a new root filesystem - the normal root filesystem - is mounted from a different device. |
|||||||
files |
/dev/initrd |
|||||||
null, zero |
purpose |
The null and zero device nodes provide access to the null device. |
||||||
description |
The null device node provides character access to the null device driver. This device driver is normally accessed to write data to the bit bucket. Data written to a null or zero device node is discarded. Reads from the null device node return end of file, whereas reads from zero return \0 characters. null and zero are typically created by:
mknod –m 666 /dev/null c 1 3 |
|||||||
files |
/dev/null |
|||||||
ram |
purpose |
The ram device node makes the RAM disk device available. |
||||||
description |
The ram device node references a block device to access the RAM disk in raw mode. It is typically created by:
mknod –m 660 /dev/ram b 1 1 |
|||||||
file |
/dev/ram |
|||||||
scd, sr |
purpose |
The scd and sr device nodes provide access to CD-ROM drivers. |
||||||
description |
CD-ROM and DVD drives and WORM devices are accessible via the scd and sr device drivers. |
|||||||
files |
/dev/scd<n> |
|||||||
sd? |
purpose |
The sd? device nodes provide access to drivers for SCSI disk drives. |
||||||
description |
The block device name has the following form: sdlp, where l is a letter denoting the physical drive, and p is a number denoting the partition on that physical drive. Often, the partition number will be left off when the device corresponds to the whole drive. SCSI disks have a major device number of 8, and a minor device number of the form (16 * drive_number) + partition_number, where drive_number is the number of the physical drive in order of detection, and partition_number is as follows:
For example, /dev/sda will have major 8, minor 0, and will refer to all of the first SCSI drive in the system; and /dev/sdb3 will have major 8, minor 19, and will refer to the third DOS primary partition on the second SCSI drive in the system. |
|||||||
files |
|
|||||||
systty |
purpose |
In many Linux distributions device node systty is a symbolic link to the device that is used as the attached keyboard and monitor. |
||||||
description |
The keyboard and monitor attached to the system unit are collectively known as the physical console. The console where system messages appear is known as the logical console. As an illustration of the difference, X Windows should start on the physical console but system messages issued by failures when starting X Windows should be written to the logical console. These distinctions are also made in the naming of devices. Device console is used to send messages to the logical console. Symbolic link systty points to the device that is used by the attached keyboard and monitor, often /dev/tty0. |
|||||||
file |
/dev/systty |
|||||||
tty |
purpose |
Device node tty supports the controlling terminal interface. |
||||||
description |
For each process, the tty device node is a synonym for the controlling terminal associated with that process. By directing messages to the tty file, programs and shell scripts can ensure that messages are written to the terminal even if output is redirected. Programs can also direct their display output to this file so that it is not necessary to identify the active terminal. File tty is a character file with major number 5 and minor number 0, usually of mode 0666 and owner.group root.tty. It is a synonym for the controlling terminal of a process, if any. |
|||||||
file |
/dev/tty |
A door is a special file for inter-process communication between a client and server, currently implemented in the Sun Solaris operating system only.
A door is marked with D as the first letter of the permissions field, for example:
Dr--r--r-- name_service_door
home | Home Page |