Software Organization (SO)

In general, the I/O system software organization consists of the following layers:

Interrupt Handler layer, handles interrupts à transferred to the interrupt handler
The device driver layer, implements the operations of each device controller.
The I/O subsystem layer or I/O kernel, provides I/O interfaces or functions for the OS or applications.
The application I/O library layer, implements an I/O access library or API (Application Programming Interface) for applications to perform I/O operations.

The purpose of designing I/O software management is for the following reasons:

Uniform naming. File or device names are strings or integers, completely independent of the device.
Error handling. Generally error handling is handled as close to the hardware as possible.
Synchronous vs. asynchronous transfers Most physical I/O is asynchronous. The processor starts the transfer and ignores it to do other work until an interrupt arrives. User programs are much easier to write if I/O operations are block-oriented. After a read command, the program is then automatically suspended until data is available in the buffer.
Shareable vs dedicated Some equipment can be shared, such as disks, but there are also equipment that must be used by only one user at a time. An example of equipment that must be dedicated is a printer.

(a) Without a standard driver interlace. (b) With a standard driver interface.

1. Interrupt Handler

An interrupt is an event that causes the execution of one program to be suspended and another program to be executed. If an interrupt occurs, the program is stopped first to run the interrupt routine. When the running program is stopped, the processor saves the register value containing the program address to the stack, and begins executing the interrupt routine.

Interrupt Handlers

a. Basic Interrupt Mechanism

When the CPU detects that a controller has sent a signal to the interrupt request line (generating an interrupt), the CPU then responds to the interrupt (also called catching the interrupt) by saving some information about the current state of the CPU, for example the value of the instruction pointer, and calling an interrupt handler so that it can service the controller or device that sent the interrupt.

b. Additional Features on Modern Computers

In modern computer architecture, three features are provided by the CPU and the interrupt controller (in hardware) to better handle interrupts. These features include the ability to inhibit an interrupt handling process while the process is in a critical state, efficient interrupt handling so that there is no need to poll to find the device that sent the interrupt, and the third feature is the existence of a multilevel interrupt concept such that there is priority in interrupt handling (implemented with the interrupt priority level system).

c. Types of Interruptions

Judging from how the processor works, not all interruptions are equally important for the process being executed by the processor. If less important interruptions interrupt the processor's work, the execution of the process will take a long time. Therefore, OS usually divides interruptions into two types, namely:

Software, namely interrupts caused by software, are often called system calls.
Hardware, occurs due to access to hardware, such as pressing keyboard keys or moving the mouse.

d. Interrupt Request Line

On the CPU hardware there is a cable called the interrupt request line, most CPUs have two types of interrupt request lines, namely nonmaskable interrupt and maskable interrupt. Maskable interrupt can be turned off/stopped by the CPU before executing a critical instruction sequence that should not be interrupted. Usually, this type of interrupt is used by the device controller to request CPU service.

e. Interrupt Vector and Interrupt Chaining

An interrupt mechanism will receive the address of a specific interrupt handling routine from a set, in most current computer architectures, this address is usually a set of numbers representing an offset in a table (often called an interrupt vector). This table stores the addresses of specific interrupt handlers in memory. The advantage of using a vector is that it reduces the need for an interrupt handler to search all possible interrupt sources to find the one that sent the interrupt.

However, interrupt vectors have a limitation because in reality, existing computers have more devices (and interrupt handlers) compared to the number of addresses in the interrupt vector. Therefore, the interrupt chaining technique is used where each element of the interrupt vector points to the first element of a list of interrupt handlers. With this technique, the overhead generated by the large size of the table and the inefficiency of using an interrupt handler (a feature of the CPU mentioned earlier) can be reduced, so that both are more or less balanced.

f. Causes of Interruption

Interruptions can be caused by various things, including exceptions, page faults, interrupts sent by device controllers, and system calls. Exceptions are conditions where something happens/from an operation certain results are obtained that are considered special so that they must receive more attention, for example division by 0 (zero), accessing restricted or even invalid memory addresses, and others. System calls are functions in applications (software) that can execute special instructions in the form of software interrupts or traps.

g. Use of interrupts

Error recovery. Computers use a variety of techniques to ensure that all hardware components operate properly. If an error occurs, the control hardware detects the error and notifies the CPU by issuing an interrupt.
Debugging. Another important use of interrupts is as an aid in program debugging. Debugging is a method used by programmers to find and reduce bugs, or damage, in a piece of hardware so that the device works as expected.
Interprogram Communication. Software interrupt commands are used by the operating system to communicate with and control the execution of other programs.

2. Device Driver

Each device driver handles one type of equipment. The device driver is responsible for accepting abstract requests from device-independent software on it and performing services according to those requests.

Device driver working mechanism:

Translating abstract commands into concrete commands.
Once it has determined the commands to be given to the controller, the device driver begins writing to the device's control registers.
After the operation is completed by the equipment, the device driver checks for errors that occurred.
If all goes well, the device driver passes the data to the device independent software.
The device reports status information as an error report to the caller.

3. Device Independent Operating System Software

The main function of this level of software is to establish I/O functions that apply to all devices and provide a uniform interface to user-level software.

Functions commonly performed include:

Uniform interface for all drivers
Equipment naming
Equipment protection
Gives the equipment block size to be device independent.
Performing buffering
Storage allocation on block devices
Allocation of dedicated devices release
Error reporting

Device-Independent IO Software

I/O Protection

Users can disrupt the operating system by performing illegal I/O instructions (accessing memory locations for the operating system or disconnecting from the processor). To prevent this, we consider all I/O instructions as privileged instructions so that they cannot execute I/O instructions directly to memory but must go through the operating system first. I/O protection is said to be complete if the user can be sure that he will not touch the monitor mode. If this happens, I/O protection can be compromised.

4. I/O Application Interface

When an application wants to open data on a disk, the application must actually be able to distinguish what type of disk it will access. To facilitate access, the operating system standardizes the way it accesses I/O devices. This approach is called the I/O application interface.

The I/O application interface involves abstraction, encapsulation, and software layering. Abstraction is done by dividing the details of I/O devices into more general classes. With these general classes, it will be easier to create standard functions (interfaces) to access them. Then there is a device driver on each I/O device, which functions to encapsulate the differences between each member of the general classes. The device driver encapsulates each I/O device into each of the general classes (standard interfaces). The purpose of this device driver layer is to hide the differences in the device controller from the I/O subsystem in the kernel.

Because of this, the I/O subsystem can be independent of the hardware. Because the I/O subsystem is independent of the hardware, this will be very beneficial in terms of hardware development. There is no need to wait for the operating system vendor to issue support code for new hardware that will be issued by the hardware vendor.

IO Interface Example

a. Block and Character Equipment

Block devices are expected to be able to meet the access needs of various disk drives and other block devices. Block devices are expected to be able to meet/understand read, write and data search commands on devices that have random-access properties.

The keyboard is one example of a tool that can access character streams. The basic system call of this interface can make an application understand how to take and write a character. Then in further development, we can create a library that can access data/messages per line.

b. Network Equipment

Due to the differences in performance and addressing of I/O networks, operating systems usually have different I/O interfaces for reading, writing, and searching on disk. One of the most widely used in operating systems is the socket interface.

Sockets function to connect computers to a network. System calls on the socket interface can facilitate an application to create a local socket, and connect it to a remote socket. By connecting a computer to a socket, communication between computers can be done.

c. Clock and Timer

The presence of clocks and timers on computer hardware has at least three functions, providing information on the current time, providing information on the length of time of a process, as a trigger for an operation at a time. These functions are often used by the operating system. Unfortunately, the system call for calling this function is not standardized between operating systems. Hardware that measures time and performs trigger operations is called a programmable interval timer. It can be set to wait for a certain time and then perform an interrupt. An example of its application is in the scheduler, where it will perform an interrupt that will stop a process at the end of its time section.

The operating system can support more timer requests than the number of hardware timers. In this condition, the kernel or device driver arranges the list of interrupts in a first-come, first-served order.

d. Blocking and Nonblocking I/O

When an application uses a blocking system call, the execution of the application is temporarily stopped. The application is moved to the wait queue. After the system call is completed, the application is returned to the run queue, so that the execution of the application can continue. Physical actions of I/O devices are usually asynchronous. However, many operating systems are blocking, this is because blocking applications are easier to understand than nonblocking applications.

5. Kernel I/O Subsystem

The kernel provides many services related to I/O. In this section, we will describe some of the services provided by the kernel I/O subsystem, and we will discuss how to create hardware infrastructure and device drivers. The services we will discuss are I/O scheduling, buffering, caching, spooling, device reservation, error handling.

a. Scheduling

To schedule a set of I/O requests, we must determine a good order in which to execute them. Scheduling can improve the overall performance of the system, can distribute the device more evenly among processes, and can reduce the average waiting time for I/O to complete. Here is a simple example to illustrate the above definition. Suppose an arm disk is located near the beginning of the disk, and there are three blocking applications making read calls to the disk. Application 1 requests a block near the end of the disk, application 2 requests a block near the beginning, and application 3 requests the middle of the disk. The operating system can reduce the distance that the arm disk must travel by serving the applications in the order 2, 3, 1. Arranging the order of work back in this way is the essence of I/O scheduling. The operating system implements scheduling by assigning request queues to each device. When an application requests a blocking system I/O request, the request is placed in the queue for that device. The I/O scheduler arranges the order of the queues to improve the efficiency of the system and the average response time that the applications experience. The operating system also tries to act fairly, such that no application receives poor service, or it can give priority to service for important pending requests. For example, requests from a subsystem may get higher priority than requests from an application. Some scheduling algorithms for disk I/O are described in the Disk Scheduling section.

One way to improve the efficiency of a computer's I/O subsystem is to schedule I/O operations. Another way is to use storage space in main memory or on disk, through techniques called buffering, caching, and spooling.

b. Buffering

A buffer is a memory area that stores data while they are being transferred between two devices or between a device and an application. Buffering is the process of smoothing out spikes in input/output access requirements. Buffering improves the efficiency of the operating system and process performance. There are 2 types of buffering:

Single buffering. This is the simplest technique. When a process gives a command to an input/output device, the operating system provides the operating system's main memory buffer for the operation.
Double buffering. Improvements can be made with two system buffers. Processes can be transferred to/from one buffer while the operating system empties (or fills) the other buffer. This technique is called double buffering or buffer swapping. Double buffering ensures that processes do not wait for I/O operations. This improvement comes at the cost of increased complexity.

(a) Unbuffered input. (b) Buffering in user space. (c) Buffering in the kernel followed by copying to user space. (d) Double buffering in the kernel.

Buffering is done for three reasons. The first reason is to cope with errors that occur due to differences in speed between the producer and consumer of a data stream. For example, a file is being received via a modem and is being sent to a hard disk. The modem is about 1/1000th the speed of the hard disk. So a buffer is created in main memory to collect the bytes received from the modem.

When all the data in the buffer has arrived, it can be written to disk in a single operation. Since disk writes are not instantaneous and the modem still needs space to store incoming data, two buffers are used. Once the modem fills the first buffer, a request is made to write to disk. The modem then begins filling the second buffer while the first buffer is used for writing to disk. By the time the modem has filled the second buffer, the disk write from the first buffer should have been completed, so the modem switches back to filling the first buffer and using the second buffer for writing. This double buffering method creates a dual pairing of producers and consumers while reducing the time required between them.

Networking may involve many copies of a packet.

The second reason for buffering is to accommodate devices that have different data transfer sizes. This is very common in computer networks, where buffers are used extensively for fragmentation and reassembly of received messages. At the sender's end, a large message is broken into small packets. The packets are sent over the network, and the receiver places them in a buffer to be reassembled.

The third reason for buffering is to support copy semantics for I/O applications. An example will illustrate what copy semantics means. Suppose an application has a buffer of data that it wants to write to disk. The application makes a write call, providing a pointer to the buffer, and an integer to indicate the number of bytes to write. After the call, what happens if the application changes the contents of the buffer? With copy semantics, the data that is being written is the same as when the application made the write call, regardless of the changes to the buffer.

A simple way for an operating system to guarantee copy semantics is to allow the write system to copy the application data into a kernel buffer before returning control to the application. Thus the disk write is performed in the kernel buffer, so changes to the application buffer have no effect. Copying data between kernel and application data buffers is common in operating systems, except for the overhead this operation incurs because of clean semantics. We can achieve the same effect more efficiently by making clever use of virtual-memory mapping and copy-on-the-wire protection.

c. Caching

A cache is a fast memory area that contains copies of data. Access to a cached copy is more efficient than access to the original data. For example, the instructions of a currently executing process are stored on disk, cached in physical memory, and then copied back into the CPU's secondary and primary caches. The difference between a buffer and a cache is that a buffer can store only its data information while a cache by definition only stores data from a single location for faster access.

Caching and buffering are two different functions, but sometimes a single memory area can be used for both. For example, to save copy semantics and make I/O scheduling efficient, the operating system uses buffers in main memory to store data.

This buffer is also used as a cache, to improve I/O efficiency for files that are shared by multiple applications, or that are being read and written repeatedly.

When the kernel receives a file I/O request, it accesses the buffer cache to see if the memory area is already available in main memory. If so, a physical disk I/O can be avoided or not used. Disk writes also accumulate in the buffer cache for a few seconds, so large transfers are batched to make the write schedule more efficient. This method of delaying writes to improve I/O efficiency is discussed in the Remote File Access section.

d. Spooling and Device Reservation

A spool is a buffer that holds output for a device, such as a printer, that cannot accept interleaved data streams. Although a printer can only handle one job at a time, multiple applications can request a printer to print, without having their output printed in a jumbled manner.

The operating system solves this problem by intercepting all output to the printer. Each application's output is spooled to a separate disk file. When an application finishes printing, the spooling system moves on to the next in the queue.

In some operating systems, spooling is handled by a system daemon process. In others, it is handled by an in-kernel thread. In both cases, the operating system provides a control interface that allows users and system administrators to view the queue, to eliminate unwanted queues before they start printing.

For some devices, such as tape drives and printers, it is not possible to multiplex I/O requests from multiple applications. Spooling is one way to solve this problem. Another way is to divide the coordination for these multiple concurrent applications.

Some operating systems provide support for exclusive device access, by allocating processes to idle devices and discarding devices that are no longer needed. Other operating systems enforce limits on the file handles for these devices. Many operating systems provide functions that allow processes to coordinate exclusive access handles among themselves.

Layers of the I/O system and the main functions of each layer.

e. Error Handling

An operating system that uses protected memory can guard against many possible errors caused by hardware or applications. Devices and I/O transfers can fail in many ways, either for transient reasons, such as network overload, or for permanent reasons, such as disk controller failure. The operating system can often compensate for transient errors. For example, a read error on the disk will result in a reread, and a transmit error on the network will result in a retransmission if the protocol is known. However, for permanent errors, the operating system generally cannot recover the situation.

(a) A disk track with a bad sector. (b) Substituting a spare for the bad sector. (c) Shifting all the sectors to bypass the bad one.

A general rule is that an I/O system will return one bit of information about the status of the call, indicating whether the process succeeded or failed. UNIX operating systems use an additional integer called errno to return an error code of about 1 in 100 values indicating the cause of the error. However, some hardware can provide detailed error information, although many operating systems do not support this facility.

f. Kernel Data Structure

The kernel requires state information about the use of I/O components. The kernel uses many similar structures to track network connections, character-device communications, and other I/O activity.

UNIX provides file system access to several entities, such as user files, raw devices, and process addresses. Although each of these entities supports a read operation, the semantics are different for each entity. For example, to read a user file, the kernel needs to check the buffer cache before deciding whether to perform disk I/O. To read a raw disk, the kernel needs to ensure that the request size is a multiple of the disk sector size, and is within sector boundaries. To process an image, it is sufficient to copy the data into memory. UNIX encapsulates these differences in a uniform structure using object-oriented techniques.

Some operating systems use object-oriented methods even more extensively. For example, Windows NT uses a message-passing implementation for I/O. An I/O request is converted into a message that is sent through the kernel to the I/O manager and then to the device driver, each of which can modify the message content. For output, the message content is the data to be written. For input, the message contains a buffer to receive the data. This message-passing approach adds overhead, compared to procedural techniques that share data structures, but it simplifies the structure and design of the I/O system and adds flexibility.

In summary, the I/O subsystem coordinates a large collection of services, available from applications and other parts of the kernel. The I/O subsystem oversees:

Name management for files and devices.
Access control for files and devices.
Operational control, example: unrecognizable model.
File system space allocation.\
Device allocation.
Buffering, caching, spooling.
I/O scheduling
Monitor device status, error handling, and errors in recovery.
Configuration and utilization of device drivers.

Clients and servers in the MIT

g. I/O Request Handling

In the previous section, we described the handshaking between the device driver and the device controller, but we did not explain how the Operating System routes the application's request for network access to a specific disk sector.

Modern operating systems gain significant flexibility from the lookup table stages in the path between the request and the physical device controller. We can introduce new devices and drivers to the computer without having to recompile the kernel. In fact, some operating systems are capable of loading desired device drivers. At boot time, the system first queries the hardware bus to determine what devices are present, and then loads in the appropriate drivers; either as soon as possible, or as needed by an I/O request.

UNIX System V has an interesting mechanism, called streams, that allows applications to dynamically assemble pipelines of driver code. A stream is a full-duplex connection between a device driver and a user-level process. A stream consists of a stream head that interfaces with the user process, a driver end that controls the device, and zero or more stream modules between them. Modules can be pushed onto a stream to add functionality in a layered fashion. As a simple illustration, a process can open a serial port device through a stream, and can push onto a module to hold edit input. Streams can be used for interprocess and network communication. In fact, in System V, the socket mechanism is implemented with streams.

The following describes a typical lifecycle of a block read request. A process issues a blocking read system call to a file descriptor of a previously opened file. The system-call code in the kernel checks the parameters for correctness. In the case of input, if the data is already in the buffer cache, the data is returned to the process and the I/O request is completed. If the data is not in the buffer cache, a physical I/O occurs, so the process is taken out of the run queue and placed in the wait queue for the device, and the I/O request is scheduled. Finally, the I/O subsystem sends the request to the device driver. Depending on the operating system, the request is sent either through a subroutine call or through an in-kernel message.

The device driver allocates buffer space in the kernel to receive data, and schedules the I/O. Finally, the driver sends commands to the device controller by writing to the device control register. The device controller operates the device hardware to perform the data transfer. The driver can receive status and data, or it can prepare a DMA transfer to kernel memory. We assume that the transfer is managed by a DMA controller, which uses interrupts when the transfer is complete.

The appropriate interrupt handler receives the interrupt through the interrupt-vector table, stores any required data, signals the device driver, and returns from the interrupt. The device driver receives the signal, analyzes which I/O requests have been completed, analyzes the status of the request, and signals the kernel I/O subsystem that the request has been completed. The kernel transfers data or return code to the address space of the requesting process, and moves the process from the waiting queue back to the ready queue. The process is not blocked when it is moved to the ready queue. When the scheduler returns the process to the CPU, the process continues execution upon completion of the system call.

h. I/O Performance

1. Impact of I/O on Performance

I/O greatly influences the performance of a computer system. This is because I/O takes up a lot of CPU in executing device drivers and scheduling processes, so that the resulting context switching burdens the CPU and hardware cache. In addition, I/O also fills the memory bus when copying data between the controller and physical memory, as well as between buffers in the kernel and application space data. Because of the large influence of I/O on computer performance, the field of computer architecture development pays great attention to the problems mentioned above.

2. How to Improve I/O Efficiency

Reduce the number of context switches.
Reduces the number of copies of data to memory when it is being sent between the device and the application.
Reduce interrupt frequency, by using large transfer sizes, smart controllers, and polling.
Increase concurrency with controllers or channels that support DMA.
Moving processing activities to hardware, so that operations to the device controller can take place simultaneously with the CPU.
Balancing between CPU, memory subsystem, bus, and I/O performance.

3. Implementation of I/O Functions

We basically implement I/O algorithms at the application level. This is because application code is very flexible, and application bugs are less likely to cause a system crash. Furthermore, by developing code at the application level, we avoid the need to reboot or reload device drivers every time we change the code. Application-level implementations can also be very inefficient. However, because of the overhead of context switching and because applications cannot take advantage of internal kernel data structures and kernel functionality (e.g., the efficiency of kernel messaging, threading, and locking).

Once the application-level algorithm has proven its benefits, we may implement it in the kernel. This step can improve performance but its development becomes more challenging, due to the size of the operating system kernel, and the complexity of the software system. Furthermore, we must debug the entire in-kernel implementation to avoid data corruption and system crashes.

We may obtain optimal performance by using a special implementation in hardware, other than the device or controller. Disadvantages of a hardware implementation include the difficulty and expense of making better progress in eliminating bugs, improved development times, and increased flexibility. For example, a hardware RAID controller may not provide the kernel with an effect to influence the order or location of individual block reads and writes, even though the kernel has special information about the workload that can enable the kernel to improve I/O performance.

Reference:

Introduction to I/O System Organization

Substance

Hardware Organization
I/O Device Classification
I/O Device Addressing
I/O Device Schematic

I/O Device Organization can be viewed in terms of Physical/Hardware Organization and Software Organization.

1. Hardware Organization

The I/O system in a computer system can be viewed in terms of physical organization or hardware as well as in terms of software organization. Physically, the organization of the I/O system is divided into:

I/O Device
Device Controller (Adapter)
I/O Bus

I/O Device

I/O devices connected to a computer have unique characteristics according to their functions and the technology they use. I/O devices can be electrical or mechanical components. Examples of I/O devices include monitors, keyboards, mice, printers, scanners, and others.

Device Controller (Adapter)

In order for I/O devices to be controlled and communicate with the computer system, there must be a device controller that functions as an interface between the I/O device and the computer's internal system. This device controller is a digital circuit that functions to control the work of other mechanical or electrical components of the I/O device.

I/O Bus

The I/O bus consists of data, address and control buses that function to connect the device controller with the internal elements of the computer such as the processor and memory. In addition, there are also advanced or expansion I/O buses such as parallel, serial, PS2 buses that are used to communicate with I/O devices that are easily moved and are generally located outside the computer box.

2. Classification of I/O Devices

Input/output devices are the most numerous types of components and can be grouped by various criteria. including:

Based on communication targets
Based on the nature of the data flow.

Based on communication targets

Human Readable Machine, that is, equipment suitable for communication with the user. For example, Video Display Terminal (VDT) which consists of a screen, keyboard, and mouse.
Machine Readable Devices, which are devices suitable for communication with electronic devices. For example, disks and tapes, sensors, controllers.
Communication Devices, namely, equipment suitable for communication with remote equipment. For example, a modem.

There are major differences between these classes of equipment, including:

Data rate
Application
Control complexity
Units transferred
Data representation
Error conditions

These differences make it necessary for I/O equipment to have a uniform and consistent approach both in terms of OS and Processing.

Based on the nature of the data flow

a. Block-oriented devices

A device for storing or exchanging information as fixed-size blocks. Each block has its own address. Block sizes can vary from 128 bytes to 1024 bytes, depending on the device.

The main feature of this device is that it allows reading or writing blocks independently, that is, it can read or write any block without having to go through other blocks. For example: disk, tape, CD ROM, Optical disk.

b. Character-oriented devices

Devices transfer from and to devices in the form of character streams. Examples: terminals, line printers, punch cards, network interfaces, paper tape, mouse.

Devices that are not included in the above categories are: clock, memory mapped screen, sensor, mouse, and so on.

3. I/O Device Addressing

To access I/O devices, namely reading and writing data to I/O devices, each I/O device needs to be given a special address. In fact, what is given an address is the registers on the device controller. There are two methods for giving an address to an I/O device:

a. Direct-Mapped I/O Addressing

In this addressing scheme, I/O devices have a separate address space from memory addresses.

b. Memory-Mapped I/O Addressing

In this addressing scheme, I/O devices have addresses that are part of the global memory address space.