Communications protocol

In telecommunications, a communication protocol is a system of rules that allow two or more entities of a communications system to transmit information via any kind of variation of a physical quantity. The protocol defines the rules syntax, semantics and synchronization of communication and possible error recovery methods. Protocols may be implemented by hardware, software, or a combination of both.

Communicating systems use well-defined formats (protocol) for exchanging various messages. Each message has an exact meaning intended to elicit a response from a range of possible responses pre-determined for that particular situation. The specified behavior is typically independent of how it is to be implemented. Communications protocols have to be agreed upon by the parties involved. To reach agreement, a protocol may be developed into a technical standard. A programming language describes the same for computations, so there is a close analogy between protocols and programming languages: protocols are to communications what programming languages are to computations.

Multiple protocols often describe different aspects of a single communication. A group of protocols designed to work together are known as a protocol suite; when implemented in software they are a protocol stack.

Internet communication protocols are published by the Internet Engineering Task Force (IETF). The IEEE handles wired and wireless networking; the International Organization for Standardization (ISO) other types. The ITU-T handles telecommunications protocols and formats for the public switched telephone network (PSTN). As the PSTN and Internet converge, the standards are also being driven towards convergence.

Protocols are to communications what algorithms or programming languages are to computations.

This analogy has important consequences for both the design and the development of protocols. One has to consider the fact that algorithms, programs and protocols are just different ways of describing expected behavior of interacting objects. A familiar example of a protocolling language is the HTML language used to describe web pages which are the actual web protocols.

In programming languages the association of identifiers to a value is termed a definition. Program text is structured using block constructs and definitions can be local to a block. The localized association of an identifier to a value established by a definition is termed a binding and the region of program text in which a binding is effective is known as its scope. The computational state is kept using two components: the environment, used as a record of identifier bindings, and the store, which is used as a record of the effects of assignments.

That’s not because TCP is perfect or because computer scientists have had trouble coming up with possible alternatives; it’s because those alternatives are too hard to test. The routers in data center networks have their traffic management protocols hardwired into them. Testing a new protocol means replacing the existing network hardware with either reconfigurable chips, which are labor-intensive to program, or software-controlled routers, which are so slow that they render large-scale testing impractical. The system maintains a compact, efficient computational model of a network running the new protocol, with virtual data packets that bounce around among virtual routers. On the basis of the model, it schedules transmissions on the real network to produce the same traffic patterns. Researchers could thus run real web applications on the network servers and get an accurate sense of how the new protocol would affect their performance.

Traffic control

Each packet of data sent over a computer network has two parts: the header and the payload. The payload contains the data the recipient is interested in — image data, audio data, text data, and so on. The header contains the sender’s address, the recipient’s address, and other information that routers and end users can use to manage transmissions. When multiple packets reach a router at the same time, they’re put into a queue and processed sequentially. With TCP, if the queue gets too long, subsequent packets are simply dropped; they never reach their recipients. When a sending computer realizes that its packets are being dropped, it cuts its transmission rate in half, then slowly ratchets it back up. A better protocol might enable a router to flip bits in packet headers to let end users know that the network is congested, so they can throttle back transmission rates before packets get dropped. Or it might assign different types of packets different priorities, and keep the transmission rates up as long as the high-priority traffic is still getting through. These are the types of strategies that computer scientists are interested in testing out on real networks.

Speedy simulation

When a server on the real network wants to transmit data, it sends a request to the emulator, which sends a dummy packet over a virtual network governed by the new protocol. When the dummy packet reaches its destination, the emulator tells the real server that it can go ahead and send its real packet. If, while passing through the virtual network, a dummy packet has some of its header bits flipped, the real server flips the corresponding bits in the real packet before sending it. If a clogged router on the virtual network drops a dummy packet, the corresponding real packet is never sent. And if, on the virtual network, a higher-priority dummy packet reaches a router after a lower-priority packet but jumps ahead of it in the queue, then on the real network, the higher-priority packet is sent first.

The servers on the network thus see the same packets in the same sequence that they would if the real routers were running the new protocol. There’s a slight delay between the first request issued by the first server and the first transmission instruction issued by the emulator. But thereafter, the servers issue packets at normal network speeds. The ability to use real servers running real web applications offers a significant advantage over another popular technique for testing new network management schemes: software simulation, which generally uses statistical patterns to characterize the applications’ behavior in a computationally efficient manner.