Difference between revisions of "IPv4"

From OSGeo
Jump to navigation Jump to search
 
(this is definitly spam - admin, please delete this page)
 
Line 1: Line 1:
{{IPstack}}
 
'''Internet Protocol version 4''' is the fourth iteration of the [[Internet Protocol]] (IP) and it is the first version of the protocol to be widely deployed.
 
IPv4 is the dominant [[network layer]] protocol on the [[Internet]] and apart from [[IPv6]] it is the only standard internetwork-layer protocol used on the [[Internet]].
 
  
It is described in [[IETF]] RFC 791 (September 1981) which made obsolete RFC 760 (January 1980).  The [[United States Department of Defense]] also standardized it as MIL-STD-1777.
 
 
IPv4 is a data-oriented protocol to be used on a [[packet switched]] [[internetwork]] (e.g., [[Ethernet]]). It is a [[best effort delivery|best effort]] protocol in that it does not guarantee delivery. It does not make any guarantees on the correctness of the data; It may result in duplicated packets and/or packets out-of-order. These aspects are addressed by an [[upper layer protocol]] (e.g., [[Transmission Control Protocol|TCP]], and partly by [[User Datagram Protocol|UDP]]).
 
 
The entire purpose of IP is to provide unique global computer addressing to ensure that two computers communicating over the Internet can uniquely identify one another.
 
 
==Addressing==
 
<!-- Note: [[IP address]] points to this heading so if you rename this then fix that link -->
 
IPv4 uses 32-[[bit]] (4-[[byte]]) addresses, which limits the [[address space]] to 4,294,967,296 possible unique addresses.
 
However, some are reserved for special purposes such as [[private network]]s (~18 million addresses) or [[multicast]] addresses (~1 million addresses). This reduces the number of addresses that can be allocated as public Internet addresses. As the number of addresses available are consumed, an [[#Exhaustion|IPv4 address shortage]] appears to be inevitable, however [[Network Address Translation]] (NAT) has significantly delayed this inevitability.
 
 
This limitation has helped stimulate the push towards [[IPv6]], which is currently in the early stages of deployment and is currently the only contender to replace IPv4.
 
 
===Address representations===
 
When writing IPv4 addresses in human readable form, the most common notation is the [[dot-decimal notation]]. There are other notations based on the values of 200.100</tt> in the dot-decimal notation which comprises four octets in [[decimal]] separated by periods. This is the base format used in the conversion in the following table:
 
 
{| class="wikitable"
 
|-
 
! Notation !! Value !! Conversion from dot-decimal
 
|-
 
| [[Dot-decimal notation]]
 
| <tt>192.0.2.235</tt>
 
| N/A
 
|-
 
| Dotted Hexadecimal
 
| <tt>0xC0.0x00.0x02.0xEB</tt>
 
| Each octet is individually converted to hex
 
|-
 
| Dotted Octal
 
| <tt>0300.0000.0002.0353</tt>
 
| Each octet is individually converted into octal
 
|-
 
| [[Hexadecimal]]
 
| <tt>0xC00002EB</tt>
 
| Concatenation of the octets from the dotted hexadecimal
 
|-
 
| [[Decimal]]
 
| <tt>3221226219</tt>
 
| The hexadecimal form converted to decimal
 
|-
 
| [[Octal]]
 
| <tt>030000001353</tt>
 
| The hexadecimal form converted to octal
 
|}
 
 
All/most of these formats should work in all browsers. Additionally, in dotted format, each octet can be of the different bases. For example, <tt>192.0x00.0002.235</tt> is a valid (though unconventional) equivalent to the above addresses.
 
 
A final form is not really a notation since it is rarely written in an ASCII string notation. That form is a binary form of the hexadecimal notation in binary.  This difference is merely the representational difference between the string "0xCF8E83EB" and the 32-bit integer value 0xCF8E83EB.  This form is used for assigning the source and destination fields in a [[software]] program.
 
 
===Allocation===
 
Originally, the IP address was divided into two parts:
 
* Network id &ndash; first octet
 
* Host id &ndash; last three octets
 
 
This created an upper limit of 256 networks. As the networks began to be allocated, this was soon seen to be inadequate.
 
 
To overcome this limit, different classes of network were defined, in a system which later became known as [[classful network]]ing.
 
Five classes were created (A, B, C, D, & E), three of which (A, B, & C) had different lengths for the network field. The rest of the address field in these three classes was used to identify a host on that network, which meant that each network class had a different maximum number of hosts. Thus there were a few networks with lots of host addresses and numerous networks with only a few addresses.
 
Class D was for [[multicast]] addresses and class E was reserved.
 
 
Around [[1993]], these classes were replaced with a [[Classless Inter-Domain Routing]] (CIDR) scheme, and the previous scheme was dubbed "classful", by contrast.
 
CIDR's primary advantage is to allow re-division of Class A, B & C networks so that smaller (or larger) blocks of addresses may be allocated to entities (such as [[Internet service provider]]s, or their customers) or local area networks.
 
 
The actual assignment of an address is not arbitrary. The fundamental principle of [[routing]] is that address encodes information about a device's location within a network. This implies that an address assigned to one part of a network will not function in another part of the network. A hierarchical structure, created by CIDR and overseen by the [[Internet Assigned Numbers Authority]] (IANA) and its [[Regional Internet Registry|Regional Internet Registries]] (RIRs), manages the assignment of Internet address worldwide. Each RIR maintains a publicly searchable [[WHOIS]] database that provides information about IP address assignments; information from these databases plays a central role in numerous tools that attempt to locate IP addresses geographically.
 
 
{| class="wikitable"
 
|+ Reserved address blocks
 
|-
 
! [[Classless Inter-Domain Routing|CIDR]] address block || Description || Reference
 
|-
 
| 0.0.0.0/8 || Current network (only valid as source address) || RFC 1700
 
|-
 
| 10.0.0.0/8 || [[Private network]] || RFC 1918
 
|-
 
| 14.0.0.0/8 || Public data networks || RFC 1700
 
|-
 
| 127.0.0.0/8 || [[Localhost|Loopback]] || RFC 3330
 
|-
 
| 128.0.0.0/16 || Reserved (IANA) || RFC 3330
 
|-
 
| 169.254.0.0/16 || [[Zeroconf|Link-Local]] || RFC 3927
 
|-
 
| 172.16.0.0/12 || [[Private network]] || RFC 1918
 
|-
 
| 191.255.0.0/16 || Reserved (IANA) || RFC 3330
 
|-
 
| 192.0.0.0/24 || Reserved (IANA) || RFC 3330
 
|-
 
| 192.0.2.0/24 || Documentation and example code || RFC 3330
 
|-
 
| 192.88.99.0/24 || [[IPv6]] to IPv4 relay || RFC 3068
 
|-
 
| 192.168.0.0/16 || [[Private network]] || RFC 1918
 
|-
 
| 198.18.0.0/15 || Network benchmark tests || RFC 2544
 
|-
 
| 223.255.255.0/24 || Reserved (IANA) || RFC 3330
 
|-
 
| 224.0.0.0/4 || [[Multicast]]s (former Class D network) || RFC 3171
 
|-
 
| 240.0.0.0/4 || Reserved (former Class E network) || RFC 1700
 
|-
 
| 255.255.255.255 || Broadcast ||
 
|}
 
 
===Private networks===
 
{{main|private network}}
 
Of the 4 billion addresses allowed in IPv4, four ranges of address are reserved for [[private network]]ing use only.
 
These ranges are not routable outside of private networks, and private machines cannot directly communicate with public networks.
 
They can, however, do so through [[network address translation]].
 
 
The following are the four ranges reserved for private networks:
 
 
{| class="wikitable"
 
|-myspace.com
 
 
! Name !! IP address range !! number of IPs
 
! ''[[classful network|classful]]'' description !! largest [[Classless Inter-Domain Routing|CIDR]] block
 
|-
 
| 24-bit block || 10.0.0.0 &ndash; 10.255.255.255 || 16,777,216 || single class A || 10.0.0.0/8
 
|-
 
| 20-bit block || 172.16.0.0 &ndash; 172.31.255.255 || 1,048,576 || 16 contiguous class Bs || 172.16.0.0/12
 
|-
 
| 16-bit block || 192.168.0.0 &ndash; 192.168.255.255 || 65,536 || 256 contiguous class Cs || 192.168.0.0/16
 
|-
 
| 16-bit block || 169.254.0.0 &ndash; 169.254.255.255 || 65,536 || 256 contiguous class Cs || 169.254.0.0/16
 
|}
 
 
===Localhost===
 
{{main|localhost}}
 
 
In addition to private networking, the IP range 127.0.0.0 &ndash; 127.255.255.255 (or 127.0.0.0/8 in [[Classless Inter-Domain Routing|CIDR]] notation) is reserved for [[localhost]] communication.
 
Any address within this range should never appear on an actual network and any packet sent to this address does not leave the source computer, and will appear as an incoming packet on that computer (known as [[Loopback]]).
 
 
===IP Addresses ending in 0 or 255===
 
{{main|IPv4 subnetting reference}}
 
 
It is a common misconception that IP addresses ending in 255 or 0 can never be assigned to hosts on a subnet, but this is purely an artifact of classful addressing.
 
 
In classful addressing (now obsolete with the advent of CIDR), there are only 3 possible subnet masks: 255.0.0.0 (Class A), 255.255.0.0 (Class B), 255.255.255.0 (Class C). If we have the subnet 192.168.5.0/255.255.255.0, the network identifier 192.168.5.0 refers to the entire network, so to avoid confusion, it cannot be assigned to a device on the network.
 
 
A [[broadcast address]] is an IP address that allows information to be sent to all machines on a given subnet rather than a specific machine. Generally, the broadcast address is found by taking the bit complement of the subnet mask and then OR-ing it bitwise with the network identifier. More simply, the broadcast address is the last IP address in the range belonging to the subnet. In our example, the broadcast address would be 192.168.5.255, so to avoid confusion this IP address also cannot be assigned to a host. On a Class A, B, or C subnet, the broadcast address would always end in 255.
 
 
However, this does not mean that all IP addresses ending in 255 cannot be used as host IP addresses. For example, if we had a Class B subnet 192.168.0.0/255.255.0.0, this is equivalent to the range 192.168.0.0 - 192.168.255.255. The broadcast address would be 192.168.255.255. However, we can assign 192.168.1.255, 192.168.2.255, etc. (though this can cause confusion). Also, 192.168.0.0 is the network identifier and so cannot be assigned, but 192.168.1.0, 192.168.2.0, etc. can be assigned (though this can also cause confusion).
 
 
With the advent of CIDR, broadcast addresses may not necessarily end with 255.
 
 
In general, the first and last IP addresses in a subnet are used as the network identifier and broadcast address, respectively. All other IP addresses in the subnet can be assigned to hosts on the subnet.
 
 
===Resolving===
 
{{main|Domain Name System}}
 
 
The [[Internet]] is most publicly known not by IP addresses but by names (e.g., www.whitehouse.gov, www.freebsd.org, www.berkeley.edu).
 
The routing of IP packets across the Internet is oblivious to such names.
 
This requires translating (or resolving) names to IP address.
 
 
The [[Domain Name System]] (DNS) provides such a system to convert names to IP address(es) and IP addresses to names.
 
Much like [[Classless Inter-Domain Routing|CIDR]] addressing, the DNS naming is also hierarchical and allows for subdelegation of name spaces to other DNS servers.
 
 
Think of this in a similar way to how you find a phone number. You want to call The Acme Bakers but don't know the number. You ring directory enquiries and they tell can you the number you need to dial or can even connect you. Next you might want to call Acme Builder. Again, you only need to know the phone number of directory enquiries, they will almost always have the number you want and connect you. Only if you ask directory enquiries for the number of a company whoch doesn't exist will they say they can't connect you - similar to a DNS error in your web browser.
 
 
===Exhaustion===
 
{{main|IP address exhaustion}}
 
A concern that has spanned decades to the [[1980s]] is the exhaustion of available IP addresses.
 
This was the driving factor in [[classful network]]s and then later in the creation of [[Classless Inter-Domain Routing|CIDR]] addressing.
 
 
Today, there are several driving forces to the next address allocation solution:
 
* Mobile devices &mdash; [[laptop computer]]s, [[personal digital assistant|PDA]]s, [[mobile phone]]s
 
* Always-on devices &mdash; [[Asymmetric Digital Subscriber Line|ADSL]] modems, [[cable modem]]s
 
* Rapidly growing number of internet users
 
 
The most visible solution is to migrate to [[IPv6]] since the address size jumps dramatically from 32-bit to 128-bit which would allow about 18 [[quintillion]] people their own set of 18 quintillion addresses (3.4e38 total addresses). However, migration has proved to be a challenge in itself, and total Internet adoption of IPv6 is unlikely to occur for many years.
 
 
Some things that can be done to mitigate the IPv4 address exhaustion are (not mutually exclusive):
 
* [[Network address translation]] (NAT)
 
* Use of [[private network]]s
 
* [[Dynamic Host Configuration Protocol]] (DHCP)
 
* Named based [[virtual hosting]]
 
* Tighter control by [[Regional Internet Registry|Regional Internet Registries]] on the allocation of addresses to Local Internet Registries
 
* Network renumbering to reclaim large blocks of address space allocated in the early days of the Internet
 
 
As of May 2007, predictions of exhaustion date of the unallocated IANA pool seem to converge to between '''March 2010''' and '''May 2010.'''
 
 
==Network address translation==
 
{{main|Network address translation}}
 
 
One method to increase both address utilization and security is to use [[network address translation]] (NAT).
 
By assigning one IP to a public machine as an [[internet]] [[Gateway (telecommunications)|gateway]] and using a [[private network]] for an organization's computers allows for considerable address savings.
 
This also increases security by making all of the computers on a private network not directly accessible from the public network.
 
 
==Virtual private networks==
 
{{main|Virtual private network}}
 
 
Since private address ranges are deliberately ignored by all public routers, it is not normally possible to connect two private networks (e.g., two branch offices) via the public Internet. [[Virtual private network]]s (VPNs) solve this problem.
 
 
VPNs work by inserting an IP packet (encapsulated packet) directly into the data field of another IP packet (encapsulating packet) and using a publicly routable address in the encapsulating packet. Once the VPN packet is routed across the public network and reaches the endpoint, the encapsulated packet is extracted and then transmitted on the private network just as if the two private networks were directly connected.
 
 
Optionally, the encapsulated packet can be encrypted to secure the data while over the public network (see VPN article for more details).
 
 
==Address Resolution Protocol==
 
{{main|Address Resolution Protocol}}
 
 
IP is an upper layer protocol to the [[data link layer]]. The data link layer of underlying physical network segment over which two communicating computers are directly connected (typically through a [[Ethernet hub|hub]] or a [[Network switch|switch]]) uses its own addressing scheme at hardware level. In order to send a packet from computer A to B, A needs to know the hardware address of B. This discovery and mapping of IP addresses onto the hardware addresses is done using [[Address Resolution Protocol]] (ARP).
 
 
==Reverse Address Resolution Protocol/DHCP==
 
{{main|Reverse Address Resolution Protocol|BOOTP|Dynamic Host Configuration Protocol}}
 
 
Unlike the situation outlined for ARP, the case arises when a computer knows its [[data link layer]] address but not its IP address.
 
This is a common scenario in [[private network]]s and [[Digital Subscriber Line]] (DSL) connections when the IP address of the machines are irrelevant.
 
This is usually the case for [[work station]]s but not [[server (computing)|servers]].
 
 
RARP is an obsoleted method for answering this question: This is my hardware address, what is my IP address?
 
RARP was replaced by [[BOOTP]] which, in turn, was replaced by [[Dynamic Host Configuration Protocol]] (DHCP).
 
 
In addition to sending the IP address, DHCP can also send the [[Network Time Protocol|NTP]] server, [[Domain Name System|DNS]] servers, and more.
 
 
==Packet structure==
 
An IP packet consists of two sections:
 
* header
 
* data
 
 
===Header===
 
The header consists of 13 fields, of which only 12 are required.  The 13<sup>th</sup> field is optional (red background in table) and aptly named: options.  The fields in the header are packed with the most significant byte first ([[Endianness|big endian]]), and for the diagram and discussion, the most significant bits are considered to come first.  The most significant bit is numbered 0, so the version field is actually found in the 4 most significant bits of the first byte, for example.
 
 
{| class="wikitable" style="text-align:center"
 
|-
 
! width="4%"|+
 
! colspan="4" width="12%"| Bits 0–3
 
! colspan="4" width="12%"| 4–7
 
! colspan="8" width="24%"| 8–15
 
! colspan="3" width="9%"| 16–18
 
! colspan="13" width="39%"| 19–31
 
|-
 
! 0
 
| colspan="4"| Version
 
| colspan="4"| Header length
 
| colspan="8"| Type of Service<br />(now [[Differentiated services|DiffServ]] and [[Explicit Congestion Notification|ECN]])
 
| colspan="16"| Total Length
 
|-
 
! 32
 
| colspan="16"| Identification
 
| colspan="3"| Flags
 
| colspan="13"| Fragment Offset
 
|-
 
! 64
 
| colspan="8"| Time to Live
 
| colspan="8"| Protocol
 
| colspan="16"| Header Checksum
 
|-
 
! 96
 
| colspan="32"| Source Address
 
|-
 
! 128
 
| colspan="32"| Destination Address
 
|-
 
! 160
 
| colspan="32" bgcolor="#FFDDDD"| Options
 
|-
 
! 160<br>or<br>192+
 
| colspan="32"| &nbsp;<br />Data<br />&nbsp;
 
|}
 
 
; Version : The first header field in an IP [[packet]] is the 4-bit version field.  For IPv4, this has a value of 4 (hence the name IPv4).
 
; Internet Header Length (IHL) : The second field is a 4-bit Internet Header Length (IHL) telling the number of 32-bit [[Word (computer science)|words]] in the header.  Since an IPv4 header may contain a variable number of options, this field specifies the size of the header (this also coincides with the offset to the data).  The minimum value for this field is 5 (rfc791), which is a length of 5×32 = 160 bits.  Being a 4-bit field the maximum length is 15 words or 480 bits.
 
; Type of Service (TOS) : In RFC 791, the following 8 bits were allocated to a Type of Service (TOS) field:
 
:* bits 0-2: precedence
 
:* bit 3: 0 = Normal Delay, 1 = Low Delay
 
:* bit 4: 0 = Normal Throughput, 1 = High Throughput
 
:* bit 5: 0 = Normal Reliability, 1 = High Reliability
 
:* bits 6-7: Reserved for future use
 
:This field is now used for [[Differentiated services|DiffServ]] and [[Explicit Congestion Notification|ECN]].  The original intention was for a sending host to specify a preference for how the datagram would be handled as it made its way through an internetwork.  For instance, one host could set its IPv4 datagrams' TOS field value to prefer low delay, while another might prefer high reliability.  In practice, the TOS field has not been widely implemented.  However, a great deal of experimental, research and deployment work has focused on how to make use of these eight bits.  These bits have been redefined, most recently through [[DiffServ]] working group in the IETF and the [[Explicit Congestion Notification]] codepoints (see RFC 3168).  New technologies are emerging that require real-time data streaming and therefore will make use of the TOS field. An example is [[Voice over IP]] (VoIP) that is used for interactive data voice exchange.
 
; Total Length : This 16-bit field defines the entire datagram size, including header and data, in bytes.  The minimum-length datagram is 20 bytes (20 bytes header + 0 bytes data) and the maximum is 65,535 &mdash; the maximum value of a 16-bit word.  The minimum size datagram that any host is '''required''' to be able to handle is 576 bytes, but most modern hosts handle much larger packets. Sometimes [[subnetwork]]s impose further restrictions on the size, in which case datagrams must be [[Fragmentation (computer)|fragmented]].  Fragmentation is handled in either the host or packet switch in IPv4 (''see [[#Fragmentation and reassembly|Fragmentation and reassembly]]'').
 
; Identification : This field is an identification field and is primarily used for uniquely identifying fragments of an original IP datagram.  Some experimental work has suggested using the ID field for other purposes, such as for adding packet-tracing information to datagrams in order to help trace back datagrams with spoofed source addresses.
 
; Flags : A 3-bit field follows and is used to control or identify fragments. They are (in order, from high order to low order):
 
:* Reserved; must be zero.  As an April Fools joke, proposed for use in RFC 3514 as the "[[Evil bit]]".
 
:* Don't Fragment (DF)
 
:* More Fragments (MF)
 
:If the DF flag is set and fragmentation is required to route the packet then the packet will be dropped.  This can be used when sending packets to a host that does not have sufficient resources to handle fragmentation.
 
:When a packet is fragmented all fragments have the MF flag set except the last fragment, which does not have the MF flag set.  The MF flag is also not set on packets that are not fragmented &mdash; clearly an unfragmented packet can be considered the last fragment.
 
; Fragment Offset : The fragment offset field, measured in units of 8-byte blocks, is 13-bits long and specifies the offset of a particular fragment relative to the beginning of the original unfragmented IP datagram.  The first fragment has an offset of 0. This allows a maximum offset of 65,528 (<math>(2^{13}-1)\times8</math>) which would exceed the maximum IP packet length of 65,535 with the header length included.
 
; Time To Live (TTL) : An 8-bit [[time to live]] (TTL) field helps prevent datagrams from persisting (e.g. going in circles) on an internetwork.  Historically the TTL field limited a datagram's lifetime in seconds, but has come to be a [[hop count]] field.  Each packet switch (or [[router]]) that a datagram crosses decrements the TTL field by one.  When the TTL field hits zero, the packet is no longer forwarded by a packet switch and is discarded.  Typically, an [[Internet Control Message Protocol|ICMP]] message (specifically the [[ICMP Time Exceeded|time exceeded]]) is sent back to the sender that it has been discarded.  The reception of these ICMP messages is at the heart of how [[traceroute]] works.
 
; Protocol : This field defines the protocol used in the data portion of the IP datagram.  The [[Internet Assigned Numbers Authority]] maintains a list of Protocol numbers and were originally defined in RFC 790.  Common protocols and their decimal values are shown below (''see [[#Data|Data]]'').
 
; Header Checksum : The 16-bit [[checksum]] field is used for error-checking of the header.  At each hop, the checksum of the header must be compared to the value of this field.  If a header checksum is found to be mismatched, then the packet is discarded.  Note that errors in the data field are up to the encapsulated protocol to handle &mdash; indeed, both [[User Datagram Protocol|UDP]] and [[Transmission Control Protocol|TCP]] have checksum fields. 
 
: Since the TTL field is decremented on each hop and fragmentation is possible at each hop then at each hop the checksum will have to be recomputed.  The method used to compute the checksum is defined within RFC 791:
 
:: ''The checksum field is the 16-bit one's complement of the one's complement sum of all 16-bit words in the header.  For purposes of computing the checksum, the value of the checksum field is zero.''
 
: In other words, all 16-bit words are summed together using [[one's complement]] (with the checksum field set to zero).  The sum is then one's complemented.  This final value is then inserted as the checksum field.
 
; Source address : An [[IP address]] is a group of 4 8-bit octets for a total of 32 bits.  The value for this field is determined by taking the binary value of each octet and concatenating them together to make a single 32-bit value.
 
: For example, the address 10.9.8.7 (00001010.00001001.00001000.00000111 in binary) would be 00001010000010010000100000000111.
 
: This address is the address of the sender of the packet.  Note that this address may not be the "true" sender of the packet due to [[network address translation]].  Instead, the source address will be translated by the NATing machine to its own address.  Thus, reply packets sent by the receiver are routed to the NATing machine, which translates the destination address to the original sender's address.
 
; Destination address : Identical to the source address field but indicates the receiver of the packet.
 
; Options : Additional header fields (called ''[[IPv4 options|options]]'') may follow the destination address field, but these are not often used. Note that the value in the IHL field must include enough extra 32-bit words to hold all the options (plus any padding needed to ensure that the header contains an integral number of 32-bit words). The list of options may be terminated with an EOL ([[End of Options List]]) option; this is only necessary if the end of the options would not otherwise coincide with the end of the header.
 
: The use of the [[Loose Source and Record Route|LSRR]] and [[Strict Source and Record Route|SSRR]] options (Loose and Strict Source and Record Route) is discouraged because they create security concerns; many routers block packets containing these options.
 
 
===Data===
 
The last field is not a part of the header and, consequently, not included in the checksum field.
 
The contents of the data field are specified in the protocol header field and can be any one of the [[transport layer]] protocols.
 
 
Some of the most commonly used protocols are listed below including their value used in the protocol field:
 
* 1: [[Internet Control Message Protocol]] (ICMP)
 
* 2: [[Internet Group Management Protocol]] (IGMP)
 
* 6: [[Transmission Control Protocol]] (TCP)
 
* 17: [[User Datagram Protocol]] (UDP)
 
* 89: [[Open Shortest Path First]] (OSPF)
 
* 132: [[Stream Control Transmission Protocol]] (SCTP)
 
 
See [[List of IPv4 protocol numbers]] for a complete list.
 
 
==Fragmentation and reassembly==
 
{{main|IP fragmentation}}
 
To make IPv4 more tolerant of different networks the concept of [[fragmentation (computer)|fragmentation]] was added so that, if necessary, a device could break up the data into smaller pieces.
 
This is necessary when the [[MTU (networking)|maximum transmission unit]] (MTU) is smaller than the packet size.
 
 
For example, the maximum size of an IP packet is 65,535 bytes while the typical MTU for [[Ethernet]] is 1,500 bytes.
 
Since the IP header consumes 20 bytes (without options) of the 1,500 bytes leaving 1,480 bytes of IP data per Ethernet frame (this leads to an MTU for IP of 1,480 bytes).
 
Therefore, a 65,535-byte data payload would require 45 packets (65535/1480 = 44.28).
 
 
The reason fragmentation was chosen to occur at the IP layer is that IP is the first layer that connects hosts instead of machines.
 
If fragmentation were performed on higher layers (TCP, UDP, etc.) then this would make fragmentation/reassembly be redundantly implemented (once per protocol); if fragmentation were performed on a lower layer (Ethernet, ATM, etc.) then this would require fragmentation/reassembly be performed on each hop (could be quite costly) and redundantly implemented (once per link layer protocol).
 
Therefore, the IP layer is the most efficient one for fragmentation.
 
 
===Fragmentation===
 
When a device receives an IP packet it examines the destination address and determines the outgoing interface to use.
 
This interface has an associated MTU that dictates the maximum data size for its payload.
 
If the MTU is smaller than the data size then the device must fragment the data.
 
 
The device then segments the data into segments where each segment is less-than-or-equal-to the MTU less the IP header size (20 bytes minimum; 60 bytes maximum).
 
Each segment is then put into its own IP packet with the following changes:
 
* The ''total length'' field will be adjusted to the segment size
 
* The ''more fragments'' (MF) flag is set for all segments except the last one
 
* The ''fragment offset'' field is set accordingly based on the offset of the segment in the original data payload. This is measured in units of 8-byte blocks.
 
 
For example, for an IP header of length 20 bytes and an Ethernet MTU of 1,500 bytes the fragment offsets would be: 0, (1480/8) = 185, (2960/8) = 370, (4440/8) = 555, (5920/8) = 740, etc.
 
 
Notice that if (MTU – header length) is not a multiple of 8, then only a multiple of 8 number of bytes of data will be included in the datagram, even if that leaves a total datagram size of less than MTU (could only be off by 4 bytes because header is always multiple of 4 bytes).
 
 
By some chance if a packet changes link layer protocols or the MTU reduces then these fragments would be fragmented again.
 
 
For example, if a 4,500 byte data payload is inserted into an IP packet with no options (thus total length is 4,520 bytes) and is transmitted over a link with an MTU of 2,500 bytes then it will be broken up into two fragments:
 
 
{| class="wikitable" style="text-align:center"
 
|-
 
!rowspan="2"| #
 
!colspan="2" width="200"| Total length
 
!rowspan="2"| More fragments (MF)<BR>flag set?
 
!rowspan="2"| Fragment offset
 
|-
 
!width="100"| Header
 
!width="100"| Data
 
|-
 
|rowspan="2"| 1 ||colspan="2"| 2500 ||rowspan="2" {{yes}} ||rowspan="2"| 0
 
|-
 
| 20 || 2480
 
|-
 
|rowspan="2"| 2 ||colspan="2"| 2040 ||rowspan="2" {{no}} ||rowspan="2"  |310
 
|-
 
| 20 || 2020
 
|}
 
 
Now, let's say the MTU drops to 1,500 bytes.  Each fragment will individually be split up into two more fragments each:
 
 
{| class="wikitable" style="text-align:center"
 
|-
 
!rowspan="2"| #
 
!colspan="2" width="200"| Total length
 
!rowspan="2"| More fragments (MF)<BR>flag set?
 
!rowspan="2"| Fragment offset
 
|-
 
!width="100"| Header
 
!width="100"| Data
 
|-
 
|rowspan="2"| 1 ||colspan="2"| 1500 ||rowspan="2" {{yes}} ||rowspan="2"| 0
 
|-
 
| 20 || 1480
 
|-
 
|rowspan="2"| 2 ||colspan="2"| 1020 ||rowspan="2" {{yes}} ||rowspan="2"| 185
 
|-
 
| 20 || 1000
 
|-
 
|rowspan="2"|  3 ||colspan="2"| 1500 ||rowspan="2" {{yes}} ||rowspan="2"| 310
 
|-
 
| 20 || 1480
 
|-
 
|rowspan="2"| 4 ||colspan="2"| 560 ||rowspan="2" {{no}} ||rowspan="2"| 495
 
|-
 
| 20 || 540
 
|}
 
 
Indeed, the amount of data has been preserved &mdash; 1480 + 1000 + 1480 + 540 = 4500 &mdash; and the last fragment offset plus data &mdash; 3960 + 540 = 4500 &mdash; is also the total length.
 
 
Note that fragments 3 & 4 were derived from the original fragment 2.  When a device must fragment the last fragment then it must set the flag for all but the last fragment it creates (fragment 3 in this case).
 
 
===Reassembly===
 
When a receiver detects an IP packet where either of the following is true:
 
 
* "more fragments" flag set
 
* "fragment offset" field is non-zero
 
 
then the receiver knows the packet is a fragment.
 
The receiver then stores the data with the identification field, fragment offset, and the more fragments flag.
 
When the receiver receives a fragment with the more fragments flag not set then it knows the length of the original data payload since the fragment offset plus the data length is equivalent to the original data payload size.
 
 
Using the example above, when the receiver receives fragment #4 the fragment offset (3960) and the data length (540) added together yield 4500 &mdash; the original data length.
 
 
Once it has all the fragments then it can reassemble the data in proper order (by using the fragment offsets) and pass it up the stack for further processing.
 
 
==See also==
 
* [[Classful network]]
 
* [[Classless Inter-Domain Routing]]
 
* [[Internet Assigned Numbers Authority]]
 
* [[IPv6]]
 
* [[List of assigned /8 IP address blocks]]
 
* [[List of IP protocol numbers]]
 
* [[Regional Internet Registry]]
 
 
==External links==
 
{{external links}}
 
* RFC 791 – Internet Protocol
 
* http://www.iana.org &ndash; Internet Assigned Numbers Authority (IANA)
 
* http://www.iplobster.com What Is My IP ? IPLobster Show Your IP Address & IPv4
 
* http://www.networksorcery.com/enp/protocol/ip.htm &ndash; IP Header Breakdown, including specific options
 
* RFC 3344 IPv4 Mobility
 
 
Address exhaustion
 
* [http://www.ripe.net/rs/news/ipv4-ncc-20031030.html RIPE report on address consumption as of October 2003]
 
* [http://www.iana.org/assignments/ipv4-address-space Official current state of IPv4 /8 allocations, as maintained by IANA]
 
* [http://www.potaroo.net/tools/ipv4/index.html Dynamically generated graphs of IPv4 address consumption with predictions of exhaustion dates – Geoff Huston]
 
* [http://bgp.potaroo.net/ipv4/ Slightly historic (June 10, 2006) version of previous link. Good for historical comparison – Geoff Huston]
 
 
* [http://www.potaroo.net/ispcol/2005-11/numerology.html Historic (November, 2005) version of previous link.]
 
* [http://www.potaroo.net/ispcol/2003-08/ale.html Historic (August 2003) version of Geoff Huston article]
 
* [http://www.cisco.com/en/US/about/ac123/ac147/archived_issues/ipj_8-3/ipv4.html  A Pragmatic Report on IPv4 Address Space Consumption by Tony Hain (Cisco) as of September 2005.]
 
* [http://www.tndh.net/~tony/ietf/ipv4-pool-combined-view.pdf Quarterly update by Tony Hain (Cisco), last as of Nov 20, 2006]
 
* [http://www.ripe.net/info/info-services/ipv4/index.html Article on IPv4 Exhaustion – "Running Out of Time?"]
 
* [http://www.apnic.net/news/hot-topics/internet-gov/ip-china.html APNIC hot topics – IP addressing in China and the myth of address shortage]
 
 
[[Category:Internet Protocol|v4]]
 
[[Category:Network layer protocols]]
 
 
<!-- interwiki -->
 
 
[[bs:IPv4]]
 
[[da:IPv4]]
 
[[de:IPv4]]
 
[[es:IPv4]]
 
[[fr:IPv4]]
 
[[hr:IPv4]]
 
[[ko:IPv4]]
 
[[id:Alamat IP versi 4]]
 
[[it:IPv4]]
 
[[lv:IPv4]]
 
[[mk:IPv4]]
 
[[nl:Internet Protocol Version 4]]
 
[[ja:IPv4]]
 
[[no:IPv4]]
 
[[pl:IPv4]]
 
[[sk:IPv4]]
 
[[sv:IPv4]]
 
[[vi:IPv4]]
 
[[tr:IPv4]]
 
[[zh:IPv4]]
 

Latest revision as of 06:47, 14 September 2007