IP(7P) | Protocols | IP(7P) |
#include <sys/socket.h>
#include <netinet/in.h>
s = socket(AF_INET, SOCK_RAW, proto);
t = t_open ("/dev/rawip", O_RDWR);
IP is the internetwork datagram delivery protocol that is central to the Internet protocol family. Programs may use IP through higher-level protocols such as the Transmission Control Protocol (TCP) or the User Datagram Protocol (UDP), or may interface directly to IP. See tcp(7P) and udp(7P). Direct access may be by means of the socket interface, using a "raw socket," or by means of the Transport Level Interface (TLI). The protocol options defined in the IP specification may be set in outgoing datagrams.
Packets sent to or from this system may be subject to IPsec policy. See ipsec(7P) for more information.
The STREAMS driver /dev/rawip is the TLI transport provider that provides raw access to IP.
Raw IP sockets are connectionless and are normally used with the sendto() and recvfrom() calls (see send(3SOCKET) and recv(3SOCKET)), although the connect(3SOCKET) call may also be used to fix the destination for future datagram. In this case, the read(2) or recv(3SOCKET) and write(2) or send(3SOCKET) calls may be used. If proto is IPPROTO_RAW or IPPROTO_IGMP, the application is expected to include a complete IP header when sending. Otherwise, that protocol number will be set in outgoing datagrams and used to filter incoming datagrams and an IP header will be generated and prepended to each outgoing datagram. In either case, received datagrams are returned with the IP header and options intact.
If an application uses IP_HDRINCL and provides the IP header contents, the IP stack does not modify the following supplied fields under any conditions: Type of Service, DF Flag, Protocol, and Destination Address. The IP Options and IHL fields are set by use of IP_OPTIONS, and Total Length is updated to include any options. Version is set to the default. Identification is chosen by the normal IP ID selection logic. The source address is updated if none was specified and the TTL is changed if the packet has a broadcast destination address. Since an applicaton cannot send down fragments (as IP assigns the IP ID), Fragment Offset is always 0. The IP Checksum field is computed by IP. None of the data beyond the IP header are changed, including the application-provided transport header.
The socket options supported at the IP level are:
IP_OPTIONS
IP_SEC_OPT
IP_ADD_MEMBERSHIP
IP_DROP_MEMBERSHIP
IP_BOUND_IF
The following option takes in_pktinfo_t as the parameter:
IP_PKTINFO
struct in_pktinfo { unsigned int ipi_ifindex;/* send/recv interface index */ struct in_addr ipi_spec_dst;/* matched source addr. */ struct in_addr ipi_addr;/* src/dst addr. in IP hdr */ } in_pktinfo_t;
When passed in (on transmit) via ancillary data with IP_PKTINFO, ipi_spec_dst is used as the source address and ipi_ifindex is used as the interface index to send the packet out.
The following options are boolean switches controlling the reception of ancillary data:
IP_RECVDSTADDR
IP_RECVIF
IP_RECVOPTS
IP_RECVPKTINFO
IP_RECVSLLA
IP_RECVTTL
IP_RECVTOS
The following options take a struct ip_mreq as the parameter. The structure contains a multicast address which must be set to the CLASS-D IP multicast address and an interface address. Normally the interface address is set to INADDR_ANY which causes the kernel to choose the interface on which to join.
IP_BLOCK_SOURCE
IP_UNBLOCK_SOURCE
IP_ADD_SOURCE_MEMBERSHIP
IP_DROP_SOURCE_MEMBERSHIP
The following options take a struct ip_mreq_source as the parameter. The structure contains a multicast address (which must be set to the CLASS-D IP multicast address), an interface address, and a source address.
MCAST_JOIN_GROUP
MCAST_BLOCK_SOURCE
MCAST_UNBLOCK_SOURCE
MCAST_LEAVE_GROUP
MCAST_JOIN_SOURCE_GROUP
MCAST_LEAVE_SOURCE_GROUP
The following options take a struct group_req or struct group_source_req as the parameter. The `group_req structure contains an interface index and a multicast address which must be set to the CLASS-D multicast address. The group_source_req structure is used for those options which include a source address. It contains an interface index, multicast address, and source address.
IP_MULTICAST_IF
IP_MULTICAST_TTL
IP_MULTICAST_LOOP
IP_TOS
IP_NEXTHOP
The multicast socket options (IP_MULTICAST_IF, IP_MULTICAST_TTL, IP_MULTICAST_LOOP and IP_RECVIF) can be used with any datagram socket type in the Internet family.
At the socket level, the socket option SO_DONTROUTE may be applied. This option forces datagrams being sent to bypass routing and forwarding by forcing the IP Time To Live field to 1, meaning that the packet will not be forwarded by routers.
Raw IP datagrams can also be sent and received using the TLI connectionless primitives.
Datagrams flow through the IP layer in two directions: from the network up to user processes and from user processes down to the network. Using this orientation, IP is layered above the network interface drivers and below the transport protocols such as UDP and TCP. The Internet Control Message Protocol (ICMP) is logically a part of IP. See icmp(7P).
IP provides for a checksum of the header part, but not the data part, of the datagram. The checksum value is computed and set in the process of sending datagrams and checked when receiving datagrams.
IP options in received datagrams are processed in the IP layer according to the protocol specification. Currently recognized IP options include: security, loose source and record route (LSRR), strict source and record route (SSRR), record route, and internet timestamp.
By default, the IP layer will not forward IPv4 packets that are not addressed to it. This behavior can be overridden by using routeadm(1M) to enable the ipv4-forwarding option. IPv4 forwarding is configured at boot time based on the setting of routeadm(1M)'s ipv4-forwarding option.
For backwards compatibility, IPv4 forwarding can be enabled or disabled using ndd(1M)'s ip_forwarding variable. It is set to 1 if IPv4 forwarding is enabled, or 0 if it is disabled.
Additionally, finer-grained forwarding can be configured in IP. Each interface can be configured to forward IP packets by setting the IFF_ROUTER interface flag. This flag can be set and cleared using ifconfig(1M)'s router and router options. If an interface's IFF_ROUTER flag is set, packets can be forwarded to or from the interface. If it is clear, packets will neither be forwarded from this interface to others, nor forwarded to this interface. Setting the ip_forwarding variable sets all of the IPv4 interfaces' IFF_ROUTER flags.
For backwards compatibility, each interface creates an <ifname>:ip_forwarding /dev/ip variable that can be modified using ndd(1M). An interface's :ip_forwarding ndd variable is a boolean variable that mirrors the status of its IFF_ROUTER interface flag. It is set to 1 if the flag is set, or 0 if it is clear. This interface specific <ifname> :ip_forwarding ndd variable is obsolete and may be removed in a future release of Solaris. The ifconfig(1M) router and -router interfaces are preferred.
The IP layer sends an ICMP message back to the source host in many cases when it receives a datagram that can not be handled. A "time exceeded" ICMP message is sent if the "time to live" field in the IP header drops to zero in the process of forwarding a datagram. A "destination unreachable" message is sent if a datagram can not be forwarded because there is no route to the final destination, or if it can not be fragmented. If the datagram is addressed to the local host but is destined for a protocol that is not supported or a port that is not in use, a destination unreachable message is also sent. The IP layer may send an ICMP "source quench" message if it is receiving datagrams too quickly. ICMP messages are only sent for the first fragment of a fragmented datagram and are never returned in response to errors in other ICMP messages.
The IP layer supports fragmentation and reassembly. Datagrams are fragmented on output if the datagram is larger than the maximum transmission unit (MTU) of the network interface. Fragments of received datagrams are dropped from the reassembly queues if the complete datagram is not reconstructed within a short time period.
Errors in sending discovered at the network interface driver layer are passed by IP back up to the user process.
Multi-Data Transmit allows more than one packet to be sent from the IP module to another in a given call, thereby reducing the per-packet processing costs. The behavior of Multi-Data Transmit can be overrideen by using ndd(1M) to set the /dev/ip variable, ip_multidata_outbound to 0. Note, the IP module will only initiate Multi-Data Transmit if the network interface driver supports it.
Through the netinfo framework, this driver provides the following packet events:
Physical in
Physical out
Forwarding
loopback in
loopback out
Currently, only a single function may be registered for each event. As a result, if the slot for an event is already occupied by someone else, a second attempt to register a callback fails.
To receive packet events in a kernel module, it is first necessary to obtain a handle for either IPv4 or IPv6 traffic. This is achieved by passing NHF_INET or NHF_INET6 through to a net_protocol_lookup() call. The value returned from this call must then be passed into a call to net_register_hook(), along with a description of the hook to add. For a description of the structure passed through to the callback, please see hook_pkt_event(9S). For IP packets, this structure is filled out as follows:
hpe_ifp
hpe_ofp
hpe_hdr
hpe_mp
hpe_mb
In addition to events describing packets as they move through the system, it is also possible to receive notification of events relating to network interfaces. These events are all reported back through the same callback. The list of events is as follows:
plumb
unplumb
up
down
address change
ifconfig(1M), routeadm(1M), ndd(1M), read(2), write(2), socket.h(3HEAD), bind(3SOCKET), connect(3SOCKET), getsockopt(3SOCKET), recv(3SOCKET), send(3SOCKET), defaultrouter(4), icmp(7P), if_tcp(7P), inet(7P), ip(7P), ip6(7P), ipsec(7P), routing(7P), tcp(7P), udp(7P), net_hook_register(9F), hook_pkt_event(9S)
Braden, R., RFC 1122, Requirements for Internet Hosts − Communication Layers, Information Sciences Institute, University of Southern California, October 1989.
Postel, J., RFC 791, Internet Protocol − DARPA Internet Program Protocol Specification, Information Sciences Institute, University of Southern California, September 1981.
A socket operation may fail with one of the following errors returned:
EACCES
Setting the IP_NEXTHOP was attempted by a process lacking the PRIV_SYS_NET_CONFIG privilege.
EADDRINUSE
EADDRNOTAVAIL
EINVAL
EINVAL
EINVAL
EISCONN
EISCONN
EMSGSIZE
ENETUNREACH
ENOTCONN
ENOBUFS
ENOBUFS
EINVAL
EHOSTUNREACH
Invalid (offlink) nexthop address for IP_NEXTHOP.
EINVAL
EADDRNOTAVAIL
EADDRINUSE
ENOENT
ENOPROTOOPT
EPERM
Raw sockets should receive ICMP error packets relating to the protocol; currently such packets are simply discarded.
Users of higher-level protocols such as TCP and UDP should be able to see received IP options.
September 18, 2020 |