Vous êtes sur la page 1sur 71

address

IPv4 32 bits long (4 bytes). Address is composed of a network and a host portion, which depend on address class. Various address classes are defined: A, B, C, D, or E depending on initial few bits. The total number of IPv4 addresses is 4 294 967 296.

IPv6 128 bits long (16 bytes). Basic architecture is 64 bits for the network number and 64 bits for the host number. Often, the host portion of an IPv6 address (or part of it) will be a MAC address or other interface identifier. Depending on the subnet prefix, IPv6 has a more complicated architecture than IPv4.

The text form of the IPv4 address is nnn.nnn.nnn.nnn, where 0<=nnn<=255, and each n is a decimal digit. Leading zeros The number of IPv6 addresses is 10 28 may be omitted. Maximum number of print (79 228 162 514 264 337 593 543 950 336) times larger than the number of IPv4 addresses. characters is 15, not counting a mask. The text form of the IPv6 address is
xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx, where each x is a hexadecimal digit, representing 4

address allocation

bits. Leading zeros may be omitted. The double colon (::) may be used once in the text form of an address, to designate any number of 0 bits. For example, ::ffff:10.120.78.40 is an IPv6 IPv4-mapped address. (See RFC 2373 for details. To view this RFC, see RFC Editor (http://www.rfceditor.org/rfcsearch.html). Originally, addresses were allocated by Allocation is in the earliest stages. The Internet network class. As address space is depleted, Engineering Task Force (IETF) and Internet smaller allocations using Classless InterArchitecture Board (IAB) have recommended that Domain Routing (CIDR) are made. essentially every organization, home, or entity be Allocation has not been balanced among allocated a /48 subnet prefix length. This would leave

institutions and nations.

16 bits for the organization to do subnetting. The address space is large enough to give every person in the world their own /48 subnet prefix length.

address lifetime

Generally, not an applicable concept, except IPv6 addresses have two lifetimes: preferred and for addresses assigned using DHCP. valid, with the preferred lifetime always <= valid. After the preferred lifetime expires, the address is not to be used as a source IP address. After the valid lifetime expires, the address is not used (recognized) as a valid destination IP address for incoming packets. Some IPv6 addresses have, by definition, infinite preferred and valid lifetimes; for example link-local (see address scope). Not used (see address prefix).

address mask address prefix

Address Resolution Protocol (ARP)

address scope

Used to designate network from host portion. Sometimes used to designate network from Used to designate the subnet prefix of an address. host portion. Sometimes written as /nn Written as /nnn (up to 3 decimal digits, 0 <= nnn <= suffix on presentation form of address. 128) suffix after the print form. An example is fe80::982:2a5c/10, where the first 10 bits comprise the subnet prefix. Address Resolution Protocol is used by IPv6 embeds these functions within IP itself as part of IPv4 to find a physical address, such as the the algorithms for stateless autoconfiguration and MAC or link address, associated with an neighbor discovery using Internet Control Message IPv4 address. Protocol version 6 (ICMPv6). Hence, there is no such thing as ARP6. For unicast addresses, the concept does not In IPv6, address scope is part of the architecture. apply. There are designated private address Unicast addresses have 3 defined scopes, including ranges and loopback. Outside of that, link-local, site-local and global; and multicast

addresses are assumed to be global.

addresses have 14 scopes. Default address selection for both source and destination takes scope into account. A scope zone is an instance of a scope in a particular network. As a consequence, IPv6 addresses sometimes have to be entered or associated with a zone ID. The syntax is %zid where zid is a number (usually small) or a name. The zone ID is written after the address and before the prefix. For example, 2ba::1:2:14e:9a9b:c%3/48.

address types communications trace

Unicast, multicast, and broadcast.

configuration

Domain Name System (DNS)

Unicast, multicast, and anycast. See IPv6 address types for descriptions. A tool to collect a detailed trace of TCP/IP Same for IPv6, and IPv6 is supported, including (and other) packets that enter and leave an ICMPv6 and IPv6 packets tunneled in IPv4. iSeries server. Configuration must be done on a newly Configuration is optional, depending on functions installed system before it can communicate; required. An appropriate Ethernet or tunnel interface that is, IP addresses and routes must be must be designated as an IPv6 interface, using iSeries assigned. Navigator. Once that is done, IPv6 interfaces are selfconfiguring. So, the system will be able to communicate with other IPv6 systems that are local and remote, depending on the type of network and whether an IPv6 router exists. Applications accept host names and then Same for IPv6. Support for IPv6 exists using AAAA use DNS to get an IP address, using socket (quad A) record type and reverse lookup (IP-to-name). API gethostbyname(). An application may elect to accept IPv6 addresses from DNS (or not) and then use IPv6 to communicate Applications also accept IP addresses and (or not).

then use DNS to get host names using gethostbyaddr().

The socket API gethostbyname() is unchanged for IPv6 and the getaddrinfo() API can be used to obtain (at application choice) IPv6 only, or IPv4 and For IPv4, the domain for reverse lookups is IPv6 addresses. in-addr.arpa. For IPv6, the domain used for reverse nibble lookups is ip6.arpa, and if not found then ip6.int (see API getnameinfo()). Dynamic Host Configuration Protocol (DHCP) File Transfer Protocol (FTP) fragments Used to dynamically obtain an IP address and other configuration information. File Transfer Protocol allows you to send and receive files across networks. When a packet is too big for the next link over which it is to travel, it can be fragmented by the sender (host or router). Currently, DHCP does not support IPv6. Currently, FTP does not support IPv6.

host table

interface

For IPv6, fragmentation can only occur at the source node, and reassembly is only done at the destination node. Currently, the fragmentation extension header is not supported. On iSeries Navigator, a configurable table Currently, this table does not support IPv6. Customers that associates an Internet address with a need to configure a AAAA record in a DNS for IPv6 host name; for example, 127.0.0.1, domain resolution. You may run the DNS locally on loopback. This table is used by the sockets the same system as the resolver, or you may run it on name resolver, either before a DNS lookup a different system. or after a DNS lookup fails (determined by host name search priority). The conceptual or logical entity used by Same concept as IPv4. TCP/IP to send and receive packets and always closely associated with an IPv4 Can be started and stopped independently of each address, if not named with an IPv4 address. other and independently of TCP/IP using iSeries Sometimes referred to as a logical interface. Navigator only.

Internet Control Message Protocol (ICMP)

Can be started and stopped independently of each other and independently of TCP/IP using STRTCPIFC and ENDTCPIFC commands and using iSeries Navigator. ICMP is used by IPv4 to communicate Used similarly for IPv6; however, Internet Control network information. Message Protocol version 6 (ICMPv6) provides some new attributes. Basic error types remain, such as destination unreachable, echo request and reply. New types and codes are added to support neighbor discovery and related functions. IGMP is used by IPv4 routers to find hosts Replaced by MLD (multicast listener discovery) that want traffic for a particular multicast protocol for IPv6. Does essentially what IGMP does group, and used by IPv4 hosts to inform for IPv4, but uses ICMPv6 by adding a few MLDIPv4 routers of existing multicast group specific ICMPv6 type values. listeners (on the host). Variable length of 20-60 bytes, depending Fixed length of 40 bytes. There are no IP header on IP options present. options. Generally, the IPv6 header is simpler than the IPv4 header. Various options that may accompany an IP The IPv6 header has no options. Instead, IPv6 adds header (before any transport header). additional (optional) extension headers. The extension headers are AH and ESP (unchanged from IPv4), hopby-hop, routing, fragment, and destination. Currently, IPv6 does not support any extension headers. The protocol code of the transport layer or The type of header immediately following the IPv6 packet payload; for example, ICMP. header. Uses the same values as the IPv4 protocol field. But the architectural effect is to allow a

Internet Group Management Protocol (IGMP)

IP header

IP header options

IP header protocol byte

currently defined range of next headers, and is easily extended. The next header will be a transport header, an extension header, or ICMPv6. IP header Type of Service (TOS) Used by QoS and differentiated services to Designates the IPv6 traffic class, similarly to IPv4. byte designate a traffic class. Uses different codes. Currently, IPv6 does not support TOS. iSeries Navigator support iSeries Navigator provides a full The optional configuration for IPv6 is provided in full configuration function for TCP/IP. by iSeries Navigator, including the IPv6 Configuration wizard. LAN connection Used by an IP interface to get to the IPv6 has the same concept. Currently, only the 2838 physical network. Many types exist; for and 2849 Ethernet cards and tunnel lines are example, token ring, Ethernet, and PPP. supported. Sometimes referred to as the physical interface, link, or line. Layer 2 Tunnel Protocol (L2TP) L2TP can be thought of as virtual PPP, and Currently, L2TP does not support IPv6. works over any supported line type. loopback address An interface with address of 127.*.*.* The concept is the same as in IPv4, and the single (typically 127.0.0.1) that can only be used loopback address is 0000:0000:0000:0000:0000:0000:0000:0001 or by a node to send packets to itself. The ::1 (shortened version). The virtual physical interface physical interface (line description) is named *LOOPBACK. is named *LOOPBACK6. Maximum Transmission Unit Maximum transmission unit of a link is the IPv6 has an architected lower bound on MTU of 1280 (MTU) maximum number of bytes that a particular bytes. That is, IPv6 will not fragment packets below link type, such as Ethernet or modem, this limit. To send IPv6 over a link with less than supports. For IPv4, 576 is the typical 1280 MTU, the link-layer must transparently fragment minimum. and defragment the IPv6 packets. netstat A tool to look at status of TCP/IP Same for IPv6, and IPv6 is supported for both 5250 connections, interfaces, or routes. Available and iSeries Navigator.

Network Address Translation (NAT)

network table

node info query

packet filtering

packet forwarding

packet tunneling

using iSeries Navigator and 5250. Basic firewall functions integrated into Currently, NAT does not support IPv6. More TCP/IP, configured using iSeries Navigator. generally, IPv6 does not require NAT. The expanded address space of IPv6 eliminates the address shortage problem and enables easier renumbering. On iSeries Navigator, a configurable table Currently, no changes are made to this table for IPv6. that associates a network name with an IP address without mask. For example, host Network14 and IP address 1.2.3.4. Does not exist. A simple and convenient network tool that should work like ping, except with content: an IPv6 node may query another IPv6 node for the target's DNS name, IPv6 unicast address, or IPv4 address. Currently, not supported. Basic firewall functions integrated into Currently, packet filtering does not support IPv6. TCP/IP, configured using iSeries Navigator. However, IPv4 filtering can be applied to tunneled IPv6 traffic. The iSeries server can be configured to Currently, IPv6 packets are not forwarded. forward IP packets it receives for nonlocal IP addresses. Typically, the inbound interface and outbound interface are connected to different LANs. In IPv4, tunneling occurs in VPN for For IPv6, tunneling in IPv4 packets is expected to be a tunnel-mode VPN connections (IPv4 major part of its evolution. Currently, at least 5 tunneled in IPv4) and in L2TP. different types of 6-in-4 tunneling are defined by IETF, each with different attributes and advantages. A basic and flexible type of IPv6-in-IPv4 tunneling is supported to allow IPv6 nodes to communicate across

PING Point-to-Point Protocol (PPP) port restrictions

ports

private and public addresses

the existing IPv4 Internet. Called configured tunneling, it provides a virtual point-to-point link between two IPv6 nodes and uses a new type of tunnel line called *TNLCFG64. Basic TCP/IP tool to test reachability. Same for IPv6, and IPv6 is supported, for both 5250 Available using iSeries Navigator and 5250. and iSeries Navigator. PPP supports dial-up interfaces over various Currently, PPP does not support IPv6. modem and line types. These iSeries panels allow a customer to Not supported for IPv6. Configured restrictions apply configure selected port number or port only to IPv4. number ranges for TCP or UDP so that they are only available for a specific profile. TCP and UDP have separate port spaces, For IPv6, ports work the same as IPv4. Because these each identified by port numbers in the range are in a new address family, there are now four 1-65535. separate port spaces. For example, there are two TCP port 80 spaces to which an application can bind, one in AF_INET and one in AF_INET6. All IPv4 addresses are public, except for IPv6 has an analogous concept, but with important three address ranges that have been differences. designated as private by IETF RFC 1918: 10.*.*.* (10/8), 172.16.0.0 through Addresses are public or temporary, previously termed anonymous. See RFC 3041. Unlike IPv4 private 172.31.255.255 (172.16/12) , and addresses, temporary addresses can be globally 192.168.*.* (192.168/16). Private address domains are commonly used within routed. The motivation is also different; IPv6 organizations. Private addresses cannot be temporary addresses are meant to shield the identity of a client when it initiates communication (a privacy routed across the Internet. concern). Temporary addresses have a limited lifetime, and do not contain an interface identifier that

is a link (MAC) address. They are generally indistinguishable from public addresses. IPv6 has the notion of limited address scope using its architected scope designations (see address scope). The table supports IPv6 without change.

protocol table

Quality of service (QoS)

On iSeries Navigator, a configurable table that associates a protocol name with its assigned protocol number; for example, UDP, 17. The system is shipped with a small number of entries: IP, TCP, UDP, ICMP. Quality of service allows you to request packet priority and bandwidth for TCP/IP applications.

renumbering

route

Currently, QoS does not support IPv6. However, when IPv6 is tunneled in IPv4, existing iSeries QoS facilities can be applied to the IPv4 traffic, which will then transparently handle the IPv6 payloads. Done by manual reconfiguration, with the Is an important architectural element of IPv6, and is possible exception of DHCP. Generally, for supposed to be largely automatic especially within the a site or organization, a difficult and /48 prefix. troublesome process to avoid if possible. Logically, a mapping of a set of IP Conceptually, the same as IPv4. One important addresses (may contain only 1) to a physical difference: IPv6 routes are associated (bound) to a interface and a single next hop IP address. physical interface (a link, such as *TNLCFG64 or IP packets whose destination address is ETH03) rather than an interface. There are various defined as part of the set are forwarded to reasons for this. One reason is that source address the next hop using the line. IPv4 routes are selection functions differently for IPv6 than for IPv4. associated with an IPv4 interface, hence, an See source address selection. IPv4 address. Duplicate routes are allowed to improve robustness,

Routing Information Protocol (RIP) services table

The default route is *DFTROUTE. RIP is a routing protocol supported by the routed daemon. On the iSeries server, a configurable table that associates a service name with a port and protocol; for example, service name FTP-control, port 21, TCP and UDP. A large number of well-known services are listed in the services table. Many applications use this table to determine which port to use. SNMP is a protocol for system management. These APIs are the way applications use TCP/IP. Applications that do not need IPv6 are not affected by sockets changes to support IPv6.

but they are ignored during route lookup. Currently, RIP does not support IPv6. IPv6 routing uses static routes. No changes are made to this table for IPv6.

Simple Network Management Protocol (SNMP) sockets API

Currently, SNMP does not support IPv6. IPv6 routing uses static routes. IPv6 enhances sockets so that applications can now use IPv6, using a new address family: AF_INET6. The enhancements have been designed so that existing IPv4 applications are completely unaffected by IPv6 and API changes. Applications that want to support concurrent IPv4 and IPv6 traffic, or IPv6-only traffic, are easily accommodated using IPv4-mapped IPv6 addresses of the form ::ffff:a.b.c.d, where a.b.c.d is the IPv4 address of the client. The new APIs also include support for converting IPv6 addresses from text to binary and from binary to

text. See Use AF_INET6 address family for more information on sockets enhancements for IPv6. An application may designate a source IP As with IPv4, an application may designate a source (typically, using sockets bind()) . If it IPv6 address using bind(). Similarly to IPv4, it can binds to INADDR_ANY, a source IP is let the system choose an IPv6 source address by using chosen based on the route. in6addr_any. But since IPv6 lines have many IPv6 addresses, the internal method of choosing a source IP is different. Use STRTCP and ENDTCP to start or end Same as IPv4. IPv4 and IPv6 are not started or TCP/IP. stopped independently of one another or independently of TCP/IP. That is, you start and stop all of TCP/IP, not just IPv4 or IPv6. Any IPv6 interfaces are automatically started if the AUTOSTART parameter = *YES (the default). IPv6 cannot be used or configured without IPv4, and IPv6 must have IPv6 loopback configured (::1). Telnet Telnet allows you to log on and use a Currently, Telnet does not support IPv6. remote computer as though you were connected to it directly. Basic TCP/IP tool to do path determination. Same for IPv6, and IPv6 is supported for both 5250 Available using iSeries Navigator and 5250. and iSeries Navigator. TCP, UDP, RAW. A new transport, Stream Same three transports exist and are functionally Control Transmission Protocol (SCTP), unchanged for IPv6. aims to offer the best features of TCP and UDP, that is, guaranteed connectionless

source address selection

starting and stopping

trace route transport layers

unspecified address

communication. SCTP is in the earliest stage of use, and is not supported on iSeries. Apparently, not defined, as such. Socket Defined as ::/128 (128 0 bits). It is used as the programming uses 0.0.0.0 as source IP in some neighbor discovery packets, and INADDR_ANY. various other contexts, like sockets. Socket programming uses ::/128 as in6addr_any. Currently, VPN does not support IPv6. However, when IPv6 is tunneled in IPv4, existing iSeries VPN facilities can be applied to the IPv4 traffic, which then transparently handles the IPv6 payloads.

virtual private networking (VPN) Virtual private networking (using IPsec) allows you to extend a secure, private network over an existing public network.

Sockets are an inter-process network communication implementation using a Internet Protocol (IP) stack on an Ethernet transport. Sockets are language and protocol independent and available to "C", Perl, Python, Ruby and Java (and more) programmers. The "C" language BSD API is used on Linux, all popular variants of Unix, Microsoft Windows (NT,2000,XP,... and later) and even embedded OSs like VxWorks. It is by far the most popular implementation of interprocess network communication. Sockets allow one process to communicate with another whether it is local on the same computer system or remote over the network. Many other higher level protocols are built upon sockets technology. The sockets API provides many configuration options so we will try and cover the socket API components and then give examples of a few implementations. It would be very difficult to cover all variations of its use. Sockets utilize the following standard protocols: Protocol IP Description Internet Protocol provides network routing using IP addressing eg 192.168.1.204

UDP TCP

User Datagram Protocol - IP with ports to distinguish among processes running on same host. No data verification. Transmission Control Protocol - IP with ports to distinguish among processes running on same host. Connection oriented, stream transfer, full duplex, reliable with data verification.

BSD socket API:


Typically one configures a socket server to which a socket client may attach and communicate. The IP protocol layer will also require that the domain name or IP addresses of the communicating processes be made known as well. Within the IP protocol it is also important to provide the mechanism used: TCP or UDP. The BSD is a "C" programming API. Examples shown are compiled using the GNU C++ compiler on Linux:

Basic steps in using a socket:

Socket include files: Include File sys/types.h netinet/in.h Types used in sys/socket.h and netinet/in.h Internet domain address structures and functions Description

netinet/tcp.h Socket option macro definitions, TCP headers, enums, etc sys/socket.h Structures and functions used for socket API.i accept(), bind(), connect(), listen(), recv(), send(), setsockopt(), shutdown(), etc ... netdb.h Used for domain/DNS hostname lookup

sys/select.h Used by the select(), pselect() functions and defines FD_CLR, FD_ISSET, FD_SET, FD_ZERO macros

sys/time.h arpa/inet.h unistd.h errno.h

uses argument of type struct timeval and pselect() uses struct timespec defined by this include file.
select()

Definitions for internet operations. Prototypes functions such as htonl(), htons(), ntohl(), ntohs(), inet_addr(), inet_ntoa(), etc ... Defines constants and types Defines sytem error numbers

Create the socket instance: Open a socket using TCP: Basic declarations and call to "socket". view source print?
01 02 03 04 05 06 07 08 09 10 11 12 13 14 #include #include #include #include #include #include #include <iostream> <sys/types.h> <netinet/in.h> <sys/socket.h> <netdb.h> <unistd.h> <errno.h> // // // // Types used in sys/socket.h and netinet/in.h Internet domain address structures and functions Structures and functions used for socket API Used for domain/DNS hostname lookup

using namespace std; main() { int socketHandle;

15 16 17 18 19 20 21 22 23 24 25 }

// create socket if((socketHandle = socket(AF_INET, SOCK_STREAM, IPPROTO_IP)) < 0) { close(socketHandle); exit(EXIT_FAILURE); } ... ...

Socket function prototype:


int socketHandle = socket(int socket_family, int socket_type, int protocol);

Choose socket communications family/domain:


o o o

Internet IPV4: AF_INET Internet IPV6: AF_INET6 Unix path name (communicating processes are on the same system): AF_UNIX

Choose socket type:


o o o

TCP: SOCK_STREAM UDP: SOCK_DGRAM Raw protocol at network layer: SOCK_RAW

Choose socket protocol: (See /etc/protocols)

o o o

Internet Protocol (IP): 0 or IPPROTO_IP ICMP: 1 ...

Also see: socket man page protocols man page Configure the socket as a client or server:
o o

Comparison of sequence of BSD API calls: Socket Server socket() bind() listen() accept() recv()/send() close() connect() recv()/send() close() Socket Client socket() gethostbyname()

This is specific to whether the application is a socket client or a socket server.


o

Socket server: bind(): bind the socket to a local socket address. This assigns a name to the socket. listen(): listen for connections on a socket created with "socket()" and "bind()" and accept incoming connections. This is used for TCP and not UDP. Zero is returned on success.

accept(): accept a connection on a socket. Accept the first connection request on the queue of pending connections, create a new connected socket with mostly the same properties as defined by the call to "socket()", and allocate a new file descriptor for the socket, which is returned. The newly created socket is no longer in the listening state. Note this call blocks until a client connects.

view source print?


01 ... 02 ... 03 #define MAXHOSTNAME 256 04 ... 05 ... 06 07 struct sockaddr_in socketAddress; 08 char sysHost[MAXHOSTNAME+1]; // Hostname of this computer we are running on 09 struct hostNamePtr *hPtr; 10 int portNumber = 8080; 11 12 bzero(&socketInfo, sizeof(sockaddr_in)); // Clear structure memory 13 14 // Get system information 15 16 gethostname(sysHost, MAXHOSTNAME); // Get the name of this computer we are running on 17 if((hPtr = gethostbyname(sysHost)) == NULL) 18 { 19 cerr << "System hostname misconfigured." << endl; 20 exit(EXIT_FAILURE); 21 } 22

23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53

// Load system information into socket data structures socketInfo.sin_family = AF_INET; // Use any address available to the system. This is a typical configuration for a server. // Note that this is where the socket client and socket server differ. // A socket client will specify the server address to connect to. socketInfo.sin_addr.s_addr = htonl(INADDR_ANY); // Translate long integer to network byte order. socketInfo.sin_port = htons(portNumber); // Set port number // Bind the socket to a local socket address if( bind(socketHandle, (struct sockaddr *) &socketInfo, sizeof(struct sockaddr_in)) < 0) { close(socketHandle); perror("bind"); exit(EXIT_FAILURE); } listen(socketHandle, 1); int socketConnection; if( (socketConnection = accept(socketHandle, NULL, NULL)) < 0) { close(socketHandle); exit(EXIT_FAILURE); } ... ... // read/write to socket here

Socket functions: bind(): Function prototype:

int bind(int sockfd, struct sockaddr *my_addr, socklen_t addrlen);

Bind arguments:

int sockfd: Socket file descriptor. Returned by call to "socket". struct sockaddr: Socket information structure socklen_t addrlen: Size of structure Returns 0: Sucess, -1: Failure and errno may be set.

Also see the bind man page listen(): Function prototype:


int listen(int s, int backlog);

Listen arguments: int s: Socket file descriptor. Returned by call to "socket". Identifies a bound but unconnected socket. int backlog: Set maximum length of the queue of pending connections for the listening socket. A reasonable value is 10. Actual maximum permissible: SOMAXCONN Example: int iret = listen(socketHandle, SOMAXCONN); The include file sys/socket.h will include /usr/include/bits/socket.h which defines the default value for SOMAXCONN as 128.

The actual value set for the operating system: cat /proc/sys/net/core/somaxconn In kernels before 2.4.25, this limit was a hard coded value and thus would require a kernel recompile with the SOMAXCONN value as defined in /usr/include/linux/socket.h For very heavy server use, modify the system limit in the proc file and set "backlog" to the same value (eg. 512). Returns 0: Sucess, -1: Failure and errno may be set. Also see the listen man page accept(): Function prototype:
int accept(int s, struct sockaddr *addr, socklen_t *addrlen);

Accept arguments: int s: Socket file descriptor. Returned by call to "socket". struct sockaddr *addr: Pointer to a sockaddr structure. This structure is filled in with the address of the connecting entity. socklen_t *addrlen: initially contains the size of the structure pointed to by addr; on return it will contain the actual length (in bytes) of the address returned. When addr is NULL nothing is filled in. Returns: Success: non-negative integer, which is a descriptor of the accepted socket. Argument "addrlen" will have a return value. Fail: -1, errno may be set

Also see the accept man page


o

[Potential Pitfall]: If you get the following message:

o bind: Address already in use o The solution is to choose a o o o

different port or kill the process which is using the port and creating the conflict. You may have to be root to see all processes with netstat.
netstat -punta | grep 8080 tcp 0 0 :::8080 :::* LISTEN

Socket client: connect(): initiate a connection with a remote entity on a socket. Zero is returned on success. Support both TCP (SOCK_STREAM) and UDP (SOCK_DGRAM). For SOCK_STREAM, an actual connection is made. For SOCK_DGRAM the address is the address to which datagrams are sent and received.

view source print?


01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 ... ... struct sockaddr_in remoteSocketInfo; struct hostent *hPtr; int socketHandle; char *remoteHost="dev.megacorp.com"; int portNumber = 8080; bzero(&remoteSocketInfo, sizeof(sockaddr_in)); // Clear structure memory // Get system information if((hPtr = gethostbyname(remoteHost)) == NULL) { cerr << "System DNS name resolution not configured properly." << endl; cerr << "Error number: " << ECONNREFUSED << endl; exit(EXIT_FAILURE); } // Load system information for remote socket server into socket data structures memcpy((char *)&remoteSocketInfo.sin_addr, hPtr->h_addr, hPtr->h_length); remoteSocketInfo.sin_family = AF_INET;

19 20 21 22 23 24 25 26 o o o

remoteSocketInfo.sin_port = htons((u_short)portNumber); // Set port number if( (connect(socketHandle, (struct sockaddr *)&remoteSocketInfo, sizeof(sockaddr_in)) < 0) { close(socketHandle); exit(EXIT_FAILURE); } ... ...

Connect function prototype:


int connect(int sockfd, const struct sockaddr *serv_addr, socklen_t addrlen); Connect arguments: (Same as server's bind() arguments) int sockfd: Socket file descriptor. Returned by call to "socket". struct sockaddr: Socket information structure socklen_t addrlen: Size of structure Returns 0: Sucess, -1: Failure and errno may be set.

Zero is returned upon success and on error, -1 and errno is set appropriately. Also see the connect man page

The sockaddr_in data structure: /usr/include/linux/in.h view source print?


01 /* Internet address. */ 02 struct in_addr { 03 __u32 s_addr; 04 }; 05 /* Defined as 32 or 64 bit address (system dependent) */

06 07 08 09 10 11 12 13 14 15 16

/* Structure describing an Internet (IP) socket address. */ #define __SOCK_SIZE__ 16 /* sizeof(struct sockaddr) struct sockaddr_in { sa_family_t sin_family; /* Address family unsigned short int struct in_addr sin_port; sin_addr; /* Port number /* Internet address

*/ */ */ */

/* Pad to size of `struct sockaddr'. */ unsigned char }; __pad[__SOCK_SIZE__ - sizeof(short int) sizeof(unsigned short int) - sizeof(struct in_addr)];

Note:

IP addresses: Note the specific IP address could be specified:


#include <arpa/inet.h> // IP from string conversion socketInfo.sin_addr.s_addr = inet_addr("127.0.0.1"); cout << inet_ntoa(socketInfo.sin_addr) << endl;

or bind to all network interfaces available:


socketInfo.sin_addr.s_addr = htonl(INADDR_ANY);

Port numbers: Note the specific port can be specified:


socketInfo.sin_port = htons(8080); cout << ntohs(socketInfo.sin_port) << endl;

or let the system define one for you:

socketInfo.sin_port = htons(INADDR_ANY);

Read from or write to socket: Use send() and recv(), or write() and read(), or sendto() and recvfrom() to read/write to/from a socket.
o

TCP recv() or UDP recvfrom(): receive a message from a socket view source print?
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 char *pcIpAddress; unsigned short shPort; ... ... if (iSocketType == SOCK_STREAM) { rc = recv(socketHandle, (char*) _pcMessage, (int) _iMessageLength, 0); if ( rc == 0 ) { cerr << "ERROR! Socket closed" << endl; } else if (rc == -1) { cerr << "ERROR! Socket error" << endl; closeSocket(); } } else if (iSocketType == SOCK_DGRAM) { int iLength; struct sockaddr_in stReceiveAddr;

22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 o

iLength = (int) sizeof(struct sockaddr_in); memset((void*) &stReceiveAddr, 0, iLength); rc = recvfrom(socketHandle, (char*) _pcMessage, (int) _iMessageLength, 0, (struct sockaddr *) &stReceiveAddr, (socklen_t*) &iLength)) if{ rc == 0 ) { cerr << "ERROR! Socket closed" << endl; } else if (rc == -1) { cerr << "ERROR! Socket error" << endl; closeSocket(); } } pcIpAddress = inet_ntoa(stReceiveAddr.sin_addr); shPort = ntohs(stReceiveAddr.sin_port); cout << "Socket Received: " << _iNumRead << " bytes from " << pcIpAddress << ":" << shPort << endl; ... ...

read(): read a specific number of bytes from a file descriptor view source print?
01 02 03 04 05 int rc = 0; // Actual number of bytes read by function read() int count = 0; // Running total "count" of bytes read int numToRead = 32; // Number of bytes we want to read each pass char buf[512]; ...

06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 o

... while(bcount < numToRead) { // rc is the number of bytes returned. if( (rc = read(socketHandle, buf, numToRead - count)) > 0); { count += rc; buf += rc; // Set buffer pointer for next read } else if(rc < 0) { close(socketHandle); exit(EXIT_FAILURE); } cout << "Number of bytes read: " << count << endl; cout << "Received: " << buf << endl; } ... ...

send(): send a message from a socket. Used only when in a connected state. The only difference between send and write is the presence of flags. With zero flags parameter, send is equivalent to write. view source print?
01 #include <string.h> 02 03 ... 04 ... 05

06 07 08 09 10 11 12 13 14 15 16 o

char buf[512]; strcpy(buf,"Message to send"); ... ... send(socketHandle, buf, strlen(buf)+1, 0); ... ...

TCP send() or UDP sendto(): view source print?


01 02 03 04 05 06 07 08 09 10 11 12 13 int iSocketType; int iBytesSent = 0; char *pMessage = "message to send"; int iMessageLength = 16; sockaddr pSendAddress; ... ... // number of bytes (includes NULL termination)

if (iSocketType == SOCK_STREAM) { if ((iBytesSent = send(socketHandle, (char*) pMessage, (int) iMessageLength, 0)) < 0 ) { cerr << "Send failed with error " << errno << endl;

14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34

close(socketHandle); } } else if (iSocketType == SOCK_DGRAM) { if ((iBytesSent = sendto(socketHandle, (char*) pMessage, (int) iMessageLength, 0, (struct sockaddr*) pSendAddress, (int) sizeof(struct sockaddr_in))) < 0 ) { cerr << "Sendto failed with error " << errno << endl; close(); } } else { // Failed - Socket type not defined } ... ...

Close the socket when done: view source print?


1 #include <unistd.h> 2 3 ... 4

5 6 7

close(socketHandle); ...

This is the "C" library function to close a file descriptor. Returns zero on success, or -1 if an error occurred.

Socket Server:
Simple Socket Server: File: server.cpp view source print?
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 #include <iostream> #include <sys/types.h> #include <sys/socket.h> #include <netdb.h> #define MAXHOSTNAME 256 using namespace std; main() { struct sockaddr_in socketInfo; char sysHost[MAXHOSTNAME+1]; // Hostname of this computer we are running on struct hostent *hPtr; int socketHandle; int portNumber = 8080;

16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45

bzero(&socketInfo, sizeof(sockaddr_in)); // Get system information

// Clear structure memory

gethostname(sysHost, MAXHOSTNAME); // Get the name of this computer we are running on if((hPtr = gethostbyname(sysHost)) == NULL) { cerr << "System hostname misconfigured." << endl; exit(EXIT_FAILURE); } // create socket if((socketHandle = socket(AF_INET, SOCK_STREAM, 0)) < 0) { close(socketHandle); exit(EXIT_FAILURE); } // Load system information into socket data structures socketInfo.sin_family = AF_INET; socketInfo.sin_addr.s_addr = htonl(INADDR_ANY); // Use any address available to the system socketInfo.sin_port = htons(portNumber); // Set port number // Bind the socket to a local socket address if( bind(socketHandle, (struct sockaddr *) &socketInfo, sizeof(socketInfo)) < 0) { close(socketHandle);

46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 }

perror("bind"); exit(EXIT_FAILURE); } listen(socketHandle, 1); int socketConnection; if( (socketConnection = accept(socketHandle, NULL, NULL)) < 0) { exit(EXIT_FAILURE); } close(socketHandle); int rc = 0; // Actual number of bytes read char buf[512]; // // // // rc is the number of characters returned. Note this is not typical. Typically one would only specify the number of bytes to read a fixed header which would include the number of bytes to read. See "Tips and Best Practices" below.

rc = recv(socketConnection, buf, 512, 0); buf[rc]= (char) NULL; // Null terminate string cout << "Number of bytes read: " << rc << endl; cout << "Received: " << buf << endl;

Forking Socket Server:

In order to accept connections while processing previous connections, use fork() to handle each connection. Use establish() and get_connection() to allow multiple connections. File: serverFork.cpp view source print?
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 #include <iostream> #include <sys/types.h> #include <sys/socket.h> #include <sys/wait.h> #include <netdb.h> #include <errno.h> #include <unistd.h> #include <signal.h> #include <netinet/in.h> #define MAXHOSTNAME 256 using namespace std; // Catch signals from child processes void handleSig(int signum) { while(waitpid(-1, NULL, WNOHANG) > 0); } main() { struct sockaddr_in socketInfo; char sysHost[MAXHOSTNAME+1]; // Hostname of this computer we are running on struct hostent *hPtr;

24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53

int socketHandle; int portNumber = 8080; signal(SIGCHLD, handleSig); bzero(&socketInfo, sizeof(sockaddr_in)); // Get system information gethostname(sysHost, MAXHOSTNAME); // Get the name of this computer we are running on if((hPtr = gethostbyname(sysHost)) == NULL) { cerr << "System hostname misconfigured." << endl; exit(EXIT_FAILURE); } // create socket if((socketHandle = socket(AF_INET, SOCK_STREAM, 0)) < 0) { close(socketHandle); exit(EXIT_FAILURE); } // Load system information into socket data structures socketInfo.sin_family = AF_INET; socketInfo.sin_addr.s_addr = htonl(INADDR_ANY); // Use any address available to the system socketInfo.sin_port = htons(portNumber); // Set port number // Clear structure memory

54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85

// Bind the socket to a local socket address if( bind(socketHandle, (struct sockaddr *) &socketInfo, sizeof(struct sockaddr_in)) < 0) { close(socketHandle); perror("bind"); exit(EXIT_FAILURE); } listen(socketHandle, 1); int socketConnection; for(;;) // infinite loop to handle remote connections. This should be limited. { if( (socketConnection = accept(socketHandle, NULL, NULL)) < 0) { close(socketHandle); if(errno == EINTR) continue; perror("accept"); exit(EXIT_FAILURE); } switch(fork()) { case -1: perror("fork"); close(socketHandle); close(socketConnection); exit(EXIT_FAILURE); case 0: // Child process - do stuff close(socketHandle); // Do your server stuff like read/write messages to the socket here! exit(0);

86 87 88 89 90 91 92 }

default: // Parent process, look for another connection close(socketConnection); continue; } }

For more on the use of the fork() function see the YoLinux.com fork() tutorial.

Socket Client:
Simple Socket client: File: client.cpp view source print?
01 02 03 04 05 06 07 08 09 10 11 12 13 14 #include <iostream> #include <string.h> #include <sys/types.h> #include <sys/socket.h> #include <netdb.h> #include <unistd.h> #include <errno.h> #define MAXHOSTNAME 256 using namespace std; main() { struct sockaddr_in remoteSocketInfo; struct hostent *hPtr;

15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44

int socketHandle; char *remoteHost="localhost"; int portNumber = 8080; bzero(&remoteSocketInfo, sizeof(sockaddr_in)); // Get system information if((hPtr = gethostbyname(remoteHost)) == NULL) { cerr << "System DNS name resolution not configured properly." << endl; cerr << "Error number: " << ECONNREFUSED << endl; exit(EXIT_FAILURE); } // create socket if((socketHandle = socket(AF_INET, SOCK_STREAM, 0)) < 0) { close(socketHandle); exit(EXIT_FAILURE); } // Load system information into socket data structures memcpy((char *)&remoteSocketInfo.sin_addr, hPtr->h_addr, hPtr->h_length); remoteSocketInfo.sin_family = AF_INET; remoteSocketInfo.sin_port = htons((u_short)portNumber); // Set port number if(connect(socketHandle, (struct sockaddr *)&remoteSocketInfo, sizeof(sockaddr_in)) < 0) // Clear structure memory

45 46 47 48 49 50 51 52 53 54 55 }

{ close(socketHandle); exit(EXIT_FAILURE); } int rc = 0; // Actual number of bytes read by function read() char buf[512]; strcpy(buf,"Message to send"); send(socketHandle, buf, strlen(buf)+1, 0);

Test Simple Client and Server Socket program:


Note that this runs on a single system using "localhost". Compile the simple client and the simple server:
g++ server.cpp -o server g++ client.cpp -o client

Start the server: ./server This will block at the "accept()" call and await a connection from a client. Start the client: ./client This will connect to the server and write the message "Message to send". The server will receive the message and write it out.

Name Resolution and Network Information Lookup:

Network information data lookup for:


DNS: Name resolution associates IP address to a hostname using DNS (Domain Name System) name servers. Name resolution invokes a series of library routines to query the name servers. TCP/IP ports and associated services Network Protocols Network name information Function Description returns hostname of local host returns a structure of type hostent for the given host name returns a servent structure for the line that matches the port port given in network byte order using protocol proto. If proto is NULL, any protocol will be matched. returns a structure servent containing the broken out fields from the line in /etc/services returns a structure protoent containing the broken out fields from the line in /etc/protocols returns a protoent structure for the line that matches the protocol number returns a structure protoent containing the broken out fields from the line in /etc/protocols a structure netent containing the broken out fields from the line in
/etc/networks

gethostname(char *name, size_t len) gethostbyname(const char *name) getservbyport(int port, const char *proto)

getservbyname(const char *name,const char *proto) returns a structure of type servent for the given host name and protocol

getservent(void) getprotobyname(const char *name) getprotobynumber(int proto) getprotoent(void) getnetbyname(const char *name)

getnetbyaddr(long net, int type) getaddrinfo(const char *node, const char *service, const struct addrinfo *hints, struct addrinfo **res) void freeaddrinfo(struct addrinfo *res) getnetent(void) Data Structures returned: hostent defined in <netdb.h> view source print?
1 struct hostent { 2 char *h_name; 3 char **h_aliases; 4 int h_addrtype; 5 int h_length; 6 char **h_addr_list; 7}

returns a netent structure for the line that matches the network number net of type "type" Returns 0 if it succeeds or an error code. Network address and service translation freeaddrinfo() frees the memory that was allocated by getaddrinfo() returns a structure netent containing the broken out fields from the line in /etc/networks

/* official name of host */ /* alias list */ /* host address type */ /* length of address */ /* list of addresses */

Lookup of data in /etc/hosts or from DNS resolution. servent defined in <netdb.h> view source print?
1 struct servent { 2 char *s_name; 3 char **s_aliases; /* official service name */ /* alias list */

4 5 6}

int char

s_port; *s_proto;

/* port number */ /* protocol to use */

Lookup of data in /etc/services protoent defined in <netdb.h> view source print?


1 struct protoent { 2 char *p_name; 3 char **p_aliases; 4 5} int p_proto; /* official protocol name */ /* alias list */ /* protocol number */

Lookup of data in /etc/protocols netent defined in <netdb.h> view source print?


1 struct netent { 2 char *n_name; 3 char **n_aliases; 4 5 6} /* official network name */ /* alias list */

int n_addrtype; /* net address type */ unsigned long int n_net; /* network number */

Lookup of data in /etc/networks

Socket Configuration Options:

Socket options: One can "get" (read) the current socket options or "set" them to new values. The default values are obtained from the OS: Level IPPROTO_IP Option TCP_NODELAY Type int 0 Default Description Don't delay send to coalesce packets. If set, disable the Nagle algorithm. When not set, data is buffered until there is a sufficient amount to send out, thereby avoiding the frequent sending of small packets, which results in poor utilization of the network. Don't use with TCP_CORK. This option is overridden by TCP_CORK Maximum segment size for outgoing TCP packets. TCP will impose its minimum and maximum bounds over the value provided. Control sending of partial frames. If set, don't send out partial frames. Not cross platform. When the SO_KEEPALIVE option is enabled, TCP probes a connection that has been idle for some amount of time. The default value for this idle period is 2 hours. The TCP_KEEPIDLE option can be used to affect this value for a given socket, and specifies the number of seconds of idle time between keepalive probes. Not cross platform. This option takes an int value, with a range of 1 to 32767. Specifies the interval between packets that are sent to validate the connection.

IPPROTO_IP

TCP_MAXSEG

int

536

IPPROTO_IP IPPROTO_IP

TCP_CORK TCP_KEEPIDLE

int int

0 7200

IPPROTO_IP

TCP_KEEPINTVL int

75

Not cross platform. IPPROTO_IP TCP_KEEPCNT int 9 When the SO_KEEPALIVE option is enabled, TCP probes a connection that has been idle for some amount of time. If the remote system does not respond to a keepalive probe, TCP retransmits the probe a certain number of times before a connection is considered to be broken. The TCP_KEEPCNT option can be used to affect this value for a given socket, and specifies the maximum number of keepalive probes to be sent. This option takes an int value, with a range of 1 to 32767. Not cross platform. Number of SYN retransmits that TCP should send before aborting the attempt to connect. It cannot exceed 255. Life time of orphaned FIN-WAIT-2 state. Not to be confused with option SO_LINGER Not cross platform. Allow local address reuse. If a problem is encountered when attempting to bind to a port which has been closed but not released (may take up to 2 minutes as defined by TIME_WAIT). Apply the SO_REUSEADDR socket option to release the resource immediately and to get around the TIME_WAIT state. 0 = disables, 1 = enables This option is AF_INET socket-specific. This option allows multiple processes to share a port. All incoming multicast or broadcast UDP datagrams that are destined for the port are delivered to all sockets that are bound to the port. All

IPPROTO_IP IPPROTO_IP

TCP_SYNCNT TCP_LINGER2

int int

5 60

SOL_SOCKET SO_REUSEADDR int (bool)

SOL_SOCKET SO_REUSEPORT int (bool)

processes that share the port must specify this option. 0 = disables, 1 = enables SOL_SOCKET SO_ERROR int (bool) 0 When an error occurs on a socket, set error variable so_error and notify process 0 = disables, 1 = enables Permit sending of broadcast datagrams 0 = disables, 1 = enables Send buffer size Receive buffer size Periodically test if connection is alive 0 = disables, 1 = enables Set timeout period for socket send. Disable by setting timeval.tv_sec = 0 sec, timeval.tv_usec = 0 usec (default) Affects write() writev() send() sendto() and sendmsg() Set timeout period for socket receive. Disable by setting timeval.tv_sec = 0 sec, timeval.tv_usec = 0 usec (default) Affects read() readv() recv() recvfrom() and recvmsg() Specifies how close function will operate for connection protocols (TCP) l_onoff: 0 = disables, 1 = enables l_linger: 0 = unsent data discarded, 1 = close() does not

SOL_SOCKET SO_BROADCAST int (bool) SOL_SOCKET SO_SNDBUF SOL_SOCKET SO_RCVBUF SOL_SOCKET SO_KEEPALIVE SOL_SOCKET SO_SNDTIMEO

int 16384 (value) int 87380 (value) int (bool) 0

timeval 0 (struct) 0

SOL_SOCKET SO_RCVTIMEO

timeval 0 (struct) 0

SOL_SOCKET SO_LINGER

linger 0 (struct) 0

return untill all unsent data is transmitted or remote connection is closed Structure defined in sys/socket.h SOL_SOCKET SO_RCVLOWAT SOL_SOCKET SO_SNDLOWAT SOL_SOCKET SO_TYPE int 1 (value) int 1 (value) Specifies number of bytes used as a threshold by select() to consider a socket read ready Specifies number of bytes used as a threshold by select() to consider a socket write ready

int undefined Specifies socket type (e.g., tcp (SOCK_STREAM), udp (value) (SOCK_DGRAM), etc.) For use with getsockopt() only.

IPPROTO_IP macro defines are found in /usr/include/netinet/tcp.h SOL_SOCKET macro defines require /usr/include/sys/socket.h For a full list of options see the TCP man page For a full list of IP options see the IP(7) man page Function Prototypes:
int getsockopt(int s, int level, int optname, void *optval, socklen_t *optlen); int setsockopt(int s, int level, int optname, const void *optval, socklen_t optlen);

getsockopt/setsockopt arguments:

int sockfd: Socket file descriptor. Returned by call to "socket". int level: See table above int optname: See table above void *optval: Pointer to value or data structure optlen: Length of "optval" Returns 0: Sucess, -1: Failure and errno may be set.

Code to read socket options: File: printSocketOptions.c view source print?


01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 #include #include #include #include #include <sys/socket.h> <netinet/in.h> <netinet/tcp.h> <errno.h> <stdio.h>

int main() { int socketHandle; // create socket if((socketHandle = socket(AF_INET, SOCK_STREAM, IPPROTO_IP)) < 0) { close(socketHandle); perror("socket"); } int iSocketOption = 0; int iSocketOptionLen = sizeof(int);; struct linger SocketOptionLinger; int iSocketOptionLingerLen = sizeof(struct linger);;

25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

getsockopt(socketHandle, IPPROTO_TCP, TCP_NODELAY, (char *)&iSocketOption, &iSocketOptionLen); printf("Socket TCP_NODELAY = %d\n", iSocketOption); getsockopt(socketHandle, IPPROTO_TCP, TCP_MAXSEG, (char *)&iSocketOption, &iSocketOptionLen); printf("Socket TCP_MAXSEG = %d\n", iSocketOption); getsockopt(socketHandle, IPPROTO_TCP, TCP_CORK, (char *)&iSocketOption, &iSocketOptionLen); printf("Socket TCP_CORK = %d\n", iSocketOption); getsockopt(socketHandle, IPPROTO_TCP, TCP_KEEPIDLE, (char *)&iSocketOption, &iSocketOptionLen); printf("Socket TCP_KEEPIDLE = %d\n", iSocketOption); getsockopt(socketHandle, IPPROTO_TCP, TCP_KEEPINTVL, (char *)&iSocketOption, &iSocketOptionLen); printf("Socket TCP_KEEPINTVL = %d\n", iSocketOption); getsockopt(socketHandle, IPPROTO_TCP, TCP_KEEPCNT, (char *)&iSocketOption, &iSocketOptionLen); printf("Socket TCP_KEEPCNT = %d\n", iSocketOption); getsockopt(socketHandle, IPPROTO_TCP, TCP_SYNCNT, (char *)&iSocketOption, &iSocketOptionLen); printf("Socket TCP_SYNCNT = %d\n", iSocketOption); getsockopt(socketHandle, IPPROTO_TCP, TCP_LINGER2, (char *)&iSocketOption, &iSocketOptionLen); printf("Socket TCP_LINGER2 = %d\n", iSocketOption); getsockopt(socketHandle, SOL_SOCKET, SO_REUSEADDR, (char *)&iSocketOption, &iSocketOptionLen); printf("Socket SO_REUSEADDR = %d\n", iSocketOption); getsockopt(socketHandle, SOL_SOCKET, SO_ERROR, (char *)&iSocketOption, &iSocketOptionLen); printf("Socket SO_ERROR = %d\n", iSocketOption);

55 56 57 58 59 60 61 62 63 64 65 66 67

getsockopt(socketHandle, SOL_SOCKET, SO_BROADCAST, (char *)&iSocketOption, &iSocketOptionLen); printf("Socket SO_BROADCAST = %d\n", iSocketOption); getsockopt(socketHandle, SOL_SOCKET, SO_KEEPALIVE, (char *)&iSocketOption, &iSocketOptionLen); printf("Socket SO_KEEPALIVE = %d\n", iSocketOption); getsockopt(socketHandle, SOL_SOCKET, SO_SNDBUF, (char *)&iSocketOption, &iSocketOptionLen); printf("Socket SO_SNDBUF = %d\n", iSocketOption); getsockopt(socketHandle, SOL_SOCKET, SO_RCVBUF, (char *)&iSocketOption, &iSocketOptionLen); printf("Socket SO_RCVBUF = %d\n", iSocketOption);

getsockopt(socketHandle, SOL_SOCKET, SO_LINGER, (char *)&SocketOptionLinger, &iSocketOptionLingerLen); printf("Socket SO_LINGER = %d time = %d\n", SocketOptionLinger.l_onoff, 68 SocketOptionLinger.l_linger); 69 70 getsockopt(socketHandle, SOL_SOCKET, SO_RCVLOWAT, (char *)&iSocketOption, &iSocketOptionLen); 71 printf("Socket SO_RCVLOWAT = %d\n", iSocketOption); 72 }

Compile: gcc -o printSocketOptions printSocketOptions.c getsockopt man page: get a particular socket option for the specified socket. Set socket options:

Socket "keep-alive": view source

print?
1 int iOption = 1; // Turn on keep-alive, 0 = disables, 1 = enables 2 if (setsockopt(socketHandle, SOL_SOCKET, SO_KEEPALIVE, (const char *) &iOption, 3 SOCKET_ERROR) 4{ 5 cerr << "Set keepalive: Keepalive option failed" << endl; 6}

sizeof(int)) ==

Set socket client options:

Socket re-use: view source print?


1 int iOption = 0; // Reuse address option to set, 0 = disables, 1 = enables 2 if (setsockopt(socketHandle, SOL_SOCKET, SO_REUSEADDR, (const char *) &iOption, sizeof(int)) == 3 SOCKET_ERROR) 4{ 5 cerr << "Set reuse address: Client set reuse address option failed" << endl; 6}

When a socket connection is closed with a call to close(), shutdown() or exit(), both the client and server will send a FIN (final) packet and will then send an acknowledgment (ACK) that they received the packet. The side which initiates the closure will be in a TIME_WAIT state until the process has been completed. This time out period is generally 2-4 minutes in duration. It is hoped that all packets are received in a timely manner and the entire time out duration is not required. When an application is abnormally terminated, the TIME_WAIT period is entered for the full duration.

Setting the SO_REUSEADDR option explicitly allows a process to bind a port in the TIME_WAIT state. This is to avoid the error "bind: Address Already in Use". One caviat is that the process can not be to the same address and port as the previous connection. If it is, the SO_REUSEADDR option will not help and the duration of the TIME_WAIT will be in effect. For more info see How to avoid the "Address Already in Use" error. Solution: Enable socket linger: view source print?
1 linger Option; 2 Option.l_onoff = 1; 3 Option.l_linger = 0; 4 5 if(setsockopt(socketHandle, SOL_SOCKET, SO_LINGER, (const char *) &Option, 6{ 7 cerr << "Set SO_LINGER option failed" << endl; 8} sizeof(linger)) == -1)

This allows the socket to die quickly and allow the address to be reused again. Warning: This linger configuration specified may/will result in data loss upon socket termination, thus it would not have the robustness required for a banking transaction but would be ok for a recreational app. Broadcast: view source print?
1 int iOption = 0; 2 // Broadcast option to set, 0 = disables, 1 = enables

if (setsockopt(socketHandle, SOL_SOCKET, SO_BROADCAST, (const char *) &iOption, sizeof(int)) == SOCKET_ERROR) 4{ 5 cerr << "Set reuse address: Client set reuse address option failed" << endl; 6} 3

Struct: remoteSocketInfo.sin_addr.s_addr = htonl(INADDR_BROADCAST);

setsockopt man page: set a particular socket option for the specified socket.

Test Socket Availability:


Function to test if a socket or set of sockets has data and can be read (Test so you don't get blocked on a read) or written. view source print?
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 #include <sys/select.h> #include <sys/time.h> ... ... bool isReadyToRead(int _socketHandle, const long &_lWaitTimeMicroseconds) { int iSelectReturn = 0; // Number of sockets meeting the criteria given to select() timeval timeToWait; int fd_max = -1; // Max socket descriptor to limit search plus one. fd_set readSetOfSockets; // Bitset representing the socket we want to read // 32-bit mask representing 0-31 descriptors where each // bit reflects the socket descriptor based on its bit position.

16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45

timeToWait.tv_sec = 0; timeToWait.tv_usec = _lWaitTimeMicroseconds; FD_ZERO(&readSetOfSockets); FD_SET(_socketHandle, &readSetOfSockets); if(_socketHandle > fd_max) { fd_max = _socketHandle; } iSelectReturn = select(fd_max + 1, &readSetOfSockets, (fd_set*) 0, (fd_set*) 0, &timeToWait); // iSelectReturn -1: ERROR, 0: no data, >0: Number of descriptors found which pass test given to select() if ( iSelectReturn == 0 ) // Not ready to read. No valid descriptors { return false; } else if ( iSelectReturn < 0 ) // Handle error { cerr << "*** Failed with error " close(_socketHandle); return false; } // Got here because iSelectReturn > 0 thus data available on at least one descriptor // Is our socket in the return list of readable sockets if ( FD_ISSET(_socketHandle, &readSetOfSockets) ) { << errno << " ***" << endl;

46 47 48 49 50 51 52 53 54 }

return true; } else { return false; } return false;

Arguments to select(): 1. int fd_max: Highest socket descriptor number. When opening a socket with the call socket(), the function returns a socket descriptor number. The call to select() will loop through all of the socket descriptors from zero up to fd_max to perform the "test". 2. fd_set *readSetOfSockets: This is a pointer to the variable holding the set of bits representing the set of sockets to test for readability. (Read will not block) The default number of bytes detected for the socket to be considered ready to read is 1. To change this default use (in this example 8 bytes):
int nBytes = 8; setsockopt(socketHandle, SOL_SOCKET, SO_RCVLOWAT,(const char *) &nBytes, sizeof(int))

3. fd_set *writeSetOfSockets: This is a pointer to the variable holding the set of bits representing the set of sockets to test for writeability. (Write will not block) 4. fd_set *exceptfds: This is a pointer to the variable holding the set of bits representing the set of sockets to test for exceptions. 5. struct timeval *timeout: This structure holds the upper bound on the amount of time elapsed before select returns. It may be zero, causing select to return immediately. If timeout is NULL (no timeout), select can block indefinitely.
6. struct timeval {

7. 8. 9.

long long };

tv_sec; tv_usec;

/* seconds */ /* microseconds */

Note: Any of the tests (read/write/exceptions) can be set to NULL to ignore that test. Also see the select man page

Port Numbers:
The use of port numbers below 1024 are limited to the root process. For a list of port numbers with established uses see the file /etc/services. Posts are 16 bit identifiers. Many are reserved and managed by the Internet Assigned Numbers Authority (IANA). See RFC 1700.

Host and Network Byte Order:


Note that when transferring data between different platforms or with Java, the byte order endianess will have to be considered. The network (the neutral medium) byte order is Big Endian and the byte order to which data is usually marshalled. Host processor byte order: Host Processor Intel x86 processor family Endianness Little endian

Power PC processor family Big endian

SUN SPARC Mips Mips

Big endian Big endian (IRIX) Little endian (NT)

Note that it is the processor architecture which determines the endianness and NOT the OS. The exception is for processors which support both big and little endian byte ordering, such as the MIPS processor. Also note: Java processes and stores data in big endian byte ordering on any platform. Character data is not subject to this problem as each character is one byte in length but integer is. Integer data can be converted from/to host or network byte order with the following routines: Function ntohl() ntohs() htonl() htons() Description Network to host byte order conversion for long integer data (uint32_t) Network to host byte order conversion for short integer data (uint16_t) Host to network byte order conversion for long integer data (uint32_t) Host to network byte order conversion for short integer data (uint16_t)

Requires #include <arpa/inet.h> The routines are aware of the processor they are running on and thus either perform a conversion or not, depending if it is required. Man pages:

ntohl/htonl, ntohs/htons: convert values between host and network byte order

Note that one uses these calls for portable code on any platform. The port of the library to that platform determines whether a byte swap will occur and not your source code.

Include file/code snipit to determine the processor endianess: File: is_bigendian.h view source print?
1 2 3 4 5 6 7 8 #ifndef __IS_BIGENDIAN__ #define __IS_BIGENDIAN__ const int bsti = 1; // Byte swap test integer

#define is_bigendian() ( (*(char*)&bsti) == 0 ) #endif // __IS_BIGENDIAN__

Code snipit to swap bytes "in place" and convert endian: view source print?
01 02 03 04 05 06 07 08 09 10 11 12 13 #include <netdb.h> #include "is_bigendian.h" /** In-place swapping of bytes to match endianness of hardware @param[in/out] *object : memory to swap in-place @param[in] _size : length in bytes */ void swapbytes(void *_object, size_t _size) { unsigned char *start, *end;

14 15 16 17 18 19 20 21 22 23 }

if(!is_bigendian()) { for ( start = (unsigned char *)_object, end = start + _size - 1; start < end; ++start, --end ) { unsigned char swap = *start; *start = *end; *end = swap; } }

Swaping bit field structures: view source print?


01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 #include <netdb.h> #include "is_bigendian.h" void swapbytes(void *_object, size_t _size); struct ExampleA { #ifdef BIGENDIAN unsigned int a:1; unsigned int b:1; unsigned int c:1; unsigned int d:1; unsigned int e:4; unsigned int f:8; unsigned int g:8; unsigned int h:8; #else

18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

unsigned unsigned unsigned unsigned unsigned unsigned unsigned unsigned #endif }; ... ...

int int int int int int int int

h:8; g:8; f:8; e:4; d:1; c:1; b:1; a:1;

// Bits: // |B31....B25 |B24....B16 |B15....B8 |B7....B0 | Big endian // |B7....B0 |B15....B8 |B24....B16 |B31....B25 | Little endian // Intel host to network: // Reverse bit field structure and then byte swap. // Just byteswap for int. // Code body struct ExampleA exampleA; char tmpStore[4]; ... // assign member variables: exampleA.a = 0;

49 50 51 52 53 54 55 56 57 58 59 60

exampleA.b exampleA.c exampleA.d exampleA.e ...

= = = =

0; 1; 1; 3;

// Use memcpy() because we can't cast a bitfield memcpy(&tmpStore, &exampleA, sizeof(ExampleA)); swapbytes((void *)&tmpStore, sizeof(ExampleA)); ...

A bit reversal routine (use as needed): view source print?


01 02 03 04 05 06 07 08 09 10 11 12 13 unsigned long reverseBitsUint32l(unsigned long x) { unsigned long tmplg = 0; int ii = 0; for(h = ii = 0; ii < 32; ii++) { h = (tmplg << 1) + (x & 1); x >>= 1; } return tmplg; }

Socket Tips and Best Practices:

Re-connect: If a socket client attempts to connect to a socket server and fails, one should attempt to re-attach after a given waiting period, a fixed number of times. view source print?
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 ... ... int iTry = 0; int mRetrymax = 10; int mRetrywait = 2; // If this client can not connect to the server, try again after period "Retrywait". while ( connect(socketHandle, (struct sockaddr *)&remoteSocketInfo, sizeof(sockaddr_in)) < 0 ) { iTry++; if( iTry > mRetrymax ) { cerr << "Failed to connect to server. Exceeded maximum allowable attempts: " << mRetrymax << endl; break; // Done retrying! } sleep( mRetrywait ); }

16 17 18 19 20 21 ...

Check return codes: Just because the socket was opened doesn't mean it will stay open. Dropped connections do happen. Check all read/write socket API function return codes:

o o o

return code > 0: Data received/sent return code == 0: Socket closed return code == -1: Check system errno and interpret (or call "perror()")

For even more robust code for a socket client, close, then open a new socket connection for return codes 0 or -1.

Read entire message: (Variable and fixed length messages) When sending messages using a socket, be aware that the other side of connection must know how much to read. This means that you must: o Send fixed message sizes of a known number of bytes where all messages sent are of the same size or o Send a message with a header of a known size where the message size is sent in the header. Thus read the fixed size header and determine the size of the whole message then read the remaining bytes. Note that with a UDP client/server communications, the socket read will read the whole frame sent. This is not necessarily true for a TCP stream. The function used to read the stream will return the number of bytes read. If the required number of bytes have not been read, repeat the read for the remaining bytes until the whole message has been read. In this example, our message header is a single integer used to store the message size in the first 32 bits of the message.

view source print?


01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 #include #include #include #include #include #include <iostream> <arpa/inet.h> <sys/socket.h> <sys/types.h> <stdio.h> <error.h>

using namespace std; ... ... // Data: Message Header and Size int rc = 0; int messageSize = 0; int remainingMessageSize = 0; int messageHeaderSize = sizeof(int); unsigned char *headerBuf= new unsigned char[messageHeaderSize ]; // Read Message Header int socketConnection; // Prototype: ssize_t recv(int s, void *buf, size_t len, int flags); // rc is the number of bytes returned.

28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57

rc = recv(socketConnection, headerBuf, messageHeaderSize , 0); if ( rc == 0 ) cerr << "Socket closed" << endl; else if ( rc == -1 ) cerr << "Socket error" << endl; else if ( rc != messageHeaderSize ) cerr << "Problem reading header" << endl; // Read message int messageTotalSize; int tmp = htonl((uint32_t)headerBuf); memcpy((void *)&messageSize,(void *)&tmp, messageHeaderSize ); // Create storage buffer for the message unsigned char *messageTotalBuf = new unsigned char[messageTotalSize ]; // Copy header into message buffer memcpy(messageTotalBuf, &headerBuf, messageHeaderSize ); // How much more to read remainingMessageSize = messageTotalSize - messageHeaderSize; // Character buffer pointer math: Put rest of message after header. messageTotalBuf += messageHeaderSize; while(1) {

58 rc = recv(socketConnection, messageTotalBuf, remainingMessageSize, 0); 59 60 if ( rc == 0 ) 61 { 62 cerr << "Socket closed" << endl; 63 break; 64 } 65 else if ( rc == -1 ) 66 { 67 cerr << "Socket error" << endl; 68 perror("recv"); 69 close(socketConnection); 70 break; 71 } 72 else if( rc != remainingMessageSize) 73 { 74 // Still more to read but less by what we have just read. 75 remainingMessageSize = remainingMessageSize - rc; 76 77 // Character buffer pointer math: Put rest of message after header 78 // and what has already been put there by recv(). 79 messageTotalBuf += rc; 80 } 81 else break; 82 } 83 84 // Now have message in messageTotalBuf of size messageTotalSize 85 86 ... 87 ...

This example applies to either client or server TCP sockets. Another method of ensuring a complete read of a fixed size message is to use the MSG_WAITALL flag: view source print?
1 2 3 4 5 6 7 8 ... ... int flags = MSG_WAITALL; int rc = recv(socketConnection, buffer, length, flags); ... ...

On "SOCK_STREAM" (TCP) sockets, this flag requests that the recv() function block until the full amount of data specified (length) can be returned. The function may return the smaller amount of data if the socket is a messagebased socket, if a signal is caught or if the connection is terminated.

UDP message order: TCP will guarentee that the order of the message delivery is maintained. UDP will not guarentee the message delivery order and thus you will have to maintain a counter in the message if the order is important. Signals: The socket layer can issue signals, which if not handled, can terminate your application. When a socket is terminated at one end, the other may receive a SIGPIPE signal.
Program received signal SIGPIPE, Broken pipe. 0x000000395d00de45 in send () from /lib64/libpthread.so.0 (gdb) where #0 0x000000395d00de45 in send () from /lib64/libpthread.so.0

#1 0x00000000004969c5 in bla bla bla ... ...

Note GDB will report a received signal even if it's being ignored by the application. In this example, we tried to send() over a broken socket link. Set up a signal handler at the beginning of your program to handle SIGPIPE: view source print?
01 02 03 04 05 06 07 08 09 10 #include <stdlib.h> #include <signal.h> /// Ignore signal with this handler void handleSigpipe(int signum) { cout << "SIGPIPE ignored" << endl; return; }

11 12 int main() 13 { 14 /// Ignore SIGPIPE "Broken pipe" signals when socket connections are broken. 15 signal(SIGPIPE, handleSigpipe); 16 17 ... 18 ... 19 }

Also see: o C++ Signal class tutorial - YoLinux.com tutorial o signal: Signal handling o sigaction: examine and change a signal action

Win32 and cross platform considerations:


The beauty of the BSD API is that it is nativly supported by Microsoft Visual C++ but there are some slight differences which can be wrapped in an "ifdef". view source print?
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 #ifdef WIN32 #include "StdAfx.h" #include <windows.h> #include <winsock2.h> #include <winsock.h> #else #include <errno.h> #include <sys/types.h> #include <sys/socket.h> #endif ... ... #ifdef WIN32 SOCKET socketHandle; WSADATA oWSAData; // Windows

// Unix/Linux

18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

memset(&oWSAData, 0, sizeof(oWSAData)); #else int socketHandle; #endif ... ... #ifdef WIN32 int iWSAError; // Load the WSA structure if ((iWSAError = WSAStartup(MAKEWORD(2, 2), &oWSAData)) != 0) { cerr << "Creating the WSAStartup failed with error " << iWSAError << endl; } #endif ... ... // ... ... // Manage errors here cerr << "Failed with error " #ifdef WIN32 << WSAGetLastError() << endl; WSACleanup(); Make sockets calls here

49 #else 50 51 #endif 52

<< errno << endl;

53 54 #ifdef WIN32 55 closesocket(socketHandle); 56 #else 57 close(socketHandle); 58 #endif

Note that the declaration and error handling are different and platform dependent. The close() function call was made a little more specific for the Win32 platform. Microsoft Visual C++ settings (VC5.0):

"Configuration Properties" --> "C/C++" --> "Preprocessor"

Add: UTILSOCKETS_USE_IPC=0 Error codes: The error codes returned by the MS/Windows BSD API socket calls use different defined macro representations and may have to be handled differently. Config files: MS/Windows has an equivalent of /etc/hosts which can be found in %SYSTEMROOT %\SYSTEM32\DIRVERS\ETC\HOSTS where the format is identical to the Unix/Linux file format.

Socket BSD API man pages:


Sockets API:

socket: establish socket interface gethostname: obtain hostname of system gethostbyname: returns a structure of type hostent for the given host name bind: bind a name to a socket listen: listen for connections on a socket accept: accept a connection on a socket connect: initiate a connection on a socket setsockopt: set a particular socket option for the specified socket. close: close a file descriptor shutdown: shut down part of a full-duplex connection

Interrogate a socket:

select: synchronous I/O multiplexing FD_ZERO(), FD_CLR(), FD_SET(), FD_ISSET(): Set socket bit masks poll: check on the state of a socket in a set of sockets. The set can be tested to see if any socket can be written to, read from or if an error occurred. getsockopt: retrieve the current value of a particular socket option for the specified socket.

Read/Write:

recv/recvfrom/recvmsg: Read socket send/sendto/sendmsg: Write socket

Convert:

ntohl/htonl, ntohs/htons: convert values between host and network byte order inet_pton: Create a network address structure inet_ntop: Parse network address structures

Other supporting system calls:


exit: Terminate process perror: Output explanation of an error code protocols: Network Protocols (see /etc/protocols)

Links:

Apache Portable Runtime (APR) - [documentation] - Portable socket wrapper functions. SCTP - Stream Control Transmission Protocol RPC - Remote Procedure Calls IBM: Secure programming with the OpenSSL API